Mortality Predictions on Admission as a Context for Organizing Care Activities

Mark E. Cowen, MD, SM; Robert L. Strawderman, ScD; Jennifer L. Czerwinski, BA; , Mary Jo Smith, RN, MS; Lakshmi K. Halasyamani, MD


Journal of Hospital Medicine. 2013;8(5):229-235. 

In This Article


The prediction rule was derived from data on all inpatients (n = 56,003) 18 to 99 years old from St. Joseph Mercy Hospital, Ann Arbor from 2008 to 2009. This is a community-based, tertiary-care center. We reference derivation cases as D1, validation cases from the same hospital in the following year (2010) as V1, and data from a second hospital in 2010 as V2. The V2 hospital belonged to the same parent health corporation and shared some physician specialists with D1 and V1 but had separate medical and nursing staff.

The primary outcome predicted is 30-day mortality from the time of admission. We chose 30-day rather than in-hospital mortality to address concerns of potential confounding of duration of hospital stay and likelihood of dying in the hospital.[23] Risk factors were considered for inclusion into the prediction rule based on their prevalence, conceptual, and univariable association with death (details provided in the Supporting information, Appendix I and II, in the online version of this article). The types of risk factors considered were patient diagnoses as of the time of admission obtained from hospital administrative data and grouped by the 2011 Clinical Classification Software (, accessed June 6, 2012), administrative data from previous hospitalizations within the health system in the preceding 12 months, and the worst value of clinical laboratory blood tests obtained within 30 days prior to the time of admission. When a given patient had missing values for the laboratory tests of interest, we imputed a "normal" value, assuming the clinician had not ordered these tests because he/she expected the patient would have normal results. The imputed normal values were derived from available results from patients discharged alive with short hospital stays (≤3 days) in 2007 to 2008. The datasets were built and analyzed using SAS version 9.1, 9.2 (SAS Institute, Inc., Cary, NC) and R (R Foundation for Statistical Computing, Vienna, Austria;

Prediction Rule Derivation Using D1 Dataset

Random forest procedures with a variety of variable importance measures were used with D1 data to reduce the number of potential predictor variables.[24] Model-based recursive partitioning, a technique that combines features of multivariable logistic regression and classification and regression trees, was then used to develop the multivariable prediction model.[25,26] Model building was done in R, employing functions provided as part of the randomForest and party packages. The final prediction rule consisted of 4 multivariable logistic regression models, each being specific to 1 of 4 possible population subgroups: females with/females without previous hospitalizations, and males with/males without previous hospitalizations. Each logistic regression model contains exactly the same predictor variables; however, the regression coefficients are subgroup specific. Therefore, the predicted probability of 30-day mortality for a patient having a given set of predictor variables depends on the subgroup to which the patient is a member.

Validation, Discrimination, Calibration

The prediction rule was validated by generating a predicted probability of 30-day mortality for each patient in V1 and V2, using their observed risk factor information combined with the scoring weights (ie, regression coefficients) derived from D1, then comparing predicted vs actual outcomes. Discriminatory accuracy is reported as the area under the receiver operating characteristic (ROC) curve that can range from 0.5 indicating pure chance, to 1.0 or perfect prediction.[27] Values above 0.8 are often interpreted as indicating strong predictive relationships, values between 0.7 and 0.79 as modest, and values between 0.6 and 0.69 as weak.[28] Model calibration was tested in all datasets across 20 intervals representing the spectrum of mortality risk, by assessing whether or not the 95% confidence limits for the actual proportion of patients dying encompassed the mean predicted mortality for the interval. These 20 intervals were defined using 5 percentile increments of the probability of dying for D1. The use of intervals based on percentiles ensures similarity in the level of predicted risk within an interval for V1 and V2, while allowing the proportion of patients contained within that interval to vary across hospitals.

Relationships With Other Adverse Events

We then used each patient's calculated probability of 30-day mortality to predict the occurrence of other adverse events. We first derived scoring weights (ie, regression parameter estimates) from logistic regression models designed to relate each secondary outcome to the predicted 30-day mortality using D1 data. These scoring weights were then respectively applied to the V1 and V2 patients' predicted 30-day mortality rate to generate their predicted probabilities for: in-hospital death, a stay in an intensive care unit at some point during the hospitalization, the occurrence of a condition not present on admission (a "complication," see the Supporting information, Appendix I, in the online version of this article), palliative care status at the time of discharge (International Classification of Diseases, 9th Revision code V66.7), 30-day readmission, and death within 180 days (determined for the first hospitalization of the patient in the calendar year, using hospital administrative data and the Social Security Death Index). Additionally, for V1 patients but not V2 due to unavailability of data, we predicted the occurrence of an unplanned transfer to an intensive care unit within the first 24 hours for those not admitted to the intensive care unit (ICU), and resuscitative efforts for cardiopulmonary arrests ("code blue," as determined from hospital paging records and resuscitation documentation, with the realization that some resuscitations within the intensive care units might be undercaptured by this approach). Predicted vs actual outcomes were assessed using SAS version 9.2 by examining the areas under the receiver operating curves generated by the PROC LOGISTIC ROC.

Implications for Care Redesign

To illustrate how the mortality prediction provides a context for organizing the work of multiple health professionals, we created 5 risk strata[10] based on quintiles of D1 mortality risk. To display the time frame in which the peak risk of death occurs, we plotted the unadjusted hazard function per strata using SAS PROC LIFETEST.