Clinical Examination for the Prediction of Mortality in the Critically Ill

The Simple Intensive Care Studies-I

Bart Hiemstra, MD; Ruben J. Eck, MD; Renske Wiersema, BSc; Thomas Kaufmann, MD; Geert Koster, MD; Thomas W.L. Scheeren, MD, PhD; Harold Snieder, PhD; Anders Perner, MD, PhD; Ville Pettilä, MD, PhD; Jørn Wetterslev, MD, PhD; Frederik Keus, MD, PhD; Iwan C.C. van der Horst, MD, PhD; SICS Study Group


Crit Care Med. 2019;47(10):1301-1309. 

In This Article

Materials and Methods

Design, Setting, and Patients

The prospective, observational, single-center Simple Intensive Care Studies-I (SICS-I) was conducted following a prewritten protocol and statistical analysis plan (SAP; see Supplement 1, Supplemental Digital Content 1,, or NCT02912624). All consecutive patients admitted to the ICU of the University Medical Center Groningen were eligible for inclusion. Adult patients who had an unplanned ICU admission and were expected to stay for at least 24 hours were included. Patients were excluded if their ICU admission was planned preoperatively, if acquiring research data interfered with clinical care due to continuous resuscitation efforts (e.g., mechanical circulatory support), or if informed consent was not provided. In unresponsive patients, informed consent was first obtained from the legal representatives and at a later time if the patient recovered consciousness. If the patient died before consent was obtained, the study data were used and legal representatives were informed on the study. The local institutional review board approved the study (M15.168207).

All included patients underwent clinical examination followed by CCUS within the first 24 hours of their ICU admission (eTable 2, Supplemental Digital Content 2, Researchers conducted the clinical and CCUS examinations, and their findings were not revealed to caregivers.

Clinical Examination

All clinical examinations were standardized, and cutoff values for abnormal clinical signs were predefined in the protocol ( NCT02912624). A total of 19 clinical signs per patient were recorded (eTable 1, Supplemental Digital Content 2, Respiratory rate, heart rate and rhythm, arterial blood pressures, and central venous pressures were recorded from the bedside monitor. Patients were auscultated for the presence of cardiac murmurs and crepitations. Clinical signs reflecting organ perfusion were obtained from the three organs readily accessible to clinical examination: cerebral (mental status), renal (urine output), and skin perfusion (CRT, central-to-peripheral temperature difference [ΔTc-p] and skin mottling). Mental status was assessed according to the categories "Alert," "responsive to Voice," "responsive to Pain," and "Unresponsive" and was scored irrespective of sedation use. Urine output was scored 1 and 6 hours prior to the clinical examination, adjusted for body weight, and considered decreased if less than 0.5 mL/kg/hr. CRT was the time for skin color to fully return after applying firm pressure at the sternum, index finger, and knee for 15 seconds and considered prolonged if greater than 4.5 seconds.[16] ΔTc-p was the difference between central temperature measured by a bladder thermistor catheter and peripheral temperature measured by a skin probe on the big toe and dorsum of the foot and considered abnormal if greater than 7°C.[17,18] The degree of skin mottling was rated at the knee according to a score from 0 to 5, where 0–1 was regarded as mild, 2–3 was regarded as moderate, and 4–5 was regarded as severe mottling.[19]

Outcome Definition

The primary outcome (dependent) variable was 90-day all-cause mortality obtained through the municipal record database. Sensitivity analyses were conducted using all-cause mortality at 7- and 30-day follow-ups.

Sample Size and Missing Data

The sample size was based on the estimation that half of the number of acute ICU admissions per year (N = 1,500) would fulfill the inclusion criteria. The potentially detectable difference was calculated using skin mottling as an example for the case inclusion exceeded 1,000 patients: a significant mortality difference of 9% for skin mottling with 84% power and a maximal type 1 error risk of 0.015 could be detected.[20] Missing values were considered missing at random because these depended on other observed patient characteristics (such as age and mechanical ventilation) and a significant Little's test.[21] Multiple imputations (20 times) for missing data were conducted, and parameter estimates and standard errors were combined using Rubin's formula.[22,23]

Analytical Approach

The aims of our primary analyses were twofold: first, a multivariable logistic regression analysis was conducted to identify the clinical examination findings that independently predict mortality at 90-day follow-up and, second, the discriminative performance of this model was compared with that of the Simplified Acute Physiology Score-II (SAPS-II), Acute Physiology and Chronic Health Evaluation-IV (APACHE-IV), and Sequential Organ Failure Assessment (SOFA). Analyses were conducted with Stata Version 15.1 (StataCorp, College Station, TX, USA) on the imputed dataset following our published SAP (Supplement 1, Supplemental Digital Content 1,

Model Development and Validation. Unadjusted and age- and sex-adjusted regression analyses were conducted on 19 clinical signs. A p-value of less than 0.25 threshold was used for inclusion in the multivariable models, which was constructed using forward stepwise regression by adding blocks of variables. The multivariable model was adjusted for age (covariate) and norepinephrine infusion rate (mediator) under the pathophysiological mechanism that norepinephrine alters most clinical signs. The final model was internally validated with bootstrap sampling. For bootstrap sampling, 1,075 patients were repeatedly drawn with replacement from the imputed dataset (for a more in-depth explanation, see eFigure 1, Supplemental Digital Content 2, In total, 100 bootstrap samples were drawn, and the final model was reconstructed in each sample. Each variable from the final model was considered internally validated if it was significant in at least 80 of the 100 bootstrapped models.[20,24] Calibration of the multivariable models was checked with calibration plots and Hosmer–Lemeshow tests. Discrimination of the final model was evaluated with receiver operating characteristic (ROC) curves.[25] Dominance analysis was used to determine the relative importance of independent variables in each multivariable model.[26] Our multivariable model was compared with the SAPS-II (reference model), APACHE-IV, and SOFA scores by 1) analyzing differences between the area under the ROC curves (AUC) using the method proposed by DeLong et al[27] and 2) constructing reclassification tables and calculating the net reclassification improvement.[28]

Sensitivity and Subgroup Analyses. In sensitivity analyses, we assessed whether the statistically significant predictors of 90-day mortality were also predictive of 7- and 30-day mortalities. Time dependency was also investigated by conducting a multivariable Cox regression analysis on 90-day mortality.

Two planned subgroup analyses were conducted, in which only the clinical examination findings that were statistically significant in the primary analysis were evaluated. First, patients were stratified by vasopressor use. Second, patients were stratified by underlying pathology that could influence the clinical measurements, that is, acute liver failure or post-orthotopic liver transplantation (OLT), heart failure, septic shock, cardiac arrest, and CNS pathology.

Statistical Significance. The SICS-I was designed to address multiple hypotheses on six different outcomes, and therefore, the mortality outcome was adjusted for multiple hypothesis testing.[29] Supplement 1 (Supplemental Digital Content 1, contains the details or our SAP, but in short, a p-value of 0.015 indicated statistical significance and p-values between 0.015 and 0.05 indicated suggestive significance with an increased family-wise error rate.[20,30] For our secondary (subgroup) analyses, a p-value of less than 0.05 indicated statistical significance due to the hypothesis-generating purpose. Accordingly, primary analyses are presented with 98.5% CIs and secondary (subgroup) analyses with 95% CIs.

Amendments to the SAP. For our primary analysis, multivariable logistic regression analyses were used instead of Cox regression because the outcome (90-day mortality) was fixed, time to event was considered less relevant, and our statistical methods would be more in line with that in the literature. Findings of the Cox regression analyses are reported in Supplement 1 (Supplemental Digital Content 1, and Supplement 2 (Supplemental Digital Content 2,

We intended to conduct a multivariable regression analysis of clinical examination findings adjusted for the SAPS-II. Since the SAPS-II also contains various clinical examination findings, we realized that such a model would have little clinical relevance and instead used this score as the reference model. We compared the performance of our clinical examination model to the SAPS-II, APACHE-IV, and SOFA.