A Comparative Analysis of Sepsis Identification Methods in an Electronic Database

Alistair E. W. Johnson, DPhil; Jerome Aboab, MD, PhD; Jesse D. Raffa, PhD; Tom J. Pollard, PhD; Rodrigo O. Deliberato, MD, PhD; Leo A. Celi, MD, MPH; David J. Stone, MD


Crit Care Med. 2018;46(4):494-499. 

In This Article


Current large-scale EHR sepsis identification frequently rests on administrative coding primarily done for billing. In contrast, the Sepsis-3 criteria primarily rely on the assembly of contributory data elements based on physiology, via the SOFA score, and clinical practice, via the definition of suspicion of infection.

We calculated the AUROC of SOFA for both the primary outcome (hospital mortality) and the secondary outcome (composite outcome of ICU LOS greater than or equal to 3 d or hospital mortality). Our reported AUROC of 0.74 for SOFA against in-hospital mortality is similar to that of Seymour et al (0.74)[4] and Raith et al (0.75),[5] though it is worth noting that our AUROC of SOFA against the secondary outcome (0.69) was lower than that of Raith et al (0.74).[5] These results give confidence in our replication of the Sepsis-3 criteria.

We found important disparities in the identification of sepsis using the various approaches. When examining different methodologies for retrospectively identifying patients with sepsis, cohort sizes varied from small (explicit: 1,062 patients, 9.0% of the entire cohort) to almost half of all patients (Sepsis-3: 5,784 patients, 49.1% of the entire cohort). Among purely administrative definitions (Angus, Martin, explicit), we found similar disparities (Figure 1). Iwashyna et al[15] assessed variance in cohort sizes for only these administrative criteria and further performed an expert chart review of a subset of these records. The authors found that the explicit criteria identified a pure cohort (100% positive predictive value) but missed the vast majority of septic patients (9.2% sensitivity). Iwashyna et al[15] also found that the Angus methodology identified a larger population of septic patients (50.3% sensitivity) but at a cost of fidelity (70.7% PPV). Our results appear consistent with the conclusions of Iwashyna et al[15] in that mortality rate ran roughly in reverse order of cohort size, and we are able to extend their results to the CMS, CDC, and Sepsis-3 criteria. This could imply that the more restrictive cohorts represent sicker, higher risk segments of the population. Sepsis-3 identified the largest cohort in our study, and this cohort mostly encapsulated those identified by other criteria (Figure 2). Only 4.8% of patients were identified by the approaches of Angus and Martin but not by Sepsis-3 (Figure 2A), and only 2.2% were identified only by the methodologies from CMS and CDC (Figure 2B). We posit that Sepsis-3, in general, identifies a larger and likely less "pure" cohort of septic patients, but one that still remains at higher risk of mortality (14.5% vs 7.3%) and higher risk of composite mortality/excess LOS (50.0% vs 21.9%).

Another advantage of the Sepsis-3 criterion is the temporal context it provides. All other criteria used billing (ICD-9) codes which are typically assigned on hospital discharge and are not time stamped within the stay. Consequently, these administrative criteria are only capable of identifying sepsis for entire hospitalizations and cannot be used to assess the time course of the disease. In contrast, the algorithm for Sepsis-3 requires the delineation of a time point at which the patient may be septic (suspected of infection with associated organ failure). This time point could be useful in retrospective assessment of the trajectory of the patient's illness. It is worth noting that this defined onset time may occur later in the course of the illness than optimally desirable for clinical detection,[6] and alternative criteria may be necessary depending on the desired application.

Lastly, Sepsis-3 is also advantageous as it better aligns with the contemporary understanding of the pathophysiology of sepsis. Angus et al[16] have proposed a framework to assess sepsis criteria, and Seymour et al[13] provided a case study using this framework. Briefly, the Sepsis-3 criteria for sepsis demonstrate content validity (agreement with contemporary understanding of sepsis), construct validity (agreement with similar previously used definitions), and criterion validity (identification of a cohort at risk of death). We provide a more detailed assessment in the Supplemental Material (Supplemental Digital Content 1, http://links.lww.com/CCM/D208), and we refer the interested reader there.

Overall, Sepsis-3 appears to present usable and viable criteria for retrospectively identifying septic patients in EHRs for the three reasons discussed: 1) it is consistent with other criteria, 2) it is timely, and 3) it satisfies many forms of validity. However, there are some limitations to the Sepsis-3 criteria. Both the Sepsis-3 and the CDC criteria rely on treatments as surrogates for organ failure. More importantly for Sepsis-3, the retrospective definition of suspicion of infection is entirely dependent the actions of the clinician. As a result, the test lacks "meta-reliability," that is, it is susceptible to changes unrelated to the biology of the patient.

Organ failure as captured by SOFA (and used by Sepsis-3) also has limitations. The neurologic component uses the Glasgow Coma Scale, which has known issues regarding interrater reliability and use/scoring in intubated and/or sedated patients.[17] The respiratory component requires an arterial blood gas and uses a low PaO2/FIO2 ratio as a marker of severity of illness. This measurement requires a known, specialized source of oxygen for accurate measurement of FIO2 and thus is variably accurate across different treatment regimens.[18] Finally, the cardiovascular component is primarily determined by the type and rate of vasopressor administration, and not on the degree of organ failure; thus scoring is susceptible to the clinician's propensity for certain interventions. The cardiovascular component of SOFA is scored as 2 if a patient is administered low-dose dopamine, though this is infrequently done in contemporary clinical practice. All of these issues are rooted in the inherent difficulty of quantifying the level of organ dysfunction when patients are intensively treated. In the absence of advances in direct quantification of organ function, carefully conceived simplifications of current criteria could improve robustness to variation in clinical practice and may improve construct validity. For example, instead of quantifying the level of organ dysfunction based on the type and dose of vasopressor (as is done in SOFA), criteria could be simplified to use of any vasopressor (such as in the CDC definition).

Our study has several limitations. First, our results are limited to a single tertiary medical center. Second, we excluded patients suspected of infection more than 24 hours before or after ICU admission, and our results are limited to patients admitted to the ICU with sepsis. We did not address the use of the SIRS criteria for sepsis identification as these criteria are not intended to independently identify septic patients.[19] Finally, knowledge of sepsis continues to develop, and the evaluation in this work rests vulnerably upon universal agreement of what sepsis is, how it is defined clinically, and precisely how the applicable terminologies are documented.