Katharine Henry, PhD
Postdoctoral Fellow
Johns Hopkins University, United States
Disclosure information not submitted.
Hossein Soleimani, PhD
Data Scientist
University of California, San Francisco, United States
Disclosure information not submitted.
Nishi Rawat, MD, MBA
Assistant Professor
Johns Hopkins University, United States
Disclosure information not submitted.
Mustapha Saheed, MD
Assistant Professor
Johns Hopkins University School of Medicine, United States
Disclosure information not submitted.
Edward Chen, MD
Assistant Professor
Johns Hopkins University School of Medicine, United States
Disclosure information not submitted.
Albert Wu, MD
Professor
Johns Hopkins University, United States
Disclosure information not submitted.
Suchi Saria, PhD
Associate Professor
Johns Hopkins University, United States
Disclosure information not submitted.
Title: Assessing Clinical Use and Performance of a Machine Learning Sepsis Alert for Sex and Racial Bias
INTRODUCTION/HYPOTHESIS: Recent studies have demonstrated the importance of evaluating clinical decision support tools for sex and racial bias. We compared the predictive performance and clinical response to a deployed sepsis early warning system, the Targeted Real-time Early Warning System (TREWS), between sex and race groups.
Methods: We used electronic health records (EHRs) from adult patients presenting to the emergency department or admitted to one of two academic and three community hospitals in the Maryland/DC area between deployment of TREWS at each site (deployments began in April, 2018) and September 31, 2021. When a patient is suspected of having sepsis, TREWS raises an alert in the EHR system based on its pre-specified configuration. We estimated the sensitivity, positive predictive value (PPV), and percentage of alerts confirmed by a clinician as having suspected sepsis in the EHR (confirmation rate) within racial and sex groups. We used the patient’s documented sex and race in the EHR and coded patients as Asian (A), Black (B), white (W), or other. We identified sepsis cases using Electronic Sepsis Phenotyping, a refinement of the Sepsis-3 criteria.
Results: Of 590,736 patient encounters monitored by TREWS during the study, 44,547 (8%) triggered an alert and 13,680 (2%) had sepsis. Overall, the sensitivity was 82.0%, the PPV was 26.7%, and among those evaluated (89.3%), the confirmation rate was 36.0%. Male and female patients had comparable sensitivity (M:81.8% vs F:82.3%; p=0.455), confirmation rates (M:35.9% vs F:36.2%; p=0.613), and PPV (M:27.3% vs F:26.0%; p=0.002). PPVs were similar between racial groups (W:26.9% vs B:26.2%; p=0.127 and W:26.9% vs A:27.9%; p=0.383). Compared with white patients, Black patients had higher sensitivity (W:80.9% vs B:84.7%; p< 0.001) but a lower confirmation rate (W:37.4% vs B:32.7%; p< 0.001), whereas Asian patients had comparable sensitivity (W:80.9% vs A:83.2%; p=0.176) and a higher confirmation rate (W:37.4% vs A:42.1%; p=0.001).
Conclusions: While our analysis did not reveal substantive sex bias, we did identify racial differences in overall performance, namely, confirmation rates ranged from 33% among Black patients to 42% among Asian patients despite similar PPV. Additional work is needed to assess the cause and clinical significance of these differences.