Itsuki Osawa, MD
Clinical Fellow
The University of Tokyo Hospital, Japan
Disclosure information not submitted.
Honoka Ito, RN
University student
Nursing Course, College of Nursing, School of Medicine and Health Sciences, University of Tsukuba, Ibaraki, Japan, United States
Disclosure information not submitted.
Title: Optimal sedation strategy for ventilated patients with sepsis using deep reinforcement learning
Introduction: Mechanically ventilated patients with sepsis need to be put on continuous sedatives. Although guidelines recommend against excessive use of sedatives, individualizing the types and dosages of sedatives to maintain light sedation is often challenging. Reinforcement learning (RL), a subfield of artificial intelligence, can extract and optimize implicit sequential strategies from observational cohorts. We aimed to optimize the strategy for administering sedatives to mechanically ventilated patients with sepsis using an RL algorithm.
Methods: This is a retrospective cohort study using the Medical Information Mart for Intensive Care IV (MIMIC-IV) dataset. We identified adult patients who met the Sepsis-3 criteria and were mechanically ventilated with sedatives during their ICU stays. After splitting the entire dataset into training (80%) and test (20%) datasets, we applied an offline deep RL algorithm to the training dataset to optimize the selection and dosage schedule of sedatives, including propofol, midazolam, and dexmedetomidine. The primary outcomes and penalties for the RL algorithm were deviation from light sedation (defined as Richmond Agitation Sedation Scale [RASS] ≥1 or RASS ≤−3) and onset of delirium (diagnosed with the Confusion Assessment Method for the Intensive Care Unit [CAM-ICU]). We compared the outcomes under the treatment administered by the clinician and the RL-based protocol on the test dataset. The RL-based protocol was interpreted using Shapley additive explanations (SHAP).
Results: We identified 11,465 ICU stays by 10,414 patients, which we divided into 88,332 4-hour time intervals. Patients were given more than one sedative in 7.4% of the periods, only propofol in 51%, only midazolam in 24%, and only dexmedetomidine in 2.5%. The RL-based protocol used midazolam less frequently and reduced deviations from light sedation and onsets of delirium by 75% compared to the treatment administered by a clinician. Using SHAP, we found that the heart rate and Sequential Organ Failure Assessment (SOFA) score contributed more to predicting the doses of sedatives than other factors used in the RL algorithm.
Conclusions: Our RL model could help clinicians maintain light sedation and prevent delirium by adjusting the administration of sedatives for mechanically ventilated patients with sepsis.