Hugues Yver, MD
The Children's Hospital of Philadelphia
Philadelphia, Pennsylvania
Disclosure information not submitted.
Charlotte Woods-Hill, MD, MSHP
The Children's Hospital of Philadelphia
Philadelphia, Pennsylvania
Disclosure information not submitted.
Michael O. Harhay, PhD, MPH
Assistant Professor
University of Pennsylvania, United States
Disclosure information not submitted.
Nadir Yehya, MD, MSCE
Children's Hospital of Philadelphia
Cherry Hill, NJ
Disclosure information not submitted.
Title: Statistical Power in Pediatric Critical Care Trials
Introduction: Randomized controlled trials (RCTs) in critically ill children are commonly negative. However, the degree to which trial design contributes to negative trials is unexplored. We aimed to appraise elements of statistical power and design characteristics of published pediatric critical care RCTs.
Methods: We analyzed pediatric critical care RCTs published in 18 high-impact general, critical care, or pediatric critical care journals between 2000 and 2020 with a clinically relevant primary outcome. We compared hypothesized versus actual baseline event rates and hypothesized versus actual effect sizes to determine whether RCTs were systematically underpowered.
Results: We identified 65 RCTs, of which 27 (42%) demonstrated a statistically significant result (as defined by the trial) supporting the intervention, with one trial demonstrating a significant result against the intervention. Forty RCTs (62%) had a binary primary outcome, including mortality in 5 RCTs, while 25 RCTs (38%) had a continuous outcome. Power calculations were reported in 54 RCTs (83%), of which 51 (78%) projected a sample size, 48 (74%) projected an effect size, and 39 (60%) provided a baseline event rate. Reporting of the elements of power improved between 2000 and 2020, with > 85% of trials after 2015 reporting all of the above elements. The hypothesized baseline event rates were misaligned with actual control group event rates such that it reduced power in 91% of the negative RCTs and 56% of positive RCTs. The actual effect sizes were approximately two-fold more conservative than hypothesized, with overestimation in 100% of the negative RCTs and 12.5% of the positive RCTs.
Conclusions: Pediatric critical care RCTs were frequently negative and apparently powered to identify unrealistic effect sizes. Baseline event rates were frequently better than projected, which also decreased power. These results suggest that pediatric critical care RCTs require methodological reassessment from design to execution. To more accurately predict baseline event rates, we suggest restricting eligibility criteria, using more precise epidemiologic data, and assessing a broader range of free-day models. To more accurately predict hypothesized effect size, we suggest greater use of pilot trials and observational studies, and pre-planned Bayesian analyses.