Presentation is loading. Please wait.

Presentation is loading. Please wait.

S117: Acute Setting Predictive Analytics Sharon E. Davis, MS

Similar presentations


Presentation on theme: "S117: Acute Setting Predictive Analytics Sharon E. Davis, MS"— Presentation transcript:

1 Calibration Drift Among Regression and Machine Learning Models for Hospital Mortality
S117: Acute Setting Predictive Analytics Sharon E. Davis, MS PhD Candidate Department of Biomedical Informatics Vanderbilt University

2 Disclosure I and my spouse have no relevant relationships with commercial interests to disclose. AMIA | amia.org

3 Motivation Evolving role of clinical prediction models
EHR-based models have access to more data, can support more complexity in real-time Move from classification to individual-level predicted probabilities Increasing focus on calibration Discrimination Ability to separate cases and non-cases Supports risk stratification Calibration Alignment between predicted and observed probabilities More difficult to assess, evolving metrics Calibration hierarchy and associated metrics1 Agreement within each covariate pattern Agreement of prediction and outcome rate among similar patients No systematic over/underprediction or over/underfitting Agreement on average Strong Moderate Weak Mean 1 Van Calster et al. “A calibration hierarchy for risk models was defined: from utopia to empirical data.” JCE 2016. AMIA | amia.org

4 Motivation Drifting performance over time, particularly in terms of calibration Driving forces Population shifts in outcome rate, case mix Clinical practice and documentation changes (predictor-outcome associations) Limited understanding Focus on logistic models, despite proliferation of more complex modeling methods and some evidence that methods impact drift Focus on crude measures of calibration, despite evidence that metrics impact understanding Limited exploration of drivers of calibration drift Davis SE, Lasko TA, Chen G, Siew ED, Matheny ME “Calibration Drift in Regression and Machine Learning Models for Acute Kidney Injury.” JAMIA. 24(6): AMIA | amia.org

5 Objectives Compare performance over time of prediction models developed using common frequentist statistical methods and machine learning techniques Expanded set of modeling methods Calibration measured at multiple levels of stringency Link shifts in patient populations with model performance to identify drivers of calibration drift across models Event rate shift Case mix shift Predictor-outcome association shift AMIA | amia.org

6 Study Population National cohort of admissions to VA facilities, Total sample size – 1,893,284 admissions One year development period, multi-year validation period Outcome: 30-day mortality after hospital admission Predictors based on prior modeling Demographics (3) Laboratory results (20) Diagnoses (32) Vitals (7) Care utilization (3) Admission type (1) AMIA | amia.org

7 Methods – Model Development
Training set – admissions occurring in 2006 (n = 235,548) Modeling methods Parallel models developed with each method Hyperparameters selected with 5-fold cross-validation Internal validation with 200 bootstrap iterations Logistic L1-L2 regularized logistic (elastic net) L1 regularized logistic (lasso) Random forest L2 regularized logistic logistic (ridge) Neural network AMIA | amia.org

8 Methods – Model Validation
Validation period: (n = 1,657,736) AMIA | amia.org

9 Methods – Model Validation
Discrimination - AUC Calibration Calibration Plot Metrics – flexible calibration curves, estimated calibration index (ECI) 1 Metrics: Cox recalibration intercept and slope Metric: observed to expected outcome ratio (O:E) Strong Moderate Weak Mean 1 Van Hoorde et al. “A spline-based tool to assess and visualize the calibration of multiclass risk predictions.” JBI AMIA | amia.org

10 Methods – Extending Flexible Calibration Curves
Determining regions of calibration Tracking regions over time Rescaling regions by volume of data Time Data concentrated at low predicted probabilities Rescaled regions of calibration by proportion of observations in each region Assessed magnitude of miscalibration using within region ECIs AMIA | amia.org 10

11 Methods – Population Data Shifts
Event rate shift Definition: Changes in the prevalence of the outcome Assessment: Distribution of mortality rate over time Case mix shift Definition: Changes in the distribution of risk factors in the population Assessment: Distributions of predictors over time Discrimination and model structure of membership models1 Predictor-outcome association shifts Definition: Changes in form or magnitude of relationships between risk factors and the outcome Assessment: Changes in the structure of models refit in each 3-month period Differences that arise over time between the population on which the model was developed and the population on which the model is applied 1 Debray et al. “A new framework to enhance the interpretation of external validation studies of clinical prediction models.” JCE AMIA | amia.org

12 Discrimination Over Time
AMIA | amia.org

13 Calibration Over Time – Observed to Expected Outcome Ratio
Ideal value: 1 AMIA | amia.org

14 Calibration Over Time – Estimated Calibration Index
Ideal value: 0 AMIA | amia.org

15 Calibration Over Time – Rescaled Regions of Calibration
AMIA | amia.org

16 Linking Population Shifts and Performance
Event rate shift dominated by seasonal variation in mortality rate Model Outcome Rate Shift Case Mix Shift Association Shift Logistic regression u L1 logistic L2 logistic L1-L2 logistic Random forest Neural network Susceptibility – u High u Moderate u Low Mortality rate declined from 5.0% to 4.8% over study period AMIA | amia.org

17 Linking Population Shifts and Performance
Outcome rate shift dominated by seasonal variation in mortality rate Limited evidence of predictor-outcome association shifts Model Outcome Rate Shift Case Mix Shift Association Shift Logistic regression u L1 logistic L2 logistic L1-L2 logistic Random forest Neural network Susceptibility – u High u Moderate u Low Mortality rate declined from 5.0% to 4.8% over study period AMIA | amia.org

18 Linking Population Shifts and Performance
Outcome rate shift dominated by seasonal variation in mortality rate Limited evidence of predictor-outcome association shifts Case mix shift occurred throughout the validation period Model Outcome Rate Shift Case Mix Shift Association Shift Logistic regression u L1 logistic L2 logistic L1-L2 logistic Random forest Neural network Membership Model Discrimination Susceptibility – u High u Moderate u Low Mortality rate declined from 5.0% to 4.8% over study period AMIA | amia.org

19 Integration With Additional Research
Corresponding analysis in separate cohort experiencing different population data shifts Acute kidney injury models experienced Event rate shift Case mix shift Predictor-outcome association shift Age inclusion in L1 penalized logistic regression model for AKI Estimated calibration index for AKI models Davis SE, Lasko TA, Chen G, Siew ED, Matheny ME “Calibration Drift in Regression and Machine Learning Models for Acute Kidney Injury.” JAMIA. 24(6): AMIA | amia.org

20 Integration With Additional Research
Corresponding analysis in separate cohort experiencing different population data shifts Acute kidney injury models experienced Event rate shift Case mix shift Predictor-outcome association shift Model Outcome Rate Shift Case Mix Shift Association Shift Logistic regression u L1 logistic L2 logistic L1-L2 logistic Random forest Neural network Susceptibility – u High u Moderate u Low Davis SE, Lasko TA, Chen G, Siew ED, Matheny ME “Calibration Drift in Regression and Machine Learning Models for Acute Kidney Injury.” JAMIA. 24(6): AMIA | amia.org

21 Limitations Statistical versus clinically meaningful miscalibration
Additional modeling methods remain to be considered Limited population data shift scenarios explored Variable magnitudes of population data shifts Variable combinations of event rate, case mix, association shifts AMIA | amia.org

22 Conclusions Calibration drift varies by modeling method and form of underlying population shifts Selection of calibration metrics impacts understanding of drift across methods Model updating strategies will need to be tailored to the unique vulnerabilities of modeling methods AMIA | amia.org

23 Funding Agencies Research Team Michael Matheny, MD, MS, MPH
VA HSR&D IIR VA HSR&D IIR NLM 5T15LM007450 NLM 1R21LM Research Team Michael Matheny, MD, MS, MPH Guanhua Chen, PhD Thomas Lasko, MD, PhD Department of Biomedical Informatics AMIA | amia.org

24 Email: sharon.e.davis@vanderbilt.edu
Thank you! Department of Biomedical Informatics


Download ppt "S117: Acute Setting Predictive Analytics Sharon E. Davis, MS"

Similar presentations


Ads by Google