Outcome prediction for heart failure patients based on routinely collected claims data was not significantly improved by using machine learning and information from electronic medical records.
That’s the finding of a study published in the journal JAMA Network Open that compared machine learning with traditional models to predict heart failure.
Specifically, investigators leveraged Medicare claims data linked to EMRs from two large academic healthcare provider networks in Boston—including Brigham and Women’s Hospital—to evaluate the added value of augmenting only claims-based predictive models with EMR-derived information.
Data not recorded in claims were extracted from the EMR, including laboratory test results and free-text information from patient medical records.
“Machine learning methods offered only limited improvement over traditional logistic regression in predicting key HF outcome,” conclude the study’s authors. “Inclusion of additional predictors from EMRs to claims-based models appeared to improve prediction for some, but not all, outcomes.”
Researchers compared several machine learning approaches with traditional logistic regression for development of predictive models for all-cause mortality, HF hospitalization, high cost and loss in home time in patients with HF.
“We observed that machine learning methods, including tree-based ensemble approaches and penalized regression, offered only limited improvement over the widely used logistic regression,” according to the study’s authors. “Although augmenting claims data with detailed EMR-derived predictors resulted in notable improvement in model performance for certain outcomes, including mortality and home days loss, such improvement was not seen for prediction of high future costs.”
At the same time, investigators added that “when the predictor set was expanded to include EMR-based information, which included numerous laboratory test results as continuous variables, we noted that machine learning approaches generally fared better than logistic regression.”
Researchers acknowledge that they focused on administrative claims–based prediction and augmented claims data with select EMR-based variables, but did not evaluate model performance based on EMR data alone—a limitation of their study “because such a model could be useful for clinicians as they weigh various care options for patients during medical visits.”