Explainability analysis in predictive models based on machine learning techniques on the risk of hospital readmissions

Bedoya, Juan Camilo Lopera; Castro, Jose Lisandro Aguilar

doi:10.1007/s12553-023-00794-8

Explainability analysis in predictive models based on machine learning techniques on the risk of hospital readmissions

Original Paper
Published: 28 November 2023

Volume 14, pages 93–108, (2024)
Cite this article

Health and Technology Aims and scope Submit manuscript

Juan Camilo Lopera Bedoya¹ &
Jose Lisandro Aguilar Castro ORCID: orcid.org/0000-0003-4194-6882^1,2,3

118 Accesses
1 Citation
Explore all metrics

Abstract

Purpose

Analyzing the risk of re-hospitalization of patients with chronic diseases allows the healthcare institutions can deliver accurate preventive care to reduce hospital admissions, and the planning of the medical spaces and resources. Thus, the research question is: Is it possible to use artificial intelligence to study the risk of re-hospitalization of patients?

Methods

This article presents several models to predict when a patient can be hospitalized again, after its discharge. In addition, an explainability analysis is carried out with the predictive models to extract information to determine the degree of importance of the predictors/descriptors. Particularly, this article makes a comparative analysis of different explainability techniques in the study context.

Results

The best model is a classifier based on decision trees with an F1-Score of 83% followed by LGMB with an F1-Score of 67%. For these models, Shapley values were calculated as a method of explainability. Concerning the quality of the explainability of the predictive models, the stability metric was used. According to this metric, more variability is evidenced in the explanations of the decision trees, where only 4 attributes are very stable (21%) and 1 attribute is unstable. With respect to the LGBM-based model, there are 12 stable attributes (63%) and no unstable attributes. Thus, in terms of explainability, the LGBM-based model is better.

Conclusions

According to the results of the explanations generated by the best predictive models, LGBM-based predictive model presents more stable variables. Thus, it generates greater confidence in the explanations it provides.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The role of artificial intelligence in healthcare: a structured literature review

Article Open access 10 April 2021

A survey on ensemble learning

Article 30 August 2019

Explainable AI: A Brief Survey on History, Research Areas, Approaches and Challenges

Data availability

The datasets used in the current study are available from the corresponding author on reasonable request.

References

Jencks S, Williams N, Coleman E. Rehospitalizations among patients in the Medicare fee-for-service. N Engl J Med. 2009;360:1418–28.
Article Google Scholar
Kansagara D. Risk prediction models for hospital readmission, a systematic review. JAMA. 2011;306(15):1688–98.
Article Google Scholar
Insight D. 56% of hospitals lack big data governance. Analytics plans, health IT analytics [Online]. 2017. Available https://healthitanalytics.com/news/56-of-hospitals-lack-big-data-governance-analytics-plans.
Jaana J. The diabetes risk score: A practical tool to predict type 2 diabetes risk. Expert Syst Appl. 2003;26(3):725–31.
Google Scholar
Ortiz M, Altamar Z, Martínez C, Petrillo A, Jiménez G, García A, Medina A. Predicting 15-day unplanned readmissions in hospitalization departments: an application of logistic regression. Ingeniare Revista Chilena de Ingeniería. 2021;29(2):378–98.
Michailidis P, Dimitriadou A, Papadimitriou T, Gogas P. Forecasting hospital readmissions with machine learning. Healthcare. 2022;10:981.
Zhang D, Lee J. Effective hospital readmission prediction models using machine-learned features. BMC Health Serv Res. 2022;22:1415.
Arkaitz G. Predictive models for hospital readmission risk: A systematic review of methods. Comput Methods Programs Biomed. 2018;164:49–64.
Article Google Scholar
Hoyos W, Aguilar J, Toro M. Dengue models based on machine learning techniques: A systematic literature review. Artif Intell Med. 2021;119:102157. https://doi.org/10.1016/j.artmed.2021.102157.
Article Google Scholar
Quintero Y, Ardila D, Camargo E, Rivas F, Aguila J. Machine learning models for the prediction of the SEIRD variables for the COVID-19 pandemic based on a deep dependence analysis of variables. Comput Biol Med. 2021;134:104500. https://doi.org/10.1016/j.compbiomed.2021.104500.
Article Google Scholar
Camargo E, Aguilar J, Quintero Y, Rivas F, Ardila D. An incremental learning approach to prediction models of SEIRD variables in the context of the COVID-19 pandemic. Health Technol. 2022;12:867–77.
Article Google Scholar
Holzinger A, Langs G, Denk H, Zatloukal K, Müller H. Causability and explainability of artificial intelligence in medicine. Wiley Interdiscip Rev Data Min Knowl Discov. 2019;4(9):e1312. https://doi.org/10.1002/widm.1312.
Article Google Scholar
Burkart N, Huber M. A survey on the explainability of supervised machine learning. J Artif Intell Res. 2021;70:245–317.
Article MathSciNet Google Scholar
Marco R, Sameer S, Carlos G. Why should i trust you? Explaining the predictions of any classifier. In: International conference on knowledge discovery and data mining. 2016.
Baig M, Hua N, Zhang E, Reece R, Spyker A, Armstrong D, Whittaker R, Robinson T, Ullah E. A machine learning model for predicting risk of hospital readmission within 30 days of discharge: validated with LACE index and patient at risk of hospital readmission (PARR) model. Med Biol Eng Comput. 2020;58:1459–66.
Article Google Scholar
Lo YT, Liao JC, Chen MH, Chang C, Li C. Predictive modeling for 14-day unplanned hospital readmission risk by using machine learning algorithms. BMC Med Inform Decis Mak. 2021;21:288. https://doi.org/10.1186/s12911-021-01639-y.
Article Google Scholar
Ko M, Chen E, Agrawal A, Rajpurkar P, Avati A, Ng A, Basu S, Shah N. Improving hospital readmission prediction using individualized utility analysis. J Biomed Inform. 2021;119:103826. https://doi.org/10.1016/j.jbi.2021.103826.
Article Google Scholar
Zhao P, Yoo I, Naqvi SH. Early prediction of unplanned 30-day hospital readmission: model development and retrospective data analysis. JMIR Med Inform. 2021;23(9):e16306. https://doi.org/10.2196/16306. PMID: 33755027; PMCID: PMC8077543.
Article Google Scholar
Afrash M, Kazemi-Arpanahi H, Shanbehzadeh M, Nopour R, Mirbagheri E. Predicting hospital readmission risk in patients with COVID-19: a machine learning approach. Inform Med Unlocked. 2022;30:100908. https://doi.org/10.1016/j.imu.2022.100908.
Article Google Scholar
Shang Y, Jiang K, Wang L, Zhang Z, Zhou S, Liu Y, Dong J, Wu H. The 30-days hospital readmission risk in diabetic patients: predictive modeling with machine learning classifiers. BMC Med Inform Decis Mak. 2021;21:57. https://doi.org/10.1186/s12911-021-01423-y.
Article Google Scholar
Huang Y, Talwar A, Chatterjee S, Aparasu R. Application of machine learning in predicting hospital readmissions: a scoping review of the literature. BMC Med Res Methodol. 2021;21:96. https://doi.org/10.1186/s12874-021-01284-z.
Article Google Scholar
Gatt M, Cassar M, Buttigieg S. A review of literature on risk prediction tools for hospital readmissions in older adults. J Health Organ Manag. 2022;36(4):521–57.
Article Google Scholar
Araujo M, Aguilar J, Aponte H. Fault detection system in gas lift well based on artificial immune system. In: Proc. International Joint Conference on Neural Networks, vol. 3. 2003. p. 1673–7.
Aguilar J, Jerez M, Exposito E, Villemur T. CARMiCLOC: context awareness middleware in cloud computing. In Latin American Computing Conference (CLEI). 2015
Morales L, Ouedraogo C, Aguilar J, Chassot C, Medjiah S, Drira K. Experimental comparison of the diagnostic capabilities of classification and clusteri algorithms for the QoS management in an autonomic IoT platform. SOCA. 2019;13:199–219.
Article Google Scholar
Sánchez M, Aguilar J, Cordero C, Valdiviezo-Díaz P, Barba-Guamán L, Chamba-Eras L. Cloud computing in smart educational environments: application in learning analytics as service. In: Rocha Á, Correia A, Adeli H, Reis L, Teixeira MM, editors. New advances in information systems and technologies. Advances in intelligent systems and computing. 2016. p. 444.
Unión Europea. Reglamento (UE) 2016/679 del Parlamento Europeo y del Consejo [Online]. Madrid; 2016. Available https://www.boe.es/doue/2016/119/L00001-00088.pdf.
Molnar C. Interpretable machine learning. A guide for making black box models explainable. Leanpub. 2019.
Ribeiro M, Singh S, Guestrin C. Model-agnostic interpretability of machine learning. Chapter 6. In: Molnar C, editor. Interpretable machine learning: a guide for making black box models explainable. Independently published. 2022.
Shearer C. The CRISP-DM model: The new blueprint for data mining. J Data Warehous. 2000;5:13–22.
Google Scholar
Anonymous database. https://www.epssura.com/.
Breiman A. Classification and regression trees. New York; 1984.
Breiman L. Statistical modeling: The two cultures (with comments and a rejoinder by the author). Stat Sci. 2001;16(3):199–231.
Article Google Scholar
Freund Y, Schapire R. A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci. 1997;55(1):119–39.
Article MathSciNet Google Scholar
Ledoit O, Wolf M, Honey I. Shrunk the sample covariance matrix. J Portf Manag. 2004;30:110–9.
Article Google Scholar
Hoyos W, Aguilar J, Toro M. A clinical decision-support system for dengue based on fuzzy cognitive maps. Health Care Manag Sci. 2022;25:666–81.
Article Google Scholar
Vizcarrondo J, Aguilar J, Exposito E, Subias A. ARMISCOM: Autonomic reflective middleware for management service composition. In: Global Information Infrastructure and Networking Symposium (GIIS). 2012.

Download references

Funding

Jose Aguilar was partially supported by grant 22-STIC-06 (HAMADI 4.0 project) funded by the STIC-AmSud regional program.

Author information

Authors and Affiliations

GIDITIC, Universidad EAFIT, Medellín, Colombia
Juan Camilo Lopera Bedoya & Jose Lisandro Aguilar Castro
CEMISID, Universidad de Los Andes, Mérida, Venezuela
Jose Lisandro Aguilar Castro
IMDEA Networks Institute, Madrid, Spain
Jose Lisandro Aguilar Castro

Authors

Juan Camilo Lopera Bedoya
View author publications
You can also search for this author in PubMed Google Scholar
Jose Lisandro Aguilar Castro
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Concept and design: All authors; Acquisition, analysis, or interpretation of data: Lopera; Drafting of the manuscript: All authors; Results analysis: All authors; Obtained funding: Aguilar.

Corresponding author

Correspondence to Jose Lisandro Aguilar Castro.

Ethics declarations

Ethics statement

The study was conducted in accordance with relevant guidelines and regulations, and approved by the EAFIT University ethics committee.

Consent to participate

The Sura health center has signed an anonymized data use agreement with the EAFIT University.

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bedoya, J.C.L., Castro, J.L.A. Explainability analysis in predictive models based on machine learning techniques on the risk of hospital readmissions. Health Technol. 14, 93–108 (2024). https://doi.org/10.1007/s12553-023-00794-8

Download citation

Received: 05 July 2023
Accepted: 30 October 2023
Published: 28 November 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s12553-023-00794-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Explainability analysis in predictive models based on machine learning techniques on the risk of hospital readmissions