Towards more Accessible Precision Medicine: Building a more Transferable Machine Learning Model to Support Prognostic Decisions for Micro- and Macrovascular Complications of Type 2 Diabetes Mellitus
Although machine learning models are increasingly being developed for clinical decision support for patients with type 2 diabetes, the adoption of these models into clinical practice remains limited. Currently, machine learning (ML) models are being constructed on local healthcare systems and are validated internally with no expectation that they would validate externally and thus, are rarely transferrable to a different healthcare system. In this work, we aim to demonstrate that (1) even a complex ML model built on a national cohort can be transferred to two local healthcare systems, (2) while a model constructed on a local healthcare system’s cohort is difficult to transfer; (3) we examine the impact of training cohort size on the transferability; and (4) we discuss criteria for external validity. We built a model using our previously published Multi-Task Learning-based methodology on a national cohort extracted from OptumLabs® Data Warehouse and transferred the model to two local healthcare systems (i.e., University of Minnesota Medical Center and Mayo Clinic) for external evaluation. The model remained valid when applied to the local patient populations and performed as well as locally constructed models (concordance: .73–.92), demonstrating transferability. The performance of the locally constructed models reduced substantially when applied to each other’s healthcare system (concordance: .62–.90). We believe that our modeling approach, in which a model is learned from a national cohort and is externally validated, produces a transferable model, allowing patients at smaller healthcare systems to benefit from precision medicine.
KeywordsMachine learning Large national data External validation Transferable model Complications of type 2 diabetes Precision medicine
This work was supported by NIH award R01 LM011972, NSF awards IIS 1602198. The views expressed in this paper are those of the authors and do not necessarily reflect the views of the funding agencies.
Compliance with ethical standard
Conflict of interest
The access to the claims and EHR data from the OLDW was made possible through use of an OptumLabs research credit. Author Era Kim owns stock in UnitedHealth Group.
This article does not contain any studies with human participants or animals performed by any of the authors.
- 4.Perveen, S. et al., A systematic machine learning based approach for the diagnosis of non-alcoholic fatty liver disease risk and progression. Comput. Struct. Biotechnol. J. 13(December):1445–1454, 2017, 2016.Google Scholar
- 6.Cichosz, S. L., Johansen, M. D., and Hejlesen, O., Toward big data analytics. Review of Predictive Models in Management of Diabetes and Its Complications, 2016.Google Scholar
- 14.C. L. Roumie et al., “Performance of a computable phenotype for identification of patients with diabetes within PCORnet : The Patient - Centered Clinical Research Network,” no. December 2018, pp. 1–8, 2019.Google Scholar
- 16.Hripcsak, G., Ryan, P. B., Duke, J. D., and Shah, N. H., R. Woong, and V. Huser, “Characterizing treatment pathways at scale using the OHDSI network,” 113(27):7329–7336, 2016.Google Scholar
- 18.OptumLabs, “OptumLabs and OptumLabs Data Warehouse (OLDW) Descriptions and Citation,” Cambridge, MA: n.p., PDF, Reproduced with permission from OptumLabs, 2018.Google Scholar
- 19.American Diabetes Association (ADA), “Standards of Medical Care in Diabetes - 2017,” Diabetes Care, vol. 40 (sup 1), no. January, pp. s4–s128, 2017.Google Scholar
- 22.E. Kim, D. S. Pieczkiewicz, M. R. Castro, P. J. Caraballo, and G. J. Simon, “Multi-Task Learning to Identify Outcome-Specific Risk Factors that Distinguish Individual Micro and Macrovascular Complications of Type 2 Diabetes,” AMIA 2018 Informatics Summit Proc., 2018.Google Scholar
- 31.Bossuyt, P. M. et al., RESEARCH METHODS & REPORTING STARD 2015 : An updated list of essential items for. Radiographies 277(3):1–9, 2015.Google Scholar
- 36.Van Soest, J. et al., Prospective validation of pathologic complete response models in rectal cancer: Transferability and reproducibility. Med. Phys. 44(9), 2017.Google Scholar