Abstract
Estimating health care cost of patients provides promising opportunities for better management and treatment to medical providers and patients. Existing clinical approaches only focus on patient’s demographics and historical diagnoses but ignore ample information from clinical records. In this paper, we formulate the problem of patient’s cost profile estimation and use Electronic Medical Records (EMRs) to model patient visit for better estimating future health care cost. The performance of traditional learning based methods suffered from the sparseness and high dimensionality of EMR dataset. To address these challenges, we propose Patient Visit Probabilistic Generative Model (PVPGM) to describe a patient’s historical visits in EMR. With the help of PVPGM, we can not only learn a latent patient condition in a low dimensional space from sparse and missing data but also hierarchically organize the high dimensional EMR features. The model finally estimates the patient’s health care cost through combining the effects learned both from the latent patient condition and the generative process of medical procedure. We evaluate the proposed model on a large collection of real-world EMR dataset with 836,033 medical visits from over 50,000 patients. Experimental results demonstrate the effectiveness of our model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Ash, A.S., Ellis, R.P., Pope, G.C., Ayanian, J.Z., Bates, D.W., Burstin, H., Iezzoni, L.I., MacKay, E., Yu, W.: Using diagnoses to describe populations and predict costs. Health Care Financ. Rev. 21(3), 7 (2000)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)
Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press, Boca Raton (1984)
Caballero Barajas, K.L., Akella, R.: Dynamically modeling patient’s health state from electronic medical records: a time series approach. In: KDD, pp. 69–78 (2015)
Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM TIST 2(3), 27:1–27:27 (2011)
Feld, S.I., Cobian, A.G., Tevis, S.E., Kennedy, G.D., Craven, M.: Modeling the temporal evolution of postoperative complications. In: AMIA (2016)
Fetter, R.B., Shin, Y., Freeman, J.L., Averill, R.F., Thompson, J.D.: Case mix definition by diagnosis-related groups. Med. Care 18(2), i–53 (1980)
Fleishman, J.A., Cohen, J.W.: Using information on clinical conditions to predict high-cost patients. Health Serv. Res. 45(2), 532–552 (2010)
Hajian-Tilaki, K.: Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation. Casp. J. Internal Med. 4(2), 627 (2013)
Horn, S.D., Bulkley, G., Sharkey, P.D., Chambers, A.F., Horn, R.A., Schramm, C.J.: Interhospital differences in severity of illness: problems for prospective payment based on diagnosis-related groups (DRGs). N. Engl. J. Med. 313(1), 20–24 (1985)
Jolliffe, I.T.: Principal component analysis and factor analysis. In: Jolliffe, I.T. (ed.) Principal Component Analysis, pp. 115–128. Springer, New York (1986). https://doi.org/10.1007/978-1-4757-1904-8_7
Koh, H.C., Tan, G., et al.: Data mining applications in healthcare. J. Healthc. Inf. Manag. 19(2), 65 (2011)
Koren, Y., Bell, R.M., Volinsky, C.: Matrix factorization techniques for recommender systems. IEEE Comput. 42(8), 30–37 (2009)
Krishnapuram, B., Carin, L., Figueiredo, M.A.T., Hartemink, A.J.: Sparse multinomial logistic regression: fast algorithms and generalization bounds. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 957–968 (2005)
Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: NIPS, pp. 801–808 (2006)
Lee, J.D., Hastie, T.J.: Learning the structure of mixed graphical models. J. Comput. Graph. Stat. 24(1), 230–253 (2015)
Lin, Y.K., Chen, H., Brown, R.A., Li, S.H., Yang, H.J.: Healthcare predictive analytics for risk profiling in chronic care: a Bayesian multitask learning approach. MIS Q. 41(2), 473–495 (2017)
Liu, C., Wang, F., Hu, J., Xiong, H.: Temporal phenotyping from longitudinal electronic health records: a graph based framework. In: KDD, pp. 705–714 (2015)
Liu, L., Tang, J., Cheng, Y., Agrawal, A., Liao, W.K., Choudhary, A.: Mining diabetes complication and treatment patterns for clinical decision support. In: CIKM, pp. 279–288 (2013)
Moher, D., Jones, A., Cook, D.J., Jadad, A.R., Moher, M., Tugwell, P., Klassen, T.P., et al.: Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses? Lancet 352(9128), 609–613 (1998)
Moturu, S.T., Johnson, W.G., Liu, H.: Predicting future high-cost patients: a real-world risk modeling application. In: BIBM, pp. 202–208. IEEE (2007)
Pearl, J.: Causality. Cambridge University Press, Cambridge (2009)
Shickel, B., Tighe, P., Bihorac, A., Rashidi, P.: Deep EHR: a survey of recent advances on deep learning techniques for electronic health record (EHR) analysis. arXiv preprint arXiv:1706.03446 (2017)
Shivade, C., Raghavan, P., Fosler-Lussier, E., Embi, P.J., Elhadad, N., Johnson, S.B., Lai, A.M.: A review of approaches to identifying patient phenotype cohorts using electronic health records. J. Am. Med. Inform. Assoc. 21(2), 221–230 (2013)
Wang, J., Wang, Z., Zhang, D., Yan, J.: Combining knowledge with deep convolutional neural networks for short text classification. In: IJCAI, pp. 2915–2921 (2017)
Wood, A.M., White, I.R., Thompson, S.G.: Are missing outcome data adequately handled? A review of published randomized controlled trials in major medical journals. Clin. Trials 1(4), 368–376 (2004)
Yadav, P., Steinbach, M., Kumar, V., Simon, G.: Mining electronic health records: a survey. arXiv preprint arXiv:1702.03222 (2017)
Yang, S., Khot, T., Kersting, K., Natarajan, S.: Learning continuous-time Bayesian networks in relational domains: a non-parametric approach. In: AAAI, pp. 2265–2271 (2016)
Yang, Y., Luyten, W., Liu, L., Moens, M.F., Tang, J., Li, J.: Forecasting potential diabetes complications. In: AAAI, pp. 313–319 (2014)
Yin, H., Cui, B.: Spatio-Temporal Recommendation in Social Media. Springer Briefs in Computer Science. Springer, Singapore (2016). https://doi.org/10.1007/978-981-10-0748-4
Yin, H., Cui, B., Zhou, X., Wang, W., Huang, Z., Sadiq, S.W.: Joint modeling of user check-in behaviors for real-time point-of-interest recommendation. ACM Trans. Inf. Syst. 35(2), 11:1–11:44 (2016)
Yin, H., Hu, Z., Zhou, X., Wang, H., Zheng, K., Hung, N.Q.V., Sadiq, S.W.: Discovering interpretable geo-social communities for user behavior prediction. In: ICDE, pp. 942–953. IEEE Computer Society (2016)
Yin, H., Wang, W., Wang, H., Chen, L., Zhou, X.: Spatial-aware hierarchical collaborative deep learning for POI recommendation. IEEE Trans. Knowl. Data Eng. 29(11), 2537–2551 (2017)
Yin, H., Zhou, X., Cui, B., Wang, H., Zheng, K., Hung, N.Q.V.: Adapting to user interest drift for POI recommendation. IEEE Trans. Knowl. Data Eng. 28(10), 2566–2581 (2016)
Yin, H., Zhou, X., Shao, Y., Wang, H., Sadiq, S.W.: Joint modeling of user check-in behaviors for point-of-interest recommendation. In: CIKM, pp. 1631–1640. ACM (2015)
Yoshida, R., West, M.: Bayesian learning in sparse graphical factor models via variational mean-field annealing. J. Mach. Learn. Res. 11, 1771–1798 (2010)
Zhang, X., Yu, Y., White, M., Huang, R., Schuurmans, D.: Convex sparse coding, subspace learning, and semi-supervised extensions. In: AAAI (2011)
Zhang, Y., Li, X., Wang, J., Zhang, Y., Xing, C., Yuan, X.: An efficient framework for exact set similarity search using tree structure indexes. In: ICDE, pp. 759–770 (2017)
Zhong, P., Wang, R.: Learning sparse crfs for feature selection and classification of hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 46(12), 4186–4197 (2008)
Acknowledgment
This work was supported by NSFC (91646202), the National High-tech R&D Program of China (SS2015AA020102), NSSFC (15CTQ028), Research/Project 2017YB142 supported by Ministry of Education of The People’s Republic of China, the 1000-Talent program and Tsinghua Fudaoyuan Research Fund.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Zhao, K. et al. (2018). Modeling Patient Visit Using Electronic Medical Records for Cost Profile Estimation. In: Pei, J., Manolopoulos, Y., Sadiq, S., Li, J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science(), vol 10828. Springer, Cham. https://doi.org/10.1007/978-3-319-91458-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-91458-9_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91457-2
Online ISBN: 978-3-319-91458-9
eBook Packages: Computer ScienceComputer Science (R0)