Skip to main content

Modeling Patient Visit Using Electronic Medical Records for Cost Profile Estimation

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10828))

Included in the following conference series:

Abstract

Estimating health care cost of patients provides promising opportunities for better management and treatment to medical providers and patients. Existing clinical approaches only focus on patient’s demographics and historical diagnoses but ignore ample information from clinical records. In this paper, we formulate the problem of patient’s cost profile estimation and use Electronic Medical Records (EMRs) to model patient visit for better estimating future health care cost. The performance of traditional learning based methods suffered from the sparseness and high dimensionality of EMR dataset. To address these challenges, we propose Patient Visit Probabilistic Generative Model (PVPGM) to describe a patient’s historical visits in EMR. With the help of PVPGM, we can not only learn a latent patient condition in a low dimensional space from sparse and missing data but also hierarchically organize the high dimensional EMR features. The model finally estimates the patient’s health care cost through combining the effects learned both from the latent patient condition and the generative process of medical procedure. We evaluate the proposed model on a large collection of real-world EMR dataset with 836,033 medical visits from over 50,000 patients. Experimental results demonstrate the effectiveness of our model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.cms.gov/.

  2. 2.

    http://www.who.int.

References

  1. Ash, A.S., Ellis, R.P., Pope, G.C., Ayanian, J.Z., Bates, D.W., Burstin, H., Iezzoni, L.I., MacKay, E., Yu, W.: Using diagnoses to describe populations and predict costs. Health Care Financ. Rev. 21(3), 7 (2000)

    Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)

    MATH  Google Scholar 

  3. Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press, Boca Raton (1984)

    MATH  Google Scholar 

  4. Caballero Barajas, K.L., Akella, R.: Dynamically modeling patient’s health state from electronic medical records: a time series approach. In: KDD, pp. 69–78 (2015)

    Google Scholar 

  5. Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM TIST 2(3), 27:1–27:27 (2011)

    Google Scholar 

  6. Feld, S.I., Cobian, A.G., Tevis, S.E., Kennedy, G.D., Craven, M.: Modeling the temporal evolution of postoperative complications. In: AMIA (2016)

    Google Scholar 

  7. Fetter, R.B., Shin, Y., Freeman, J.L., Averill, R.F., Thompson, J.D.: Case mix definition by diagnosis-related groups. Med. Care 18(2), i–53 (1980)

    Google Scholar 

  8. Fleishman, J.A., Cohen, J.W.: Using information on clinical conditions to predict high-cost patients. Health Serv. Res. 45(2), 532–552 (2010)

    Article  Google Scholar 

  9. Hajian-Tilaki, K.: Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation. Casp. J. Internal Med. 4(2), 627 (2013)

    Google Scholar 

  10. Horn, S.D., Bulkley, G., Sharkey, P.D., Chambers, A.F., Horn, R.A., Schramm, C.J.: Interhospital differences in severity of illness: problems for prospective payment based on diagnosis-related groups (DRGs). N. Engl. J. Med. 313(1), 20–24 (1985)

    Article  Google Scholar 

  11. Jolliffe, I.T.: Principal component analysis and factor analysis. In: Jolliffe, I.T. (ed.) Principal Component Analysis, pp. 115–128. Springer, New York (1986). https://doi.org/10.1007/978-1-4757-1904-8_7

    Chapter  Google Scholar 

  12. Koh, H.C., Tan, G., et al.: Data mining applications in healthcare. J. Healthc. Inf. Manag. 19(2), 65 (2011)

    Google Scholar 

  13. Koren, Y., Bell, R.M., Volinsky, C.: Matrix factorization techniques for recommender systems. IEEE Comput. 42(8), 30–37 (2009)

    Article  Google Scholar 

  14. Krishnapuram, B., Carin, L., Figueiredo, M.A.T., Hartemink, A.J.: Sparse multinomial logistic regression: fast algorithms and generalization bounds. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 957–968 (2005)

    Article  Google Scholar 

  15. Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: NIPS, pp. 801–808 (2006)

    Google Scholar 

  16. Lee, J.D., Hastie, T.J.: Learning the structure of mixed graphical models. J. Comput. Graph. Stat. 24(1), 230–253 (2015)

    Article  MathSciNet  Google Scholar 

  17. Lin, Y.K., Chen, H., Brown, R.A., Li, S.H., Yang, H.J.: Healthcare predictive analytics for risk profiling in chronic care: a Bayesian multitask learning approach. MIS Q. 41(2), 473–495 (2017)

    Article  Google Scholar 

  18. Liu, C., Wang, F., Hu, J., Xiong, H.: Temporal phenotyping from longitudinal electronic health records: a graph based framework. In: KDD, pp. 705–714 (2015)

    Google Scholar 

  19. Liu, L., Tang, J., Cheng, Y., Agrawal, A., Liao, W.K., Choudhary, A.: Mining diabetes complication and treatment patterns for clinical decision support. In: CIKM, pp. 279–288 (2013)

    Google Scholar 

  20. Moher, D., Jones, A., Cook, D.J., Jadad, A.R., Moher, M., Tugwell, P., Klassen, T.P., et al.: Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses? Lancet 352(9128), 609–613 (1998)

    Article  Google Scholar 

  21. Moturu, S.T., Johnson, W.G., Liu, H.: Predicting future high-cost patients: a real-world risk modeling application. In: BIBM, pp. 202–208. IEEE (2007)

    Google Scholar 

  22. Pearl, J.: Causality. Cambridge University Press, Cambridge (2009)

    Book  Google Scholar 

  23. Shickel, B., Tighe, P., Bihorac, A., Rashidi, P.: Deep EHR: a survey of recent advances on deep learning techniques for electronic health record (EHR) analysis. arXiv preprint arXiv:1706.03446 (2017)

  24. Shivade, C., Raghavan, P., Fosler-Lussier, E., Embi, P.J., Elhadad, N., Johnson, S.B., Lai, A.M.: A review of approaches to identifying patient phenotype cohorts using electronic health records. J. Am. Med. Inform. Assoc. 21(2), 221–230 (2013)

    Article  Google Scholar 

  25. Wang, J., Wang, Z., Zhang, D., Yan, J.: Combining knowledge with deep convolutional neural networks for short text classification. In: IJCAI, pp. 2915–2921 (2017)

    Google Scholar 

  26. Wood, A.M., White, I.R., Thompson, S.G.: Are missing outcome data adequately handled? A review of published randomized controlled trials in major medical journals. Clin. Trials 1(4), 368–376 (2004)

    Article  Google Scholar 

  27. Yadav, P., Steinbach, M., Kumar, V., Simon, G.: Mining electronic health records: a survey. arXiv preprint arXiv:1702.03222 (2017)

  28. Yang, S., Khot, T., Kersting, K., Natarajan, S.: Learning continuous-time Bayesian networks in relational domains: a non-parametric approach. In: AAAI, pp. 2265–2271 (2016)

    Google Scholar 

  29. Yang, Y., Luyten, W., Liu, L., Moens, M.F., Tang, J., Li, J.: Forecasting potential diabetes complications. In: AAAI, pp. 313–319 (2014)

    Google Scholar 

  30. Yin, H., Cui, B.: Spatio-Temporal Recommendation in Social Media. Springer Briefs in Computer Science. Springer, Singapore (2016). https://doi.org/10.1007/978-981-10-0748-4

    Book  Google Scholar 

  31. Yin, H., Cui, B., Zhou, X., Wang, W., Huang, Z., Sadiq, S.W.: Joint modeling of user check-in behaviors for real-time point-of-interest recommendation. ACM Trans. Inf. Syst. 35(2), 11:1–11:44 (2016)

    Article  Google Scholar 

  32. Yin, H., Hu, Z., Zhou, X., Wang, H., Zheng, K., Hung, N.Q.V., Sadiq, S.W.: Discovering interpretable geo-social communities for user behavior prediction. In: ICDE, pp. 942–953. IEEE Computer Society (2016)

    Google Scholar 

  33. Yin, H., Wang, W., Wang, H., Chen, L., Zhou, X.: Spatial-aware hierarchical collaborative deep learning for POI recommendation. IEEE Trans. Knowl. Data Eng. 29(11), 2537–2551 (2017)

    Article  Google Scholar 

  34. Yin, H., Zhou, X., Cui, B., Wang, H., Zheng, K., Hung, N.Q.V.: Adapting to user interest drift for POI recommendation. IEEE Trans. Knowl. Data Eng. 28(10), 2566–2581 (2016)

    Article  Google Scholar 

  35. Yin, H., Zhou, X., Shao, Y., Wang, H., Sadiq, S.W.: Joint modeling of user check-in behaviors for point-of-interest recommendation. In: CIKM, pp. 1631–1640. ACM (2015)

    Google Scholar 

  36. Yoshida, R., West, M.: Bayesian learning in sparse graphical factor models via variational mean-field annealing. J. Mach. Learn. Res. 11, 1771–1798 (2010)

    MathSciNet  MATH  Google Scholar 

  37. Zhang, X., Yu, Y., White, M., Huang, R., Schuurmans, D.: Convex sparse coding, subspace learning, and semi-supervised extensions. In: AAAI (2011)

    Google Scholar 

  38. Zhang, Y., Li, X., Wang, J., Zhang, Y., Xing, C., Yuan, X.: An efficient framework for exact set similarity search using tree structure indexes. In: ICDE, pp. 759–770 (2017)

    Google Scholar 

  39. Zhong, P., Wang, R.: Learning sparse crfs for feature selection and classification of hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 46(12), 4186–4197 (2008)

    Article  Google Scholar 

Download references

Acknowledgment

This work was supported by NSFC (91646202), the National High-tech R&D Program of China (SS2015AA020102), NSSFC (15CTQ028), Research/Project 2017YB142 supported by Ministry of Education of The People’s Republic of China, the 1000-Talent program and Tsinghua Fudaoyuan Research Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kangzhi Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhao, K. et al. (2018). Modeling Patient Visit Using Electronic Medical Records for Cost Profile Estimation. In: Pei, J., Manolopoulos, Y., Sadiq, S., Li, J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science(), vol 10828. Springer, Cham. https://doi.org/10.1007/978-3-319-91458-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-91458-9_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-91457-2

  • Online ISBN: 978-3-319-91458-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics