Skip to main content

Advertisement

Log in

Building an Open Health Data Analytics Platform: a Case Study Examining Relationships and Trends in Seniority and Performance in Healthcare Providers

  • Research Article
  • Published:
Journal of Healthcare Informatics Research Aims and scope Submit manuscript

Abstract

With a movement toward an open data-driven approach, governments worldwide are releasing public data in an effort to increase transparency. Despite the availability of this data, many factors make it difficult for anyone to extract knowledge from it. The relationship between seniority and performance is a controversial issue in many fields; using an open and reproducible framework, we investigate the relationship in open healthcare data. Using data from the Center for Medicare and Medicaid Services covering 895,000 practitioners and 3000 hospitals, weak but statistically significant correlations between graduation year, a proxy for seniority, and the hospital value-based performance score were found in 29 of 74 specialties (Spearman rank correlation values < 0.164, p value < 0.05). This result represents 7% of US healthcare practitioners and over 75% of medical practitioners in several specialties. With 5 years of data (2009–2014) from the New York Statewide Planning and Research Cooperative System (SPARCS), we found weak but statistically significant correlations between graduation year and cardiac surgery outcome measures (Spearman rank correlation value − 0.096, p value < 0.0005). An unsupervised clustering K-means-based algorithm for finding outliers was also applied to these datasets. It captured a unique trend in the number of nurse practitioners which was increasing rapidly since 2010. It also revealed consistencies in practitioner placement throughout hospitals. Our findings suggest that the training of healthcare professionals appears to be robust and positions them for long-lasting and consistent careers across the majority of specialties.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Schwartz AL, Landon BE, Elshaug AG, Chernew ME, McWilliams JM (2014) Measuring low-value care in Medicare. JAMA Intern Med 174:1067–1076

    Article  Google Scholar 

  2. Schneeweiss S (2014) Learning from big healthcare data. N Engl J Med 370:2161–2163

    Article  Google Scholar 

  3. Meyer J, Boll S (2014) Digital health devices for everyone. IEEE Pervasive Comput 13(2):10

    Article  Google Scholar 

  4. Medicare.gov The Total Performance Score information. Available: https://www.medicare.gov/hospitalcompare/data/total-performance-scores.html

  5. Le ME, Kurd MF (2014) A review and analysis of the YODA trials: what can we glean clinically? Curr Rev Musculoskelet Med 7:189–192

    Article  Google Scholar 

  6. Rao AR, Chhabra A, Das R, Ruhil V (2015) “A framework for analyzing publicly available healthcare data,” in 2015 17th International Conference on E-health Networking, Application & Services (IEEE HealthCom), pp. 653–656

  7. Frakt A (2016) How common procedures became 20 percent cheaper for many Californians, The New York Times

  8. Paul R, Clay A (2011) An open source approach to medical research,” Stanf Soc Innov Rev

  9. Johnson AE, Stone DJ, Celi LA, Pollard TJ (2017) The MIMIC Code Repository: enabling reproducibility in critical care research. J Am Med Inform Assoc p. ocx084

  10. Baker M (2016) 1,500 scientists lift the lid on reproducibility. Nature 533:452–454

    Article  Google Scholar 

  11. github.com/fdudatamining/framework, Base Frame Work for Data Mining.”

  12. Centre For Disease Control and Prevention. Available: http://www.cdc.gov/DataStatistics/

  13. New York State Department Of Health, Statewide Planning and Research Cooperative System (SPARCS). Available: https://www.health.ny.gov/statistics/sparcs/

  14. Available: http://www.op.nysed.gov/opsearches.htm

  15. “Department of Health and Human Services, Secondary Analyses of Existing Data Sets and Stored Biospecimens to Address Clinical Aging Research Questions (R01).”

  16. “Department of Health and Human Services, Secondary Dataset Analyses in Heart, Lung, and Blood Diseases and Sleep Disorders (R21).”

  17. (2017, 1/28/2018). Robert Woods Johnson Foundation, 2017 Call for Proposals on “Health Data for Action”. Available: https://www.rwjf.org/en/library/funding-opportunities/2017/health-data-for-action.html

  18. “RUSSELL SAGE FOUNDATION, Funding Opportunity: The Social, Economic, and Political Effects of the Affordable Care Act.”

  19. Fung CH, Lim Y-W, Mattke S, Damberg C, Shekelle PG (2008) Systematic review: the evidence that publishing patient care performance data improves quality of care. The impact of publishing performance data on quality of care. Ann Intern Med 148:111–123

    Article  Google Scholar 

  20. Tsugawa Y, Jha AK, Newhouse JP, Zaslavsky AM, Jena AB (2017a) Variation in physician spending and association with patient outcomes. JAMA Intern Med 177:675–682

    Article  Google Scholar 

  21. Higgins A, Brainard N, Veselovskiy G (2016) Characterizing health plan price estimator tools: findings from a national survey. Am J Manag Care 22:126–131

    Google Scholar 

  22. Buerhaus P, Staiger D, Auerbach D (2009) The future of the nursing workforce in the United States: Data, trends and implications: Jones & Bartlett Publishers

  23. Choudhry NK, Fletcher RH, Soumerai SB (2005) Systematic review: the relationship between clinical experience and quality of healthcare. Ann Intern Med 142:260–273

    Article  Google Scholar 

  24. Neumayer LA, Gawande AA, Wang J, Giobbie-Hurder A, Itani KM, Fitzgibbons Jr RJ et al (2005) Proficiency of surgeons in inguinal hernia repair: effect of experience and age. Ann Surg 242:344–352

    Google Scholar 

  25. Jena AB, Romley J (2015) Changes in hospitalizations, treatment patterns, and outcomes during major cardiovascular meetings—reply. JAMA Intern Med 175:1420–1421

    Article  Google Scholar 

  26. Schenarts PJ, Cemaj S (Feb 2016) The aging surgeon: implications for the workforce, the surgeon, and the patient. Surg Clin North Am 96:129–138

    Article  Google Scholar 

  27. Anderson BR, Wallace AS, Hill KD, Gulack BC, Matsouaka R, Jacobs JP, Bacha EA, Glied SA, Jacobs ML (2017) Association of surgeon age and experience with congenital heart surgery outcomes. Circ Cardiovasc Qual Outcomes 10:e003533

    Article  Google Scholar 

  28. McAlister FA, Youngson E, Bakal JA, Holroyd-Leduc J, Kassam N (2015) Physician experience and outcomes among patients admitted to general internal medicine teaching wards. CMAJ 187:1041–1048

    Article  Google Scholar 

  29. Li C-J, Syue Y-J, Kung C-T, Hung S-C, Lee C-H, Wu K-H (2016) Seniority of emergency physician, patient disposition and outcome following disposition. Am J Med Sci 351:582–588

    Article  Google Scholar 

  30. Cunningham E, Debar S, Bell B (2003) Association between surgeon seniority and outcome in intracranial aneurysm surgery. Br J Neurosurg 17:124–129

    Article  Google Scholar 

  31. Waljee JF, Greenfield LJ, Dimick JB, Birkmeyer JD (2006) Surgeon age and operative mortality in the United States. Ann Surg 244:353

    Google Scholar 

  32. Tsugawa Y, Newhouse JP, Zaslavsky AM, Blumenthal DM, Jena AB (2017b) Physician age and outcomes in elderly patients in hospital in the US: observational study. BMJ 357:j1797

    Article  Google Scholar 

  33. Melville NA (2017) “Age-based testing of physician competence stirs controversy,” in Medscape, ed

  34. Davis DA, Mazmanian PE, Fordis M, Van Harrison R, Thorpe KE, Perrier L (2006) Accuracy of physician self-assessment compared with observed measures of competence: a systematic review. JAMA 296:1094–1102

    Article  Google Scholar 

  35. Kupfer JM (2016) The graying of US physicians: implications for quality and the future supply of physicians. JAMA 315:341–342

    Article  Google Scholar 

  36. Salthouse TA (2009) When does age-related cognitive decline begin? Neurobiol Aging 30:507–514

    Article  Google Scholar 

  37. Meagher AD, Beadles CA, Sheldon GF, Charles AG (2016) Opportunities to create new general surgery residency programs to alleviate the shortage of general surgeons. Acad Med 91:833–838

    Article  Google Scholar 

  38. Harden RM (2000) The integration ladder: a tool for curriculum planning and evaluation. Med Educ-Oxf 34:551–557

    Article  Google Scholar 

  39. Rao AR, Clarke D (2018) Facilitating the exploration of open health-care data through BOAT: a big data open source analytics tool, in Emerging Challenges in Business, Optimization, Technology, and Industry, ed: Springer, pp. 93–115

  40. Bullinger AC, Rass M, Adamczyk S, Moeslein KM, Sohn S (2012) Open innovation in healthcare: analysis of an open health platform. Health Policy 105:165–175

    Article  Google Scholar 

  41. Estrin D, Sim I (2010) Healthcare delivery. Open mHealth architecture: an engine for healthcare innovation. Science 330:759–760

    Article  Google Scholar 

  42. Safran C, Bloomrosen M, Hammond WE, Labkoff S, Markel-Fox S, Tang PC, Detmer DE (2007) Toward a national framework for the secondary use of health data: an American medical informatics association white paper. J Am Med Inform Assoc 14:1–9

    Article  Google Scholar 

  43. Rosenkrantz AB, Doshi AM (2016) Public transparency web sites for radiology practices: prevalence of price, clinical quality, and service quality information. Clin Imaging 40:531–534

    Article  Google Scholar 

  44. Krumholz HM (2014) Big data and new knowledge in medicine: the thinking, training, and tools needed for a learning health system. Health Aff (Millwood) 33:1163–1170

    Article  Google Scholar 

  45. Rao AR, Clarke D (2016) A fully integrated open-source toolkit for mining healthcare big-data: architecture and applications,” in IEEE International Conference on Healthcare Informatics ICHI, Chicago, pp. 255–261

  46. Institute of Medicine (2011) Committee on the Robert Wood Johnson Foundation Initiative on the Future of Nursing, The future of nursing: Leading change, advancing health. National Academies Press, Washington, DC

    Google Scholar 

  47. Mundinger MO (2002) Twenty-first-century primary care: new partnerships between nurses and doctors. Acad Med 77:776–780

    Article  Google Scholar 

  48. Mundinger MO, Kane RL, Lenz ER, Totten AM, Tsai W-Y, Cleary PD, Friedewald WT, Siu AL, Shelanski ML (2000) Primary care outcomes in patients treated by nurse practitioners or physicians: a randomized trial. JAMA 283:59–68

    Article  Google Scholar 

  49. New Jersey Division Of Consumer Affairs. Available: http://www.njconsumeraffairs.gov/nur/Pages/APN-Certification.aspx

  50. “Summary of the Affordable Care Act,” THE HENRY J. KAISER FAMILY FOUNDATION, 2013

  51. HHS.GOV/HealthCare. Read The Law. Available: https://www.hhs.gov/healthcare/about-the-law/read-the-law/

  52. Hunter JD (2007) Matplotlib: a 2D graphics environment. Comput Sci Eng 9:90–95

    Article  Google Scholar 

  53. McKinney W (2012) Python for data analysis: Data wrangling with Pandas, NumPy, and IPython: “ O’Reilly Media, Inc.”

  54. Pérez F, Granger BE (2007) IPython: a system for interactive scientific computing. Comput Sci Eng 9:21–29

    Article  Google Scholar 

  55. McKinney W (2010) “Data structures for statistical computing in python,” in Proceedings of the 9th Python in Science Conference, pp. 51–56

  56. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  57. CMS.GOV “Private correspondence with the data vendor for CMS.”

  58. Pronovost PJ, Lilford R (2011) A road map for improving the performance of performance measures. Health Aff 30:569–573

    Article  Google Scholar 

  59. Press, I (2004) The measure of quality. Q Manage Health Care Vol 13, No 4. Lippincott Williams & Wilkins, Inc, Philadelphia, pp. 202–209

  60. Kernisan LP, Lee SJ, Boscardin WJ, Landefeld CS, Dudley RA (2009) Association between hospital-reported leapfrog safe practices scores and inpatient mortality. JAMA 301:1341–1348

    Article  Google Scholar 

  61. Leonardi MJ, McGory ML, Ko CY (2007) Publicly available hospital comparison web sites: determination of useful, valid, and appropriate information for comparing surgical quality. Arch Surg 142:863–869

    Article  Google Scholar 

  62. Rao AR, Clarke D (2017) “An open-source framework for the interactive exploration of big data: applications in understanding healthcare ” presented at the IJCNN, International Joint Conference on Neural Networks

  63. Physician Specialty Data Book, 2014

  64. Hardy Q (2013) Technology workers are young (really young),” The New York Times

  65. Thibodeau P (2015) “Median age at Google is 29, says age discrimination lawsuit,” Computerworld

  66. “Total Healthcare Employment,” THE HENRY J. KAISER FAMILY FOUNDATION, 2015

  67. Fried LP, Begg MD, Bayer R, Galea S (2014) MPH education for the 21st century: motivation, rationale, and key principles for the new Columbia public health curriculum. Am J Public Health 104:23–30

    Article  Google Scholar 

  68. Anderson A (2014) “The impact of the affordable care act on the healthcare workforce,” Background

  69. Mukamel DB, Weimer DL, Zwanziger J, Gorthy S-FH, Mushlin AI (2004) Quality report cards, selection of cardiac surgeons, and racial disparities: a study of the publication of the New York state cardiac surgery reports. INQUIRY: J Healthc Organ, Provision Financ 41:435–446

    Article  Google Scholar 

  70. Jha AK, Epstein AM (2006) The predictive accuracy of the New York State coronary artery bypass surgery report-card system. Health Aff 25:844–855

    Article  Google Scholar 

  71. Miller RH, Sim I (2004) Physicians’ use of electronic medical records: barriers and solutions. Health Aff 23:116–126

    Article  Google Scholar 

  72. Walker J, Darer JD, Elmore JG, Delbanco T (2014) The road toward fully transparent medical records. N Engl J Med 370:6–8

    Article  Google Scholar 

  73. Thomas K (2016) “Furor over drug prices puts patient advocacy groups in bind,” The New York Times

  74. Martínez-Torres MR, Diaz-Fernandez M d C (2014) Current issues and research trends on open-source software communities. Tech Anal Strat Manag 26:55–68

    Article  Google Scholar 

  75. DATAUSA. Available: https://datausa.io/about/

  76. Hussey PS, Luft HS, McNamara P (2014) Public reporting of provider performance at a crossroads in the United States: summary of current barriers and recommendations on how to move forward. Med Care Res Rev 71:5S–16S

    Article  Google Scholar 

  77. Marshall MN, Shekelle PG, Leatherman S, Brook RH (2000) The public release of performance data: what do we expect to gain? A review of the evidence. JAMA 283:1866–1874

    Article  Google Scholar 

Download references

Acknowledgments

We are grateful to Victor Samarkone for helpful comments. We greatly appreciate the comments of the anonymous reviewers which helped improve this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Ravishankar Rao.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflicts of interest.

Electronic supplementary material

ESM 1

(DOCX 315 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ravishankar Rao, A., Clarke, D. & Vargas, M. Building an Open Health Data Analytics Platform: a Case Study Examining Relationships and Trends in Seniority and Performance in Healthcare Providers. J Healthc Inform Res 2, 44–70 (2018). https://doi.org/10.1007/s41666-018-0014-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41666-018-0014-0

Keywords

Navigation