Skip to main content

Big Data Cohort Extraction for Personalized Statin Treatment and Machine Learning

  • Protocol
  • First Online:
Bioinformatics and Drug Discovery

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1939))

Abstract

The creation of big clinical data cohorts for machine learning and data analysis require a number of steps from the beginning to successful completion. Similar to data set preprocessing in other fields, there is an initial need to complete data quality evaluation; however, with large heterogeneous clinical data sets, it is important to standardize the data in order to facilitate dimensionality reduction. This is particularly important for clinical data sets including medications as a core data component due to the complexity of coded medication data. Data integration at the individual subject level is essential with medication-related machine learning applications since it can be difficult to accurately identify drug exposures, therapeutic effects, and adverse drug events without having high-quality data integration of insurance, medication, and medical data. Successful data integration and standardization efforts can substantially improve the ability to identify and replicate personalized treatment pathways to optimize drug therapy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Jill Kolesar LV (2015) McGraw-Hill's 2016/2017 top 300 pharmacy drug cards. McGraw-Hill

    Google Scholar 

  2. ClinCalc (2017) The Top 200 of 2017 ClinCalc LLC. http://clincalc.com/DrugStats/Top200Drugs.aspx. Accessed 30 July 2017

  3. Agency for Healthcare Research and Quality R, MD (2017) Medical expenditure panel survey

    Google Scholar 

  4. Food and Drug Administration US (2017) National drug code directory. https://www.fda.gov/drugs/informationondrugs/ucm142438.htm. Accessed 30 July 2017

  5. Food and Drug Administration US (2017) Structured product labeling resources. https://www.fda.gov/ForIndustry/DataStandards/StructuredProductLabeling/default.htm. Accessed 31 July 2017

  6. WHO Collaborating Centre for Drug Statistics O (2017) ATC: structure and principles. https://www.whocc.no/atc/structure_and_principles/. Accessed 31 July 2017

  7. WHO Collaborating Centre for Drug Statistics O (2017) ATC/DDD index 2017. https://www.whocc.no/atc_ddd_index/. Accessed 31 July 2017

  8. U.S. National Library of Medicine B (2017) RxNorm technical documentation. U.S. National Library of Medicine. https://www.nlm.nih.gov/research/umls/rxnorm/docs/2017/rxnorm_doco_full_2017-2.html. Accessed 31 July 2017

  9. Svensson-Ranallo PA, Adam TJ, Sainfort F (2011) A framework and standardized methodology for developing minimum clinical datasets. AMIA Jt Summits Transl Sci Proc 2011:54–58

    Google Scholar 

  10. Regenstrief I (2017) LOINC: the international standard for identifying health measurements, observations, and documents. Regenstrief Institute https://loinc.org/. Accessed 31 July 2017

  11. Agency for Healthcare Research and Quality R, MD (2017) HCUP chronic condition indicator. Healthcare cost and utilization project (HCUP): Chronic condition indicator (CCI) for ICD-9-CM. https://www.hcup-us.ahrq.gov/toolssoftware/chronic/chronic.jsp. Accessed 31 July 2017

  12. Agency for Healthcare Research and Quality R, MD (2012) HCUP CCS fact sheet. Healthcare cost and utilization project (HCUP). https://www.hcup-us.ahrq.gov/toolssoftware/ccs/ccsfactsheet.jsp. Accessed 31 July 2017

  13. Charlson ME, Pompei P, Ales KL, MacKenzie CR (1987) A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis 40(5):373–383

    Google Scholar 

  14. Manitoba Centre for Health Policy C (2016) Concept: charlson comorbidity index http://mchp-appserv.cpe.umanitoba.ca/viewConcept.php?conceptID=1098#a_references. Accessed 31 July 2017

  15. Quan H, Sundararajan V, Halfon P, Fong A, Burnand B, Luthi JC, Saunders LD, Beck CA, Feasby TE, Ghali WA (2005) Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care 43(11):1130–1139

    Google Scholar 

  16. Deyo RA, Cherkin DC, Ciol MA (1992) Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. J Clin Epidemiol 45(6):613–619

    Google Scholar 

  17. Elixhauser A, Steiner C, Harris DR, Coffey RM (1998) Comorbidity measures for use with administrative data. Med Care 36(1):8–27

    Google Scholar 

  18. Manitoba Centre for Health Policy C (2016) Concept: elixhauser comorbidity index http://mchp-appserv.cpe.umanitoba.ca/viewConcept.php?conceptID=1436. Accessed 31 July 2017

  19. Chi C-L, Wang J, Clancy TR, Robinson JG, Tonellato PJ, Adam TJ (2017) Big data cohort extraction to facilitate machine learning to improve statin treatment. West J Nurs Res 39(1):42–62. https://doi.org/10.1177/0193945916673059

    Google Scholar 

  20. Hebert PL, Geiss LS, Tierney EF, Engelgau MM, Yawn BP, McBean AM (1999) Identifying persons with diabetes using medicare claims data. Am J Med Qual 14(6):270–277. https://doi.org/10.1177/106286069901400607

    Google Scholar 

  21. Center for Medicare and Medicaid Services H (2017) 2017 ICD-10-CM and GEMs. https://www.cms.gov/Medicare/Coding/ICD10/2017-ICD-10-CM-and-GEMs.html. Accessed 28 July 2017

  22. Olson CH, Dierich M, Adam T, Westra BL (2014) Optimization of decision support tool using medication regimens to assess rehospitalization risks. Appl Clin Inform 5(3):773–788. https://doi.org/10.4338/ACI-2014-04-RA-0040

    Google Scholar 

  23. Benner JS, Glynn RJ, Mogun H, Neumann PJ, Weinstein MC, Avorn J (2002) Long-term persistence in use of statin therapy in elderly patients. JAMA 288(4):455–461

    Google Scholar 

  24. Nau DP (2017) Proportion of days covered (PDC) as a preferred method of measuring medication adherence. Pharmacy quality alliance. http://pqaalliance.org/resources/adherence.asp Accessed 31 July 2017

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Terrence J. Adam .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Adam, T.J., Chi, CL. (2019). Big Data Cohort Extraction for Personalized Statin Treatment and Machine Learning. In: Larson, R., Oprea, T. (eds) Bioinformatics and Drug Discovery. Methods in Molecular Biology, vol 1939. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-9089-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-9089-4_14

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-9088-7

  • Online ISBN: 978-1-4939-9089-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics