Abstract
The creation of big clinical data cohorts for machine learning and data analysis require a number of steps from the beginning to successful completion. Similar to data set preprocessing in other fields, there is an initial need to complete data quality evaluation; however, with large heterogeneous clinical data sets, it is important to standardize the data in order to facilitate dimensionality reduction. This is particularly important for clinical data sets including medications as a core data component due to the complexity of coded medication data. Data integration at the individual subject level is essential with medication-related machine learning applications since it can be difficult to accurately identify drug exposures, therapeutic effects, and adverse drug events without having high-quality data integration of insurance, medication, and medical data. Successful data integration and standardization efforts can substantially improve the ability to identify and replicate personalized treatment pathways to optimize drug therapy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Jill Kolesar LV (2015) McGraw-Hill's 2016/2017 top 300 pharmacy drug cards. McGraw-Hill
ClinCalc (2017) The Top 200 of 2017 ClinCalc LLC. http://clincalc.com/DrugStats/Top200Drugs.aspx. Accessed 30 July 2017
Agency for Healthcare Research and Quality R, MD (2017) Medical expenditure panel survey
Food and Drug Administration US (2017) National drug code directory. https://www.fda.gov/drugs/informationondrugs/ucm142438.htm. Accessed 30 July 2017
Food and Drug Administration US (2017) Structured product labeling resources. https://www.fda.gov/ForIndustry/DataStandards/StructuredProductLabeling/default.htm. Accessed 31 July 2017
WHO Collaborating Centre for Drug Statistics O (2017) ATC: structure and principles. https://www.whocc.no/atc/structure_and_principles/. Accessed 31 July 2017
WHO Collaborating Centre for Drug Statistics O (2017) ATC/DDD index 2017. https://www.whocc.no/atc_ddd_index/. Accessed 31 July 2017
U.S. National Library of Medicine B (2017) RxNorm technical documentation. U.S. National Library of Medicine. https://www.nlm.nih.gov/research/umls/rxnorm/docs/2017/rxnorm_doco_full_2017-2.html. Accessed 31 July 2017
Svensson-Ranallo PA, Adam TJ, Sainfort F (2011) A framework and standardized methodology for developing minimum clinical datasets. AMIA Jt Summits Transl Sci Proc 2011:54–58
Regenstrief I (2017) LOINC: the international standard for identifying health measurements, observations, and documents. Regenstrief Institute https://loinc.org/. Accessed 31 July 2017
Agency for Healthcare Research and Quality R, MD (2017) HCUP chronic condition indicator. Healthcare cost and utilization project (HCUP): Chronic condition indicator (CCI) for ICD-9-CM. https://www.hcup-us.ahrq.gov/toolssoftware/chronic/chronic.jsp. Accessed 31 July 2017
Agency for Healthcare Research and Quality R, MD (2012) HCUP CCS fact sheet. Healthcare cost and utilization project (HCUP). https://www.hcup-us.ahrq.gov/toolssoftware/ccs/ccsfactsheet.jsp. Accessed 31 July 2017
Charlson ME, Pompei P, Ales KL, MacKenzie CR (1987) A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis 40(5):373–383
Manitoba Centre for Health Policy C (2016) Concept: charlson comorbidity index http://mchp-appserv.cpe.umanitoba.ca/viewConcept.php?conceptID=1098#a_references. Accessed 31 July 2017
Quan H, Sundararajan V, Halfon P, Fong A, Burnand B, Luthi JC, Saunders LD, Beck CA, Feasby TE, Ghali WA (2005) Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care 43(11):1130–1139
Deyo RA, Cherkin DC, Ciol MA (1992) Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. J Clin Epidemiol 45(6):613–619
Elixhauser A, Steiner C, Harris DR, Coffey RM (1998) Comorbidity measures for use with administrative data. Med Care 36(1):8–27
Manitoba Centre for Health Policy C (2016) Concept: elixhauser comorbidity index http://mchp-appserv.cpe.umanitoba.ca/viewConcept.php?conceptID=1436. Accessed 31 July 2017
Chi C-L, Wang J, Clancy TR, Robinson JG, Tonellato PJ, Adam TJ (2017) Big data cohort extraction to facilitate machine learning to improve statin treatment. West J Nurs Res 39(1):42–62. https://doi.org/10.1177/0193945916673059
Hebert PL, Geiss LS, Tierney EF, Engelgau MM, Yawn BP, McBean AM (1999) Identifying persons with diabetes using medicare claims data. Am J Med Qual 14(6):270–277. https://doi.org/10.1177/106286069901400607
Center for Medicare and Medicaid Services H (2017) 2017 ICD-10-CM and GEMs. https://www.cms.gov/Medicare/Coding/ICD10/2017-ICD-10-CM-and-GEMs.html. Accessed 28 July 2017
Olson CH, Dierich M, Adam T, Westra BL (2014) Optimization of decision support tool using medication regimens to assess rehospitalization risks. Appl Clin Inform 5(3):773–788. https://doi.org/10.4338/ACI-2014-04-RA-0040
Benner JS, Glynn RJ, Mogun H, Neumann PJ, Weinstein MC, Avorn J (2002) Long-term persistence in use of statin therapy in elderly patients. JAMA 288(4):455–461
Nau DP (2017) Proportion of days covered (PDC) as a preferred method of measuring medication adherence. Pharmacy quality alliance. http://pqaalliance.org/resources/adherence.asp Accessed 31 July 2017
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Adam, T.J., Chi, CL. (2019). Big Data Cohort Extraction for Personalized Statin Treatment and Machine Learning. In: Larson, R., Oprea, T. (eds) Bioinformatics and Drug Discovery. Methods in Molecular Biology, vol 1939. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-9089-4_14
Download citation
DOI: https://doi.org/10.1007/978-1-4939-9089-4_14
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-9088-7
Online ISBN: 978-1-4939-9089-4
eBook Packages: Springer Protocols