Making Study Populations Visible Through Knowledge Graphs
Treatment recommendations within Clinical Practice Guidelines (CPGs) are largely based on findings from clinical trials and case studies, referred to here as research studies, that are often based on highly selective clinical populations, referred to here as study cohorts. When medical practitioners apply CPG recommendations, they need to understand how well their patient population matches the characteristics of those in the study cohort, and thus are confronted with the challenges of locating the study cohort information and making an analytic comparison. To address these challenges, we develop an ontology-enabled prototype system, which exposes the population descriptions in research studies in a declarative manner, with the ultimate goal of allowing medical practitioners to better understand the applicability and generalizability of treatment recommendations. We build a Study Cohort Ontology (SCO) to encode the vocabulary of study population descriptions, that are often reported in the first table in the published work, thus they are often referred to as Table 1. We leverage the well-used Semanticscience Integrated Ontology (SIO) for defining property associations between classes. Further, we model the key components of Table 1s, i.e., collections of study subjects, subject characteristics, and statistical measures in RDF knowledge graphs. We design scenarios for medical practitioners to perform population analysis, and generate cohort similarity visualizations to determine the applicability of a study population to the clinical population of interest. Our semantic approach to make study populations visible, by standardized representations of Table 1s, allows users to quickly derive clinically relevant inferences about study populations.
Resource Website: https://tetherless-world.github.io/study-cohort-ontology/.
KeywordsScientific Study Data Analysis Knowledge graphs Modeling Aggregations and Summary Statistics Ontology Development
This work is partially supported by IBM Research AI through the AI Horizons Network. We thank our colleagues from IBM Research, Dan Gruen, Morgan Foreman and Ching-Hua Chen, and from RPI, John Erickson, Alexander New, and Rebecca Cowan, who greatly assisted the research.
- 1.American Diabetes Association (ADA) et al.: 8. Pharmacologic approaches to glycemic treatment: Standards of medical care in diabetes - 2018. Diabetes Care 41(Suppl. 1), S73–S85 (2018)Google Scholar
- 2.American Diabetes Association (ADA) et al.: 9. Cardiovascular disease and risk management: standards of medical care in diabetes - 2018. Diabetes Care 41(Suppl. 1), S86–S104 (2018)Google Scholar
- 3.Auer, S., Kovtun, V., Prinz, M., Kasprzik, A., Stocker, M., Vidal, M.E.: Towards a knowledge graph for science. In: Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics, p. 1. ACM, Novi Sad (2018)Google Scholar
- 4.Bechhofer, S., et al.: OWL web ontology language reference. OWL Reference Guide. https://www.w3.org/TR/owl-ref/
- 5.Courtot, M., et al.: MIREOT: The minimum information to reference an external ontology term. Appl. Ontol. 6(1), 23–33 (2011)Google Scholar
- 6.Cyganiak, R., Field, S., Gregory, A., Halb, W., Tennison, J.: Semantic statistics: bringing together SDMX and SCOVO. In: Proceedings of the Linked Data on the Web Workshop (LDOW 2010), Raleigh, North Carolina, USA, 27 April 2010 (2010). http://ceur-ws.org/Vol-628/. Accessed 26 Mar 2019
- 7.Garijo, D., Poveda-VillalÃşn, M.: A checklist for complete vocabulary metadata. List of Desirable Ontology Best-Practices. http://dgarijo.github.io/Widoco/doc/bestPractices/index-en.html
- 8.Graham, R., et al.: Trustworthy clinical practice guidelines: challenges and potential. In: Clinical Practice Guidelines We Can Trust, pp. 53–75. National Academies Press (US), Washington D.C. (2011)Google Scholar
- 10.Ontarget Investigators: Telmisartan, ramipril, or both in patients at high risk for vascular events. N. Engl. J. Med. 358(15), 1547–1559 (2008)Google Scholar
- 11.Jang, M., Jahanshad, N., Espiritu, R.: The cohort ontology. Enigma Knowledge Capture and Discovery Project. https://knowledgecaptureanddiscovery.github.io/EnigmaOntology/release/cohort/1.0.0/index-en.html
- 13.National Institute of Health (NIH): Rigor and Reproducibility. Introduction and need for principles. https://www.nih.gov/research-training/rigor-reproducibility
- 14.New, A., Rashid, S.M., Erickson, J.S., McGuinness, D.L., Bennett, K.P.: Semantically-aware population health risk analyses. Presented as a Poster at Machine Learning for Health (ML4H) Workshop, NeurIPS, Montreal, Canada (2018). https://arxiv.org/abs/1811.11190. Accessed 20 Mar 2019
- 15.NIH Colloboratory: Table 1 project. Rethinking Clinical Trials. https://sites.duke.edu/rethinkingclinicaltrials/ehr-phenotyping/table-1-project/
- 18.Reinhardt, S.: Property reification vocabulary. A Strawman Draft. https://www.w3.org/wiki/PropertyReificationVocabulary
- 19.Shankar, R.D., Martins, S.B., O’Connor, M.J., Parrish, D.B., Das, A.K.: Epoch: an ontological framework to support clinical trials management. In: Proceedings of the International Workshop on Healthcare Information and Knowledge Management, pp. 25–32. ACM, Arlington (2006)Google Scholar
- 22.Valdez, J., Kim, M., Rueschman, M., Socrates, V., Redline, S., Sahoo, S.S.: Provcare semantic provenance knowledgebase: evaluating scientific reproducibility of research studies. In: AMIA Annual Symposium Proceedings, vol. 2017, p. 1705. American Medical Informatics Association, Washington D.C., USA (2017)Google Scholar
- 24.Younesi, E.: A knowledge-based integrative modeling approach for in-silico identification of mechanistic targets in neurodegeneration with focus on Alzheimer’s disease. Ph.D. thesis, Department of Mathematics and Natural Sciences, Universitäts-und Landesbibliothek Bonn, Bonn, Germany (2014)Google Scholar