Abstract
Technological advancements in many fields have led to huge increases in data production, including data volume, diversity, and the speed at which new data is becoming available. In accordance with this, there is a lack of conformity in the ways data is interpreted. This era of “big data” provides unprecedented opportunities for data-driven research and “big picture” models. However, in-depth analyses—making use of various data types and data sources and extracting knowledge—have become a more daunting task. This is especially the case in life sciences where simplification and flattening of diverse data types often lead to incorrect predictions. Effective applications of big data approaches in life sciences require better, knowledge-based, semantic models that are suitable as a framework for big data integration, while avoiding oversimplifications, such as reducing various biological data types to the gene level. A huge hurdle in developing such semantic knowledge models, or ontologies, is the knowledge acquisition bottleneck. Automated methods are still very limited, and significant human expertise is required. In this chapter, we describe a methodology to systematize this knowledge acquisition and representation challenge, termed KNowledge Acquisition and Representation Methodology (KNARM). We then describe application of the methodology while implementing the Drug Target Ontology (DTO). We aimed to create an approach, involving domain experts and knowledge engineers, to build useful, comprehensive, consistent ontologies that will enable big data approaches in the domain of drug discovery, without the currently common simplifications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gruber TR (1993) Towards principles for the Design of Ontologies Used for knowledge sharing. Int J Hum Comput Stud 43(5–6):907–928
CommonKADS CommonKADS. http://commonkads.org/
Schreiber G, Wielinga B, de Hoog R, Akkermans H, Van de Velde W (1994) CommonKADS: a comprehensive methodology for KBS development. IEEE Expert 9(6):28–37
Barnes JC (2002) Conceptual biology: a semantic issue and more. Nature 417(6889):587–588
Blagosklonny MV, Pardee AB (2002) Conceptual biology: unearthing the gems. Nature 416(6879):373–373
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Rapp BA, Wheeler DL (2000) GenBank. Nucleic Acids Res 28(1):15–18
Heflin J, Hendler J (2000) Semantic interoperability on the web. Maryland University, College Park, Department of Computer Science, Maryland
Noy NF, Fergerson RW, Musen MA (2000) The knowledge model of Protege-2000: combining interoperability and flexibility. In: Knowledge engineering and knowledge management methods, models, and tools. Springer, New York, pp 17–32
Stevens R, Goble CA, Bechhofer S (2000) Ontology-based knowledge representation for bioinformatics. Brief Bioinform 1(4):398–414
Wache H, Voegele T, Visser U, Stuckenschmidt H, Schuster G, Neumann H, Hübner S (2001) Ontology-based integration of information-a survey of existing approaches. In: IJCAI-01 workshop: ontologies and information sharing. Citeseer, New Jersey, pp 108–117
Yeh I, Karp PD, Noy NF, Altman RB (2003) Knowledge acquisition, consistency checking and concurrency control for gene ontology (GO). Bioinformatics 19(2):241–248
Degtyarenko K, De Matos P, Ennis M, Hastings J, Zbinden M, McNaught A, Alcántara R, Darsow M, Guedj M, Ashburner M (2008) ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res 36(suppl 1):D344–D350
Baader F, Calvanese D, McGuinness DL, Nardi D, Patel-Schneider PF (2010) The description logic handbook: theory, implementation and applications, 2nd edn. Cambridge University Press, New York, NY
Buchanan BG, Barstow D, Bechtal R, Bennett J, Clancey W, Kulikowski C, Mitchell T, Waterman DA (1983) Constructing an expert system. Build Exper Sys 50:127–167
Natale DA, Arighi CN, Blake JA, Bona J, Chen C, Chen S-C, Christie KR, Cowart J, D'Eustachio P, Diehl AD, Drabkin HJ, Duncan WD, Huang H, Ren J, Ross K, Ruttenberg A, Shamovsky V, Smith B, Wang Q, Zhang J, El-Sayed A, Wu CH (2011) The representation of protein complexes in the protein ontology (PRO). BMC bioinformatics 12(1):1
Clark AM, Litterman NK, Kranz JE, Gund P, Gregory K, Bunin BA, Cao L (2016) BioAssay templates for the semantic web data science: challenges and directions. PeerJ Comput Sci 2(8):e61
Belleau F, Nolin M-A, Tourigny N, Rigault P, Morissette J (2008) Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J Biomed Inform 41(5):706–716
Cook CE, Bergman MT, Finn RD, Cochrane G, Birney E, Apweiler R (2015) The European bioinformatics institute in 2016: data growth and integration. Nucleic Acids Res 44(D1):D20–D26
Hitzler P, Krötzsch M, Rudolph S (2009) Foundations of semantic web technologies. Chapman and Hall (CRC), Florida
Küçük-Mcginty H, Metha S, Lin Y, Nabizadeh N, Stathias V, Vidovic D, Koleti A, Mader C, Duan J, Visser U (2016) Schurer S IT405: building concordant ontologies for drug discovery. In: International conference on biomedical ontology and BioCreative. (ICBO BioCreative 2016), Oregon
Schurer SC, Vempati U, Smith R, Southern M, Lemmon V (2011) BioAssay ontology annotations facilitate cross-analysis of diverse high-throughput screening data sets. J Biomol Screen 16(4):415–426
Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, Goldberg LJ, Eilbeck K, Ireland A, Mungall CJ et al (2007) The OBO foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol 25(11):1251–1255
Whetzel PL, Noy NF, Shah NH, Alexander PR, Nyulas C, Tudorache T, Musen MA (2011) BioPortal: enhanced functionality via new web services from the National Center for biomedical ontology to access and use ontologies in software applications. Nucleic Acids Res 39(2):W541–W545
Lin Y, Mehta S, Küçük-McGinty H, Turner JP, Vidovic D, Forlin M, Koleti A, Nguyen D-T, Jensen LJ, Guha R, Mathias SL, Ursu O, Stathias V, Duan J, Nabizadeh N, Chung C, Mader C, Visser U, Yang JJ, Bologa CG, Oprea TI, Schürer SC (2017) Drug target ontology to classify and integrate drug discovery data. J Biomed Semantics 8(1):50
Ma'ayan A (2017) Complex systems biology. J R Soc Interface 14(134):1742–5689
Abeyruwan S, Vempati UD, Küçük-McGinty H, Visser U, Koleti A, Mir A, Sakurai K, Chung C, Bittker JA, Clemons PA, Chung C, Bittker JA, Clemons PA, Brudz S, Siripala A, Morales AJ, Romacker M, Twomey D, Bureeva S, Lemmon V, Schürer SC (2014) Evolving BioAssay ontology (BAO): modularization, integration and applications. J Biomed Semantics 5(Suppl 1):S5
BAOSearch. http://baosearch.ccs.miami.edu/
Visser U, Abeyruwan S, Vempati U, Smith R, Lemmon V, Schurer S (2011) BioAssay ontology (BAO): a semantic description of bioassays and high-throughput screening results. BMC Bioinformatics 12(1):257
Drug Target Ontology. http://drugtargetontology.org/
Brinkman RR, Courtot M, Derom D, Fostel JM, He Y, Lord P, Malone J, Parkinson H, Peters B, Rocca-Serra P, Ruttenberg A, Sansone SA, Soldatova LN, Stoeckert CJ Jr, Turner JA, Zheng J (2010) Modeling biomedical experimental processes with OBI. J Biomed Semantics 1(Suppl 1):S7
Callahan A, Cruz-Toledo J, Dumontier M (2013) Ontology-based querying with Bio2RDF’s linked open data. Journal of Biomedical Semantics 4(Suppl 1):S1
Ceusters W, Smith B (2006) A realism-based approach to the evolution of biomedical ontologies. AMIA Annu Symp Proc:121–125
Consortium TGO (2015) Gene ontology consortium: going forward. Nucleic Acids Res 43(D1):D1049–D1056. https://doi.org/10.1093/nar/gku1179
Decker S, Erdmann M, Fensel D, Studer R (1999) Ontobroker: ontology based access to distributed and semi-structured information. In: Database semantics. Springer, New York, pp 351–369
Gruber TR (1993) A translation approach to portable ontology specifications. Knowl Acquis 5(2):199–220
Köhler J, Philippi S, Lange M (2003) SEMEDA: ontology based semantic integration of biological databases. Bioinformatics 19(18):2420–2427
Ontology BF Basic Formal Ontology (BFO) Project. http://www.ifomis.org/bfo
Pease A, Niles I, Li J (2002) The suggested upper merged ontology: a large ontology for the semantic web and its applications. In: Working notes of the AAAI-2002 workshop on ontologies and the semantic web
Sure Y, Erdmann M, Angele J, Staab S, Studer R, Wenke D (2002) OntoEdit: collaborative ontology development for the semantic web. Springer, New York
Welty CA, Fikes R (2006) A reusable ontology for Fluents in OWL. In: Formal ontology in information systems Frontiers in artificial Intel. And apps. IOS, pp 226–236
NIH Illuminating the Druggable Genome | NIH Common Fund. https://commonfund.nih.gov/idg/index
TCRD Database. http://habanero.health.unm.edu/tcrd/
Hamosh AS, Alan F, Amberger JS, Bocchini CA, McKusick VA (2005) Online Mendelian inheritance in man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 33:D514–D517
Pletscher-Frankild S, Pallejà A, Tsafou K, Binder JX, Jensen LJ (2015) DISEASES: text mining and data integration of disease–gene associations. Methods 74:83–89
NCBI (2017) https://www.ncbi.nlm.nih.gov/gene/about-generif. 2017
Kiermer V (2008) Antibodypedia. Nat Methods 5(10):860–860
Santos A, Tsafou K, Stolte C, Pletscher-Frankild S, O’Donoghue SI, Jensen LJ (2015) Comprehensive comparison of large-scale tissue expression datasets. PeerJ 3:e1054
Sugen and the Salk Institute. (2012). http://kinase.com/human/kinome/phylogeny.html
Consortium TU (2015) UniProt: a hub for protein information. Nucleic Acids Res 43(D1):D204–D212. https://doi.org/10.1093/nar/gku989
Przydzial MJ, Bhhatarai B, Koleti A, Vempati U, Schürer SC (2013) GPCR ontology: development and application of a G protein-coupled receptor pharmacology knowledge framework. Bioinformatics 29(24):3211–3219
Pawson AJ, Sharman JL, Benson HE, Faccenda E, Alexander SP, Buneman OP, Davenport AP, JC MG, Peters JA, Southan C, Spedding M, Yu W, Harmar AJ, NC-IUPHAR (2013) The IUPHAR/BPS guide to PHARMACOLOGY: an expert-driven knowledgebase of drug targets and their ligands. Nucleic Acids Res 42(D1):D1098–D1106
Vidović D, Koleti A, Schürer SC (2014) Large-scale integration of small molecule-induced genome-wide transcriptional responses, Kinome-wide binding affinities and cell-growth inhibition profiles reveal global trends characterizing systems-level drug action. Front Genet 5:342
Shvachko K, Kuang H, Radia S, Chansler R (2010) The Hadoop distributed file system. In: Proceedings of the 2010 IEEE 26th symposium on mass storage systems and technologies (MSST). IEEE Computer Society, Washington, DC, USA, pp 1–10
Acknowledgments and Funding
This work was supported by NIH grants U54CA189205 (Illuminating the Druggable Genome Knowledge Management Center, IDG-KMC), U24TR002278 (Illuminating the Druggable Genome Resource Dissemination and Outreach Center, IDG-RDOC), U54HL127624 (BD2K LINCS Data Coordination and Integration Center, DCIC), and U01LM012630-02 (BD2K, Enhancing the efficiency and effectiveness of digital curation for biomedical “big data ”). The IDG-KMC and IDG-RDOC (http://druggablegenome.net/) are components of the Illuminating the Druggable Genome (IDG) project (https://commonfund.nih.gov/idg) awarded by the National Cancer Institute (NCI) and National Center for Advancing Translational Sciences (NCATS), respectively. The BD2K LINC DCIC is awarded by the National Heart, Lung, and Blood Institute through funds provided by the trans-NIH Library of Integrated Network-Based Cellular Signatures (LINCS ) Program (http://www.lincsproject.org/) and the trans-NIH Big Data to Knowledge (BD2K) initiative (https://commonfund.nih.gov/bd2k). IDG, LINCS, and BD2K are NIH Common Fund projects.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Küçük McGinty, H., Visser, U., Schürer, S. (2019). How to Develop a Drug Target Ontology: KNowledge Acquisition and Representation Methodology (KNARM). In: Larson, R., Oprea, T. (eds) Bioinformatics and Drug Discovery. Methods in Molecular Biology, vol 1939. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-9089-4_4
Download citation
DOI: https://doi.org/10.1007/978-1-4939-9089-4_4
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-9088-7
Online ISBN: 978-1-4939-9089-4
eBook Packages: Springer Protocols