Abstract
Nowadays, ontologies and machine learning constitute two major technologies for domain-specific knowledge extraction which are actively used in knowledge-based systems of different kind including expert systems, decision support systems, knowledge discovery systems, etc. While the aim of these two technologies is the same – the extraction of useful knowledge – little is known about how the two sources of knowledge can be successfully integrated. Today the two technologies are used mainly separate; even though the knowledge extracted by the two is complementary and significant benefits can be obtained if the technologies were integrated. This problem is especially important for biomedicine where relevant data are often naturally complex having large dimensionality and including heterogeneous features, and where a large body of knowledge is available in the form of ontologies. In this paper we propose one approach for improving the performance of machine learning algorithms by integrating the knowledge provided by ontologies. The basic idea is to redefine the concept of similarity for complex heterogeneous data by incorporating available ontological knowledge, creating a bridge between the two technologies. Potential benefits and difficulties of this integration are discussed, two techniques for empirical evaluation and fine-tuning of feature ontologies are described, and an example from the field of paediatric cardiology is given
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Assem, M., Menken, M., Schreiber, G., Wielemaker, J., Wielinga, B.: A method for converting thesauri to RDF/OWL. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 17–34. Springer, Heidelberg (2004)
Azuaje, F., Bodenreider, O.: Incorporating ontology-driven similarity knowledge into functional genomics: an exploratory study. In: Proc. IEEE Symposium on Bioinformatics and Bioengineering, BIBE 2004, pp. 317–324. IEEE Press, Los Alamitos (2004)
Ashburner, M., et al.: Creating the gene ontology resource: design and implementation. Genome Research 11(8), 1425–1433 (2001)
Bader, G., Cary, M. (eds.): BioPAX – Biological Pathways Exchange Language, Level 2, Version 1.0 Documentation, BioPAX Working Group (2006) available at http://www.biopax.org
Baker, L.D., McCallum, A.K.: Distributional clustering of words for text classification. In: Proc. 21st ACM Int. Conf. on Research and Development in Information Retrieval SIGIR‘98, pp. 96–103. ACM Press, New York (1998)
Bergmann, R., Kolodner, J., Plaza, E.: Representation in case-based reasoning. In: Knowledge Engineering Review, vol. 20, pp. 209–213. Cambridge University Press, Cambridge (2005)
Berrar, D., Sturgeon, B., Bradbury, I., Downes, C.S., Dubitzky, W.: Microarray data integration and machine learning methods for lung cancer survival prediction. In: 4th Int. Conf. Critical Assessment of Microarray Data Analysis, CAMDA, pp. 43–54 (2003)
Bodenreider, O.: The Unified Medical Language System (UMLS): integrating biomedical terminology. In: Nucleid Acids Research, vol. 31, pp. 267–270. Oxford University Press, Oxford,UK (2004)
Bolshakova, N., Azuaje, F., Cunningham, P.: Incorporating biological domain knowledge into cluster validity assessment. In: Rothlauf, F., Branke, J., Cagnoni, S., Costa, E., Cotta, C., Drechsler, R., Lutton, E., Machado, P., Moore, J.H., Romero, J., Smith, G.D., Squillero, G., Takagi, H. (eds.) EvoWorkshops 2006. LNCS, vol. 3907, pp. 13–22. Springer, Heidelberg (2006)
Camps-Valls, G., Gomez-Chova, L., Muñoz-Marí, J., Vila-Francés, J., Calpe-Maravilla, J.: Composite kernels for hyperspectral image classification. IEEE Geoscience and Remote Sensing Letters 3(1), 93–97 (2006)
Futschik, M.E., Sullivan, M., Reeve, A., Kasabov, N.: Prediction of clinical behaviour and treatment for cancers. Applied Bioinformatics 2(3), 53–58 (2003)
Goldbreich, C., Zhang, S., Bodenreider, O.: The foundational model of anatomy in OWL: experiences and perspectives. In: J. of Web Semantics: Science, Services, and Agents on the World Wide Web, vol. 4, pp. 181–195. Elsevier, North-Holland, Amsterdam (2006)
Gruber, T.: Towards principles for the design of ontologies used for knowledge sharing, Human and Computer Studies, vol. 43, pp. 907–928. Academic Press, San Diego (1995)
Hodge, G.: Systems of Knowledge Organization for Digital Libraries: Beyond Traditional Authority Files, The Digital Library Federation (2000)
Hanslik, A., Pospisil, U., Salzer-Muhar, U., Greber-Platzer, S., Male, C.: Predictors of spontaneous closure of isolated secundum atrial septal defect in children: a longitudinal study. Pediatrics 118(4), 1560–1565 (2006)
International Statistical Classification of Diseases and Related Health Problems, 10th revision (ICD-10), World Health Organization [classifications/apps/icd/ icd10online/], available at http://www.who.int/
Janecek, P., Pu, P.: Searching with semantics: an interactive visualization technique for exploring an annotated image collection. In: Meersman, R., Tari, Z. (eds.) On The Move to Meaningful Internet Systems 2003: OTM 2003 Workshops. LNCS, vol. 2889, pp. 185–196. Springer, Heidelberg (2003)
Kalfoglou, Y., Schorlemmer, M.: Ontology mapping: the state of the art. In: Kalfoglou, Y., Schorlemmer, M., Sheth, A., Staab, S., Uschold, M. (eds.) Semantic Interoperability and Integration, Dagstuhl Seminar Proceedings 4391, IBFI (2005) [available at drops.dagstuhl.de/opus/volltexte/2005/40]
Louie, B., Mork, P., Martin-Sanchez, F., Halevy, A., Tarczy-Hornoch, P.: Data integration and genomic medicine. Methodological review, Biomedical Informatics 40, 5–16 (2007)
Melton, G., Parsons, S., Morrison, F., Rothschild, A., Markatou, M., Hripcsak, G.: Inter-patient distance metrics using SNOMED CT defining relationships. Biomedical Informatics 39, 697–705 (2006)
Mitchell, T.M.: Machine Learning. McGraw Hill, New York (1997)
Moench, E., Ullrich, M., Schnurr, H., Angele, J.: SemanticMiner – ontology-based knowledge retrieval. Universal Computer Science 9(7), 682–696 (2003)
Nelson, S., Johnston, D., Humphreys, B.: Relationships in medical subject headings. In: Bean, C., Green, R. (eds.) Relationships in the Organization of Knowledge, pp. 171–184. Kluwer Academic, Boston, MA (2001)
Oleshchuk, V., Pedersen, A.: Ontology-based semantic similarity comparison of documents. In: DEXA Workshops 2003, pp. 735–738. IEEE CS Press, Los Alamitos, CA, USA (2003)
Panyr, J.: Thesauri, semantic nets, frames, taxonomies, ontologies – conceptual confusion or conceptional diversity? In: Harms, I., Luckhardt, D., Giessen, H. (eds.)Information and Language – Contributions from Computer Science, Computer Linguistics, Librarianship, and Related Disciplines, Saur-Verlag, pp. 139–152 (In German) (2006)
Rosse, C., Mejino, J.: A reference ontology for biomedical informatics: the foundational model of anatomy. Biomedical Informatics 36, 478–500 (2003)
Soualmia, L.F., Golbreich, C., Darmoni, S.J.: Representing the MeSH in OWL: towards a semi-automatic migration. In: Proc. 1st Int. Workshop on Formal Biomedical Knowledge Representation (KR-MED 2004), Whistler, Canada, pp. 81–87 (2004)
Stahl, A.: Learning of Knowledge-Intensive Similarity Measures in Case-Based Reasoning, Ph. D. Thesis, University of Kaiserslautern, Germany (2004)
Stearns, M., Price, C., Spackman, K., Wang, A.: SNOMED: clinical terms: overview of the development process and project status. In: Proc. Annual Symposium of American Medical Informatics Association, AMIA 2001, Hanley & Belfus, pp. 662–666 (2001)
Whetzel, P., Parkinson, H., Causton, H., Fan, L., Fostel, J., Fragoso, G., Game, L., Heiskanen, M., Morrison, N., Rocca-Serra, P., Sansone, S., Taylor, S., White, J., Stoeckert, C.: The MGED ontology; a resource for semantics-based description of microarray experiments. In: Bioinformatics, vol. 22, pp. 866–873. Oxford University Press, Oxford, UK (2006)
Zighed, D.A., Ras, Z.W. (ed.): Proc. 2nd IASC Workshop on Mining Complex Data, in conjunction with IEEE Int. Conf. on Data Mining ICDM 2006, Hong Kong (December 2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Tsymbal, A., Zillner, S., Huber, M. (2007). Ontology – Supported Machine Learning and Decision Support in Biomedicine. In: Cohen-Boulakia, S., Tannen, V. (eds) Data Integration in the Life Sciences. DILS 2007. Lecture Notes in Computer Science(), vol 4544. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73255-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-540-73255-6_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73254-9
Online ISBN: 978-3-540-73255-6
eBook Packages: Computer ScienceComputer Science (R0)