Ontology – Supported Machine Learning and Decision Support in Biomedicine

  • Alexey Tsymbal
  • Sonja Zillner
  • Martin Huber
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4544)


Nowadays, ontologies and machine learning constitute two major technologies for domain-specific knowledge extraction which are actively used in knowledge-based systems of different kind including expert systems, decision support systems, knowledge discovery systems, etc. While the aim of these two technologies is the same – the extraction of useful knowledge – little is known about how the two sources of knowledge can be successfully integrated. Today the two technologies are used mainly separate; even though the knowledge extracted by the two is complementary and significant benefits can be obtained if the technologies were integrated. This problem is especially important for biomedicine where relevant data are often naturally complex having large dimensionality and including heterogeneous features, and where a large body of knowledge is available in the form of ontologies. In this paper we propose one approach for improving the performance of machine learning algorithms by integrating the knowledge provided by ontologies. The basic idea is to redefine the concept of similarity for complex heterogeneous data by incorporating available ontological knowledge, creating a bridge between the two technologies. Potential benefits and difficulties of this integration are discussed, two techniques for empirical evaluation and fine-tuning of feature ontologies are described, and an example from the field of paediatric cardiology is given


Atrial Septal Defect Semantic Similarity Machine Learning Algorithm Unify Medical Language System Microarray Gene Expression Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Assem, M., Menken, M., Schreiber, G., Wielemaker, J., Wielinga, B.: A method for converting thesauri to RDF/OWL. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 17–34. Springer, Heidelberg (2004)Google Scholar
  2. 2.
    Azuaje, F., Bodenreider, O.: Incorporating ontology-driven similarity knowledge into functional genomics: an exploratory study. In: Proc. IEEE Symposium on Bioinformatics and Bioengineering, BIBE 2004, pp. 317–324. IEEE Press, Los Alamitos (2004)CrossRefGoogle Scholar
  3. 3.
    Ashburner, M., et al.: Creating the gene ontology resource: design and implementation. Genome Research 11(8), 1425–1433 (2001)CrossRefGoogle Scholar
  4. 4.
    Bader, G., Cary, M. (eds.): BioPAX – Biological Pathways Exchange Language, Level 2, Version 1.0 Documentation, BioPAX Working Group (2006) available at
  5. 5.
    Baker, L.D., McCallum, A.K.: Distributional clustering of words for text classification. In: Proc. 21st ACM Int. Conf. on Research and Development in Information Retrieval SIGIR‘98, pp. 96–103. ACM Press, New York (1998)CrossRefGoogle Scholar
  6. 6.
    Bergmann, R., Kolodner, J., Plaza, E.: Representation in case-based reasoning. In: Knowledge Engineering Review, vol. 20, pp. 209–213. Cambridge University Press, Cambridge (2005)Google Scholar
  7. 7.
    Berrar, D., Sturgeon, B., Bradbury, I., Downes, C.S., Dubitzky, W.: Microarray data integration and machine learning methods for lung cancer survival prediction. In: 4th Int. Conf. Critical Assessment of Microarray Data Analysis, CAMDA, pp. 43–54 (2003)Google Scholar
  8. 8.
    Bodenreider, O.: The Unified Medical Language System (UMLS): integrating biomedical terminology. In: Nucleid Acids Research, vol. 31, pp. 267–270. Oxford University Press, Oxford,UK (2004)Google Scholar
  9. 9.
    Bolshakova, N., Azuaje, F., Cunningham, P.: Incorporating biological domain knowledge into cluster validity assessment. In: Rothlauf, F., Branke, J., Cagnoni, S., Costa, E., Cotta, C., Drechsler, R., Lutton, E., Machado, P., Moore, J.H., Romero, J., Smith, G.D., Squillero, G., Takagi, H. (eds.) EvoWorkshops 2006. LNCS, vol. 3907, pp. 13–22. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    Camps-Valls, G., Gomez-Chova, L., Muñoz-Marí, J., Vila-Francés, J., Calpe-Maravilla, J.: Composite kernels for hyperspectral image classification. IEEE Geoscience and Remote Sensing Letters 3(1), 93–97 (2006)CrossRefGoogle Scholar
  11. 11.
    Futschik, M.E., Sullivan, M., Reeve, A., Kasabov, N.: Prediction of clinical behaviour and treatment for cancers. Applied Bioinformatics 2(3), 53–58 (2003)Google Scholar
  12. 12.
    Goldbreich, C., Zhang, S., Bodenreider, O.: The foundational model of anatomy in OWL: experiences and perspectives. In: J. of Web Semantics: Science, Services, and Agents on the World Wide Web, vol. 4, pp. 181–195. Elsevier, North-Holland, Amsterdam (2006)Google Scholar
  13. 13.
    Gruber, T.: Towards principles for the design of ontologies used for knowledge sharing, Human and Computer Studies, vol. 43, pp. 907–928. Academic Press, San Diego (1995)Google Scholar
  14. 14.
    Hodge, G.: Systems of Knowledge Organization for Digital Libraries: Beyond Traditional Authority Files, The Digital Library Federation (2000)Google Scholar
  15. 15.
    Hanslik, A., Pospisil, U., Salzer-Muhar, U., Greber-Platzer, S., Male, C.: Predictors of spontaneous closure of isolated secundum atrial septal defect in children: a longitudinal study. Pediatrics 118(4), 1560–1565 (2006)CrossRefGoogle Scholar
  16. 16.
    International Statistical Classification of Diseases and Related Health Problems, 10th revision (ICD-10), World Health Organization [classifications/apps/icd/ icd10online/], available at
  17. 17.
    Janecek, P., Pu, P.: Searching with semantics: an interactive visualization technique for exploring an annotated image collection. In: Meersman, R., Tari, Z. (eds.) On The Move to Meaningful Internet Systems 2003: OTM 2003 Workshops. LNCS, vol. 2889, pp. 185–196. Springer, Heidelberg (2003)Google Scholar
  18. 18.
    Kalfoglou, Y., Schorlemmer, M.: Ontology mapping: the state of the art. In: Kalfoglou, Y., Schorlemmer, M., Sheth, A., Staab, S., Uschold, M. (eds.) Semantic Interoperability and Integration, Dagstuhl Seminar Proceedings 4391, IBFI (2005) [available at]Google Scholar
  19. 19.
    Louie, B., Mork, P., Martin-Sanchez, F., Halevy, A., Tarczy-Hornoch, P.: Data integration and genomic medicine. Methodological review, Biomedical Informatics 40, 5–16 (2007)CrossRefGoogle Scholar
  20. 20.
    Melton, G., Parsons, S., Morrison, F., Rothschild, A., Markatou, M., Hripcsak, G.: Inter-patient distance metrics using SNOMED CT defining relationships. Biomedical Informatics 39, 697–705 (2006)CrossRefGoogle Scholar
  21. 21.
    Mitchell, T.M.: Machine Learning. McGraw Hill, New York (1997)zbMATHGoogle Scholar
  22. 22.
    Moench, E., Ullrich, M., Schnurr, H., Angele, J.: SemanticMiner – ontology-based knowledge retrieval. Universal Computer Science 9(7), 682–696 (2003)Google Scholar
  23. 23.
    Nelson, S., Johnston, D., Humphreys, B.: Relationships in medical subject headings. In: Bean, C., Green, R. (eds.) Relationships in the Organization of Knowledge, pp. 171–184. Kluwer Academic, Boston, MA (2001)Google Scholar
  24. 24.
    Oleshchuk, V., Pedersen, A.: Ontology-based semantic similarity comparison of documents. In: DEXA Workshops 2003, pp. 735–738. IEEE CS Press, Los Alamitos, CA, USA (2003)Google Scholar
  25. 25.
    Panyr, J.: Thesauri, semantic nets, frames, taxonomies, ontologies – conceptual confusion or conceptional diversity? In: Harms, I., Luckhardt, D., Giessen, H. (eds.)Information and Language – Contributions from Computer Science, Computer Linguistics, Librarianship, and Related Disciplines, Saur-Verlag, pp. 139–152 (In German) (2006)Google Scholar
  26. 26.
    Rosse, C., Mejino, J.: A reference ontology for biomedical informatics: the foundational model of anatomy. Biomedical Informatics 36, 478–500 (2003)CrossRefGoogle Scholar
  27. 27.
    Soualmia, L.F., Golbreich, C., Darmoni, S.J.: Representing the MeSH in OWL: towards a semi-automatic migration. In: Proc. 1st Int. Workshop on Formal Biomedical Knowledge Representation (KR-MED 2004), Whistler, Canada, pp. 81–87 (2004)Google Scholar
  28. 28.
    Stahl, A.: Learning of Knowledge-Intensive Similarity Measures in Case-Based Reasoning, Ph. D. Thesis, University of Kaiserslautern, Germany (2004)Google Scholar
  29. 29.
    Stearns, M., Price, C., Spackman, K., Wang, A.: SNOMED: clinical terms: overview of the development process and project status. In: Proc. Annual Symposium of American Medical Informatics Association, AMIA 2001, Hanley & Belfus, pp. 662–666 (2001)Google Scholar
  30. 30.
    Whetzel, P., Parkinson, H., Causton, H., Fan, L., Fostel, J., Fragoso, G., Game, L., Heiskanen, M., Morrison, N., Rocca-Serra, P., Sansone, S., Taylor, S., White, J., Stoeckert, C.: The MGED ontology; a resource for semantics-based description of microarray experiments. In: Bioinformatics, vol. 22, pp. 866–873. Oxford University Press, Oxford, UK (2006)Google Scholar
  31. 31.
    Zighed, D.A., Ras, Z.W. (ed.): Proc. 2nd IASC Workshop on Mining Complex Data, in conjunction with IEEE Int. Conf. on Data Mining ICDM 2006, Hong Kong (December 2006)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Alexey Tsymbal
    • 1
  • Sonja Zillner
    • 2
  • Martin Huber
    • 1
  1. 1.Corporate Technology Div., Siemens AG, ErlangenGermany
  2. 2.Corporate Technology Div., Siemens AG, MunichGermany

Personalised recommendations