Instance-Based Matching of Large Life Science Ontologies

  • Toralf Kirsten
  • Andreas Thor
  • Erhard Rahm
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4544)


Ontologies are heavily used in life sciences so that there is increasing value to match different ontologies in order to determine related conceptual categories. We propose a simple yet powerful methodology for instance-based ontology matching which utilizes the associations between molecular-biological objects and ontologies. The approach can build on many existing ontology associations for instance objects like sequences and proteins and thus makes heavy use of available domain knowledge. Furthermore, the approach is flexible and extensible since each instance source with associations to the ontologies of interest can contribute to the ontology mapping. We study several approaches to determine the instance-based similarity of ontology categories. We perform an extensive experimental evaluation to use protein associations for different species to match between subontologies of the Gene Ontology and OMIM. We also provide a comparison with metadata-based ontology matching.


Ontology matching instance-based matching match evaluation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aumüller, D., Do, H.-H., Massmann, S., Rahm, E.: Schema and ontology matching with COMA++. In: Proc. ACM SIGMOD (2005)Google Scholar
  2. 2.
    Avesansi, A., Giunchiglia, F., Yatskevich, M.Y.: A large taxonomy mapping evaluation. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, Springer, Heidelberg (2005)Google Scholar
  3. 3.
    Bodenreider, O., Aubry, M., Bugrun, A.: Non-lexical approaches to identifying associative relations in the Gene Ontology. In: Proc. Pacific Symposium on Biocomputing (2005)Google Scholar
  4. 4.
    Bodenreider, O., Bugrun, A.: Linking the Gene Ontology to other biological ontologies. In: Proc. ISMB meeting on Bio-Ontologies (2005)Google Scholar
  5. 5.
    Hubbard, T., Andrews, D., Caccamo, M., et al.: Ensembl 2005. Nucleic Acid Research 33(Database Issue), D447–D453 (2005)CrossRefGoogle Scholar
  6. 6.
    Cohen, J.: A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20, 37–46 (1960)CrossRefGoogle Scholar
  7. 7.
    The Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Research 32, D258–D261 (2004)Google Scholar
  8. 8.
    Ichise, R., Takeda, H., Honiden, S.: Integrating multiple internet directories by instance-based learning. In: Proc. 18th Intl. Joint Conf. on Artificial Intelligence (IJCAI) (2003)Google Scholar
  9. 9.
    King, O.D., Fougler, R.E, Dwight, S.S., et al.: Predicting gene function from patterns of annotation. Genome research 13(5), 896–904 (2003)CrossRefGoogle Scholar
  10. 10.
    Kalfoglou, Y., Schorlemmer, M.: Ontology mapping: The state of the art. The Knowledge Engineering Review Journal 18(1), 1–31 (2003)CrossRefGoogle Scholar
  11. 11.
    Kumar, A., Smith, B., Borgelt, C.: Dependence relationships between Gene Ontology terms based on TIGR gene product annotations. In: Proc. 3rd Intl. Workshop on Computational Terminology (CompuTerm) (2004)Google Scholar
  12. 12.
    Mork, P., Bernstein, P.: Adapting a generic match algorithm to align ontologies of human anatomy. In: Proc. 20th Intl. Conf. on Data Engineering (ICDE) (2004)Google Scholar
  13. 13.
    Maedche, A., Staab, S.: Measuring similarity between ontologies. In: Proc. 13th Conf. on Knowledge Engineering and Management (2002)Google Scholar
  14. 14.
    Myhre, S., Tveit, H., Mollestad, T., Laegreid, A.: Additional Gene Ontology structure for improved biological reasoning. Bioinformatics 22(16), 2020–2037 (2006)CrossRefGoogle Scholar
  15. 15.
    Noy, N., Musen, M.: The PROMPT suite: Interactive tools for ontology merging and mapping. Intl. Journal of Human-Computer Studies 59(6), 983–1024 (2003)CrossRefGoogle Scholar
  16. 16.
    Ogren, P., Cohen, K., Acquaah-Mensah, G., et al.: The compositional structure of Gene Ontology terms. In: Proc. Pacific Symposium on Biocomputing (2004)Google Scholar
  17. 17.
    Online Mendelian Inheritance in Man, OMIM. McKusick-Nathans Institute for Genetic Medicine, Johns Hopkins University (Baltimore) and National Center for Biotechnology Information, National Library of Medicine (Bethesda) (2000)Google Scholar
  18. 18.
    Rahm, E., Bernstein, P.: A survey of approaches to automatic schema matching. The. VLDB Journal 10(4), 334–350 (2001)zbMATHCrossRefGoogle Scholar
  19. 19.
    van Rijsbergen, C.J.: Information retrieval, 2nd edn. Butterworths, London (1979)Google Scholar
  20. 20.
    Shvaiko, P., Euzenat, J.: A survey of schema-based matching approaches. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 928–943. Springer, Heidelberg (2005)Google Scholar
  21. 21.
    Thor, A., Kirsten, T., Rahm, E.: Instance-based matching of hierarchical ontologies. In: Proc. 12th German Database Conf (BTW) (2007)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Toralf Kirsten
    • 1
  • Andreas Thor
    • 2
  • Erhard Rahm
    • 1
    • 2
  1. 1.Interdisciplinary Center for Bioinformatics, University of LeipzigGermany
  2. 2.Dept. of Computer Sciences, University of LeipzigGermany

Personalised recommendations