Identifying Disease-Centric Subdomains in Very Large Medical Ontologies: A Case-Study on Breast Cancer Concepts in SNOMED CT. Or: Finding 2500 Out of 300.000

  • Krystyna Milian
  • Zharko Aleksovski
  • Richard Vdovjak
  • Annette ten Teije
  • Frank van Harmelen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5943)


Modern medical vocabularies can contain up to hundreds of thousands of concepts. In any particular use-case only a small fraction of these will be needed. In this paper we first define two notions of a disease-centric subdomain of a large ontology. We then explore two methods for identifying disease-centric subdomains of such large medical vocabularies. The first method is based on lexically querying the ontology with an iteratively extended set of seed queries. The second method is based on manual mapping between concepts from a medical guideline document and ontology concepts. Both methods include concept-expansion over subsumption and equality relations. We use both methods to determine a breast-cancer-centric subdomain of the SNOMED CT ontology. Our experiments show that the two methods produce a considerable overlap, but they also yield a large degree of complementarity, with interesting differences between the sets of concepts that they return. Analysis of the results reveals strengths and weaknesses of the different methods.


identifying ontology subdomain disease related concepts ontology subsetting mapping medical terminologies seed queries medical guidelines 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aleksovski, Z., Vdovjak, R.: Overlap of selected ontologies in the context of the breast cancer domain. In: Proceedings of SIIM 2009 (2009)Google Scholar
  2. 2.
    Aronson, A.R.: Metamap: Mapping text to the umls metathesaurus. In: Proceedings AMIA Symposium (2001)Google Scholar
  3. 3.
    CBO. Guideline for the Treatment of Breast Carcinoma. van Zuiden. PMID: 12474555 (2002)Google Scholar
  4. 4.
    Clark, K., Parsia, B.: Modularity and owl (2008)Google Scholar
  5. 5.
    Grau, B.C., Horrocks, I., Kazakov, y., Satler, U.: Modular reuse of ontologies: Theory and practise. Journal of Artificial Intelligence Research (2008)Google Scholar
  6. 6.
    Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Just the right amount: extracting modules from ontologies. In: Proceedings of WWW, pp. 717–726 (2007)Google Scholar
  7. 7.
    Konev, B., Lutz, C., Walther, D., Wolter, F.: Cex and mex: Logical diff and semantic module extraction in a fragment of owl. In: Proceedings of the OWL: Experiences and Directions Workshop, OWLED 2008 (2008)Google Scholar
  8. 8.
    Marcos, M., Galan, J.C., Martinez, B., Polo, C., Seyfang, A., Miksch, S., Serban, R., ten Teije, A., van Harmelen, F., Rosenbrand, K., Wittenberg, J., van Croonenborg, J., Lucas, P., Hommersom, A.: Protocure ii deliverable d2.2bcd: Models of selected guideline in intermediate, asbru and kiv representations. Technical report (2005),
  9. 9.
    McCray, A.T., Srinivasan, S., Browne, A.C.: Lexical methods for managing variation in biomedical terminologies. In: Proceedings of Symposium on Computer Applications in Medical Care, pp. 235–239 (1994)Google Scholar
  10. 10.
    Noy, N.F., Musen, M.A.: The prompt suite: interactive tools for ontology merging and mapping. Int. J. Hum.-Comput. Stud. 59(6), 983–1024 (2003)CrossRefGoogle Scholar
  11. 11.
    Porter, M.F.: An algorithm for suffix stripping, pp. 313–316. Morgan Kaufmann Publishers Inc., San Francisco (1997)Google Scholar
  12. 12.
    Serban, R., ten Teije, A.: Exploiting thesauri knowledge in medical guideline formalization. Methods of Information in Medicine (to appear, 2009)Google Scholar
  13. 13.
    Stuckenschmidt, H., Klein, M.: Structure-based partitioning of large concept hierarchies. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 289–303. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  14. 14.
    Suntisrivaraporn, B.: Module extraction and incremental classification: A pragmatic approach for el+ ontologies (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Krystyna Milian
    • 1
  • Zharko Aleksovski
    • 2
  • Richard Vdovjak
    • 2
  • Annette ten Teije
    • 1
  • Frank van Harmelen
    • 1
  1. 1.Vrije Universiteit AmsterdamNetherlands
  2. 2.Philips ResearchNetherlands

Personalised recommendations