Abstract
Large medical ontologies can be of great help in building a specialized clinical information system. First step in their use is to identify the subset of concepts which are relevant to the specialty. In this paper we present a method to automatically identify the breast cancer concepts from the SNOMED-CT ontology using large text corpus as source of knowledge. In addition to finding them, the concepts are also assigned relevance values.
In our experiments the method produced results of an overall high quality. The precision was high, and the recall was relatively low, but the concepts which were not found are complex and arguably ambiguous, which limits their applicability in practice. This research was application driven, and the breast cancer concepts found have been applied in a real oncology information system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aizawa, A.: An information-theoretic perspective of tf-idf measures. Information Processing & Management 39(1), 45–65 (2003)
Aleksovski, Z., Vdovjak, R.: Overlap of selected ontologies in the context of the breast cancer domain. In: Proceedings of SIIM Annual Meeting (2009)
Clark, K., Parsia, B.: Modularity and owl. Literature survey (2008)
Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Just the right amount: extracting modules from ontologies. In: Proceedings of WWW, pp. 717–726 (2007)
Lawrie, D., Croft, W.B., Rosenberg, A.: Finding topic words for hierarchical summarization. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 349–357. ACM, New York (2001)
Milian, K., Aleksovski, Z., Vdovjak, R., ten Teije, A., van Harmelen, F.: Identifying disease-centric subdomains in very large medical ontologies: A case-study on breast cancer concepts in snomed ct. or: Finding 2500 out of 300.000. In: Riaño, D., ten Teije, A., Miksch, S., Peleg, M. (eds.) KR4HC 2009. LNCS, vol. 5943, pp. 50–63. Springer, Heidelberg (2010)
Porter, M.F.: An algorithm for suffix stripping, pp. 313–316. Morgan Kaufmann Publishers Inc., San Francisco (1997)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing & Management 24(5), 513–523 (1988)
Stearns, M.Q., Price, C., Spackman, K.A., Wang, A.Y.: Snomed clinical terms: overview of the development process and project status. In: Proceedings of the AMIA Symposium, p. 662. American Medical Informatics Association (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Aleksovski, Z., Sevenster, M. (2011). Identifying Breast Cancer Concepts in SNOMED-CT Using Large Text Corpus. In: Szomszor, M., Kostkova, P. (eds) Electronic Healthcare. eHealth 2010. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 69. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23635-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-23635-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23634-1
Online ISBN: 978-3-642-23635-8
eBook Packages: Computer ScienceComputer Science (R0)