Skip to main content

Sense-Based Biomedical Indexing and Retrieval

  • Conference paper
Natural Language Processing and Information Systems (NLDB 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6177))

Abstract

This paper tackles the problem of term ambiguity, especially for biomedical literature. We propose and evaluate two methods of Word Sense Disambiguation (WSD) for biomedical terms and integrate them to a sense-based document indexing and retrieval framework. Ambiguous biomedical terms in documents and queries are disambiguated using the Medical Subject Headings (MeSH) thesaurus and semantically indexed with their associated correct sense. The experimental evaluation carried out on the TREC9-FT 2000 collection shows that our approach of WSD and sense-based indexing and retrieval outperforms the baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: SIGDOC 1986, pp. 24–26 (1986)

    Google Scholar 

  2. Gale, W., Church, K., Yarowsky, D.: A method for disambiguating word senses in a large corpus. Computers and the Humanities, 415–439 (1993)

    Google Scholar 

  3. Mihalcea, R.: Unsupervised large-vocabulary word sense disambiguation with graph-based algorithms for sequence data labeling. In: HLT 2005, pp. 411–418 (2005)

    Google Scholar 

  4. Lee, Y.K., Ng, H.T., Chia, T.K.: Supervised word sense disambiguation with support vector machines and multiple knowledge sources. In: Senseval-3: Workshop on the Evaluation of Systems for the Semantic Analysis of Text, pp. 137–140 (2004)

    Google Scholar 

  5. Liu, H., Teller, V., Friedman, C.: A multi-aspect comparison study of supervised word sense disambiguation. J Am. Med. Inform. Assoc. 11(4), 320–331 (2004)

    Article  Google Scholar 

  6. Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: ACL 1995, pp. 189–196 (1995)

    Google Scholar 

  7. Abney, S.P.: Bootstrapping. In: ACL, pp. 360–367 (2002)

    Google Scholar 

  8. Leroy, G., et al.: Effects of information and machine learning algorithms on word sense disambiguation with small datasets. Medical Informatics, 573–585 (2005)

    Google Scholar 

  9. Joshi, M., Pedersen, T., Maclin, R.: A comparative study of support vector machines applied to the word sense disambiguation problem for the medical domain. In: IICAI 2005, pp. 3449–3468 (2005)

    Google Scholar 

  10. Weeber, M., Mork, J., Aronson, A.: Developing a test collection for biomedical word sense disambiguation. In: Proc. AMIA Symp., pp. 746–750 (2001)

    Google Scholar 

  11. Humphrey, S.M., Rogers, W.J., et al.: Word sense disambiguation by selecting the best semantic type based on journal descriptor indexing: Preliminary experiment. J. Am. Soc. Inf. Sci. Technol. 57(1), 96–113 (2006)

    Article  Google Scholar 

  12. Gaudan, S., Kirsch, H., Rebholz-Schuhmann, D.: Resolving abbreviations to their senses in medline. Bioinformatics 21(18), 3658–3664 (2005)

    Article  Google Scholar 

  13. Andreopoulos, B., Alexopoulou, D., Schroeder, M.: Word sense disambiguation in biomedical ontologies with term co-occurrence analysis and document clustering. IJDMB 2(3), 193–215 (2008)

    Article  Google Scholar 

  14. Mohammad, S., Pedersen, T.: Combining lexical and syntactic features for supervised word sense disambiguation. In: CoNLL 2004, pp. 25–32 (2004)

    Google Scholar 

  15. Stevenson, M., Guo, Y., Gaizauskas, R., Martinez, D.: Knowledge sources for word sense disambiguation of biomedical text. In: BioNLP 2008, pp. 80–87 (2008)

    Google Scholar 

  16. Aronson, A.R.: Effective mapping of biomedical text to the UMLS Metathesaurus: the metamap program. In: Proceedings AMIA Symposium, pp. 17–21 (2001)

    Google Scholar 

  17. Schmid, H.: Part-of-speech tagging with neural networks. In: Proceedings of the 15th conference on Computational linguistics, pp. 172–176 (1994)

    Google Scholar 

  18. Gale, W.A., Church, K.W., Yarowsky, D.: One sense per discourse. In: HLT 1991: Proceedings of the workshop on Speech and natural Language, pp. 233–237 (1992)

    Google Scholar 

  19. Leacock, C., Chodorow, M.: Combining local context and wordnet similarity for word sense identification. An Electronic Lexical Database, 265–283 (1998)

    Google Scholar 

  20. Kang, B.Y., Kim, D.W., Lee, S.J.: Exploiting concept clusters for content-based information retrieval. Information Sciences - Informatics and Computer Science 170(2-4), 443–462 (2005)

    Google Scholar 

  21. Hersh, W., Buckley, C., Leone, T.J., Hickam, D.: Ohsumed: an interactive retrieval evaluation and new large test collection for research. In: SIGIR 1994, pp. 192–201 (1994)

    Google Scholar 

  22. Robertson, S.E., Walker, S., Hancock-Beaulieu, M.: Okapi at trec-7: Automatic ad hoc, filtering, vlc and interactive. In: TREC, pp. 199–210 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dinh, D., Tamine, L. (2010). Sense-Based Biomedical Indexing and Retrieval. In: Hopfe, C.J., Rezgui, Y., Métais, E., Preece, A., Li, H. (eds) Natural Language Processing and Information Systems. NLDB 2010. Lecture Notes in Computer Science, vol 6177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13881-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13881-2_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13880-5

  • Online ISBN: 978-3-642-13881-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics