Skip to main content

Word Sense Disambiguation and Information Retrieval

  • Conference paper
SIGIR ’94

Abstract

It has often been thought that word sense ambiguity is a cause of poor performance in Information Retrieval (IR) systems. The belief is that if ambiguous words can be correctly disambiguated, IR performance will increase. However, recent research into the application of a word sense disambiguator to an IR system failed to show any performance increase. From these results it has become clear that more basic research is needed to investigate the relationship between sense ambiguity, disambiguation, and IR.

Using a technique that introduces additional sense ambiguity into a collection, this paper presents research that goes beyond previous work in this field to reveal the influence that ambiguity and disambiguation have on a probabilistic IR system. We conclude that word sense ambiguity is only problematic to an B2 system when it is retrieving from very short queries. In addition we argue that if a word sense disambiguator is to be of any use to an IR system, the disambiguator must be able to resolve word senses to a high degree of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Gale W, Church KW, Yarowsky D. Estimating upper and lower bounds on the performance of word-sense disambiguation programs. Proceedings of the ACL, 1992; 30: 249–256

    Google Scholar 

  2. Weiss SF. Learning to disambiguate. Information Storage and Retrieval, 1973; 9: 33–41

    Article  Google Scholar 

  3. Kelly E, Stone P. Computer recognition of English word senses. North-Holland Publishing Co., Amsterdam, 1975

    Google Scholar 

  4. Small S, Rieger C. Parsing and comprehending with word experts (a theory and its realisation). In: Strategies for Natural Language Processing, Lehnert WG, Ringle MH (Eds), LEA, 1992, pp 89–148

    Google Scholar 

  5. Lesk M. Automatic sense disambiguation: how to tell a pine cone from an ice cream cone. Proceedings of the SIGDOC Conference 1986; 24–26

    Google Scholar 

  6. Cowie J, Guthrie J, Guthrie L. Lexical disambiguation using simulated annealing. Proceedings of COLING Conference, 1992; 359–365

    Google Scholar 

  7. Black E. An experiment in computational discrimination of English word senses. IBM Journal, 1988; 32: 185194

    Google Scholar 

  8. Wallis P. Information retrieval based on paraphrase. Proceedings of PACLING Conference, 1993

    Google Scholar 

  9. Demetriou GC. Lexical disambiguation using constraint handling in Prolog (CHIP). Proceedings of the European Chapter of the ACL, 1993; 6: 431–436

    Google Scholar 

  10. Zemik U. TRAINI vs. TRAIN2: Tagging word senses in corpus. Proceedings of RIAO 91, Intelligent Text and Image Handling, 1991; 567–585

    Google Scholar 

  11. Hearst MA. Noun homograph disambiguation using local context in large text corpora. Proceedings of the 7th conference, UW Centre for the New OED & Text Research Using Corpora, 1991; 7

    Google Scholar 

  12. Dagan I, Itai A, Schwall U. Two languages are more informative than one. Proceedings of the ACL, 1991: 29: 130–137

    Google Scholar 

  13. Church KW. Using bilingual materials to develop word sense disambiguation methods. Proceedings of ACM SIGIR Conference, 1992; 15: 350

    Google Scholar 

  14. Voorhees EM. Using WordNetru to disambiguate word sense for text retrieval. Proceedings of ACM SIGIR Conference, 1993; 16: 171–180

    Google Scholar 

  15. Sussna M. Word sense disambiguation for free-text indexing using a massive semantic network. Proceedings of CIKM, 1993

    Google Scholar 

  16. Yarowsky D. Word sense disambiguation using statistical models of Roget’s categories trained on large corpora. Proceedings of COLING Conference, 1992; 454–460

    Google Scholar 

  17. Krovetz R, Croft WB. Lexical Ambiguity and Information Retrieval. ACM Transactions on Information Systems, 1992; 10

    Google Scholar 

  18. Miller G. WordNet: an on-line lexical database. International Journal of Lexicography, 1990; 24: 513–523

    Google Scholar 

  19. Yarowsky D. One sense per collocation. Proceedings of ARPA Human Language Technology Workshop, 1993

    Google Scholar 

  20. Hayes PJ. Intelligent high volume text processing using shallow, domain specific techniques. Working Notes, AAAI Spring Symposium on Text-Based Intelligent Systems, 1990: 134–138

    Google Scholar 

  21. Lewis DD. Representation and learning in information retrieval. PhD Thesis, COINS Technical Report 91–93 Department of Computer and Information Science, University of Massachusetts, Amherst, MA 01003, 1991

    Google Scholar 

  22. Robertson SE, Sparck-Jones K. Relevance weighting of search terms. Journal of the American Society for Information Science, 1976; 27: 129–146.

    Article  Google Scholar 

  23. van Rijsbergen Q. Information retrieval ( second edition ). London: Butterworths, 1979

    Google Scholar 

  24. Hughes GF. On the mean accuracy of statistical pattern recognisers. IEEE Transactions on Information Theory, 1968; 14: 55–63

    Article  Google Scholar 

  25. Harman D. Relevance feedback revisited. Proceedings of ACM SIGIR Conference, 1992; 15: 1–10

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer-Verlag London Limited

About this paper

Cite this paper

Sanderson, M. (1994). Word Sense Disambiguation and Information Retrieval. In: Croft, B.W., van Rijsbergen, C.J. (eds) SIGIR ’94. Springer, London. https://doi.org/10.1007/978-1-4471-2099-5_15

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-2099-5_15

  • Publisher Name: Springer, London

  • Print ISBN: 978-3-540-19889-5

  • Online ISBN: 978-1-4471-2099-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics