Word Sense Disambiguation and Information Retrieval

Sanderson, Mark

doi:10.1007/978-1-4471-2099-5_15

Mark Sanderson³

446 Accesses
90 Citations
1 Altmetric

Abstract

It has often been thought that word sense ambiguity is a cause of poor performance in Information Retrieval (IR) systems. The belief is that if ambiguous words can be correctly disambiguated, IR performance will increase. However, recent research into the application of a word sense disambiguator to an IR system failed to show any performance increase. From these results it has become clear that more basic research is needed to investigate the relationship between sense ambiguity, disambiguation, and IR.

Using a technique that introduces additional sense ambiguity into a collection, this paper presents research that goes beyond previous work in this field to reveal the influence that ambiguity and disambiguation have on a probabilistic IR system. We conclude that word sense ambiguity is only problematic to an B2 system when it is retrieving from very short queries. In addition we argue that if a word sense disambiguator is to be of any use to an IR system, the disambiguator must be able to resolve word senses to a high degree of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Gale W, Church KW, Yarowsky D. Estimating upper and lower bounds on the performance of word-sense disambiguation programs. Proceedings of the ACL, 1992; 30: 249–256
Google Scholar
Weiss SF. Learning to disambiguate. Information Storage and Retrieval, 1973; 9: 33–41
Article Google Scholar
Kelly E, Stone P. Computer recognition of English word senses. North-Holland Publishing Co., Amsterdam, 1975
Google Scholar
Small S, Rieger C. Parsing and comprehending with word experts (a theory and its realisation). In: Strategies for Natural Language Processing, Lehnert WG, Ringle MH (Eds), LEA, 1992, pp 89–148
Google Scholar
Lesk M. Automatic sense disambiguation: how to tell a pine cone from an ice cream cone. Proceedings of the SIGDOC Conference 1986; 24–26
Google Scholar
Cowie J, Guthrie J, Guthrie L. Lexical disambiguation using simulated annealing. Proceedings of COLING Conference, 1992; 359–365
Google Scholar
Black E. An experiment in computational discrimination of English word senses. IBM Journal, 1988; 32: 185194
Google Scholar
Wallis P. Information retrieval based on paraphrase. Proceedings of PACLING Conference, 1993
Google Scholar
Demetriou GC. Lexical disambiguation using constraint handling in Prolog (CHIP). Proceedings of the European Chapter of the ACL, 1993; 6: 431–436
Google Scholar
Zemik U. TRAINI vs. TRAIN2: Tagging word senses in corpus. Proceedings of RIAO 91, Intelligent Text and Image Handling, 1991; 567–585
Google Scholar
Hearst MA. Noun homograph disambiguation using local context in large text corpora. Proceedings of the 7th conference, UW Centre for the New OED & Text Research Using Corpora, 1991; 7
Google Scholar
Dagan I, Itai A, Schwall U. Two languages are more informative than one. Proceedings of the ACL, 1991: 29: 130–137
Google Scholar
Church KW. Using bilingual materials to develop word sense disambiguation methods. Proceedings of ACM SIGIR Conference, 1992; 15: 350
Google Scholar
Voorhees EM. Using WordNetru to disambiguate word sense for text retrieval. Proceedings of ACM SIGIR Conference, 1993; 16: 171–180
Google Scholar
Sussna M. Word sense disambiguation for free-text indexing using a massive semantic network. Proceedings of CIKM, 1993
Google Scholar
Yarowsky D. Word sense disambiguation using statistical models of Roget’s categories trained on large corpora. Proceedings of COLING Conference, 1992; 454–460
Google Scholar
Krovetz R, Croft WB. Lexical Ambiguity and Information Retrieval. ACM Transactions on Information Systems, 1992; 10
Google Scholar
Miller G. WordNet: an on-line lexical database. International Journal of Lexicography, 1990; 24: 513–523
Google Scholar
Yarowsky D. One sense per collocation. Proceedings of ARPA Human Language Technology Workshop, 1993
Google Scholar
Hayes PJ. Intelligent high volume text processing using shallow, domain specific techniques. Working Notes, AAAI Spring Symposium on Text-Based Intelligent Systems, 1990: 134–138
Google Scholar
Lewis DD. Representation and learning in information retrieval. PhD Thesis, COINS Technical Report 91–93 Department of Computer and Information Science, University of Massachusetts, Amherst, MA 01003, 1991
Google Scholar
Robertson SE, Sparck-Jones K. Relevance weighting of search terms. Journal of the American Society for Information Science, 1976; 27: 129–146.
Article Google Scholar
van Rijsbergen Q. Information retrieval ( second edition ). London: Butterworths, 1979
Google Scholar
Hughes GF. On the mean accuracy of statistical pattern recognisers. IEEE Transactions on Information Theory, 1968; 14: 55–63
Article Google Scholar
Harman D. Relevance feedback revisited. Proceedings of ACM SIGIR Conference, 1992; 15: 1–10
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing Science, University of Glasgow, Glasgow, G12 8QQ, UK
Mark Sanderson

Authors

Mark Sanderson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Massachusetts, 01003, Amherst, MA, USA
Bruce W. Croft
Department of Computer Science, University of Glasgow, G12 8RZ, 8–17 Lilybank Gardens, Glasgow, Scotland
C. J. van Rijsbergen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sanderson, M. (1994). Word Sense Disambiguation and Information Retrieval. In: Croft, B.W., van Rijsbergen, C.J. (eds) SIGIR ’94. Springer, London. https://doi.org/10.1007/978-1-4471-2099-5_15

Download citation

DOI: https://doi.org/10.1007/978-1-4471-2099-5_15
Publisher Name: Springer, London
Print ISBN: 978-3-540-19889-5
Online ISBN: 978-1-4471-2099-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics