Abstract
It has often been thought that word sense ambiguity is a cause of poor performance in Information Retrieval (IR) systems. The belief is that if ambiguous words can be correctly disambiguated, IR performance will increase. However, recent research into the application of a word sense disambiguator to an IR system failed to show any performance increase. From these results it has become clear that more basic research is needed to investigate the relationship between sense ambiguity, disambiguation, and IR.
Using a technique that introduces additional sense ambiguity into a collection, this paper presents research that goes beyond previous work in this field to reveal the influence that ambiguity and disambiguation have on a probabilistic IR system. We conclude that word sense ambiguity is only problematic to an B2 system when it is retrieving from very short queries. In addition we argue that if a word sense disambiguator is to be of any use to an IR system, the disambiguator must be able to resolve word senses to a high degree of accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gale W, Church KW, Yarowsky D. Estimating upper and lower bounds on the performance of word-sense disambiguation programs. Proceedings of the ACL, 1992; 30: 249–256
Weiss SF. Learning to disambiguate. Information Storage and Retrieval, 1973; 9: 33–41
Kelly E, Stone P. Computer recognition of English word senses. North-Holland Publishing Co., Amsterdam, 1975
Small S, Rieger C. Parsing and comprehending with word experts (a theory and its realisation). In: Strategies for Natural Language Processing, Lehnert WG, Ringle MH (Eds), LEA, 1992, pp 89–148
Lesk M. Automatic sense disambiguation: how to tell a pine cone from an ice cream cone. Proceedings of the SIGDOC Conference 1986; 24–26
Cowie J, Guthrie J, Guthrie L. Lexical disambiguation using simulated annealing. Proceedings of COLING Conference, 1992; 359–365
Black E. An experiment in computational discrimination of English word senses. IBM Journal, 1988; 32: 185194
Wallis P. Information retrieval based on paraphrase. Proceedings of PACLING Conference, 1993
Demetriou GC. Lexical disambiguation using constraint handling in Prolog (CHIP). Proceedings of the European Chapter of the ACL, 1993; 6: 431–436
Zemik U. TRAINI vs. TRAIN2: Tagging word senses in corpus. Proceedings of RIAO 91, Intelligent Text and Image Handling, 1991; 567–585
Hearst MA. Noun homograph disambiguation using local context in large text corpora. Proceedings of the 7th conference, UW Centre for the New OED & Text Research Using Corpora, 1991; 7
Dagan I, Itai A, Schwall U. Two languages are more informative than one. Proceedings of the ACL, 1991: 29: 130–137
Church KW. Using bilingual materials to develop word sense disambiguation methods. Proceedings of ACM SIGIR Conference, 1992; 15: 350
Voorhees EM. Using WordNetru to disambiguate word sense for text retrieval. Proceedings of ACM SIGIR Conference, 1993; 16: 171–180
Sussna M. Word sense disambiguation for free-text indexing using a massive semantic network. Proceedings of CIKM, 1993
Yarowsky D. Word sense disambiguation using statistical models of Roget’s categories trained on large corpora. Proceedings of COLING Conference, 1992; 454–460
Krovetz R, Croft WB. Lexical Ambiguity and Information Retrieval. ACM Transactions on Information Systems, 1992; 10
Miller G. WordNet: an on-line lexical database. International Journal of Lexicography, 1990; 24: 513–523
Yarowsky D. One sense per collocation. Proceedings of ARPA Human Language Technology Workshop, 1993
Hayes PJ. Intelligent high volume text processing using shallow, domain specific techniques. Working Notes, AAAI Spring Symposium on Text-Based Intelligent Systems, 1990: 134–138
Lewis DD. Representation and learning in information retrieval. PhD Thesis, COINS Technical Report 91–93 Department of Computer and Information Science, University of Massachusetts, Amherst, MA 01003, 1991
Robertson SE, Sparck-Jones K. Relevance weighting of search terms. Journal of the American Society for Information Science, 1976; 27: 129–146.
van Rijsbergen Q. Information retrieval ( second edition ). London: Butterworths, 1979
Hughes GF. On the mean accuracy of statistical pattern recognisers. IEEE Transactions on Information Theory, 1968; 14: 55–63
Harman D. Relevance feedback revisited. Proceedings of ACM SIGIR Conference, 1992; 15: 1–10
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1994 Springer-Verlag London Limited
About this paper
Cite this paper
Sanderson, M. (1994). Word Sense Disambiguation and Information Retrieval. In: Croft, B.W., van Rijsbergen, C.J. (eds) SIGIR ’94. Springer, London. https://doi.org/10.1007/978-1-4471-2099-5_15
Download citation
DOI: https://doi.org/10.1007/978-1-4471-2099-5_15
Publisher Name: Springer, London
Print ISBN: 978-3-540-19889-5
Online ISBN: 978-1-4471-2099-5
eBook Packages: Springer Book Archive