Abstract
We investigate the effect of paraphrase generation on document retrieval performance. Specifically, we describe experiments where three information sources are used to generate lexical paraphrases of queries posed to the Internet. These information sources are: WordNet, a Webster-based thesaurus, and a combination of Webster and WordNet. Corpus-based information and wordsimilarity information are then used to rank the paraphrases. We evaluated our mechanism using 404 queries whose answers reside in the LA Times subset of the TREC-9 corpus. Our experiments show that query paraphrasing improves retrieval performance, and that performance is influenced both by the number of paraphrases generated for a query and by their quality. Specifically, the best performance was obtained usingWordNet, which improves document recall by 14% and increases the number of questions that can be answered by 8%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Miller, G., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to WordNet: An on-line lexical database. Journal of Lexicography 3 (1990) 235–244
Buckley, C., Salton, G., Allan, J., Singhal, A.: Automatic query expansion using SMART. In Harman, D., ed.: The Third Text REtrieval Conference (TREC3). National Institute of Standards and Technology Special Publication (1995)
Mitra, M., Singhal, A., Buckley, C.: Improving automatic query expansion. In: SIGIR’98-Proceedings of the 21th ACM International Conference on Research and Development in Information Retrieval, Melbourne, Australia (1998) 206–214
Mihalcea, R., Moldovan, D.: A method for word sense disambiguation of unrestricted text. In: ACL99-Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, Baltimore, Maryland (1999)
Lytinen, S., Tomuro, N., Repede, T.: The use of WordNet sense tagging in FAQfinder. In: Proceedings of the AAAI00Workshop on AI andWeb Search, Austin, Texas (2000)
Schütze, H., Pedersen, J.O.: Information retrieval based on word senses. In: Proceedings of the Fourth Annual Symposium on Document Analysis and Information Retrieval, LasVegas, Nevada (1995) 161–175
Lin, D.: Automatic retrieval and clustering of similarwords. In: COLING-ACL’98-Proceedings of the International Conference on Computational Linguistics and the Annual Meeting of the Association for Computational Linguistics, Montreal, Canada (1998) 768–774
Harabagiu, S., Moldovan, D., Pasca, M., Mihalcea, R., Surdeanu, M., Bunescu, R., Girju, R., Rus, V., Morarescu, P.: The role of lexico-semantic feedback in open domain textual question-answering. In:ACL01-Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France (2001) 274–281
Sanderson, M.:Word sense disambiguation and information retrieval. In: SIGIR’94-Proceedings of the 17thACMInternational Conference on Research and Development in Information Retrieval, Dublin, Ireland (1994) 142–151
Gonzalo, J., Verdejo, F., Chugur, I., Cigarran, J.: Indexing withWordNet synsets can improve text retrieval. In: Proceedings of the COLING-ACL’98 Workshop on Usage of WordNet in Natural Language Processing Systems, Montreal, Canada (1998) 38–44
Zukerman, I., Raskutti, B.: Lexical query paraphrasing for document retrieval. In: COLING’ 02-Proceedings of the International Conference on Computational Linguistics, Taipei, Taiwan (2002) 1177–1183
Brill, E.: A simple rule-based part of speech tagger. In: ANLP-92-Proceedings of the Third Conference on Applied Natural Language Processing, Trento, IT (1992) 152–155
Salton, G., McGill, M.: An Introduction to Modern Information Retrieval. McGraw Hill (1983)
Carletta, J.: Assessing agreement on classification tasks: The Kappa statistic. Computational Linguistics 22 (1996) 249–254
Kwok, C.C., Etzioni, O., Weld, D.S.: Scaling question answering to the web. In: WWW10-Proceedings of the Tenth International World Wide Web Conference, Hong Kong (2001) 150–161
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zukerman, I., Raskutti, B., Wen, Y. (2002). Experiments in Query Paraphrasing for Information Retrieval. In: McKay, B., Slaney, J. (eds) AI 2002: Advances in Artificial Intelligence. AI 2002. Lecture Notes in Computer Science(), vol 2557. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36187-1_3
Download citation
DOI: https://doi.org/10.1007/3-540-36187-1_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00197-3
Online ISBN: 978-3-540-36187-9
eBook Packages: Springer Book Archive