Abstract
We study the impact of using thesaurus-based query expansion methods at the Information Retrieval (IR) stage of a Question Answering (QA) system. We focus on expanding queries for questions regarding actions and events, where verbs have a central role. Two different thesaurus are used: the OpenOffice thesaurus and an automatically generated verb thesaurus. The performance of thesaurus-based methods is compared against what is obtained by (i) executing no expansion and (ii) applying a simple query generalization method. Results show that thesaurus-based approaches help improving recall at retrieval, while keeping satisfactory precision. However, we confirm that positive impact for the final QA performance is mostly achieved due to increase in recall, which can also be obtained by using simpler methods. Nevertheless, because of its better relative precision thesaurus-based expansion is effective in selectively reducing the number of irrelevant text passages retrieved, thus reducing computational load in the answer extraction stage.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bilotti, M.W., Katz, B., Lin, J.: What works better for question answering: Stemming or morphological query expansion? In: Proceedings of the Information Retrieval for Question Answering (IR4QA) Workshop. SIGIR 2004, Sheffield, England (July 2004)
Sarmento, L., Teixeira, J., Oliveira, E.: Experiments with query expansion in the raposa (fox) question answering system. In: Borri, F., Nardi, A., Peters, C. (eds.) Working Notes for the CLEF 2008 Workshop, Aarhus, Denmark, September 17-19 (2008)
Tellex, S., Katz, B., Lin, J., Fern, A., Marton, G.: Quantitative evaluation of passage retrieval algorithms for question answering. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR, pp. 41–47. ACM Press, New York (2003)
Costa, L., Sarmento, L.: Component evaluation in a question answering system. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy (May 2006)
Monz, C.: Document retrieval in the context of question answering. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 571–579. Springer, Heidelberg (2003)
Curtis, J., Matthews, G., Baxter, D.: On the effective use of cyc in a question answering system. In: IJCAI Workshop on Knowledge and Reasoning for Answering Questions (KRAQ 2005), Edinburgh, Scotland (2005)
Hovy, E., Gerber, L., Hermjakob, U., Junk, M., Lin, C.Y.: Question answering in webclopedia. In: Proceedings of the 9th Text REtrieval Conference, Gaithersburg, MD, USA, November 2000, pp. 655–664 (2000)
Negri, M.: Sense-based blind relevance feedback for question answering. In: SIGIR 2004 Workshop on Information Retrieval For Question Answering (IR4QA), Sheffield, UK (July 2004)
Riezler, S., Vasserman, A., Tsochantaridis, I., Mittal, V.O., Liu, Y.: Statistical machine translation for query expansion in answer retrieval. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, June 23-30 (2007)
Sarmento, L.: A first step to address biography generation as an iterative QA task. In: Peters, C., Clough, P., Gey, F.C., Karlgren, J., Magnini, B., Oard, D.W., de Rijke, M., Stempfhuber, M. (eds.) CLEF 2006. LNCS, vol. 4730, pp. 473–482. Springer, Heidelberg (2007)
Sarmento, L., Oliveira, E.: Making RAPOSA (FOX) smarter. In: Nardi, A., Peters, C. (eds.) Working Notes of the Cross-Language Evaluation Forum (CLEF) Workshop 2007, Budapest, Hungary (September 2007)
Lin, D.: Automatic Retrieval and Clustering of Similar Words. In: Proceedings of COLING-ACL 1998, Montreal, vol. 2, pp. 768–773 (1998)
Sarmento, L.: BACO - A large database of text and co-occurrences. In: Calzolari, N., Choukri, K., Gangemi, A., Maegaard, B., Mariani, J., Odjik, J., Tapias, D. (eds.) Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy, May 22-28, pp. 1787–1790 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sarmento, L., Teixeira, J., Oliveira, E. (2009). Assessing the Impact of Thesaurus-Based Expansion Techniques in QA-Centric IR. In: Peters, C., et al. Evaluating Systems for Multilingual and Multimodal Information Access. CLEF 2008. Lecture Notes in Computer Science, vol 5706. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04447-2_37
Download citation
DOI: https://doi.org/10.1007/978-3-642-04447-2_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04446-5
Online ISBN: 978-3-642-04447-2
eBook Packages: Computer ScienceComputer Science (R0)