Assessing the Impact of Thesaurus-Based Expansion Techniques in QA-Centric IR

Sarmento, Luís; Teixeira, Jorge; Oliveira, Eugénio

doi:10.1007/978-3-642-04447-2_37

Assessing the Impact of Thesaurus-Based Expansion Techniques in QA-Centric IR

Luís Sarmento²⁴,
Jorge Teixeira²⁴ &
Eugénio Oliveira²⁴

Conference paper

553 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5706))

Abstract

We study the impact of using thesaurus-based query expansion methods at the Information Retrieval (IR) stage of a Question Answering (QA) system. We focus on expanding queries for questions regarding actions and events, where verbs have a central role. Two different thesaurus are used: the OpenOffice thesaurus and an automatically generated verb thesaurus. The performance of thesaurus-based methods is compared against what is obtained by (i) executing no expansion and (ii) applying a simple query generalization method. Results show that thesaurus-based approaches help improving recall at retrieval, while keeping satisfactory precision. However, we confirm that positive impact for the final QA performance is mostly achieved due to increase in recall, which can also be obtained by using simpler methods. Nevertheless, because of its better relative precision thesaurus-based expansion is effective in selectively reducing the number of irrelevant text passages retrieved, thus reducing computational load in the answer extraction stage.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bilotti, M.W., Katz, B., Lin, J.: What works better for question answering: Stemming or morphological query expansion? In: Proceedings of the Information Retrieval for Question Answering (IR4QA) Workshop. SIGIR 2004, Sheffield, England (July 2004)
Google Scholar
Sarmento, L., Teixeira, J., Oliveira, E.: Experiments with query expansion in the raposa (fox) question answering system. In: Borri, F., Nardi, A., Peters, C. (eds.) Working Notes for the CLEF 2008 Workshop, Aarhus, Denmark, September 17-19 (2008)
Google Scholar
Tellex, S., Katz, B., Lin, J., Fern, A., Marton, G.: Quantitative evaluation of passage retrieval algorithms for question answering. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR, pp. 41–47. ACM Press, New York (2003)
Google Scholar
Costa, L., Sarmento, L.: Component evaluation in a question answering system. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy (May 2006)
Google Scholar
Monz, C.: Document retrieval in the context of question answering. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 571–579. Springer, Heidelberg (2003)
Chapter Google Scholar
Curtis, J., Matthews, G., Baxter, D.: On the effective use of cyc in a question answering system. In: IJCAI Workshop on Knowledge and Reasoning for Answering Questions (KRAQ 2005), Edinburgh, Scotland (2005)
Google Scholar
Hovy, E., Gerber, L., Hermjakob, U., Junk, M., Lin, C.Y.: Question answering in webclopedia. In: Proceedings of the 9th Text REtrieval Conference, Gaithersburg, MD, USA, November 2000, pp. 655–664 (2000)
Google Scholar
Negri, M.: Sense-based blind relevance feedback for question answering. In: SIGIR 2004 Workshop on Information Retrieval For Question Answering (IR4QA), Sheffield, UK (July 2004)
Google Scholar
Riezler, S., Vasserman, A., Tsochantaridis, I., Mittal, V.O., Liu, Y.: Statistical machine translation for query expansion in answer retrieval. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, June 23-30 (2007)
Google Scholar
Sarmento, L.: A first step to address biography generation as an iterative QA task. In: Peters, C., Clough, P., Gey, F.C., Karlgren, J., Magnini, B., Oard, D.W., de Rijke, M., Stempfhuber, M. (eds.) CLEF 2006. LNCS, vol. 4730, pp. 473–482. Springer, Heidelberg (2007)
Chapter Google Scholar
Sarmento, L., Oliveira, E.: Making RAPOSA (FOX) smarter. In: Nardi, A., Peters, C. (eds.) Working Notes of the Cross-Language Evaluation Forum (CLEF) Workshop 2007, Budapest, Hungary (September 2007)
Google Scholar
Lin, D.: Automatic Retrieval and Clustering of Similar Words. In: Proceedings of COLING-ACL 1998, Montreal, vol. 2, pp. 768–773 (1998)
Google Scholar
Sarmento, L.: BACO - A large database of text and co-occurrences. In: Calzolari, N., Choukri, K., Gangemi, A., Maegaard, B., Mariani, J., Odjik, J., Tapias, D. (eds.) Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy, May 22-28, pp. 1787–1790 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratorio de Inteligência Artificial e Ciências de Computadores, Faculdade de Engenharia da Universidade do Porto, Rua Dr. Roberto Frias, s/n, 4200-465, Porto, Portugal
Luís Sarmento, Jorge Teixeira & Eugénio Oliveira

Authors

Luís Sarmento
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Teixeira
View author publications
You can also search for this author in PubMed Google Scholar
Eugénio Oliveira
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Istituto di Scienza e Tecnologie dell’Informazione, CNR, Pisa, Italy
Carol Peters
RWTH Aachen University, Aachen, Germany
Thomas Deselaers
University of Padua, Padua, Italy
Nicola Ferro
LSI-UNED, Madrid, Spain
Julio Gonzalo & Anselmo Peñas &
Dublin City University, Dublin 9, Ireland
Gareth J. F. Jones
Helsinki University of Technology, Espoo, Finland
Mikko Kurimo
University of Hildesheim, Hildesheim, Germany
Thomas Mandl
Humboldt University Berlin, Germany
Vivien Petras

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sarmento, L., Teixeira, J., Oliveira, E. (2009). Assessing the Impact of Thesaurus-Based Expansion Techniques in QA-Centric IR. In: Peters, C., et al. Evaluating Systems for Multilingual and Multimodal Information Access. CLEF 2008. Lecture Notes in Computer Science, vol 5706. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04447-2_37

Download citation

DOI: https://doi.org/10.1007/978-3-642-04447-2_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04446-5
Online ISBN: 978-3-642-04447-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics