Text-to-Text Generation for Question Answering

Bosma, Wauter; Marsi, Erwin; Krahmer, Emiel; Theune, Mariët

doi:10.1007/978-3-642-17525-1_6

Wauter Bosma³,
Erwin Marsi⁴,
Emiel Krahmer⁵ &
…
Mariët Theune⁶

Part of the book series: Theory and Applications of Natural Language Processing ((NLP))

633 Accesses
1 Citations

Abstract

When answering questions, major challenges are (a) to carefully determine the content of the answer and (b) phrase it in a proper way. In IMIX, we focus on two text-to-text generation techniques to accomplish this: content selection and sentence fusion. Using content selection, we can extend answers to an arbitrary length, providing not just a direct answer but also related information so to better address the user’s information need. In this process, we use a graph-based model to generate coherent answers. We then apply sentence fusion to combine partial answers from different sources into a single more complete answer, at the same time avoiding redundancy. The fusion process involves syntactic parsing, tree alignment and surface string generation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bakshi K, Huynh D, Katz B, Karger D, Lin J, Quan D, Sinha V (2003) The role of context in question answering systems. In: CHI ’03 extended abstracts on Human Factors in Computing Systems, New York, NY, USA, pp 1006–1007
Google Scholar
Barzilay R (2003) Information fusion for multidocument summarization. PhD thesis, Columbia University
Google Scholar
Barzilay R, Elhadad M (1997) Using lexical chains for text summarization. In: Proceedings of the ACL workshop on Intelligent Scalable Text Summarization, pp 10–17
Google Scholar
Barzilay R, McKeown K (2005) Sentence fusion for multidocument news summarization. Computational Linguistics 31(3):297–328
Article Google Scholar
Barzilay R, McKeown K, Elhaded M (1999) Information fusion in the context of multi-document summarization. In: Proceedings of the 37th annual meeting of the ACL, Maryland
Google Scholar
Bates M (1990) The berry-picking search: user interface design. In: Thimbleby H (ed) User Interface Design, Addison-Wesley
Google Scholar
Blair-Goldensohn S, McKeown K (2006) Integrating rhetorical-semantic relation models for query-focused summarization. In: Proceedings of the Document Understanding Conference
Google Scholar
Bouma G, van Noord G, Malouf R (2001) Alpino: Wide-coverage computational analysis of Dutch. In: Proceedings of CLIN
Google Scholar
Carbonell J, Goldstein J (1998) The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, NY, USA, pp 335–336
Google Scholar
Carletta J (1996) Assessing agreement on classification tasks: the kappa statistic. Compututational Linguistics 22(2):249–254
Google Scholar
Daum´e III H, Marcu D (2004) Generic sentence fusion is an ill-defined summarization task. In: Proceedings of the ACL workshop: Text Summarization Branches Out, Barcelona, Spain
Google Scholar
Edmundson HP (1969) New methods in automatic extracting. Journal of the ACM 16(2):264–285
Article MATH Google Scholar
Erkan G, Radev D (2004) LexRank: Graph-based centrality as salience in text summarization. Journal of Artificial Intelligence Research
Google Scholar
Gildea D (2003) Loosely tree-based alignment for machine translation. In: Proceedings of the 41st annual meeting of the ACL, Sapporo, Japan
Google Scholar
Krahmer E, Marsi E, van Pelt P (2008) Query-based sentence fusion is better defined and leads to more preferred results than generic sentence fusion. In: Proceedings of the 46th Annual Meeting of the ACL, Columbus, OH, USA, pp 193–196
Google Scholar
Langkilde I, Knight K (1998) Generation that exploits corpus-based statistical knowledge. In: Proceedings of the 36th annual meeting of the ACL, Morristown, NJ, USA, pp 704–710
Google Scholar
Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Proceedings of the ACL workshop: Text Summarization Branches Out, Barcelona, Spain
Google Scholar
Luhn H (1958) The automatic creation of literature abstracts. IBM Journal of Research and Development 2(2):159–165
Article MathSciNet Google Scholar
Mani I, Bloedorn E (1997) Multi-document summarization by graph search and matching. In: Proceedings of the Fourteenth National Conference on Artificial Intelligence, pp 622–628
Google Scholar
Mann W, Thompson S (1988) Rhetorical Structure Theory: Toward a functional theory of text organization. Text 8:243–281
Google Scholar
Marcu D (1999) Discourse trees are good indicators of importance in text. In: Mani I, Maybury M (eds) Advances in Automatic Text Summarization, MIT Press, pp 123–136
Google Scholar
Marsi E, Krahmer E (2005a) Classification of semantic relations by humans and machines. In: Proceedings of the ACL workshop on Empirical Modeling of Semantic Equivalence and Entailment, Ann Arbor, Michigan, pp 1–6
Google Scholar
Marsi E, Krahmer E (2005b) Explorations in sentence fusion. In: Proceedings of the 10th European workshop on Natural Language Generation, Aberdeen, UK
Google Scholar
Marsi E, Krahmer E (2007) Annotating a parallel monolingual treebank with semantic similarity relations. In: Proceedings of the 6th International Workshop on Treebanks and Linguistic Theories, Bergen, Norway, pp 85–96
Google Scholar
Marsi E, Krahmer E (2009) Detecting semantic overlap: A parallel monolingual treebank for dutch. In: Proceedings of CLIN
Google Scholar
Marsi E, Krahmer E (2010) Automatic analysis of semantic similarity in comparable text through syntactic tree matching. In: Proceedings of the 23rd International Conference on Computational Linguistics, Beijing, China, pp 752–760
Google Scholar
Maybury M (2004) New Directions in Question Answering. AAAI Press
Google Scholar
Meyers A, Yangarber R, Grisham R (1996) Alignment of shared forests for bilingual corpora. In: Proceedings of 16th International Conference on Computational Linguistics, Copenhagen, Denmark, pp 460–465
Google Scholar
Noreen EW (1989) Computer intensive methods for testing hypotheses: an introduction. Wiley, New York, NY, USA
Google Scholar
Och FJ, Ney H (2000) Statistical machine translation. In: EAMT Workshop, Ljubljana, Slovenia, pp 39–46
Google Scholar
Porter M (2001) Snowball: A language for stemming algorithms. http://snowball.tartarus.org/texts/introduction.html
Sp¨arck Jones K (1972) A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation 28(1):11–21
Article Google Scholar
Strzalkowski T, Gaizauskas R, Voorhees E, Harabagiu S, Weishedel R, Israel D, Jacquemin C, Lin C, Maiorano S, Miller G, Moldovan D, Ogden B, Prager J, Riloff E, Burger J, Singhal A, Cardie C, Shrihari R, Chaudhri V (2000) Issues, tasks, and program structures to roadmap research in question & answering (Q&A). NIST
Google Scholar
Vossen P (ed) (1998) EuroWordNet: a multilingual database with lexical semantic networks. Kluwer Academic Publishers, Norwell, MA, USA Wolf F, Gibson E (2005) Representing discourse coherence: A corpus-based study. Computational Linguistics 31(2):249–288
Google Scholar
van der Wouden T, Hoekstra H, Moortgat M, Renmans B, Schuurman I (2002) Syntactic analysis in the spoken dutch corpus. In: Proceedings of the 3rd International Conference on Language Resources and Evaluation, Las Palmas, Spain, pp 768–773
Google Scholar

Download references

Author information

Authors and Affiliations

VU University Amsterdam, Amsterdam, The Netherlands
Wauter Bosma
Norwegian University of Science and Technology, Trondheim, Norway
Erwin Marsi
Tilburg University, Tilburg, The Netherlands
Emiel Krahmer
University of Twente, Enschede, The Netherlands
Mariët Theune

Authors

Wauter Bosma
View author publications
You can also search for this author in PubMed Google Scholar
Erwin Marsi
View author publications
You can also search for this author in PubMed Google Scholar
Emiel Krahmer
View author publications
You can also search for this author in PubMed Google Scholar
Mariët Theune
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wauter Bosma .

Editor information

Editors and Affiliations

Fac. Humanities, Tilburg University, Tilburg, Netherlands
Antal van den Bosch
, Information Science, University of Groningen, NL-9700 AS Groningen, Netherlands
Gosse Bouma

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bosma, W., Marsi, E., Krahmer, E., Theune, M. (2011). Text-to-Text Generation for Question Answering. In: van den Bosch, A., Bouma, G. (eds) Interactive Multi-modal Question-Answering. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17525-1_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-17525-1_6
Published: 08 April 2011
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17524-4
Online ISBN: 978-3-642-17525-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics