Abstract
When answering questions, major challenges are (a) to carefully determine the content of the answer and (b) phrase it in a proper way. In IMIX, we focus on two text-to-text generation techniques to accomplish this: content selection and sentence fusion. Using content selection, we can extend answers to an arbitrary length, providing not just a direct answer but also related information so to better address the user’s information need. In this process, we use a graph-based model to generate coherent answers. We then apply sentence fusion to combine partial answers from different sources into a single more complete answer, at the same time avoiding redundancy. The fusion process involves syntactic parsing, tree alignment and surface string generation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bakshi K, Huynh D, Katz B, Karger D, Lin J, Quan D, Sinha V (2003) The role of context in question answering systems. In: CHI ’03 extended abstracts on Human Factors in Computing Systems, New York, NY, USA, pp 1006–1007
Barzilay R (2003) Information fusion for multidocument summarization. PhD thesis, Columbia University
Barzilay R, Elhadad M (1997) Using lexical chains for text summarization. In: Proceedings of the ACL workshop on Intelligent Scalable Text Summarization, pp 10–17
Barzilay R, McKeown K (2005) Sentence fusion for multidocument news summarization. Computational Linguistics 31(3):297–328
Barzilay R, McKeown K, Elhaded M (1999) Information fusion in the context of multi-document summarization. In: Proceedings of the 37th annual meeting of the ACL, Maryland
Bates M (1990) The berry-picking search: user interface design. In: Thimbleby H (ed) User Interface Design, Addison-Wesley
Blair-Goldensohn S, McKeown K (2006) Integrating rhetorical-semantic relation models for query-focused summarization. In: Proceedings of the Document Understanding Conference
Bouma G, van Noord G, Malouf R (2001) Alpino: Wide-coverage computational analysis of Dutch. In: Proceedings of CLIN
Carbonell J, Goldstein J (1998) The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, NY, USA, pp 335–336
Carletta J (1996) Assessing agreement on classification tasks: the kappa statistic. Compututational Linguistics 22(2):249–254
Daum´e III H, Marcu D (2004) Generic sentence fusion is an ill-defined summarization task. In: Proceedings of the ACL workshop: Text Summarization Branches Out, Barcelona, Spain
Edmundson HP (1969) New methods in automatic extracting. Journal of the ACM 16(2):264–285
Erkan G, Radev D (2004) LexRank: Graph-based centrality as salience in text summarization. Journal of Artificial Intelligence Research
Gildea D (2003) Loosely tree-based alignment for machine translation. In: Proceedings of the 41st annual meeting of the ACL, Sapporo, Japan
Krahmer E, Marsi E, van Pelt P (2008) Query-based sentence fusion is better defined and leads to more preferred results than generic sentence fusion. In: Proceedings of the 46th Annual Meeting of the ACL, Columbus, OH, USA, pp 193–196
Langkilde I, Knight K (1998) Generation that exploits corpus-based statistical knowledge. In: Proceedings of the 36th annual meeting of the ACL, Morristown, NJ, USA, pp 704–710
Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Proceedings of the ACL workshop: Text Summarization Branches Out, Barcelona, Spain
Luhn H (1958) The automatic creation of literature abstracts. IBM Journal of Research and Development 2(2):159–165
Mani I, Bloedorn E (1997) Multi-document summarization by graph search and matching. In: Proceedings of the Fourteenth National Conference on Artificial Intelligence, pp 622–628
Mann W, Thompson S (1988) Rhetorical Structure Theory: Toward a functional theory of text organization. Text 8:243–281
Marcu D (1999) Discourse trees are good indicators of importance in text. In: Mani I, Maybury M (eds) Advances in Automatic Text Summarization, MIT Press, pp 123–136
Marsi E, Krahmer E (2005a) Classification of semantic relations by humans and machines. In: Proceedings of the ACL workshop on Empirical Modeling of Semantic Equivalence and Entailment, Ann Arbor, Michigan, pp 1–6
Marsi E, Krahmer E (2005b) Explorations in sentence fusion. In: Proceedings of the 10th European workshop on Natural Language Generation, Aberdeen, UK
Marsi E, Krahmer E (2007) Annotating a parallel monolingual treebank with semantic similarity relations. In: Proceedings of the 6th International Workshop on Treebanks and Linguistic Theories, Bergen, Norway, pp 85–96
Marsi E, Krahmer E (2009) Detecting semantic overlap: A parallel monolingual treebank for dutch. In: Proceedings of CLIN
Marsi E, Krahmer E (2010) Automatic analysis of semantic similarity in comparable text through syntactic tree matching. In: Proceedings of the 23rd International Conference on Computational Linguistics, Beijing, China, pp 752–760
Maybury M (2004) New Directions in Question Answering. AAAI Press
Meyers A, Yangarber R, Grisham R (1996) Alignment of shared forests for bilingual corpora. In: Proceedings of 16th International Conference on Computational Linguistics, Copenhagen, Denmark, pp 460–465
Noreen EW (1989) Computer intensive methods for testing hypotheses: an introduction. Wiley, New York, NY, USA
Och FJ, Ney H (2000) Statistical machine translation. In: EAMT Workshop, Ljubljana, Slovenia, pp 39–46
Porter M (2001) Snowball: A language for stemming algorithms. http://snowball.tartarus.org/texts/introduction.html
Sp¨arck Jones K (1972) A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation 28(1):11–21
Strzalkowski T, Gaizauskas R, Voorhees E, Harabagiu S, Weishedel R, Israel D, Jacquemin C, Lin C, Maiorano S, Miller G, Moldovan D, Ogden B, Prager J, Riloff E, Burger J, Singhal A, Cardie C, Shrihari R, Chaudhri V (2000) Issues, tasks, and program structures to roadmap research in question & answering (Q&A). NIST
Vossen P (ed) (1998) EuroWordNet: a multilingual database with lexical semantic networks. Kluwer Academic Publishers, Norwell, MA, USA Wolf F, Gibson E (2005) Representing discourse coherence: A corpus-based study. Computational Linguistics 31(2):249–288
van der Wouden T, Hoekstra H, Moortgat M, Renmans B, Schuurman I (2002) Syntactic analysis in the spoken dutch corpus. In: Proceedings of the 3rd International Conference on Language Resources and Evaluation, Las Palmas, Spain, pp 768–773
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Bosma, W., Marsi, E., Krahmer, E., Theune, M. (2011). Text-to-Text Generation for Question Answering. In: van den Bosch, A., Bouma, G. (eds) Interactive Multi-modal Question-Answering. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17525-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-17525-1_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17524-4
Online ISBN: 978-3-642-17525-1
eBook Packages: Computer ScienceComputer Science (R0)