Abstract
Multi-document summarization aims to create a single summary based on the information conveyed by a collection of texts. After the candidate sentences have been identified and ordered, it is time to select which will be included in the summary. In this paper, we describe an approach that uses sentence reduction, both lexical and syntactic, to help improve the compression step in the summarization process. Three different algorithms are proposed and discussed. Sentence reduction is performed by removing specific sentential constructions conveying information that can be considered to be less relevant to the general message of the summary. Thus, the rationale is that sentence reduction not only removes expendable information, but also makes room for further relevant data in a summary.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Silveira, S.B., Branco, A.: Enhancing multi-document summaries with sentence simplification. In: ICAI 2012: International Conference on Artificial Intelligence, Las Vegas, USA, July 2012, pp. 742–748 (2012)
Lin, C.Y.: Improving summarization performance by sentence compression: a pilot study. In: Proceedings of the Sixth International Workshop on Information Retrieval with Asian Languages, AsianIR ’03, Stroudsburg, PA, USA, vol. 11, 1–8. Association for Computational Linguistics (2003)
Berg-Kirkpatrick, T., Gillick, D., Klein, D.: Jointly learning to extract and compress. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT ’11, Stroudsburg, PA, USA, vol. 1, pp. 481–490. Association for Computational Linguistics (2011)
Marsi, E., Krahmer, E., Hendrickx, I., Daelemans, W.: Empirical Methods in Natural Language Generation. Springer, Heidelberg (2010)
Feng, L.: Text simplification: a survey. Technical report, The City University of New York (2008)
Chandrasekar, R., Doran, C., Srinivas, B.: Motivations and methods for text simplification. In: Proceedings of the Sixteenth International Conference on Computational Linguistics (COLING ’96), pp. 1041–1044 (1996)
Jing, H.: Sentence reduction for automatic text summarization. In: Proceedings of the Sixth Conference on Applied Natural Language Processing, Morristown, NJ, USA, pp. 310–315. Association for Computational Linguistics (2000)
Jing, H., McKeown, K.R.: Cut and paste based text summarization. In: Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference, NAACL 2000, Stroudsburg, PA, USA, pp. 178–185. ACL (2000)
Blair-Goldensohn, S., Evans, D., Hatzivassiloglou, V., Mckeown, K., Nenkova, A., Passonneau, R., Schiffman, B., Schlaikjer, A., Advaith, Siddharthan, A., Siegelman, S.: Columbia university at duc. In: Proceedings of the 2004 Document Understanding Conference (DUC 2004), HLT/NAACL 2004, Boston, Massachusetts, pp. 23–30 (2004)
Conroy, J., Schlesinger, J., Stewart, J.: Classy query-based multidocument summarization. In: Proceedings of 2005 Document Understanding Conference, Vancouver, BC (2005)
Siddharthan, A., Nenkova, A., McKeown, K.: Syntactic simplification for improving content selection in multi-document summarization. In: COLING ’04: Proceedings of the 20th International Conference on Computational Linguistics, Morristown, NJ, USA, p. 896. ACL (2004)
Zajic, D., Dorr, B.J., Lin, J., Schwartz, R.: Multi-candidate reduction: sentence compression as a tool for document summarization tasks. Inf. Process. Manag. 43(6), 1549–1570 (2007)
Cohn, T., Lapata, M.: Sentence compression as tree transduction. J. Artif. Intell. Res. (JAIR) 34, 637–674 (2009)
Filippova, K.: Multi-sentence compression: finding shortest paths in word graphs. In: Proceedings of the 23rd International Conference on Computational Linguistics, COLING ’10, Stroudsburg, PA, USA, pp. 322–330. ACL (2010)
Lloret, E.: Text summarisation based on human language technologies and its applications. Ph.D. thesis, Universidad de Alicante (2011)
Wubben, S., van den Bosch, A., Krahmer, E.: Sentence simplification by monolingual machine translation. In: ACL - The 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, Jeju Island, Korea, 8–14 July 2012, vol. 1: Long Papers, pp.1015–1024 The Association for Computer Linguistics (2012)
Yoshikawa, K., Iida, R., Hirao, T., Okumura, M.: Sentence compression with semantic role constraints. In: ACL - The 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, Jeju Island, Korea, 8–14 July 2012, vol. 2: Short Papers, pp. 349–353. The Association for Computer Linguistics (2012)
Silveira, S.B., Branco, A.: Combining a double clustering approach with sentence simplification to produce highly informative multi-document summaries. In: IRI 2012: 14th International Conference on Artificial Intelligence, Las Vegas, USA, August 2012, pp. 482–489 (2012)
Silva, J., Branco, A., Castro, S., Reis, R.: Out-of-the-box robust parsing of Portuguese. In: Pardo, T.A.S., Branco, A., Klautau, A., Vieira, R., de Lima, V.L.S. (eds.) PROPOR 2010. LNCS, vol. 6001, pp. 75–85. Springer, Heidelberg (2010)
Levy, R., Andrew, G.: Tregex and Tsurgeon: tools for querying and manipulating tree data structures. In: Proceedings of the 5th Language Resources and Evaluation Conference (LREC) (2006)
Silveira, S.B., Branco, A.: Compressing multi-document summaries through sentence simplification. In: ICAART 2013: 5th International Conference on Agents and Artificial Intelligence, Barcelona, Spain, February 2013
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Silveira, S.B., Branco, A. (2014). Sentence Reduction Algorithms to Improve Multi-document Summarization. In: Filipe, J., Fred, A. (eds) Agents and Artificial Intelligence. ICAART 2013. Communications in Computer and Information Science, vol 449. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44440-5_16
Download citation
DOI: https://doi.org/10.1007/978-3-662-44440-5_16
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44439-9
Online ISBN: 978-3-662-44440-5
eBook Packages: Computer ScienceComputer Science (R0)