Abstract
Web queries often give rise to a lot of documents and the user is overwhelmed by the information. Query-specific extractive summarization of a selected set of retrieved documents helps the user to get a gist of the information. The current extractive summary generation systems focus on extracting query-relevant sentences from the documents. However, the selected sentences are presented either in the order in which the documents were considered or in the order in which they were selected. This approach does not guarantee a coherent summary. In this paper, we propose incremental integrated graph to represent the sentences in a collection of documents. Sentences from the documents are merged into a master sequence to improve coherence and flow. The same ordering is used for sequencing the sentences in the extracted summary. User evaluations indicate that the proposed technique markedly improves the user satisfaction with regard to coherence in the summary.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
CSTBank Corpus Available at, http://tangra.si.umich.edu/clair/CSTBank/phase1.htm
Barzilay, R., Elhadad, N., McKeown, K.R.: Sentence ordering in multidocument summarization. In: HLT 2001: Proceedings of the first international conference on Human language technology research, pp. 1–7. Association for Computational Linguistics, Morristown, NJ, USA (2001)
Frakes, W.B., Baeza-Yates, R.A. (eds.): Information Retrieval: Data Structures & Algorithms. Prentice-Hall, Englewood Cliffs (1992)
Goldstein, J., Carbonell, J.: Summarization (1) using mmr for diversity - based reranking and (2) evaluating summaries. In: Proceedings of a workshop, Baltimore, Maryland, pp. 181–195. Association for Computational Linguistics, Morristown, NJ, USA (1996)
Schlesinger, J.D., Conroy, J.M., Stewart, J.G.: CLASSY query-based multi-document summarization. In: Proceedings of the Document Understanding Conference (DUC) (2005)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)
Knight, K., Marcu, D.: Statistics-based summarization - step one: Sentence compression. In: Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence, pp. 703–710. AAAI Press / The MIT Press (2000)
Li, W., Wu, M., Lu, Q., Xu, W., Yuan, C.: Extractive summarization using inter- and intra- event relevance. In: ACL 2006: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL, pp. 369–376. Association for Computational Linguistics, Morristown, NJ, USA (2006)
Liddy, E.D., Paik, W., Yu, E.S., McVearry, K.A.: An overview of dr-link and its approach to document filtering. In: HLT 1993: Proceedings of the workshop on Human Language Technology, pp. 358–362. Association for Computational Linguistics, Morristown, NJ, USA (1993)
Mani, I., Bloedorn, E.: Multi-document summarization by graph search and matching. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI 1997), pp. 622–628. AAAI/IAAI (1997)
McKenna, M., Liddy, E.: Multiple & single document summarization using dr-link. In: Proceedings of a workshop, Baltimore, Maryland, pp. 215–221. Association for Computational Linguistics, Morristown, NJ, USA (1996)
Mihalcea, R., Tarau, P.: Multi-Document Summarization with Iterative Graph-based Algorithms. In: Proceedings of the First International Conference on Intelligent Analysis Methods and Tools (IA 2005), McLean, VA (May 2005)
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. In: Proceedings of the 7th International World Wide Web Conference, Brisbane, Australia, pp. 161–172 (1998)
Radev, D.R., Jing, H., Budzikowska, M.: Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. In: NAACL-ANLP 2000 Workshop on Automatic summarization, pp. 21–30. Association for Computational Linguistics, Morristown, NJ, USA (2000)
Radev, D.R., McKeown, K.R.: Generating natural language summaries from multiple on-line sources. Comput. Linguist. 24(3), 470–500 (1998)
Varadarajan, R., Hristidis, V.: A system for query-specific document summarization. In: CIKM 2006: Proceedings of the 15th ACM international conference on Information and knowledge management, pp. 622–631. ACM Press, New York (2006)
Witbrock, M.J., Mittal, V.O.: Ultra-summarization (poster abstract): a statistical approach to generating highly condensed non-extractive summaries. In: SIGIR 1999: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 315–316. ACM, New York (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chowdary, C.R., Sreenivasa Kumar, P. (2008). Sentence Ordering for Coherent Multi-document Summary Generation. In: Gray, A., Jeffery, K., Shao, J. (eds) Sharing Data, Information and Knowledge. BNCOD 2008. Lecture Notes in Computer Science, vol 5071. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70504-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-70504-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70503-1
Online ISBN: 978-3-540-70504-8
eBook Packages: Computer ScienceComputer Science (R0)