Skip to main content

An Aspect-Driven Random Walk Model for Topic-Focused Multi-document Summarization

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7097))

Abstract

Recently, there has been increased interest in topic-focused multi-document summarization where the task is to produce automatic summaries in response to a given topic or specific information requested by the user. In this paper, we incorporate a deeper semantic analysis of the source documents to select important concepts by using a predefined list of important aspects that act as a guide for selecting the most relevant sentences into the summaries. We exploit these aspects and build a novel methodology for topic-focused multi-document summarization that operates on a Markov chain tuned to extract the most important sentences by following a random walk paradigm. Our evaluations suggest that the augmentation of important aspects with the random walk model can raise the summary quality over the random walk model up to 19.22%.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. duVerle, D.A., Prendinger, H.: A Novel Discourse Parser Based on Support Vector Machine Classification. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP (ACL 2009), vol. 2, pp. 665–673 (2009)

    Google Scholar 

  2. Fellbaum, C.: WordNet - An Electronic Lexical Database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  3. Filatova, E., Hatzivassiloglou, V., McKeown, K.: Automatic Creation of Domain Templates. In: Proceedings of the COLING/ACL on Main Conference Poster Sessions, COLING-ACL 2006, pp. 207–214 (2006)

    Google Scholar 

  4. Harabagiu, S., Lacatusu, F., Hickl, A.: Answering Complex Questions with Random Walk Models. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 220–227. ACM (2006)

    Google Scholar 

  5. Lin, C.Y.: ROUGE: A Package for Automatic Evaluation of Summaries. In: Proceedings of Workshop on Text Summarization Branches Out, Post-Conference Workshop of Association for Computational Linguistics, Barcelona, Spain, pp. 74–81 (2004)

    Google Scholar 

  6. Lin, C.Y., Hovy, E.H.: The Automated Acquisition of Topic Signatures for Text Summarization. In: Proceedings of the 18th Conference on Computational Linguistics, pp. 495–501 (2000)

    Google Scholar 

  7. Lin, C.Y., Hovy, E.H.: From Single to Multi-Document Summarization: A Prototype System and Its Evaluation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, pp. 457–464 (2002)

    Google Scholar 

  8. Mani, I., Maybury, M.T.: Advances in Automatic Text Summarization. MIT Press (1999)

    Google Scholar 

  9. Mann, W.C., Thompson, S.A.: Rhetorical Structure Theory: Toward a Functional Theory of Text Organization. Text 8(3), 243–281 (1988)

    Google Scholar 

  10. Marcu, D.: Improving Summarization Through Rhetorical Parsing Tuning. In: The Sixth Workshop on Very Large Corpora, Montreal, Canada, pp. 206–215 (1998)

    Google Scholar 

  11. Nastase, V.: Topic-Driven Multi-Document Summarization with Encyclopedic Knowledge and Spreading Activation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2008), pp. 763–772 (2008)

    Google Scholar 

  12. Otterbacher, J., Erkan, G., Radev, D.R.: Using Random Walks for Question-focused Sentence Retrieval. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, Canada, pp. 915–922 (2005)

    Google Scholar 

  13. Sekine, S.: Proteus Project OAK System (English Sentence Analyzer) (2002), http://nlp.nyu.edu/oak

  14. Sjöbergh, J.: Older Versions of the ROUGEeval Summarization Evaluation System Were Easier to Fool. Information Processing and Management 43, 1500–1505 (2007)

    Article  Google Scholar 

  15. White, M., Korelsky, T., Cardie, C., Ng, V., Pierce, D., Wagstaff, K.: Multidocument Summarization via Information Extraction. In: Proceedings of the First International Conference on Human Language Technology Research, HLT 2001, pp. 1–7 (2001)

    Google Scholar 

  16. Zhou, L., Ticrea, M., Hovy, E.H.: Multi-document Biography Summarization. CoRR abs/cs/0501078 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chali, Y., Hasan, S.A., Imam, K. (2011). An Aspect-Driven Random Walk Model for Topic-Focused Multi-document Summarization. In: Salem, M.V.M., Shaalan, K., Oroumchian, F., Shakery, A., Khelalfa, H. (eds) Information Retrieval Technology. AIRS 2011. Lecture Notes in Computer Science, vol 7097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25631-8_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25631-8_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25630-1

  • Online ISBN: 978-3-642-25631-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics