Skip to main content

Multi-document Summarization Based on Unsupervised Clustering

  • Conference paper
Book cover Information Retrieval Technology (AIRS 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4182))

Included in the following conference series:

Abstract

In this paper, we propose a method for multi-document summarization based on unsupervised clustering. First, the main topics are determined by a MDL-based clustering strategy capable of inferring optimal cluster numbers. Then, the problem of multi-document summarization is formalized on the clusters using an entropy-based object function.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barzilay, R., McKeown, K.R., Elhadad, M.: Information fusion in the context of multidocument summarization. In: Proceedings of the 37th ACL, Maryland (1999)

    Google Scholar 

  2. Boros, E., Kantor, P.B., Neu, D.J.: A Clustering Based Approach to Creating Multi- Document Summaries. In: Proceedings of the 24th ACM SIGIR Conference, LA (2001)

    Google Scholar 

  3. Bouman, C.A., Shapiro, M., Cook, G.W., Atkins, C.B., Cheng, H.: Cluster: An unsupervised algorithm for modeling Gaussian mixtures (1998)

    Google Scholar 

  4. Hardy, H., Shimizu, N.: Cross-Document Summarization by Concept Classification. In: SIGIR 2002, pp. 121–128 (2002)

    Google Scholar 

  5. Hatzivassiloglou, V., Klavans, J., Eskin, E.: Detecting text similarity over short passages: exploring linguistic feature combinations via machine learning. In: Proceedings of EMNLP 1999 (1999)

    Google Scholar 

  6. Knight, K., Marcu, D.: Summarization beyond sentence extraction: a probabilistic approach to sentence compression. Artificial Intelligence 139(1) (2002)

    Google Scholar 

  7. Mann, W., Thompson, S.: Rhetorical structure theory: towards a functional theory of text organization. Text 1988 8(3), 243–281 (1988)

    Google Scholar 

  8. Over, P., Yen, J.: An Introduction to DUC2004: Intrinsic Evaluation of Generic New Text Summarization Systems. In: Proceedings of DUC 2004 (2004)

    Google Scholar 

  9. Radev, D., Allison, T., Goldensohn, S.B., Blitzer, J., Çelebi, A., Dimitrov, S., Drabek, E., Hakim, A., Lam, W., Liu, D., Otterbacher, J., Qi, H., Saggion, H.: MEAD - a platform for multidocument multilingual text summarization. In: Proceedings of LREC 2004, Lisbon, Portugal (May 2004)

    Google Scholar 

  10. Siddharthan, A., Nenkova, A., McKeown, K.: Syntactic Simplication for Improving Content Selection in Multi-Document Summarization. In: Proceeding of COLING 2004, Geneva, Switzerland (2004)

    Google Scholar 

  11. Stein, G.C., Bagga, A., Wise, G.B.: Multi-Document Summarization: Methodologies and Evaluations. In: Conference TALN 2000, Lausanne (October 2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ji, P. (2006). Multi-document Summarization Based on Unsupervised Clustering. In: Ng, H.T., Leong, MK., Kan, MY., Ji, D. (eds) Information Retrieval Technology. AIRS 2006. Lecture Notes in Computer Science, vol 4182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11880592_46

Download citation

  • DOI: https://doi.org/10.1007/11880592_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45780-0

  • Online ISBN: 978-3-540-46237-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics