Skip to main content

Multi-document Summarization Exploiting Semantic Analysis Based on Tag Cluster

  • Conference paper
Advances in Multimedia Modeling

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7733))

Abstract

Multi-document summarization techniques aim to reduce the documents into a small set of words or paragraphs that convey the main meaning of the original documents. Many approaches for multi-document summarization have used probability based methods and machine learning techniques to summarize multiple documents sharing a common topic at the same time. However, these techniques fail to semantically analyze proper nouns and newly-coined words because most of them depend on old-fashioned dictionary or thesaurus. To overcome these drawbacks, we propose a novel multi-document summarization technique which employs the tag cluster on Flickr, a kind of folksonomy systems, for detecting key sentences from multiple documents. We first create a word frequency table for analyzing the semantics and contribution of words by using HITS algorithm. Then, by exploiting tag clusters, we analyze the semantic relationship between words in the word frequency table. The experimental results on TAC 2008, 2009 data sets demonstrate the improvement of our proposed framework over existing summarization systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Mani, I.: Automatic Summarization. John Benjamins (2001)

    Google Scholar 

  2. Barzilay, R., McKeown, K.R., Elhadad, M.: Information fusion in the context of multi-document summarization. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, ACL 1999, pp. 550–557. Association for Computational Linguistics (1999)

    Google Scholar 

  3. Knight, K., Marcu, D.: Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence 139, 91–107 (2002)

    Article  Google Scholar 

  4. McKeown, K.R., Klavans, J.L., Hatzivassiloglou, V., et al.: Towards multidocument summarization by reformulation: Progress and prospects, pp. 453–460. John Wiley & Sons Ltd. (1999)

    Google Scholar 

  5. Hennig, L., Labor, D.: Topic-based multi-document summarization with probabilistic latent semantic analysis. In: Proceedings of the International Conference RANLP, pp. 144–149 (2009)

    Google Scholar 

  6. Wan, X., Yang, J.: Multi-document summarization using cluster-based link analysis. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, pp. 299–306. ACM (2008)

    Google Scholar 

  7. Dang, C., Luo, X.: WordNet-based Document Summarization. In: Proceeding of the 7th WSEAS International Conference on Applied Computer & Applied Computational Science (ACACOS 2008), pp. 383–387 (2008)

    Google Scholar 

  8. Zhu, J., Wang, C., He, X., et al.: Tag-oriented document summarization. In: Proceedings of the 18th International Conference on World Wide Web, WWW 2009, pp. 1195–1196. ACM (2009)

    Google Scholar 

  9. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) 46, 604–632 (1999)

    Article  MathSciNet  Google Scholar 

  10. Lin, C.Y., Hovy, E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language, NAACL 2003, pp. 71–78. Association for Computational Linguistics (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Heu, JU., Jeong, JW., Qasim, I., Joo, YD., Cho, JM., Lee, DH. (2013). Multi-document Summarization Exploiting Semantic Analysis Based on Tag Cluster. In: Li, S., et al. Advances in Multimedia Modeling. Lecture Notes in Computer Science, vol 7733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35728-2_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35728-2_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35727-5

  • Online ISBN: 978-3-642-35728-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics