Multi-document Summarization Exploiting Semantic Analysis Based on Tag Cluster

Heu, Jee-Uk; Jeong, Jin-Woo; Qasim, Iqbal; Joo, Young-Do; Cho, Joon-Myun; Lee, Dong-Ho

doi:10.1007/978-3-642-35728-2_46

Jee-Uk Heu⁷,
Jin-Woo Jeong⁷,
Iqbal Qasim⁷,
Young-Do Joo⁸,
Joon-Myun Cho⁹ &
…
Dong-Ho Lee⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7733))

Abstract

Multi-document summarization techniques aim to reduce the documents into a small set of words or paragraphs that convey the main meaning of the original documents. Many approaches for multi-document summarization have used probability based methods and machine learning techniques to summarize multiple documents sharing a common topic at the same time. However, these techniques fail to semantically analyze proper nouns and newly-coined words because most of them depend on old-fashioned dictionary or thesaurus. To overcome these drawbacks, we propose a novel multi-document summarization technique which employs the tag cluster on Flickr, a kind of folksonomy systems, for detecting key sentences from multiple documents. We first create a word frequency table for analyzing the semantics and contribution of words by using HITS algorithm. Then, by exploiting tag clusters, we analyze the semantic relationship between words in the word frequency table. The experimental results on TAC 2008, 2009 data sets demonstrate the improvement of our proposed framework over existing summarization systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Mani, I.: Automatic Summarization. John Benjamins (2001)
Google Scholar
Barzilay, R., McKeown, K.R., Elhadad, M.: Information fusion in the context of multi-document summarization. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, ACL 1999, pp. 550–557. Association for Computational Linguistics (1999)
Google Scholar
Knight, K., Marcu, D.: Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence 139, 91–107 (2002)
Article Google Scholar
McKeown, K.R., Klavans, J.L., Hatzivassiloglou, V., et al.: Towards multidocument summarization by reformulation: Progress and prospects, pp. 453–460. John Wiley & Sons Ltd. (1999)
Google Scholar
Hennig, L., Labor, D.: Topic-based multi-document summarization with probabilistic latent semantic analysis. In: Proceedings of the International Conference RANLP, pp. 144–149 (2009)
Google Scholar
Wan, X., Yang, J.: Multi-document summarization using cluster-based link analysis. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, pp. 299–306. ACM (2008)
Google Scholar
Dang, C., Luo, X.: WordNet-based Document Summarization. In: Proceeding of the 7th WSEAS International Conference on Applied Computer & Applied Computational Science (ACACOS 2008), pp. 383–387 (2008)
Google Scholar
Zhu, J., Wang, C., He, X., et al.: Tag-oriented document summarization. In: Proceedings of the 18th International Conference on World Wide Web, WWW 2009, pp. 1195–1196. ACM (2009)
Google Scholar
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) 46, 604–632 (1999)
Article MathSciNet Google Scholar
Lin, C.Y., Hovy, E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language, NAACL 2003, pp. 71–78. Association for Computational Linguistics (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Hanyang University, Ansan, Kyeonggi-do, Korea
Jee-Uk Heu, Jin-Woo Jeong, Iqbal Qasim & Dong-Ho Lee
Division of Computer Media Engineering, Kangnam University, Yongin Kyeonggi-do, Korea
Young-Do Joo
SmartTV Research Center, ETRI(Electronics and Telecommunications Research Insitute), Korea
Joon-Myun Cho

Authors

Jee-Uk Heu
View author publications
You can also search for this author in PubMed Google Scholar
Jin-Woo Jeong
View author publications
You can also search for this author in PubMed Google Scholar
Iqbal Qasim
View author publications
You can also search for this author in PubMed Google Scholar
Young-Do Joo
View author publications
You can also search for this author in PubMed Google Scholar
Joon-Myun Cho
View author publications
You can also search for this author in PubMed Google Scholar
Dong-Ho Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Asia, 5 Danling Street, 100080, Beijing, China
Shipeng Li & Tao Mei &
School of Electrical Engineering and Computer Science, University of Ottawa, 800 King Edward, K1N 6N5, Ottawa, ON, Canada
Abdulmotaleb El Saddik
School of Computer and Information, Hefei University of Technology, Road Tunxi 193#, 230009, Hefei, Anhui, China
Meng Wang & Richang Hong &
Department of Information Engineering and Computer Science, University of Trento, ommarive 14, 38100, Trento, Italy
Nicu Sebe
Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, 117583, Singapore, Singapore
Shuicheng Yan
School of Computing, CLARITY: Centre for Sensor Web Technologies, Dublin City University, Glasnevin, 9, Dublin, Ireland
Cathal Gurrin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Heu, JU., Jeong, JW., Qasim, I., Joo, YD., Cho, JM., Lee, DH. (2013). Multi-document Summarization Exploiting Semantic Analysis Based on Tag Cluster. In: Li, S., et al. Advances in Multimedia Modeling. Lecture Notes in Computer Science, vol 7733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35728-2_46

Download citation

DOI: https://doi.org/10.1007/978-3-642-35728-2_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35727-5
Online ISBN: 978-3-642-35728-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics