iDVS: An Interactive Multi-document Visual Summarization System

Zhang, Yi; Wang, Dingding; Li, Tao

doi:10.1007/978-3-642-23808-6_37

Yi Zhang²³,
Dingding Wang²³ &
Tao Li²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6913))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

5790 Accesses
2 Citations

Abstract

Multi-document summarization is a fundamental tool for understanding documents. Given a collection of documents, most of existing multi- document summarization methods automatically generate a static summary for all the users using unsupervised learning techniques such as sentence ranking and clustering. However, these methods almost exclude human from the summarization process. They do not allow for user interaction and do not consider users’ feedback which delivers valuable information and can be used as the guidance for summarization. Another limitation is that the generated summaries are displayed in textual format without visual representation. To address the above limitations, in this paper, we develop iDVS, a visualization-enabled multi-document summarization system with users’ interaction, to improve the summarization performance using users’ feedback and to assist users in document understanding using visualization techniques. In particular, iDVS uses a new semi-supervised document summarization method to dynamically select sentences based on users’ interaction. To this regard, iDVS tightly integrates semi-supervised learning with interactive visualization for document summarization. Comprehensive experiments on multi-document summarization using benchmark datasets demonstrate the effectiveness of iDVS, and a user study is conducted to evaluate the users’ satisfaction.

Download to read the full chapter text

Chapter PDF

Multi-document Summarizer

Multi-Document Extractive Summarization as a Non-linear Combinatorial Optimization Problem

Literature Study on Multi-document Text Summarization Techniques

Keywords

References

Agarwal, G., Kempe, D.: Modularity-maximizing graph communities via mathematical programming. The European Physical Journal B - Condensed Matter and Complex Systems 66(3), 409–418 (2008)
Article MATH MathSciNet Google Scholar
Allan, J., Leouski, A.V., Swan, R.C.: Interactive cluster visualization for information retrieval. In: ECDL (1998)
Google Scholar
Ando, R., Boguraev, B., Byrd, R., Neff, M.: Visualization-enabled multi-document summarization by iterative residual rescaling. Nat. Lang. Eng. 11(1), 67–86 (2005)
Article Google Scholar
Belkin, M., Niyogi, P.: Towards a theoretical foundation for laplacian-based manifold methods. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS (LNAI), vol. 3559, pp. 486–500. Springer, Heidelberg (2005)
Chapter Google Scholar
Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge (2006)
Google Scholar
Chen, K., Liu, L.: Vista: validating and refining clusters via visualization. Information Visualization 3(4), 257–270 (2004)
Article Google Scholar
Chen, K., Liu, L.: ivibrate: Interactive visualization-based framework for clustering large datasets. ACM Trans. Inf. Syst. 24(2), 245–294 (2006)
Article Google Scholar
Conroy, J., O’Leary, D.: Text summarization via hidden markov models. In: SIGIR, pp. 406–407 (2001)
Google Scholar
Ding, C., Jin, R., Li, T., Simon, H.D.: A learning framework using green’s function and kernel regularization with application to recommender system. In: SIGKDD (2007)
Google Scholar
Don, A., Zheleva, E., Gregory, M., Tarkan, S., Auvil, L., Clement, T., Shneiderman, B., Plaisant, C.: Discovering interesting usage patterns in text collections: integrating text mining with visualization. In: CIKM, pp. 213–222 (2007)
Google Scholar
Erkan, G., Radev, D.: Lexpagerank: Prestige in multi-document text summarization. In: EMNLP (2004)
Google Scholar
Goldstein, J., Kantrowitz, M., Mittal, V., Carbonell, J.: Summarizing text documents: Sentence selection and evaluation metrics. In: SIGIR, pp. 121–128 (1999)
Google Scholar
Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: SIGIR, pp. 75–95 (2001)
Google Scholar
Grinstain, G., Ankerst, M., Keim, D.: Visual data mining: Background, applications, ad drug discovery applications. In: SIGMOD (1999)
Google Scholar
Havre, S., Hetzler, E., Whitney, P., Nowell, L.: Themeriver: Visualizing thematic changes in large document collections. IEEE Transactions on Visualization and Computer Graphics 8(1), 9–20 (2002)
Article Google Scholar
Hearst, M.A.: Tilebars: visualization of term distribution information in full text information access. In: CHI, pp. 59–66 (1995)
Google Scholar
Hein, M., Audibert, J., Von Luxburg, U.: From graphs to manifolds - weak and strong pointwise consistency of graph laplacians. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS (LNAI), vol. 3559, pp. 470–485. Springer, Heidelberg (2005)
Chapter Google Scholar
Hinneburg, A., Keim, D., Wawryniuk, M.: Visual mining of high-dimensional data. IEEE Computer Graphics and Applications (1999)
Google Scholar
Hu, M., Sun, A., Lim, E.-P.: Comments-oriented document summarization: understanding documents with readers’ feedback. In: SIGIR, pp. 291–298 (2008)
Google Scholar
Jiao, B., Yang, L., Xu, J., Wu, F.: Visual summarization of web pages. In: SIGIR, pp. 499–506 (2010)
Google Scholar
Kerr, B.: Thread arcs: an email thread visualization. In: InfoVis, pp. 211–218 (2003)
Google Scholar
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: NIPS (2001)
Google Scholar
Lin, C.-Y., Hovy, E.: From single to multi-document summarization: A prototype system and its evaluation. In: ACL, pp. 457–464 (2001)
Google Scholar
Lin, C.-Y., Hovy, E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: NLT-NAACL, pp. 71–78 (2003)
Google Scholar
Liu, S., Zhou, M.X., Pan, S., Qian, W., Cai, W., Lian, X.: Interactive, topic-based visual text summarization and analysis. In: CIKM, pp. 543–552 (2009)
Google Scholar
Nardi, B.A., Whittaker, S., Isaacs, E., Creech, M., Johnson, J., Hainsworth, J.: Integrating communication and information through contactmap. Commun. ACM 45(4), 89–95 (2002)
Article Google Scholar
Noack, A.: Modularity clustering is force-direced layout. Physical Review E 79, 026102 (2009)
Article Google Scholar
Perer, A., Smith, M.A.: Contrasting portraits of email practices: visual approaches to reflection and analysis. In: AVI 2006, pp. 389–395 (2006)
Google Scholar
Radev, D., Jing, H., Stys, M., Tam, D.: Centroid-based summarization of multiple documents. In: Information Processing and Management, pp. 919–938 (2004)
Google Scholar
Rennison, E.: Galaxy of news: an approach to visualizing and understanding expansive news landscapes. In: UIST 1994, pp. 3–12 (1994)
Google Scholar
Shen, D., Sun, J.-T., Li, H., Yang, Q., Chen, Z.: Document summarization using conditional random fields. In: IJCAI, pp. 2862–2867 (2007)
Google Scholar
Stasko, J., Görg, C., Liu, Z.: Jigsaw: supporting investigative analysis through interactive visualization. Information Visualization 7(2), 118–132 (2008)
Article Google Scholar
Wang, D., Li, T., Zhu, S., Ding, C.H.Q.: Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In: SIGIR, pp. 307–314 (2008)
Google Scholar
Wattenberg, M., Viégas, F.B.: The word tree, an interactive visual concordance. IEEE Transactions on Visualization and Computer Graphics 14(6), 1221–1228 (2008)
Article Google Scholar
Wong, K.-F., Wu, M., Li, W.: Extractive summarization using supervised and semi-supervised learning. In: Coling (2008)
Google Scholar
Yang, L.: n23tool: A tool for exploring large relational datasets through 3d dynamic projections. In: CIKM (2000)
Google Scholar
Yih, W.-T., Goodman, J., Vanderwende, L., Suzuki, H.: Multi-document summarization by maximizing informative content-words. In: IJCAI, pp. 1776–1782 (2007)
Google Scholar
Zhou, D., Bousquet, O., Navin Lal, T., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: NIPS, vol. 16, pp. 321–328 (2004)
Google Scholar
Zhu, X.: Semi-supervised learning literature survey. Technical report, Computer Sciences, University of Wisconsin-Madison (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Florida International University, Miami, FL, 33199, USA
Yi Zhang, Dingding Wang & Tao Li

Authors

Yi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Dingding Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tao Li
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics and Telecommunications, University of Athens, Panepistimioupolis, Ilisia, 15784, Athens, Greece
Dimitrios Gunopulos
Google Switzerland GmbH, Brandschenkestrasse 110, 8002, Zurich, Switzerland
Thomas Hofmann
Department of Computer Science, University of Bari “Aldo Moro”, via Orabona 4, 70125, Bari, Italy
Donato Malerba
Deptartment of Informatics, Athens University of Economics and Business, Patision 76, 10434, Athens, Greece
Michalis Vazirgiannis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Y., Wang, D., Li, T. (2011). iDVS: An Interactive Multi-document Visual Summarization System. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science(), vol 6913. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23808-6_37

Download citation

DOI: https://doi.org/10.1007/978-3-642-23808-6_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23807-9
Online ISBN: 978-3-642-23808-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

iDVS: An Interactive Multi-document Visual Summarization System

Abstract

Chapter PDF

Similar content being viewed by others

Multi-document Summarizer

Multi-Document Extractive Summarization as a Non-linear Combinatorial Optimization Problem

Literature Study on Multi-document Text Summarization Techniques

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

iDVS: An Interactive Multi-document Visual Summarization System

Abstract

Chapter PDF

Similar content being viewed by others

Multi-document Summarizer

Multi-Document Extractive Summarization as a Non-linear Combinatorial Optimization Problem

Literature Study on Multi-document Text Summarization Techniques

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation