Abstract
Although the goal of traditional text summarization is to generate summaries with diverse information, most of those applications have no explicit definition of the information structure. Thus, it is difficult to generate truly structure-aware summaries because the information structure to guide summarization is unclear. In this paper, we present a novel framework to generate guided summaries for product reviews. The guided summary has an explicitly defined structure which comes from the important aspects of products. The proposed framework attempts to maximize expected aspect satisfaction during summary generation. The importance of an aspect to a generated summary is modeled using Labeled Latent Dirichlet Allocation. Empirical experimental results on consumer reviews of cars show the effectiveness of our method.
Similar content being viewed by others
References
Hu M, Liu B. Mining and summarizing customer reviews. In Proc. the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD2004), Seattle, USA, Aug. 22–25, 2004, pp.168–177.
Liu B, Hu M, Cheng J. Opinion observer: Analyzing and comparing opinions on the Web. In Proc. the 14th International Conference on World Wide Web (WWW2005), Chiba, Japan, May 10–14, 2005, pp.342–351.
Zhuang L, Jing F, Zhu X. Movie review mining and summarization. In Proc. the 15th ACM International Conference on Information and Knowledge Management, Arlington, USA, Nov. 5–11, 2006, pp.43–50.
Titov I, McDonald R. A joint model of text and aspect ratings for sentiment summarization. In Proc. ACL 2008: HLT, Columbus, USA, Jun. 15–20, 2008, pp.308–316.
Ramage D, Hall D, Nallapati R, Manning C D. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In Proc. the 2009 Conference on Empirical Methods in Natural Language Processing, Singapore, Aug. 6–7, 2009, pp.248-256.
Luhn H P. The automatic creation of literature abstracts. IBM Journal of Research and Development, April 1958, 2(2): 159–165.
Edmundson H P. New methods in automatic extracting. Journal of the ACM (JACM) Archive, 1969, 16(2): 264–285.
Erkan G, Radev D R. LexPageRank: Prestige in multi-document text summarization. In Proc. EMNLP, Barcelona, Spain, Jul. 25–26, 2004, pp.365–371.
Nenkova A, Vanderwende L, McKeown K. A compositional context sensitive multi-document summarizer: Exploring the factors that influence summarization. In Proc. the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, USA, Aug. 6–11, 2006, pp.573–580.
Toutanova K, Brockett C, Gamon M, Jagarlamundi J, Suzuki H, Vanderwende L. The pythy summarization system: Microsoft research at DUC 2007. In Proc. DUC 2007, New York, USA, Apr. 22–27, 2007.
Shen D, Sun J T, Li H, Yang Q, Chen Z. Document summarization using conditional random fields. In Proc. the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, Jan. 6–12, 2007, pp.2862-2867.
Sauper C, Barzilay R. Automatically generating Wikipedia articles: A structure-aware approach. In Proc. ACL 2009, Suntec, Singapore, Aug. 2–7, 2009, pp.208-216.
Lu Y, Zhai C X, Sundaresan N. Rated aspect summarization of short comments. In Proc. the 18th International Conference on World Wide Web, Madrid, Spain, Apr. 20–24, pp.131-140.
Ling X, Jiang J, He X, Mei Q, Zhai C, Schatz B. Generating gene summaries from biomedical literature: A study of semi-structured summarization. Information Processing and Management, 2007, 43(6): 1777–1791.
Ling X, Mei Q, Zhai C X, Schatz B. Mining multi-faceted overviews of arbitrary topics. In Proc. SIGKDD, Las Vegas, USA, Aug. 24–28, 2008, pp.497-505.
Carbonell J, Goldstein J. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proc. SIGIR, Melbourne, Australia, Aug. 24–28, 1998, pp.335-336.
Agrawal R, Gollapudi S, Halverson A, Ieong S. Diversifying search results. In Proc. the Second ACM International Conference on Web Search and Data Mining, Barcelona, Spain, Feb. 9–11, 2009, pp.5-14.
Hofmann T. Probabilistic latent semantic indexing. In Proc. the 22nd Annual International SIGIR Conference, Berkeley, US, Aug. 15–19, 1999, pp.50-57.
Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993–1022.
Mei Q, Shen X, Zhai C X. Automatic labeling of multinomial topic models. In Proc. the 2007 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, USA, Aug. 12–15, 2007, pp.490-499.
Lin C, Hovy E. The automated acquisition of topic signatures for text summarization. In Proc. the 18th Conference on Computational Linguistics, Saarbrücken, Germany, Jul. 31-Aug. 4, 2000, pp.495-501.
Ramage D, Dumais S, Liebling D. Characterizing microblogs with topic models. In Proc. ICWSM 2010, Washington, DC, USA, May 23–26, 2010.
Manning C, Schütze H. Foundations of Statistical Natural Language Processing. MIT Press, 1999, pp.152-189.
Griffiths T L, Steyvers M. Finding scientific topics. In Proc. the National Academy of Sciences of the United States of America, April 6, 2004, 101(Suppl. 1): 5228–5235.
Wilcoxon F. Individual comparisons by ranking methods. Biometrics, 1945, 1: 80–83.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was partly supported by the National Natural Science Foundation of China under Grant Nos. 60973104 and 60803075, and with the aid of a grant from the International Development Research Center, Ottawa, Canada IRCI Project.
Electronic Supplementary Material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Jin, F., Huang, ML. & Zhu, XY. Guided Structure-Aware Review Summarization. J. Comput. Sci. Technol. 26, 676–684 (2011). https://doi.org/10.1007/s11390-011-1167-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-011-1167-y