Topic and sentiment aware microblog summarization for twitter

Ali, Syed Muhammad; Noorian, Zeinab; Bagheri, Ebrahim; Ding, Chen; Al-Obeidat, Feras

doi:10.1007/s10844-018-0521-8

Topic and sentiment aware microblog summarization for twitter

Published: 08 August 2018

Volume 54, pages 129–156, (2020)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Syed Muhammad Ali¹,
Zeinab Noorian¹,
Ebrahim Bagheri¹,
Chen Ding² &
…
Feras Al-Obeidat³

1196 Accesses
17 Citations
Explore all metrics

Abstract

Recent advances in microblog content summarization has primarily viewed this task in the context of traditional multi-document summarization techniques where a microblog post or their collection form one document. While these techniques already facilitate information aggregation, categorization and visualization of microblog posts, they fall short in two aspects: i) when summarizing a certain topic from microblog content, not all existing techniques take topic polarity into account. This is an important consideration in that the summarization of a topic should cover all aspects of the topic and hence taking polarity into account (sentiment) can lead to the inclusion of the less popular polarity in the summarization process. ii) Some summarization techniques produce summaries at the topic level. However, it is possible that a given topic can have more than one important aspect that need to have representation in the summarization process. Our work in this paper addresses these two challenges by considering both topic sentiments and topic aspects in tandem. We compare our work with the state of the art Twitter summarization techniques and show that our method is able to outperform existing methods on standard metrics such as ROUGE-1.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A review on sentiment analysis and emotion detection from text

Article 28 August 2021

Pansy Nandwani & Rupali Verma

Social media analytics: a survey of techniques, tools and platforms

Article Open access 26 July 2014

Bogdan Batrinca & Philip C. Treleaven

A survey of sentiment analysis in social media

Article 04 July 2018

Lin Yue, Weitong Chen, … Minghao Yin

Notes

References

Abel, F., Gao, Q., Houben, G.-J., Tao, K. (2011). Analyzing user modeling on twitter for personalized news recommendations, User Modeling, Adaption and Personalization, pp. 1–12.
Google Scholar
Abdullah, Z., & Hamdan, A. (2015). Hierarchical clustering algorithms in data mining.
Ackermann, M.R., Blömer, J., Kuntze, D., Sohler, C. (2014). Analysis of agglomerative clustering. Algorithmica, 69(1), 184–215.
Article MathSciNet MATH Google Scholar
Amigó, E., De Albornoz, J.C., Chugur, I., Corujo, A., Gonzalo, J., Martín, T., Meij, E., De Rijke, M., Spina, D. (2013). Overview of replab 2013: Evaluating online reputation monitoring systems. In International Conference of the Cross-Language Evaluation Forum for European Languages, pp. 333–352 Springer.
Google Scholar
Atefeh, F., & Khreich, W. (2015). A survey of techniques for event detection in twitter. Computational Intelligence, 31(1), 132–164.
Article MathSciNet Google Scholar
Bhargava, R., Sharma, Y., Sharma, G. (2016). Atssi: Abstractive text summarization using sentiment infusion. Procedia Computer Science, 89, 404–411.
Article Google Scholar
Bian, J., Yang, Y., Zhang, H., Chua, T. -S. (2015). Multimedia summarization for social events in microblog stream. IEEE Transactions on Multimedia, 17(2), 216–228.
Article Google Scholar
Bild, D.R., Liu, Y., Dick, R.P., Mao, Z.M., Wallach, D.S. (2015). Aggregate characterization of user behavior in twitter and analysis of the retweet graph. ACM Transactions on Internet Technology (TOIT), 15(1), 4.
Article Google Scholar
Biryukov, M., Angheluta, R., Moens, M. -F. (2005). Multidocument question answering text summarization using topic signatures. JDIM, 3(1), 27–33.
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993–1022.
MATH Google Scholar
Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008(10), P10008.
Article MATH Google Scholar
Carrillo-de Albornoz, J., Amigó, E., Plaza, L., Gonzalo, J. (2016). Tweet stream summarization for online reputation management. In European Conference on Information Retrieval, pp. 378–389 Springer.
Chakrabarti, D., & Punera, K. (2011). Event summarization using tweets. ICWSM, 11, 66–73.
Google Scholar
De Maio, C., Fenza, G., Gallo, M., Loia, V., Senatore, S. (2014). Formal and relational concept analysis for fuzzy-based automatic semantic annotation. Applied intelligence, 40(1), 154–177.
Article Google Scholar
De Maio, C., Fenza, G., Loia, V., Parente, M. (2016). Time aware knowledge extraction for microblog summarization on twitter. Information Fusion, 28, 60–74.
Article Google Scholar
Dongen, S. (2000). Performance criteria for graph clustering and markov cluster experiments.
Erkan, G., & Radev, D.R. (2004a). Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 22, 457–479.
Article Google Scholar
Erkan, G., & Radev, D.R. (2004b). Lexpagerank: Prestige in multi-document text summarization. In EMNLP, (Vol. 4 pp. 365–371).
Feng, Y., Bagheri, E., Ensan, F., Jovanovic, J. (2017). The state of the art in semantic relatedness: A framework for comparison, The Knowledge Engineering Review.
Feng, Y., Zarrinkalam, F., Bagheri, E., Fani, H., Al-Obeidat, F. (2018). Entity linking of tweets based on dominant entity candiyears. Social Network Analysis and Mining, 8(1), 46.
Article Google Scholar
Ferragina, P., & Scaiella, U. (2010). Tagme: on-the-fly annotation of short text fragments (by wikipedia entities). In Proceedings of the 19th ACM international conference on Information and knowledge management, pp. 1625–1628 ACM.
Ganesan, K., Zhai, C., Han, J. (2010). Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions. In Proceedings of the 23rd international conference on computational linguistics, pp. 340–348 Association for Computational Linguistics.
Genest, P.-E., & Lapalme, G. (2012). Fully abstractive approach to guided summarization. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2, pp. 354–358 Association for Computational Linguistics.
Go, A., Bhayani, R., Huang, L. (2009). Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, 1(2009), 12.
Google Scholar
Goldstein, J., Kantrowitz, M., Mittal, V., Carbonell, J. (1999). Summarizing text documents: sentence selection and evaluation metrics. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 121–128 ACM.
Gong, Y., & Liu, X. (2001). Generic text summarization using relevance measure and latent semantic analysis. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 19–25 ACM.
Haghighi, A., & Vanderwende, L. (2009). Exploring content models for multi-document summarization. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 362–370 Association for Computational Linguistics.
Hennig, L., & Labor, D. (2009). Topic-based multi-document summarization with probabilistic latent semantic analysis. In Ranlp (pp. 144–149).
Hu, X., Zhang, X., Lu, C., Park, E.K., Zhou, X. (2009). Exploiting wikipedia as external knowledge for document clustering. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 389–396 ACM.
Hu, Y.-H., Chen, Y.-L., Chou, H.-L. (2017). Opinion mining from online hotel reviews–a text summarization approach. Information Processing & Management, 53(2), 436–449.
Article Google Scholar
Inouye, D., & Kalita, J.K. (2011). Comparing twitter summarization algorithms for multiple post summaries. In 2011 IEEE 3rd international conference on privacy, security, risk and trust (PASSAT) and 2011 IEEE 3rd Inernational conference on social computing (SocialCom), pp. 298–306 IEEE.
Jashki, M.-A., Makki, M., Bagheri, E., Ghorbani, A.A. (2009). An iterative hybrid filter-wrapper approach to feature selection for document clustering. In Proceedings of the 22Nd Canadian Conference on Artificial Intelligence: Advances in Artificial Intelligence, Canadian AI ’09 (pp. 74–85). Berlin: Springer.
Chapter Google Scholar
Jing, H., & McKeown, K.R. (2000). Cut and paste based text summarization. In Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference, pp. 178–185 Association for Computational Linguistics.
Jones, K.S. (2007). Automatic summarising: The state of the art. Information Processing & Management, 43(6), 1449–1481.
Article Google Scholar
Knight, K., & Marcu, D. (2002). Summarization beyond sentence extraction: a probabilistic approach to sentence compression. Artificial Intelligence, 139(1), 91–107.
Article MATH Google Scholar
Ku, L.-W., Liang, Y.-T., Chen, H.-H. (2006). Opinion extraction, summarization and tracking in news and blog corpora. In Proceedings of AAAI, pp. 100–107.
Lin, C.-Y., & Hovy, E. (2002). From single to multi-document summarization: A prototype system and its evaluation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 457–464 Association for Computational Linguistics.
Lin, C.-Y., & Hovy, E. (2003). Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1, pp. 71–78 Association for Computational Linguistics.
Lin, C., Li, J., Wang, D., Chen, Y., Li, T. (2012). Generating event storylines from microblogs. In Proceedings of the 21st ACM international conference on Information and knowledge management, pp. 175–184 ACM.
Ling, X., Mei, Q., Zhai, C., Schatz, B. (2008). Mining multi-faceted overviews of arbitrary topics in a text collection. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 497–505 ACM.
Liu, F., Flanigan, J., Thomson, S., Sadeh, N., Smith, N.A. (2015). Toward abstractive summarization using semantic representations.
Lloret, E., & Palomar, M. (2011). Analyzing the use of word graphs for abstractive text summarization. In Proceedings of the First International Conference on Advances in Information Mining and Management, Barcelona (pp. 61–6).
Louis, A, & Nenkova, A. (2009). Automatically evaluating content selection in summarization without human models. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 1, pp. 306–314 Association for Computational Linguistics.
Mani, I. (2001). Automaticsummarization, Vol. 3, John Benjamins Publishing, Amsterdam.
Marcus, A., Bernstein, M.S., Badar, O., Karger, D.R., Madden, S., Miller, R.C. (2011). Twitinfo: aggregating and visualizing microblogs for event exploration. In Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 227–236 ACM.
Meila, M. (2003). Comparing clusterings by the variation of information. In Colt, vol. 3, pp. 173–187 Springer.
Chapter Google Scholar
Miao, Y., & Li, C. (2010). Enhancing query-oriented summarization based on sentence wikification. In Workshop of the 33 rd Annual International (p. 32).
Mihalcea, R., & Tarau, P. (2004). Textrank: Bringing order into text. In EMNLP, (Vol. 4 pp. 404–411).
Mihalcea, R., & Tarau, P. (2005). A language independent algorithm for single and multiple document summarization. In Proceedings of IJCNLP, Vol. 5.
Newman, M.E. (2006). Finding community structure in networks using the eigenvectors of matrices. Physical review E, 74(3), 036104.
Article MathSciNet Google Scholar
Nichols, J., Mahmud, J., Drews, C. (2012). Summarizing sporting events using twitter. In Proceedings of the 2012 ACM international conference on Intelligent User Interfaces, pp. 189–198 ACM.
Ohsawa, Y., Benson, N.E., Yachida, M. (1998). Keygraph: Automatic indexing by co-occurrence graph based on building construction metaphor. In Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries, 1998. ADL 98. pp. 12–18 IEEE.
Piryani, R., Gupta, V., Kumar Singh, V. (2018). Generating aspect-based extractive opinion summary: Drawing inferences from social media texts. Computación y Sistemas, 1, 22.
Google Scholar
Radev, D.R., Jing, H., Styś, M., Tam, D. (2004). Centroid-based summarization of multiple documents. Information Processing & Management, 40(6), 919–938.
Article MATH Google Scholar
Ramage, D., & Rosen, E. (2011). Stanford topic modeling toolbox.
Rosvall, M., & Bergstrom, C.T. (2008). Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences, 105(4), 1118–1123.
Article Google Scholar
Saggion, H., Torres-Moreno, J.-M., Cunha, I.d., SanJuan, E. (2010). Multilingual summarization evaluation without human models. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 1059–1067 Association for Computational Linguistics.
Saif, H., He, Y., Alani, H. (2012). Semantic sentiment analysis of twitter. The Semantic Web–ISWC 2012, pp. 508–524.
Sharifi, B., Hutton, M.-A., Kalita, J.K. (2010). Experiments in microblog summarization. In Social Computing (SocialCom), 2010 IEEE Second International Conference on, pp. 49–56 IEEE.
Sharifi, B.P., Inouye, D.I., Kalita, J.K. (2013). Summarization of twitter microblogs. The Computer Journal, 57(3), 378–402.
Article Google Scholar
Steinbach, M., Karypis, G., Kumar, V., et al. (2000). A comparison of document clustering techniques. In KDD workshop on text mining, vol. 400, pp. 525–526 Boston.
Sun, J.-T., Shen, D., Zeng, H.-J., Yang, Q. , Lu, Y., Chen, Z. (2005). Web-page summarization using clickthrough data. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 194–201 ACM.
Titov, I., & McDonald, R. (2008). A joint model of text and aspect ratings for sentiment summarization, Proceedings of ACL-08: HLT, pp. 308–316.
Torres-Moreno, J.-M., St-Onge, P.-L., Gagnon, M., El-Beze, M., Bellot, P. (2009). Automatic summarization system coupled with a question-answering system (qaas), arXiv:0905.2990.
Vanderwende, L., Suzuki, H., Brockett, C., Nenkova, A. (2007). Beyond sumbasic: Task-focused summarization with sentence simplification and lexical expansion. Information Processing & Management, 43(6), 1606–1618.
Article Google Scholar
Varga, A., Basave, A.E.C., Rowe, M., Ciravegna, F., He, Y. (2014). Linked knowledge sources for topic classification of microposts: a semantic graph-based approach. Web Semantics: Science, Services and Agents on the World Wide Web, 26, 36–57.
Article Google Scholar
Wan, X., & Yang, J. (2008). Multi-document summarization using cluster-based link analysis. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 299–306 ACM.
Wang, D., Li, T., Zhu, S., Ding, C. (2008). Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 307–314 ACM.
Wu, F., & Huberman, B.A. (2004). Finding communities in linear time: a physics approach. The European Physical Journal B-Condensed Matter and Complex Systems, 38(2), 331–338.
Article Google Scholar
Wu, H., Gu, Y., Sun, S., Gu, X. (2016). Aspect-based opinion summarization with convolutional neural networks. In Neural Networks (IJCNN), 2016 International Joint Conference on, pp. 3157–3163 IEEE.
Xu, X., Meng, T., Cheng, X. (2011). Aspect-based extractive summarization of online reviews. In Proceedings of the 2011 ACM Symposium on Applied Computing, pp. 968–975 ACM.
Yih, W.-t., Goodman, J., Vanderwende, L., Suzuki, H. (2007). Multi-document summarization by maximizing informative content-words. In IJCAI, (Vol. 7 pp. 1776–1782).
Zarrinkalam, F., Fani, H., Bagheri, E., Kahani, M., Du, W. (2015). Semantics-enabled user interest detection from twitter. In IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2015, Singapore, December 6-9, 2015 - Volume I (pp. 469–476).
Zarrinkalam, F., Fani, H., Bagheri, E., Kahani, M. (2016). Inferring implicit topical interests on twitter. In European Conference on Information Retrieval, pp. 479–491 Springer.
Zhou, L., & Hovy, E.H. (2006). On the summarization of dynamically introduced information: Online discussions and blogs. In AAAI Spring symposium: Computational approaches to analyzing weblogs, p. 237.
Zhou, X., Wan, X., Xiao, J. (2016). Cminer: opinion extraction and summarization for chinese microblogs. IEEE Transactions on Knowledge and Data Engineering, 28(7), 1650–1663.
Article Google Scholar
Zhuang, L., Jing, F., Zhu, X.-Y. (2006). Movie review mining and summarization. In Proceedings of the 15th ACM international conference on Information and knowledge management, pp. 43–50 ACM.

Download references

Author information

Authors and Affiliations

Laboratory for Systems, Software and Semantics (LS3), Ryerson University, Toronto, Canada
Syed Muhammad Ali, Zeinab Noorian & Ebrahim Bagheri
Department of Computer Science, Ryerson University, Toronto, ON, Canada
Chen Ding
College of Technological Innovation, Zayed University, Dubai, United Arab Emirates
Feras Al-Obeidat

Authors

Syed Muhammad Ali
View author publications
You can also search for this author in PubMed Google Scholar
Zeinab Noorian
View author publications
You can also search for this author in PubMed Google Scholar
Ebrahim Bagheri
View author publications
You can also search for this author in PubMed Google Scholar
Chen Ding
View author publications
You can also search for this author in PubMed Google Scholar
Feras Al-Obeidat
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ebrahim Bagheri.

Appendix

Table 8 shows the topics and their associated number of tweets that are used in our experiments. Note that the internal cohesion of all the topics is 1. Table 9 shows samples of summaries generated by different clustering algorithms along with the manual summary generated by our volunteers based on the topics from (Table 10). The set of extracted aspects are reported in Table 11, which are then assigned to respective aspects as reported in Table 12. Finally, we pick one representative tweet for each sentiment-aspect pair in order to generate a summary shown in Table 13.

Table 8 Topics and their associated tweets in our experiments

Full size table

Table 9 Sample generated summary for different clustering algorithms

Full size table

Table 10 Tweet corpus for the snowfall topic with associated sentiments

Full size table

Table 11 Aspects extracted from the word graph based on the tweets and their sentiments

Full size table

Table 12 Selected tweets for two different aspects

Full size table

Table 13 The set of summary tweets for the two aspects

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ali, S.M., Noorian, Z., Bagheri, E. et al. Topic and sentiment aware microblog summarization for twitter. J Intell Inf Syst 54, 129–156 (2020). https://doi.org/10.1007/s10844-018-0521-8

Download citation

Received: 01 October 2017
Revised: 26 July 2018
Accepted: 27 July 2018
Published: 08 August 2018
Issue Date: February 2020
DOI: https://doi.org/10.1007/s10844-018-0521-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Topic and sentiment aware microblog summarization for twitter

Abstract

Access this article

Similar content being viewed by others

A review on sentiment analysis and emotion detection from text

Social media analytics: a survey of techniques, tools and platforms

A survey of sentiment analysis in social media

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Topic and sentiment aware microblog summarization for twitter

Abstract

Access this article

Similar content being viewed by others

A review on sentiment analysis and emotion detection from text

Social media analytics: a survey of techniques, tools and platforms

A survey of sentiment analysis in social media

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation