Topic and sentiment aware microblog summarization for twitter

  • Syed Muhammad Ali
  • Zeinab Noorian
  • Ebrahim BagheriEmail author
  • Chen Ding
  • Feras Al-Obeidat


Recent advances in microblog content summarization has primarily viewed this task in the context of traditional multi-document summarization techniques where a microblog post or their collection form one document. While these techniques already facilitate information aggregation, categorization and visualization of microblog posts, they fall short in two aspects: i) when summarizing a certain topic from microblog content, not all existing techniques take topic polarity into account. This is an important consideration in that the summarization of a topic should cover all aspects of the topic and hence taking polarity into account (sentiment) can lead to the inclusion of the less popular polarity in the summarization process. ii) Some summarization techniques produce summaries at the topic level. However, it is possible that a given topic can have more than one important aspect that need to have representation in the summarization process. Our work in this paper addresses these two challenges by considering both topic sentiments and topic aspects in tandem. We compare our work with the state of the art Twitter summarization techniques and show that our method is able to outperform existing methods on standard metrics such as ROUGE-1.


Microblogging Twitter Summarization Topic Modeling 


  1. Abel, F., Gao, Q., Houben, G.-J., Tao, K. (2011). Analyzing user modeling on twitter for personalized news recommendations, User Modeling, Adaption and Personalization, pp. 1–12.Google Scholar
  2. Abdullah, Z., & Hamdan, A. (2015). Hierarchical clustering algorithms in data mining.Google Scholar
  3. Ackermann, M.R., Blömer, J., Kuntze, D., Sohler, C. (2014). Analysis of agglomerative clustering. Algorithmica, 69(1), 184–215.MathSciNetCrossRefzbMATHGoogle Scholar
  4. Amigó, E., De Albornoz, J.C., Chugur, I., Corujo, A., Gonzalo, J., Martín, T., Meij, E., De Rijke, M., Spina, D. (2013). Overview of replab 2013: Evaluating online reputation monitoring systems. In International Conference of the Cross-Language Evaluation Forum for European Languages, pp. 333–352 Springer.Google Scholar
  5. Atefeh, F., & Khreich, W. (2015). A survey of techniques for event detection in twitter. Computational Intelligence, 31(1), 132–164.MathSciNetCrossRefGoogle Scholar
  6. Bhargava, R., Sharma, Y., Sharma, G. (2016). Atssi: Abstractive text summarization using sentiment infusion. Procedia Computer Science, 89, 404–411.CrossRefGoogle Scholar
  7. Bian, J., Yang, Y., Zhang, H., Chua, T. -S. (2015). Multimedia summarization for social events in microblog stream. IEEE Transactions on Multimedia, 17(2), 216–228.CrossRefGoogle Scholar
  8. Bild, D.R., Liu, Y., Dick, R.P., Mao, Z.M., Wallach, D.S. (2015). Aggregate characterization of user behavior in twitter and analysis of the retweet graph. ACM Transactions on Internet Technology (TOIT), 15(1), 4.CrossRefGoogle Scholar
  9. Biryukov, M., Angheluta, R., Moens, M. -F. (2005). Multidocument question answering text summarization using topic signatures. JDIM, 3(1), 27–33.Google Scholar
  10. Blei, D.M., Ng, A.Y., Jordan, M.I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993–1022.zbMATHGoogle Scholar
  11. Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008(10), P10008.CrossRefGoogle Scholar
  12. Carrillo-de Albornoz, J., Amigó, E., Plaza, L., Gonzalo, J. (2016). Tweet stream summarization for online reputation management. In European Conference on Information Retrieval, pp. 378–389 Springer.Google Scholar
  13. Chakrabarti, D., & Punera, K. (2011). Event summarization using tweets. ICWSM, 11, 66–73.Google Scholar
  14. De Maio, C., Fenza, G., Gallo, M., Loia, V., Senatore, S. (2014). Formal and relational concept analysis for fuzzy-based automatic semantic annotation. Applied intelligence, 40(1), 154–177.CrossRefGoogle Scholar
  15. De Maio, C., Fenza, G., Loia, V., Parente, M. (2016). Time aware knowledge extraction for microblog summarization on twitter. Information Fusion, 28, 60–74.CrossRefGoogle Scholar
  16. Dongen, S. (2000). Performance criteria for graph clustering and markov cluster experiments.Google Scholar
  17. Erkan, G., & Radev, D.R. (2004a). Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 22, 457–479.CrossRefGoogle Scholar
  18. Erkan, G., & Radev, D.R. (2004b). Lexpagerank: Prestige in multi-document text summarization. In EMNLP, (Vol. 4 pp. 365–371).Google Scholar
  19. Feng, Y., Bagheri, E., Ensan, F., Jovanovic, J. (2017). The state of the art in semantic relatedness: A framework for comparison, The Knowledge Engineering Review.Google Scholar
  20. Feng, Y., Zarrinkalam, F., Bagheri, E., Fani, H., Al-Obeidat, F. (2018). Entity linking of tweets based on dominant entity candiyears. Social Network Analysis and Mining, 8(1), 46.CrossRefGoogle Scholar
  21. Ferragina, P., & Scaiella, U. (2010). Tagme: on-the-fly annotation of short text fragments (by wikipedia entities). In Proceedings of the 19th ACM international conference on Information and knowledge management, pp. 1625–1628 ACM.Google Scholar
  22. Ganesan, K., Zhai, C., Han, J. (2010). Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions. In Proceedings of the 23rd international conference on computational linguistics, pp. 340–348 Association for Computational Linguistics.Google Scholar
  23. Genest, P.-E., & Lapalme, G. (2012). Fully abstractive approach to guided summarization. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2, pp. 354–358 Association for Computational Linguistics.Google Scholar
  24. Go, A., Bhayani, R., Huang, L. (2009). Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford, 1(2009), 12.Google Scholar
  25. Goldstein, J., Kantrowitz, M., Mittal, V., Carbonell, J. (1999). Summarizing text documents: sentence selection and evaluation metrics. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 121–128 ACM.Google Scholar
  26. Gong, Y., & Liu, X. (2001). Generic text summarization using relevance measure and latent semantic analysis. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 19–25 ACM.Google Scholar
  27. Haghighi, A., & Vanderwende, L. (2009). Exploring content models for multi-document summarization. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 362–370 Association for Computational Linguistics.Google Scholar
  28. Hennig, L., & Labor, D. (2009). Topic-based multi-document summarization with probabilistic latent semantic analysis. In Ranlp (pp. 144–149).Google Scholar
  29. Hu, X., Zhang, X., Lu, C., Park, E.K., Zhou, X. (2009). Exploiting wikipedia as external knowledge for document clustering. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 389–396 ACM.Google Scholar
  30. Hu, Y.-H., Chen, Y.-L., Chou, H.-L. (2017). Opinion mining from online hotel reviews–a text summarization approach. Information Processing & Management, 53(2), 436–449.CrossRefGoogle Scholar
  31. Inouye, D., & Kalita, J.K. (2011). Comparing twitter summarization algorithms for multiple post summaries. In 2011 IEEE 3rd international conference on privacy, security, risk and trust (PASSAT) and 2011 IEEE 3rd Inernational conference on social computing (SocialCom), pp. 298–306 IEEE.Google Scholar
  32. Jashki, M.-A., Makki, M., Bagheri, E., Ghorbani, A.A. (2009). An iterative hybrid filter-wrapper approach to feature selection for document clustering. In Proceedings of the 22Nd Canadian Conference on Artificial Intelligence: Advances in Artificial Intelligence, Canadian AI ’09 (pp. 74–85). Berlin: Springer.Google Scholar
  33. Jing, H., & McKeown, K.R. (2000). Cut and paste based text summarization. In Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference, pp. 178–185 Association for Computational Linguistics.Google Scholar
  34. Jones, K.S. (2007). Automatic summarising: The state of the art. Information Processing & Management, 43(6), 1449–1481.MathSciNetCrossRefGoogle Scholar
  35. Knight, K., & Marcu, D. (2002). Summarization beyond sentence extraction: a probabilistic approach to sentence compression. Artificial Intelligence, 139(1), 91–107.CrossRefzbMATHGoogle Scholar
  36. Ku, L.-W., Liang, Y.-T., Chen, H.-H. (2006). Opinion extraction, summarization and tracking in news and blog corpora. In Proceedings of AAAI, pp. 100–107.Google Scholar
  37. Lin, C.-Y., & Hovy, E. (2002). From single to multi-document summarization: A prototype system and its evaluation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 457–464 Association for Computational Linguistics.Google Scholar
  38. Lin, C.-Y., & Hovy, E. (2003). Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1, pp. 71–78 Association for Computational Linguistics.Google Scholar
  39. Lin, C., Li, J., Wang, D., Chen, Y., Li, T. (2012). Generating event storylines from microblogs. In Proceedings of the 21st ACM international conference on Information and knowledge management, pp. 175–184 ACM.Google Scholar
  40. Ling, X., Mei, Q., Zhai, C., Schatz, B. (2008). Mining multi-faceted overviews of arbitrary topics in a text collection. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 497–505 ACM.Google Scholar
  41. Liu, F., Flanigan, J., Thomson, S., Sadeh, N., Smith, N.A. (2015). Toward abstractive summarization using semantic representations.Google Scholar
  42. Lloret, E., & Palomar, M. (2011). Analyzing the use of word graphs for abstractive text summarization. In Proceedings of the First International Conference on Advances in Information Mining and Management, Barcelona (pp. 61–6).Google Scholar
  43. Louis, A, & Nenkova, A. (2009). Automatically evaluating content selection in summarization without human models. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 1, pp. 306–314 Association for Computational Linguistics.Google Scholar
  44. Mani, I. (2001). Automaticsummarization, Vol. 3, John Benjamins Publishing, Amsterdam.Google Scholar
  45. Marcus, A., Bernstein, M.S., Badar, O., Karger, D.R., Madden, S., Miller, R.C. (2011). Twitinfo: aggregating and visualizing microblogs for event exploration. In Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 227–236 ACM.Google Scholar
  46. Meila, M. (2003). Comparing clusterings by the variation of information. In Colt, vol. 3, pp. 173–187 Springer.Google Scholar
  47. Miao, Y., & Li, C. (2010). Enhancing query-oriented summarization based on sentence wikification. In Workshop of the 33 rd Annual International (p. 32).Google Scholar
  48. Mihalcea, R., & Tarau, P. (2004). Textrank: Bringing order into text. In EMNLP, (Vol. 4 pp. 404–411).Google Scholar
  49. Mihalcea, R., & Tarau, P. (2005). A language independent algorithm for single and multiple document summarization. In Proceedings of IJCNLP, Vol. 5.Google Scholar
  50. Newman, M.E. (2006). Finding community structure in networks using the eigenvectors of matrices. Physical review E, 74(3), 036104.MathSciNetCrossRefGoogle Scholar
  51. Nichols, J., Mahmud, J., Drews, C. (2012). Summarizing sporting events using twitter. In Proceedings of the 2012 ACM international conference on Intelligent User Interfaces, pp. 189–198 ACM.Google Scholar
  52. Ohsawa, Y., Benson, N.E., Yachida, M. (1998). Keygraph: Automatic indexing by co-occurrence graph based on building construction metaphor. In Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries, 1998. ADL 98. pp. 12–18 IEEE.Google Scholar
  53. Piryani, R., Gupta, V., Kumar Singh, V. (2018). Generating aspect-based extractive opinion summary: Drawing inferences from social media texts. Computación y Sistemas, 1, 22.Google Scholar
  54. Radev, D.R., Jing, H., Styś, M., Tam, D. (2004). Centroid-based summarization of multiple documents. Information Processing & Management, 40(6), 919–938.CrossRefzbMATHGoogle Scholar
  55. Ramage, D., & Rosen, E. (2011). Stanford topic modeling toolbox.Google Scholar
  56. Rosvall, M., & Bergstrom, C.T. (2008). Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences, 105(4), 1118–1123.CrossRefGoogle Scholar
  57. Saggion, H., Torres-Moreno, J.-M., Cunha, I.d., SanJuan, E. (2010). Multilingual summarization evaluation without human models. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 1059–1067 Association for Computational Linguistics.Google Scholar
  58. Saif, H., He, Y., Alani, H. (2012). Semantic sentiment analysis of twitter. The Semantic Web–ISWC 2012, pp. 508–524.Google Scholar
  59. Sharifi, B., Hutton, M.-A., Kalita, J.K. (2010). Experiments in microblog summarization. In Social Computing (SocialCom), 2010 IEEE Second International Conference on, pp. 49–56 IEEE.Google Scholar
  60. Sharifi, B.P., Inouye, D.I., Kalita, J.K. (2013). Summarization of twitter microblogs. The Computer Journal, 57(3), 378–402.CrossRefGoogle Scholar
  61. Steinbach, M., Karypis, G., Kumar, V., et al. (2000). A comparison of document clustering techniques. In KDD workshop on text mining, vol. 400, pp. 525–526 Boston.Google Scholar
  62. Sun, J.-T., Shen, D., Zeng, H.-J., Yang, Q. , Lu, Y., Chen, Z. (2005). Web-page summarization using clickthrough data. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 194–201 ACM.Google Scholar
  63. Titov, I., & McDonald, R. (2008). A joint model of text and aspect ratings for sentiment summarization, Proceedings of ACL-08: HLT, pp. 308–316.Google Scholar
  64. Torres-Moreno, J.-M., St-Onge, P.-L., Gagnon, M., El-Beze, M., Bellot, P. (2009). Automatic summarization system coupled with a question-answering system (qaas), arXiv:0905.2990.
  65. Vanderwende, L., Suzuki, H., Brockett, C., Nenkova, A. (2007). Beyond sumbasic: Task-focused summarization with sentence simplification and lexical expansion. Information Processing & Management, 43(6), 1606–1618.CrossRefGoogle Scholar
  66. Varga, A., Basave, A.E.C., Rowe, M., Ciravegna, F., He, Y. (2014). Linked knowledge sources for topic classification of microposts: a semantic graph-based approach. Web Semantics: Science, Services and Agents on the World Wide Web, 26, 36–57.CrossRefGoogle Scholar
  67. Wan, X., & Yang, J. (2008). Multi-document summarization using cluster-based link analysis. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 299–306 ACM.Google Scholar
  68. Wang, D., Li, T., Zhu, S., Ding, C. (2008). Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 307–314 ACM.Google Scholar
  69. Wu, F., & Huberman, B.A. (2004). Finding communities in linear time: a physics approach. The European Physical Journal B-Condensed Matter and Complex Systems, 38(2), 331–338.CrossRefGoogle Scholar
  70. Wu, H., Gu, Y., Sun, S., Gu, X. (2016). Aspect-based opinion summarization with convolutional neural networks. In Neural Networks (IJCNN), 2016 International Joint Conference on, pp. 3157–3163 IEEE.Google Scholar
  71. Xu, X., Meng, T., Cheng, X. (2011). Aspect-based extractive summarization of online reviews. In Proceedings of the 2011 ACM Symposium on Applied Computing, pp. 968–975 ACM.Google Scholar
  72. Yih, W.-t., Goodman, J., Vanderwende, L., Suzuki, H. (2007). Multi-document summarization by maximizing informative content-words. In IJCAI, (Vol. 7 pp. 1776–1782).Google Scholar
  73. Zarrinkalam, F., Fani, H., Bagheri, E., Kahani, M., Du, W. (2015). Semantics-enabled user interest detection from twitter. In IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2015, Singapore, December 6-9, 2015 - Volume I (pp. 469–476).Google Scholar
  74. Zarrinkalam, F., Fani, H., Bagheri, E., Kahani, M. (2016). Inferring implicit topical interests on twitter. In European Conference on Information Retrieval, pp. 479–491 Springer.Google Scholar
  75. Zhou, L., & Hovy, E.H. (2006). On the summarization of dynamically introduced information: Online discussions and blogs. In AAAI Spring symposium: Computational approaches to analyzing weblogs, p. 237.Google Scholar
  76. Zhou, X., Wan, X., Xiao, J. (2016). Cminer: opinion extraction and summarization for chinese microblogs. IEEE Transactions on Knowledge and Data Engineering, 28(7), 1650–1663.CrossRefGoogle Scholar
  77. Zhuang, L., Jing, F., Zhu, X.-Y. (2006). Movie review mining and summarization. In Proceedings of the 15th ACM international conference on Information and knowledge management, pp. 43–50 ACM.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Syed Muhammad Ali
    • 1
  • Zeinab Noorian
    • 1
  • Ebrahim Bagheri
    • 1
    Email author
  • Chen Ding
    • 2
  • Feras Al-Obeidat
    • 3
  1. 1.Laboratory for Systems, Software and Semantics (LS3)Ryerson UniversityTorontoCanada
  2. 2.Department of Computer ScienceRyerson UniversityTorontoCanada
  3. 3.College of Technological Innovation, Zayed UniversityDubaiUnited Arab Emirates

Personalised recommendations