Abstract
Social media platforms like Twitter have become extremely popular for exchanging information and opinions. The opinions expressed through Twitter can be exploited by news media sources to obtain user reactions centered around different news articles. A comprehensive summary of the user reactions with respect to a news article can be crucial due to various reasons like: (i) obtaining insights about the diverse opinions of the readers with respect to the news and (ii) understanding the key aspects that draw the interest of the readers. However extracting the relevant opinions from tweets is a challenging task due to the enormous volume of contents generated and difference in vocabulary of social media contents from the published article. Existing supervised learning based techniques yield poor accuracy due to unavailability of sufficient training data and large heterogeneity in the features of various news articles, while the unsupervised techniques fail to handle the noise and diversity of the tweets.
In this paper, we propose a network community based unsupervised approach that effectively handles the problem of noise and diversity in tweet feeds to capture the relevant and the diverse opinions with respect to a news article. Using a combined metric that considers both relevance and diversity, we show that our proposed approach produces 16–25% improvement over existing schemes. Results based on human annotations also validate the effectiveness of the extracted summary tweets with respect to specific news articles.
References
Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 591–600. ACM, New York (2010)
Tsagkias, M., De Rijke, M., Weerkamp, W.: Linking online news and social media. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp. 565–574. ACM (2011)
Kothari, A., Magdy, W., Darwish, K., Mourad, A., Taei, A.: Detecting comments on news articles in microblogs. In: ICWSM, vol. 2013 (2013)
Krestel, R., Werkmeister, T., Wiradarma, T.P., Kasneci, G.: Tweet-recommender: finding relevant tweets for news articles. In: Proceedings of the 24th International Conference on World Wide Web, pp. 53–54. ACM (2015)
Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Int. Res. 22(1), 457–479 (2004)
Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2001, pp. 19–25. ACM, New York (2001)
Ikeda, D., Fujiki, T., Okumura, M.: Automatically linking news articles to blog entries. In: AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs, pp. 78–82. AAAI (2006)
Takama, Y., Matsumura, A., Kajinami, T.: Visualization of news distribution in blog space. In: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, WI-IATW 2006, pp. 413–416. IEEE Computer Society, Washington, D.C. (2006)
Štajner, T., Thomee, B., Popescu, A.-M., Pennacchiotti, M., Jaimes, A.: Automatic selection of social media responses to news. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 50–58. ACM (2013)
Shi, B., Ifrim, G., Hurley, N.: Be in the know: connecting news articles to relevant Twitter conversations. arXiv preprint arXiv:1405.3117 (2014)
Cao, X., Chen, K., Long, R., Zheng, G., Yu, Y.: News comments generation via mining microblogs. In: Proceedings of the 21st International Conference on World Wide Web, pp. 471–472. ACM (2012)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)
Zhao, W.X., Jiang, J., Weng, J., He, J., Lim, E.-P., Yan, H., Li, X.: Comparing Twitter and traditional media using topic models. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 338–349. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20161-5_34
Mehrotra, R., Sanner, S., Buntine, W., Xie, L.: Improving lda topic models for microblogs via tweet pooling and automatic labeling. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 889–892. ACM (2013)
O’Connor, B., Krieger, M., Ahn, D.: TweetMotif: exploratory search and topic summarization for Twitter. In: ICWSM, pp. 384–385 (2010)
Becker, H., Naaman, M., Gravano, L.: Selecting quality Twitter content for events. In: Adamic, L.A., Baeza-Yates, R.A., Counts, S. (eds.) Proceedings of International Conference on Weblogs and Social Media, ICWSM 2011. The AAAI Press (2011)
Chang, Y., Wang, X., Mei, Q., Liu, Y.: Towards Twitter context summarization with user influence models. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, WSDM 2013, pp. 527–536. ACM, New York (2013)
Meng, X., Wei, F., Liu, X., Zhou, M., Li, S., Wang, H.: Entity-centric topic-oriented opinion summarization in Twitter. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 379–387. ACM (2012)
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177. ACM (2004)
Ding, X., Liu, B., Yu, P.S.: A holistic lexicon-based approach to opinion mining. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, pp. 231–240. ACM (2008)
Qiu, G., Liu, B., Bu, J., Chen, C.: Opinion word expansion and target extraction through double propagation. Comput. Linguist. 37(1), 9–27 (2011)
Ortega, R., Fonseca, A., Montoyo, A.: SSA-UO: unsupervised Twitter sentiment analysis. In: Second Joint Conference on Lexical and Computational Semantics (*SEM), vol. 2, pp. 501–507 (2013)
Luo, Z., Osborne, M., Wang, T.: An effective approach to tweets opinion retrieval. World Wide Web 18(3), 545–566 (2015)
Bravo-Marquez, F., Mendoza, M., Poblete, B.: Combining strengths, emotions and polarities for boosting Twitter sentiment analysis. In: Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining, p. 2. ACM (2013)
Sahni, S., Gonzalez, T.: P-complete approximation problems. J. ACM 23(3), 555–565 (1976)
Ramos, J., et al.: Using TF-IDF to determine word relevance in document queries. in Proceedings of the First Instructional Conference on Machine Learning (2003)
Hutto, C.J., Gilbert, E.: Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Eighth International AAAI Conference on Weblogs and Social Media (2014)
Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 335–336. ACM (1998)
Lin, J.: Divergence measures based on the Shannon entropy. IEEE Trans. Inf. Theory 37(1), 145–151 (1991)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Chakraborty, R., Bhavsar, M., Dandapat, S., Chandra, J. (2017). A Network Based Stratification Approach for Summarizing Relevant Comment Tweets of News Articles. In: Bouguettaya, A., et al. Web Information Systems Engineering – WISE 2017. WISE 2017. Lecture Notes in Computer Science(), vol 10569. Springer, Cham. https://doi.org/10.1007/978-3-319-68783-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-68783-4_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68782-7
Online ISBN: 978-3-319-68783-4
eBook Packages: Computer ScienceComputer Science (R0)