World Wide Web

, Volume 21, Issue 3, pp 609–627 | Cite as

Event summarization for sports games using twitter streams

Article
  • 163 Downloads

Abstract

Given a textual data stream related to an event, social event summarization aims to generate an informative textual description that can capture all the important moments, and it plays a critical role in mining and analyzing social media streams. In this paper, we present a general social event summarization framework using Twitter streams. The proposed framework consists of three key components: participant detection, sub-event detection, and summary tweet extraction. To make the system applicable in real data, an online clustering approach is developed for participant detection and an online temporal-content mixture model is proposed to conduct sub-event detection. Experiments show that the proposed framework can achieve similar performance with its batch counterpart.

Keywords

Social event summarization Participant detection Temporal mixture model Online update 

Notes

Acknowledgements

The work was supported in part by the National Science Foundation under Grant Nos. IIS-1213026, CNS-1126619, and CNS-1461926, Chinese National Natural Science Foundation under grant 91646116, Ministry of Education/China Mobile joint research grant under Project No.5-10, and Scientific and Technological Support Project (Society) of Jiangsu Province (No. BE2016776).

References

  1. 1.
    Allan, J.: Topic Detection and Tracking: Event-based Information Organization. Kluwer Academic Publishers Norwell, MA, USA (2002)CrossRefMATHGoogle Scholar
  2. 2.
    Ahlqvist, T., Beck, A., Halonen, M., Heinonen, S.: Social Media Roadmaps: Exploring the futures triggered by social media. VTT Tiedotteita - Research Notes (2454) (2008)Google Scholar
  3. 3.
    Atefeh, F., Khreich, W.: A survey of techniques for event detection in twitter. Comput. Intell., 132–164 (2013)Google Scholar
  4. 4.
    Bagga, A., Baldwin, B.: Algorithms for Scoring Coreference Chains. In: The First International Conference on Language Resources and Evaluation Workshop on Linguistics Coreference (1998)Google Scholar
  5. 5.
    Becker, H., Naaman, M., Gravano, L.: Beyond Trending Topics: Real-World Event Identification on Twitter. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, pp. 438–441 (2011)Google Scholar
  6. 6.
    Chakrabarti, D., Punera, K.: Event Summarization Using Tweets. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, pp. 66–73 (2011)Google Scholar
  7. 7.
    Chang, Y., Wang, X., Mei, Q., Liu, Y.: Towards Twitter Context Summarization with User Influence Models. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, pp. 527–536 (2013)Google Scholar
  8. 8.
    Diao, Q., Jiang, J., Zhu, F., Lim, E.: Finding Bursty Topics from Microblogs. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp. 536–544 (2012)Google Scholar
  9. 9.
    Daumé, IIIH., Marcu, D.: Bayesian Query-Focused Summarization. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp. 305–312 (2006)Google Scholar
  10. 10.
    Erkan, G., Radev, D.R.: Lexrank: Graph-based lexical centrality as salience in text summarization. JAIR 22(1), 457–479 (2004)Google Scholar
  11. 11.
    Guille, A., Favre, C.: Event detection, tracking, and visualization in Twitter: a mention-anomaly-based approach. Soc. Netw. Anal. Min., 18:1–18:18 (2015)Google Scholar
  12. 12.
    Goswami, A., Kumar, A.: A survey of event detection techniques in online social networks. Soc. Netw. Anal. Min., 107 (2016)Google Scholar
  13. 13.
    Guha, S., Khuller, S.: Approximation algorithms for connected dominating sets. Algorithmica 20(4), 374–387 (1998)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Goldstein, J., Mittal, V., Carbonell, J., Kantrowitz, M.: Multi-Document Summarization by Sentence Extraction. In: NAACL-ANLP 2000 Workshop on Automatic Summarization, Association for Computational Linguistics, pp. 40–48 (2000)CrossRefGoogle Scholar
  15. 15.
    Hofmann, T.: Probabilistic Latent Semantic Indexing. In: Proceedings of the 22th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57 (1999)Google Scholar
  16. 16.
    Hong, L., Dom, B., Gurumurthy, S., Tsioutsiouliklis, K.: A Time-dependent Topic Model for Multiple Text Streams. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 832–840 (2011)Google Scholar
  17. 17.
    He, R., Liu, Y., Yu, G., Tang, J., Hu, Q., Dang, J.: Twitter summarization with social-temporal context. World Wide Web, 1–24 (2017)Google Scholar
  18. 18.
    Haghighi, A., Vanderwende, L.: Exploring Content Models for Multi-Document Summarization. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, pp. 362–370 (2009)Google Scholar
  19. 19.
    Imran, M., Castillo, C., Diaz, F., Vieweg, S.: Processing Social Media Messages in Mass Emergency: A Survey. ACM Comput. Surv., 67:1–67:38 (2015)Google Scholar
  20. 20.
    Inouye, D., Kalita, J.K.: Comparing Twitter Summarization Algorithms for Multiple Post Summaries. In: Proceedings of 2011 IEEE third International Conference on Social Computing, pp. 290–306 (2011)Google Scholar
  21. 21.
    Jurafsky, D., Martin, J.: Speech and language processing. Prentice Hall, New York (2008)Google Scholar
  22. 22.
    L.L.: Measures of Distributional Similarity. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pp. 25–32 (1999)Google Scholar
  23. 23.
    Lin, C.-Y.: ROUGE: a Package for Automatic Evaluation of Summaries. In: Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pp. 74–81 (2004)Google Scholar
  24. 24.
    Li, Z., Tang, J., Wang, X., Liu, J., Lu, H.: Multimedia News Summarization in Search, ACM Trans. Intell. Syst. Technol., 33:1–33:20 (2016)Google Scholar
  25. 25.
    Li, T., Xie, N., Zeng, C., Zhou, W., Zheng, L., Jiang, Y., Yang, Y., Ha, H.-Y., Xue, W., Huang, Y., Chen, S.-C., Navlakha, J., Iyengar, S.S.: Data-Driven Techniques in Disaster Information Management. ACM Comput. Surv. 50 (1), 1:1–1:45 (2017)CrossRefGoogle Scholar
  26. 26.
    Mani, I.: Automatic summarization. Comput. Linguist. 28(2)Google Scholar
  27. 27.
    Marcus, A., Bernstein, M., Badar, O., Karger, D., Madden, S., Miller, R.: Twitinfo: Aggregating and Visualizing Microblogs for Event Exploration. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 227–236 (2011)Google Scholar
  28. 28.
    Nichols, J., Mahmud, J., Drews, C.: Summarizing Sporting Events Using Twitter. In: Proceedings of the 2012 ACM Interntional Conference on Intelligent User Interfaces, pp. 189–198 (2012)Google Scholar
  29. 29.
    Nenkova, A., Vanderwende, L.: The impact of frequency on summarization. Microsoft Research, Redmond, Washington, Tech. Rep. MSR-TR-2005-101Google Scholar
  30. 30.
    Purushotham, S., Kuo, C.-C.J.: Personalized Group Recommender Systems for Location- and Event-Based Social Networks. ACM Trans. Web, 16:1–16:29 (2016)Google Scholar
  31. 31.
    Peng, B., Li, J., Chen, J., Han, X., Xu, R., Wong, K.F.: Trending Sentiment-Topic detection on twitter. Springer International Publishing, 66–77 (2015)Google Scholar
  32. 32.
    Petrovic, S., Osborne, M., Lavrenko, V.: Streaming First Story Detection with Application to Twitter. In: Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 181–189 (2010)Google Scholar
  33. 33.
    Ritter, A., Clark, S., Mausam, O.: Etzioni, Named Entity Recognition in Tweets: an Experimental Study. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1524–1534 (2011)Google Scholar
  34. 34.
    Radev, D., Jing, H., Styś, M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manag. 40(6), 919–938 (2004)CrossRefMATHGoogle Scholar
  35. 35.
    Ritter, A., Mausam, O., Etzioni, S.: Clark, Open Domain Event Extraction from Twitter. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1104–1112 (2012)Google Scholar
  36. 36.
    Shen, C., Li, T.: Multi-Document Summarization via the Minimum Dominating Set. In: Proceedings of the 23rd International Conference on Computational Linguistics, Association for Computational Linguistics, pp. 984–992 (2010)Google Scholar
  37. 37.
    Shen, C., Liu, F., Weng, F., Li, T.: A Participant-based Approach for Event Summarization Using Twitter Streams. In: Proceedings of NAACL-HLT, pp. 1152–1162 (2013)Google Scholar
  38. 38.
    Tang, J., Yao, L., Chen, D.: Multi-Topic Based Query-oriented Summarization. In: Proceedings of SDM, pp. 1147–1158 (2009)Google Scholar
  39. 39.
    Takamura, H., Yokono, H., Okumura, M.: Summarizing a Document Stream. In: Proceedings of the 33rd European Conference on Advances in Information Retrieval, pp. 177–188 (2011)CrossRefGoogle Scholar
  40. 40.
    Unankard, S., Li, X., Sharaf, M.A.: Emerging event detection in social networks with location sensitivity. World Wide Web, 1393–1417 (2015)Google Scholar
  41. 41.
    Wan, X.: Topic Analysis for Topic-Focused Multi-Document Summarization. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1609–1612. ACM (2009)Google Scholar
  42. 42.
    Weng, J., Lee, B.-S.: Event Detection in Twitter. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, pp. 401–408 (2011)Google Scholar
  43. 43.
    Wang, D., Li, T., Zhu, S., Ding, C.: Multi-Document Summarization via Sentence-Level Semantic Analysis and Symmetric Matrix Factorization. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 307–314. ACM (2008)Google Scholar
  44. 44.
    Wan, X., Yang, J., Xiao, J.: Towards an Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, pp. 543–552 (2007)Google Scholar
  45. 45.
    Xue, W., Li, T., Rishe, N.: Aspect identification and ratings inference for hotel reviews. World Wide Web 20(1), 23–37 (2017)CrossRefGoogle Scholar
  46. 46.
    Yamamoto, Y., Shinozaki, T., Ikegami, Y., Tsuruta, S.: Context respectful counseling agent virtualized on the Web. World Wide Web, 1–24 (2015)Google Scholar
  47. 47.
    Zhang, X., Li, Z., Zhu, S., Liang, W.: Detecting Spam and Promoting Campaigns in Twitter. ACM Trans. Web, 4:1–4:28 (2016)Google Scholar
  48. 48.
    Zubiaga, A., Spina, D., Amigó, E., Gonzalo, J.: Towards Real-time Summarization of Scheduled Events from Twitter Streams. In: Proceedings of the 23Rd ACM Conference on Hypertext and Social Media, pp. 319–320 (2012)Google Scholar
  49. 49.
    Zhao, S., Zhong, L., Wickramasuriya, J., Vasudevan, V., LiKamWa, R., Rahmati, A.: SportSense: Real-Time Detection of NFL Game Events from Twitter. Technical Report TR0511-2012Google Scholar
  50. 50.
    Zhao, S., Zhong, L., Wickramasuriya, J., Vasudevan, V.: Human as Real-Time Sensors of Social and Physical Events: A Case Study of Twitter and Sports Games. Technical Report TR0620-2011, Rice University and Motorola LabsGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Jiangsu BDSIP Key Lab, School of Computer ScienceNanjing University of Posts and TelecommunicationsNanjingPeople’s Republic of China
  2. 2.School of Computing and Information SciencesFlorida International UniversityMiamiUSA

Personalised recommendations