Skip to main content
Log in

Event summarization for sports games using twitter streams

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Given a textual data stream related to an event, social event summarization aims to generate an informative textual description that can capture all the important moments, and it plays a critical role in mining and analyzing social media streams. In this paper, we present a general social event summarization framework using Twitter streams. The proposed framework consists of three key components: participant detection, sub-event detection, and summary tweet extraction. To make the system applicable in real data, an online clustering approach is developed for participant detection and an online temporal-content mixture model is proposed to conduct sub-event detection. Experiments show that the proposed framework can achieve similar performance with its batch counterpart.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7

Similar content being viewed by others

Notes

  1. http://projects.ldc.upenn.edu/TDT/

  2. We use “participant sub-events” and “global sub-events” respectively to represent the important moments happened on the participant-level and on the entire event-level. A “global sub-event” may consist of one or more “participant sub-events”. For example., the “steal” action in the basketball game typically involves both the defensive and offensive players, and can be generated by merging the two participant-level sub-events.

  3. We use the algorithm described in [27] as a baseline and ad hoc spike detection algorithm.

  4. β was set to 5 minutes in our experiments.

  5. β was set to 5 minutes in our experiments.

  6. https://dev.twitter.com/docs/streaming-apis

  7. http://espn.go.com/nba/scoreboard

  8. http://espn.go.com/nba/scoreboard

References

  1. Allan, J.: Topic Detection and Tracking: Event-based Information Organization. Kluwer Academic Publishers Norwell, MA, USA (2002)

    Book  MATH  Google Scholar 

  2. Ahlqvist, T., Beck, A., Halonen, M., Heinonen, S.: Social Media Roadmaps: Exploring the futures triggered by social media. VTT Tiedotteita - Research Notes (2454) (2008)

  3. Atefeh, F., Khreich, W.: A survey of techniques for event detection in twitter. Comput. Intell., 132–164 (2013)

  4. Bagga, A., Baldwin, B.: Algorithms for Scoring Coreference Chains. In: The First International Conference on Language Resources and Evaluation Workshop on Linguistics Coreference (1998)

    Google Scholar 

  5. Becker, H., Naaman, M., Gravano, L.: Beyond Trending Topics: Real-World Event Identification on Twitter. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, pp. 438–441 (2011)

    Google Scholar 

  6. Chakrabarti, D., Punera, K.: Event Summarization Using Tweets. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, pp. 66–73 (2011)

    Google Scholar 

  7. Chang, Y., Wang, X., Mei, Q., Liu, Y.: Towards Twitter Context Summarization with User Influence Models. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, pp. 527–536 (2013)

    Google Scholar 

  8. Diao, Q., Jiang, J., Zhu, F., Lim, E.: Finding Bursty Topics from Microblogs. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp. 536–544 (2012)

    Google Scholar 

  9. Daumé, IIIH., Marcu, D.: Bayesian Query-Focused Summarization. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp. 305–312 (2006)

    Google Scholar 

  10. Erkan, G., Radev, D.R.: Lexrank: Graph-based lexical centrality as salience in text summarization. JAIR 22(1), 457–479 (2004)

    Google Scholar 

  11. Guille, A., Favre, C.: Event detection, tracking, and visualization in Twitter: a mention-anomaly-based approach. Soc. Netw. Anal. Min., 18:1–18:18 (2015)

  12. Goswami, A., Kumar, A.: A survey of event detection techniques in online social networks. Soc. Netw. Anal. Min., 107 (2016)

  13. Guha, S., Khuller, S.: Approximation algorithms for connected dominating sets. Algorithmica 20(4), 374–387 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  14. Goldstein, J., Mittal, V., Carbonell, J., Kantrowitz, M.: Multi-Document Summarization by Sentence Extraction. In: NAACL-ANLP 2000 Workshop on Automatic Summarization, Association for Computational Linguistics, pp. 40–48 (2000)

    Chapter  Google Scholar 

  15. Hofmann, T.: Probabilistic Latent Semantic Indexing. In: Proceedings of the 22th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57 (1999)

    Google Scholar 

  16. Hong, L., Dom, B., Gurumurthy, S., Tsioutsiouliklis, K.: A Time-dependent Topic Model for Multiple Text Streams. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 832–840 (2011)

    Google Scholar 

  17. He, R., Liu, Y., Yu, G., Tang, J., Hu, Q., Dang, J.: Twitter summarization with social-temporal context. World Wide Web, 1–24 (2017)

  18. Haghighi, A., Vanderwende, L.: Exploring Content Models for Multi-Document Summarization. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, pp. 362–370 (2009)

    Google Scholar 

  19. Imran, M., Castillo, C., Diaz, F., Vieweg, S.: Processing Social Media Messages in Mass Emergency: A Survey. ACM Comput. Surv., 67:1–67:38 (2015)

  20. Inouye, D., Kalita, J.K.: Comparing Twitter Summarization Algorithms for Multiple Post Summaries. In: Proceedings of 2011 IEEE third International Conference on Social Computing, pp. 290–306 (2011)

    Google Scholar 

  21. Jurafsky, D., Martin, J.: Speech and language processing. Prentice Hall, New York (2008)

    Google Scholar 

  22. L.L.: Measures of Distributional Similarity. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pp. 25–32 (1999)

    Google Scholar 

  23. Lin, C.-Y.: ROUGE: a Package for Automatic Evaluation of Summaries. In: Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pp. 74–81 (2004)

    Google Scholar 

  24. Li, Z., Tang, J., Wang, X., Liu, J., Lu, H.: Multimedia News Summarization in Search, ACM Trans. Intell. Syst. Technol., 33:1–33:20 (2016)

  25. Li, T., Xie, N., Zeng, C., Zhou, W., Zheng, L., Jiang, Y., Yang, Y., Ha, H.-Y., Xue, W., Huang, Y., Chen, S.-C., Navlakha, J., Iyengar, S.S.: Data-Driven Techniques in Disaster Information Management. ACM Comput. Surv. 50 (1), 1:1–1:45 (2017)

    Article  Google Scholar 

  26. Mani, I.: Automatic summarization. Comput. Linguist. 28(2)

  27. Marcus, A., Bernstein, M., Badar, O., Karger, D., Madden, S., Miller, R.: Twitinfo: Aggregating and Visualizing Microblogs for Event Exploration. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 227–236 (2011)

    Google Scholar 

  28. Nichols, J., Mahmud, J., Drews, C.: Summarizing Sporting Events Using Twitter. In: Proceedings of the 2012 ACM Interntional Conference on Intelligent User Interfaces, pp. 189–198 (2012)

    Google Scholar 

  29. Nenkova, A., Vanderwende, L.: The impact of frequency on summarization. Microsoft Research, Redmond, Washington, Tech. Rep. MSR-TR-2005-101

  30. Purushotham, S., Kuo, C.-C.J.: Personalized Group Recommender Systems for Location- and Event-Based Social Networks. ACM Trans. Web, 16:1–16:29 (2016)

  31. Peng, B., Li, J., Chen, J., Han, X., Xu, R., Wong, K.F.: Trending Sentiment-Topic detection on twitter. Springer International Publishing, 66–77 (2015)

  32. Petrovic, S., Osborne, M., Lavrenko, V.: Streaming First Story Detection with Application to Twitter. In: Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 181–189 (2010)

    Google Scholar 

  33. Ritter, A., Clark, S., Mausam, O.: Etzioni, Named Entity Recognition in Tweets: an Experimental Study. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1524–1534 (2011)

    Google Scholar 

  34. Radev, D., Jing, H., Styś, M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manag. 40(6), 919–938 (2004)

    Article  MATH  Google Scholar 

  35. Ritter, A., Mausam, O., Etzioni, S.: Clark, Open Domain Event Extraction from Twitter. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1104–1112 (2012)

    Google Scholar 

  36. Shen, C., Li, T.: Multi-Document Summarization via the Minimum Dominating Set. In: Proceedings of the 23rd International Conference on Computational Linguistics, Association for Computational Linguistics, pp. 984–992 (2010)

    Google Scholar 

  37. Shen, C., Liu, F., Weng, F., Li, T.: A Participant-based Approach for Event Summarization Using Twitter Streams. In: Proceedings of NAACL-HLT, pp. 1152–1162 (2013)

    Google Scholar 

  38. Tang, J., Yao, L., Chen, D.: Multi-Topic Based Query-oriented Summarization. In: Proceedings of SDM, pp. 1147–1158 (2009)

    Google Scholar 

  39. Takamura, H., Yokono, H., Okumura, M.: Summarizing a Document Stream. In: Proceedings of the 33rd European Conference on Advances in Information Retrieval, pp. 177–188 (2011)

    Chapter  Google Scholar 

  40. Unankard, S., Li, X., Sharaf, M.A.: Emerging event detection in social networks with location sensitivity. World Wide Web, 1393–1417 (2015)

  41. Wan, X.: Topic Analysis for Topic-Focused Multi-Document Summarization. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1609–1612. ACM (2009)

  42. Weng, J., Lee, B.-S.: Event Detection in Twitter. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, pp. 401–408 (2011)

  43. Wang, D., Li, T., Zhu, S., Ding, C.: Multi-Document Summarization via Sentence-Level Semantic Analysis and Symmetric Matrix Factorization. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 307–314. ACM (2008)

  44. Wan, X., Yang, J., Xiao, J.: Towards an Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, pp. 543–552 (2007)

  45. Xue, W., Li, T., Rishe, N.: Aspect identification and ratings inference for hotel reviews. World Wide Web 20(1), 23–37 (2017)

    Article  Google Scholar 

  46. Yamamoto, Y., Shinozaki, T., Ikegami, Y., Tsuruta, S.: Context respectful counseling agent virtualized on the Web. World Wide Web, 1–24 (2015)

  47. Zhang, X., Li, Z., Zhu, S., Liang, W.: Detecting Spam and Promoting Campaigns in Twitter. ACM Trans. Web, 4:1–4:28 (2016)

  48. Zubiaga, A., Spina, D., Amigó, E., Gonzalo, J.: Towards Real-time Summarization of Scheduled Events from Twitter Streams. In: Proceedings of the 23Rd ACM Conference on Hypertext and Social Media, pp. 319–320 (2012)

  49. Zhao, S., Zhong, L., Wickramasuriya, J., Vasudevan, V., LiKamWa, R., Rahmati, A.: SportSense: Real-Time Detection of NFL Game Events from Twitter. Technical Report TR0511-2012

  50. Zhao, S., Zhong, L., Wickramasuriya, J., Vasudevan, V.: Human as Real-Time Sensors of Social and Physical Events: A Case Study of Twitter and Sports Games. Technical Report TR0620-2011, Rice University and Motorola Labs

Download references

Acknowledgements

The work was supported in part by the National Science Foundation under Grant Nos. IIS-1213026, CNS-1126619, and CNS-1461926, Chinese National Natural Science Foundation under grant 91646116, Ministry of Education/China Mobile joint research grant under Project No.5-10, and Scientific and Technological Support Project (Society) of Jiangsu Province (No. BE2016776).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tao Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, Y., Shen, C. & Li, T. Event summarization for sports games using twitter streams. World Wide Web 21, 609–627 (2018). https://doi.org/10.1007/s11280-017-0477-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-017-0477-6

Keywords

Navigation