Abstract
In recent years, substantial research efforts have gone into investigating different approaches to the detection of events in real time from the Twitter data stream. Most of these approaches, however, suffer from a high computational cost and are not evaluated using a publicly available corpus, thus making it difficult to properly compare them. In this paper, we propose a scalable event detection system, TwitterNews+, to detect and track newsworthy events in real time. TwitterNews+ uses a novel approach to cluster event related tweets from Twitter with a significantly lower computational cost compared to the existing state-of-the-art approaches. Finally, we evaluate the effectiveness of TwitterNews+ using a publicly available corpus and its associated ground truth data set of newsworthy events. The result of the evaluation shows a significant improvement, in terms of recall and precision, over the baselines we have used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Atefeh, F., Khreich, W.: A survey of techniques for event detection in Twitter. Comput. Intell. 31(1), 132–164 (2015). Wiley Online Library
Mathioudakis, M., Koudas, N.: TwitterMonitor: trend detection over the Twitter stream. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, NY, USA, pp. 1155–1158. ACM, New York (2010)
Alvanaki, F., Sebastian, M., Ramamritham, K., Weikum, G.: Enblogue: emergent topic detection in web 2.0 streams. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, NY, USA, pp. 1271–1274. ACM, New York (2011)
Gaglio, S., Re, G.L., Morana, M.: A framework for real-time Twitter data analysis. Comput. Commun. 73, 236–242 (2016). Elsevier
Xie, R., Zhu, F., Ma, H., Xie, W., Lin, C.: CLEar: a real-time online observatory for bursty and viral events. Proc. VLDB Endowment 7(13), 1637–1640 (2014). VLDB Endowment
Li, J., Wen, J., Tai, Z., Zhang, R., Yu, W.: Bursty event detection from microblog: a distributed and incremental approach. In: Concurrency and Computation:Practice and Experience. Wiley Online Library (2015)
Cai, H., Yang, Y., Li, X., Huang, Z.: What are popular: exploring Twitter features for event detection, tracking and visualization. In: Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, pp. 89–98. ACM (2015)
Petrović, S., Osborne, M., Lavrenko, V.: Streaming first story detection with application to Twitter. In: Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, HLT 2010, ACL, Stroudsburg, PA, USA, pp. 181–189 (2010)
McMinn, A.J., Jose, J.M.: Real-time entity-based event detection for Twitter. In: Mothe, J., Savoy, J., Kamps, J., Pinel-Sauvagnat, K., Jones, G.J.F., SanJuan, E., Cappellato, L., Ferro, N. (eds.) CLEF 2015. LNCS, vol. 9283, pp. 65–77. Springer, Heidelberg (2015). doi:10.1007/978-3-319-24027-5_6
Hasan, M., Orgun, M.A., Schwitter, R.: TwitterNews: real time event detection from the Twitter data stream. PeerJ PrePrints 4, e2297v1 (2016)
Sahlgren, M.: An introduction to random indexing. In: Proceedings of the Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering, TKE, vol. 5 (2005)
Guzman, J., Poblete, B.: On-line relevant anomaly detection in the Twitter stream: an efficient bursty keyword detection model. In: Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description, pp. 31–39. ACM (2013)
Petkos, G., Papadopoulos, S., Aiello, L., Skraba, R., Kompatsiaris, Y.: A soft frequent pattern mining approach for textual topic detection. In: Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics, WIMS, pp. 25: 1–25: 10. ACM (2014)
Marcus, A., Bernstein, M.S., Badar, O., Karger, D.R., Madden, S., Miller, R.C.: TwitInfo: aggregating and visualizing microblogs for event exploration. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2011, NY, USA, pp. 227–236. ACM, New York (2011)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678. ACM (2014)
Derczynski, L., Ritter, A., Clark, S., Bontcheva, K.: Twitter part-of-speech tagging for all: overcoming sparse and noisy data. In: Proceedings of the Recent Advances in Natural Language Processing, RANLP, pp. 198–206 (2013)
Aiello, L.M., Petkos, G., Martin, C., Corney, D., Papadopoulos, S., Skraba, R., Goker, A., Kompatsiaris, I., Jaimes, A.: Sensing trending topics in Twitter. IEEE Trans. Multimedia 15(6), 1268–1282 (2013). IEEE
Stilo, G., Velardi, P.: Efficient temporal mining of micro-blog texts and its application to event discovery. In: Fürnkranz, J. (ed.) Data Mining and Knowledge Discovery, pp. 1–31. Springer, Heidelberg (2015)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). JMLR.org
Owoputi, O., O’Connor, B., Dyer, C., Gimpel, K., Schneider, N., Smith, N.A.: Improved part-of-speech tagging for online conversational text with word clusters. In: Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, HLT 2013, ACL, pp. 380–391 (2013)
McMinn, A.J., Moshfeghi, Y., Jose, J.M.: Building a large-scale corpus for evaluating event detection on Twitter. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013, NY, USA, pp. 409–418. ACM, New York (2013)
Kumar, S., Liu, H., Mehta, S., Subramaniam, L.V.: From tweets to events: exploring a scalable solution for Twitter streams. arXiv preprint arXiv:1405.1392 (2014)
Lehmann, J., Gonçalves, B., Ramasco, J.J., Cattuto, C.: Dynamical classes of collective attention in Twitter. In: Proceedings of the International Conference on World Wide Web, pp. 251–260. ACM (2012)
Yang, J., Leskovec, J.: Patterns of temporal variation in online media. In: Proceedings of the 4th ACM International Conference on Web Search and Data Mining, pp. 177–186. ACM (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Hasan, M., Orgun, M.A., Schwitter, R. (2016). TwitterNews+: A Framework for Real Time Event Detection from the Twitter Data Stream. In: Spiro, E., Ahn, YY. (eds) Social Informatics. SocInfo 2016. Lecture Notes in Computer Science(), vol 10046. Springer, Cham. https://doi.org/10.1007/978-3-319-47880-7_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-47880-7_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47879-1
Online ISBN: 978-3-319-47880-7
eBook Packages: Computer ScienceComputer Science (R0)