Abstract
Event detection is a concept that is crucial to the assurance of public safety surrounding real-world events. Decision makers use information from a range of terrestrial and online sources to help inform decisions that enable them to develop policies and react appropriately to events as they unfold. One such source of online information is social media. Twitter, as a form of social media, is a popular micro-blogging web application serving hundreds of millions of users. User-generated content can be utilized as a rich source of information to identify real-world events. In this paper, we present a novel detection framework for identifying such events, with a focus on ‘disruptive’ events using Twitter data.The approach is based on five steps; data collection, pre-processing, classification, clustering and summarization. We use a Naïve Bayes classification model and an Online Clustering method to validate our model over multiple real-world data sets. To the best of our knowledge, this study is the first effort to identify real-world events in Arabic from social media.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alsaedi, N., Burnap, P., Rana, O.: A Combined Classification-Clustering Framework for Identifying Disruptive Events. In: Proceedings of 7th ASE International Conference on Social Computing (SocialCom 2014), pp. 1–10 (2014), http://ase360.org/handle/123456789/71
Darwish, K., Magdy, W.: Arabic Information Retrieval. Foundations and Trends® in Information Retrieval 7, 239–342 (2014), http://www.nowpublishers.com/articles/foundations-and-trends-in-information-retrieval/INR-031
PearAnalytics. Twitter study (August 2009), http://www.pearanalytics.com/wpcontent/uploads/2009/08/Twitter-Study-August-2009.pdf
Larkey, L., Ballesteros, L., Connell, M.: Light stemming for Arabic information retrieval. Arabic Computational Morphology, 221–243 (2007)
Cheng, Z., Caverlee, J., Lee, K.: You are where you tweet: a content-based approach to geo-locating twitter users. In: Proceeding CIKM 2010, pp. 759–768 (2010), http://dl.acm.org/citation.cfm?id=1871535
Hecht, B., Hong, L., Suh, B., Chi, E.: Tweets from Justin Bieber’s heart: the dynamics of the location field in user profiles. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 237–246 (2011)
Cha, M., Haddadi, H., Benevenuto, F., Gummadi, P.: Measuring User Influence in Twitter: The Million Follower Fallacy. In: ICWSM 2010 (2010)
Ma, Z., Sun, A., Cong, G.: On predicting the popularity of newly emerging hashtags in twitter. Journal of the American Society for Information Science and Technology 64(7), 1399–1410 (2013)
Thelwall, M., Buckley, K., Paltoglou, G.: Sentiment in Twitter events. Journal of the American Society for Information Science and Technology 62(2), 406–418 (2011)
Agarwal, A., Xie, B., Vovsha, I., Rambow, O., Passonneau, R.: Sentiment analysis of twitter data. In: Proceedings of the ACL 2011 Workshop on Languages in Social Media, pp. 30–38 (2011)
Petrović, S., Osborne, M., Lavrenko, V.: Streaming first story detection with application to twitter. In: Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 181–189 (2010)
Cordeiro, M.: Twitter event detection: combining wavelet analysis and topic inference summarization. In: Doctoral Symposium on Informatics Engineering, DSIE 2012 (2012)
Cheng, J., Adamic, L., Dow, P., Jon, K., Jure, L. (2014), Can cascades be predicted? In: WWW 2014 (2014), http://dl.acm.org/citation.cfm?id=2567997
Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake Shakes Twitter Users: Real-time Event Detection by Social Sensors. In: 19th International World Wide Web Conference, WWW 2010 (2010)
Phuvipadawat, S., Murata, T.: Breaking news detection and tracking in Twitter. In: Proceedings - 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Workshops, WI-IAT 2010, pp. 120–123 (2010)
Bollmann, P.: A comparison of evaluation measures for document retrieval systems. Journal of Informatics, 97–116 (1977)
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 38 (1998)
Takahashi, T., Igata, N.: Rumor detection on twitter. In: SCIS ‘6 and ISIS ‘13, pp. 452–457 (2012)
Kumar, S., Morstatter, F., Liu, H.: Twitter Data Analytics. Springer (2014)
Dou, W., Wang, X., Skau, D., Ribarsky, W., Zhou, M.X.: LeadLine: Interactive visual analysis of text data through event identification. In: VAST 2012, pp. 93–102 (2012)
Becker, H., Naaman, M., Gravano, L.: Beyond Trending Topics: Real- Event Identification on Twitter. In: ICWSM, pp. 1–17 (2011)
Khoja, S., Garside, R., Knowles, G.: Stemming arabic text. In: NAACL 2001 (2001)
Chua, F., Asur, S.: Automatic Summarization of Events from Social Media. In: ICWSM 2013 (2012)
Mahmud, J., Nichols, J., Drews, C.: Where Is This Tweet From? Inferring Home Locations of Twitter Users. In: ICWSM, pp. 511–514 (2012), http://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/viewFile/4605/5045
Porter, M.: An algorithm for suffix stripping. Program: Electronic Library & Information Systems 40(3), 211 – 218
Burnap, P., Williams, M.L., Sloan, L., Rana, O., Housley, W., Edwards, A., Knight, V., Procter, R., Voss, A.: Tweeting the Terror: Modelling the Social Media Reaction to the Woolwich Terrorist Attack. Social Network Analysis and Mining 4, 1 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Alsaedi, N., Burnap, P. (2015). Arabic Event Detection in Social Media. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9041. Springer, Cham. https://doi.org/10.1007/978-3-319-18111-0_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-18111-0_29
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18110-3
Online ISBN: 978-3-319-18111-0
eBook Packages: Computer ScienceComputer Science (R0)