Abstract
The growing complexity of the Twitter micro-blogging service in terms of size, number of users, and variety of bloggers relationships have generated a big data which requires innovative approaches in order to analyse, extract and detect non-obvious and popular events. Under such a circumstance, we aim, in this paper, to use big data analytics within twitter to allow real time event detection. These challenges present a big opportunity for Natural Language Processing (NLP) and Information Extraction (IE) technology to enable new large-scale data-analysis applications. Taking to account all the difficulties, this paper proposes a new metric to improve the results of the searches in microblogs. It combines content relevance, tweet relevance and author relevance, and develops a Natural Language Processing method for extracting temporal information of events from posts more specifically tweets. Our approach is based on a methodology of temporal markers classes and on a contextual exploration method. To evaluate our model, we built a knowledge management system. Actually, we used a collection of 10 thousand of tweets talking about the current events in 2014 and 2015.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: WWW’10 (2010)
Mendoza, M., Poblete, B., Castillo, C.: Twitter under crisis: can we trust what we RT? In: Proceedings of the First Workshop on Social Media Analytics (2010)
Qu, Y., Huang, C., Zhang, P., Zhang, J.: Microblogging after a major disaster in China: a case study of the 2010 Yushu earthquake. In: Proceedings of the ACM 2011 Conference on Computer Supported Cooperative Work, pp. 25–34 (2011)
Zheng, X., Zeng, Z., Chen, Z., Yu, Y., Rong, C.: Detecting spammers on social networks. Neurocomputing 159(2), 27–34 (2015)
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H.: Big data: the next frontier for innovation, competition, and productivity (2011)
Elgendy, N., Elragal, A.: Big data analytics: a literature review paper. In: Advances in Data Mining. Applications and Theoretical Aspects, pp. 214–227. Springer, Berlin (2014)
Barbosa, L., Feng, J.: Robust sentiment detection on Twitter from biased and noisy data. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 36–44. Association for Computational Linguistics (2010)
Jiang, L., Yu, M., Zhou, M., Liu, X., Zhao, T.: Target-dependent Twitter sentiment classification. In: Proceedings of 49th ACL: HLT, vol. 1, pp. 151–160 (2011)
Cha, M., Haddadi, H., Benevenuto Krishna, F., Gummadi, P.: Measuring user influence in twitter: the million follower fallacy. In: Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media 2010, ICWSM (2010)
Doan, S., Vo, B.K.H., Collier, N.: An analysis of Twitter messages in the 2011 Toho earthquake. Arxiv preprint arXiv:1109.1618 (2011)
Lampos, V., Cristianini, N.: Tracking the flu pandemic by monitoring the social web. In: 2010 2nd International Workshop on Cognitive Information Processing (CIP), pp. 411–416. IEEE (2010)
OConnor, B., Balasubramanyan, R., Routledge, B.R., Smith, N.A.: From tweets to polls: linking text sentiment to public opinion time series. In: Proceedings of the International AAAI Conference on Weblogs and Social Media, pp. 122–129 (2010)
Duan, Y., Jiang, L., Qin, T., et al.: An empirical study on learning to rank of tweets. In: COLING Proceedings of the 23rd International Conference on Computational Linguistics Proceedings of the Conference, Beijing, China, pp. 295–303. Tsinghua University Press, 23–27 Aug 2010
Yardi, S., Boyd, D.: Tweeting from the town square: measuring geographic local networks. In: ICWSM’10 (2010)
Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: WWW’10 (2010)
Ritter, A., Clark, S., Mausam., Etzioni, O.: Named entity recognition in tweets: an experimental study. In: Proceedings of EMNLP (2011)
Faiz, R.: Identifying relevant sentences in news articles for event information extraction. Int. J. Comput. Process. Orient. Lang. (IJCPOL) 19(1), 1–19 (2006)
Becker, H., Naaman, M., Gravano, L.: Learning similarity metrics for event identification in social media. In: WSDM’10 (2010)
Metzler, D., Cai, C., Hovy, E.: Structured event retrieval over microblog archives. In: Proceedings of HLT-NAACL (2012)
Sankaranarayanan, J., Samet, H., Teitler, B.E., Lieberman, M.D., Sperling, J.: Twitterstand: news in tweets. In: GIS’09 (2009)
Petrović, S., Osborne, M., Lavrenko, V.: Streaming first story detection with application to Twitter. In: NAACL’10 (2010)
Chakrabarti, D., Punera, K.: Event summarization using Tweets. In: ICWSM (2011)
Robertson, S., Walker, S., Hancock-Beaulieu, M.: Okapi at TREC-7: automatic ad hoc, filtering, VLC and interactive. In: Text REtrieval Conference TREC, pp. 199–210 (1998)
Cherichi, S., Faiz, R.: Analyzing the behavior and text posted by users to extract knowledge. In: Proceedings of the International Conference on Computational Collective Intelligence Technologies and Applications ICCCI 2014, Seoul, Korea, ACM 2014 Lecture Notes in Artificial Intelligence of Springer (2014)
Cherichi, S., Faiz, R.: New metric measure for the improvement of search results in microblogs. In: Proceedings of the International Conference on Web Intelligence, Mining and Semantics (WIMS 2013), New York, NY, USA. ACM (2013)
Cherichi, S., Faiz, R.: Relevant information discovery in microblogs: new metric measure for the improvement of search results in microblogs. In: Proceedings of INSTICC International Conference on Knowledge Discovery and Information Retrieval (KDIR 2013), Vilamoura, Portugal, ©SciTePress, 19–22 Sept 2013
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Cherichi, S., Faiz, R. (2016). Big Data Analysis for Event Detection in Microblogs. In: Król, D., Madeyski, L., Nguyen, N. (eds) Recent Developments in Intelligent Information and Database Systems. Studies in Computational Intelligence, vol 642. Springer, Cham. https://doi.org/10.1007/978-3-319-31277-4_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-31277-4_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31276-7
Online ISBN: 978-3-319-31277-4
eBook Packages: EngineeringEngineering (R0)