Skip to main content

Big Data Analysis for Event Detection in Microblogs

  • Chapter
  • First Online:

Part of the book series: Studies in Computational Intelligence ((SCI,volume 642))

Abstract

The growing complexity of the Twitter micro-blogging service in terms of size, number of users, and variety of bloggers relationships have generated a big data which requires innovative approaches in order to analyse, extract and detect non-obvious and popular events. Under such a circumstance, we aim, in this paper, to use big data analytics within twitter to allow real time event detection. These challenges present a big opportunity for Natural Language Processing (NLP) and Information Extraction (IE) technology to enable new large-scale data-analysis applications. Taking to account all the difficulties, this paper proposes a new metric to improve the results of the searches in microblogs. It combines content relevance, tweet relevance and author relevance, and develops a Natural Language Processing method for extracting temporal information of events from posts more specifically tweets. Our approach is based on a methodology of temporal markers classes and on a contextual exploration method. To evaluate our model, we built a knowledge management system. Actually, we used a collection of 10 thousand of tweets talking about the current events in 2014 and 2015.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: WWW’10 (2010)

    Google Scholar 

  2. Mendoza, M., Poblete, B., Castillo, C.: Twitter under crisis: can we trust what we RT? In: Proceedings of the First Workshop on Social Media Analytics (2010)

    Google Scholar 

  3. Qu, Y., Huang, C., Zhang, P., Zhang, J.: Microblogging after a major disaster in China: a case study of the 2010 Yushu earthquake. In: Proceedings of the ACM 2011 Conference on Computer Supported Cooperative Work, pp. 25–34 (2011)

    Google Scholar 

  4. Zheng, X., Zeng, Z., Chen, Z., Yu, Y., Rong, C.: Detecting spammers on social networks. Neurocomputing 159(2), 27–34 (2015)

    Google Scholar 

  5. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H.: Big data: the next frontier for innovation, competition, and productivity (2011)

    Google Scholar 

  6. Elgendy, N., Elragal, A.: Big data analytics: a literature review paper. In: Advances in Data Mining. Applications and Theoretical Aspects, pp. 214–227. Springer, Berlin (2014)

    Google Scholar 

  7. Barbosa, L., Feng, J.: Robust sentiment detection on Twitter from biased and noisy data. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 36–44. Association for Computational Linguistics (2010)

    Google Scholar 

  8. Jiang, L., Yu, M., Zhou, M., Liu, X., Zhao, T.: Target-dependent Twitter sentiment classification. In: Proceedings of 49th ACL: HLT, vol. 1, pp. 151–160 (2011)

    Google Scholar 

  9. Cha, M., Haddadi, H., Benevenuto Krishna, F., Gummadi, P.: Measuring user influence in twitter: the million follower fallacy. In: Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media 2010, ICWSM (2010)

    Google Scholar 

  10. Doan, S., Vo, B.K.H., Collier, N.: An analysis of Twitter messages in the 2011 Toho earthquake. Arxiv preprint arXiv:1109.1618 (2011)

  11. Lampos, V., Cristianini, N.: Tracking the flu pandemic by monitoring the social web. In: 2010 2nd International Workshop on Cognitive Information Processing (CIP), pp. 411–416. IEEE (2010)

    Google Scholar 

  12. OConnor, B., Balasubramanyan, R., Routledge, B.R., Smith, N.A.: From tweets to polls: linking text sentiment to public opinion time series. In: Proceedings of the International AAAI Conference on Weblogs and Social Media, pp. 122–129 (2010)

    Google Scholar 

  13. Duan, Y., Jiang, L., Qin, T., et al.: An empirical study on learning to rank of tweets. In: COLING Proceedings of the 23rd International Conference on Computational Linguistics Proceedings of the Conference, Beijing, China, pp. 295–303. Tsinghua University Press, 23–27 Aug 2010

    Google Scholar 

  14. Yardi, S., Boyd, D.: Tweeting from the town square: measuring geographic local networks. In: ICWSM’10 (2010)

    Google Scholar 

  15. Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: WWW’10 (2010)

    Google Scholar 

  16. Ritter, A., Clark, S., Mausam., Etzioni, O.: Named entity recognition in tweets: an experimental study. In: Proceedings of EMNLP (2011)

    Google Scholar 

  17. Faiz, R.: Identifying relevant sentences in news articles for event information extraction. Int. J. Comput. Process. Orient. Lang. (IJCPOL) 19(1), 1–19 (2006)

    Google Scholar 

  18. Becker, H., Naaman, M., Gravano, L.: Learning similarity metrics for event identification in social media. In: WSDM’10 (2010)

    Google Scholar 

  19. Metzler, D., Cai, C., Hovy, E.: Structured event retrieval over microblog archives. In: Proceedings of HLT-NAACL (2012)

    Google Scholar 

  20. Sankaranarayanan, J., Samet, H., Teitler, B.E., Lieberman, M.D., Sperling, J.: Twitterstand: news in tweets. In: GIS’09 (2009)

    Google Scholar 

  21. Petrović, S., Osborne, M., Lavrenko, V.: Streaming first story detection with application to Twitter. In: NAACL’10 (2010)

    Google Scholar 

  22. Chakrabarti, D., Punera, K.: Event summarization using Tweets. In: ICWSM (2011)

    Google Scholar 

  23. Robertson, S., Walker, S., Hancock-Beaulieu, M.: Okapi at TREC-7: automatic ad hoc, filtering, VLC and interactive. In: Text REtrieval Conference TREC, pp. 199–210 (1998)

    Google Scholar 

  24. Cherichi, S., Faiz, R.: Analyzing the behavior and text posted by users to extract knowledge. In: Proceedings of the International Conference on Computational Collective Intelligence Technologies and Applications ICCCI 2014, Seoul, Korea, ACM 2014 Lecture Notes in Artificial Intelligence of Springer (2014)

    Google Scholar 

  25. Cherichi, S., Faiz, R.: New metric measure for the improvement of search results in microblogs. In: Proceedings of the International Conference on Web Intelligence, Mining and Semantics (WIMS 2013), New York, NY, USA. ACM (2013)

    Google Scholar 

  26. Cherichi, S., Faiz, R.: Relevant information discovery in microblogs: new metric measure for the improvement of search results in microblogs. In: Proceedings of INSTICC International Conference on Knowledge Discovery and Information Retrieval (KDIR 2013), Vilamoura, Portugal, ©SciTePress, 19–22 Sept 2013

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soumaya Cherichi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Cherichi, S., Faiz, R. (2016). Big Data Analysis for Event Detection in Microblogs. In: Król, D., Madeyski, L., Nguyen, N. (eds) Recent Developments in Intelligent Information and Database Systems. Studies in Computational Intelligence, vol 642. Springer, Cham. https://doi.org/10.1007/978-3-319-31277-4_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-31277-4_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-31276-7

  • Online ISBN: 978-3-319-31277-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics