Event Categorization and Key Prospect Identification from Storylines

  • Manu ShuklaEmail author
  • Andrew Fong
  • Raimundo Dos Santos
  • Chang-Tien Lu
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 741)


Event analysis and prospect identification in social media is challenging due to endless amount of information generated daily. While current research focuses on detecting events, there is no clear guidance on how those events should be processed such that they are meaningful to a human analyst. There are no clear ways to detect prospects from social media either. In this paper, we present DISTL, an event processing and prospect identifying platform. It accepts as input a set of storylines (a sequence of entities and their relationships) and processes them as follows: (1) uses different algorithms (LDA, SVM, information gain, rule sets) to identify themes from storylines; (2) identifies top locations and times in storylines and combines with themes to generate events that are meaningful in a specific scenario for categorizing storylines; and (3) extracts top prospects as people and organizations from data elements contained in storylines. The output comprises sets of events in different categories and storylines under them along with top prospects identified. DISTL uses in-memory distributed processing that scales to high data volumes and categorizes generated storylines in near real-time.


  1. 1.
    Agarwal, C., Subbian, K.: Event detection in social streams. In: SDM, pp. 624–635 (2012)Google Scholar
  2. 2.
  3. 3.
  4. 4.
    Apache and Spark. Spark programming guide (2015c).
  5. 5.
  6. 6.
    Chae, J., Thom, D., Bosch, H., Jang, Y., Maciejewski, R., Ebert, D., Ertl, T.: Spatiotemporal social media analytics for abnormal event detection and examination using seasonal-trend decomposition. In: 2012 IEEE Conference on Visual Analytics Science and Technology (VAST), pp. 143–152 (2012)Google Scholar
  7. 7.
    Chen, F., Neill, D.B.: Non-parametric scan statistics for event detection and forecasting in heterogeneous social media graph. In: ACM SIGKDD, pp. 1166–1175 (2014)Google Scholar
  8. 8.
    Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: Developing language processing components with gate version 8. University of Sheffield Department of Computer Science (2014)Google Scholar
  9. 9.
    Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  10. 10.
    Weng, J., Lee, B.-S.: Event detection in twitter, pp. 401–408. AAAI (2011)Google Scholar
  11. 11.
    Keskisärkkä, R., Blomqvist, E.: Semantic complex event processing for social media monitoring-a survey. In: Proceedings of Social Media and Linked Data for Emergency Response (SMILE) Co-Located with the 10th Extended Semantic Web Conference, Montpellier, France, CEUR Workshop Proceedings (2013)Google Scholar
  12. 12.
    Lappas, T., Vieira, M.R., Gunopulos, D., Tsotras, V.J.: On the spatiotemporal burstiness of terms. Proc. VLDB Endow. 5(9), 836–847 (2012)CrossRefGoogle Scholar
  13. 13.
    Leetaru, K., Schrodt, P.A.: GDELT: global database of events, language, and tone. In: ISA Annual Convention (2013)Google Scholar
  14. 14.
    Li, C., Sun, A., Datta, A.: Twevent: segment-based event detection from tweets. In: Conference on Information and Knowledge Management, pp. 155–164 (2012a)Google Scholar
  15. 15.
    Li, R., Lei, K.H., Khadiwala, R., Chang, K.: Tedas: a twitter-based event detection and analysis system. In: Proceedings of 28th IEEE Conference on Data Engineering (ICDE), pp. 1273–1276 (2012b)Google Scholar
  16. 16.
    Petrovic, S., Osborne, M., McCreadie, R., Macdonald, C., Ounis, I., Shrimpton, L.: Can twitter replace newswire for breaking news? In: 7th International AAAI Conference on Weblogs and Social Media (ICWSM) (2013)Google Scholar
  17. 17.
    Pohl, D., Bouchachia, A., Hellwagner, H.: Automatic sub-event detection in emergency management using social media. In: Proceedings of the 21st International Conference Companion on World Wide Web (WWW 2012 Companion), pp. 683–686, New York, NY, USA. ACM (2012)Google Scholar
  18. 18.
    Radinsky, K., Horvitz, E.: Mining the web to predict future events. In: WSDM, pp. 255–264 (2013)Google Scholar
  19. 19.
    Reuter, T., Buza, L.D.K., Schmidt-Thieme, L.: Scalable event-based clustering of social media via record linkage techniques. In: ICWSM (2011)Google Scholar
  20. 20.
    Saha, A., Sindhwani, V.: Learning evolving and emerging topics in social media: a dynamic NMF approach with temporal regularization. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining (WSDM 2012), pp. 693–702, New York, NY, USA. ACM (2012)Google Scholar
  21. 21.
    Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes twitter users: real-time event detection by social sensors. In: WWW, pp. 851–860 (2010)Google Scholar
  22. 22.
    Shukla, M., Santos, R.D., Chen, F., Lu, C.-T.: Discrn: a distributed storytelling framework for intelligence analysis. Virginia Tech Computer Science Technical report (2015)Google Scholar
  23. 23.
    Walther, M., Kaisser, M.: Geo-spatial event detection in the twitter stream. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 356–367. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-36973-5_30 CrossRefGoogle Scholar
  24. 24.
    Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauly, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Presented as Part of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2012), pp. 15–28, San Jose, CA. USENIX (2012)Google Scholar
  25. 25.
    Zhao, L., Chen, F., Dai, J., Lu, C.-T., Ramakrishnan, N.: Unsupervised spatial events detection in targeted domains with applications to civil unrest modeling. PLOS One 9(10), e110206 (2014)CrossRefGoogle Scholar
  26. 26.
    Zhao, L., Chen, F., Lu, C.-T., Ramakishnan, N.: Spatiotemporal event forecasting in social media. In: SDM, pp. 963–971 (2015)Google Scholar
  27. 27.
    Zhou, X., Chen, L.: Event detection over twitter social media streams. VLDB J. 23(3), 381–400 (2014)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Li, J., Tsang, E.P.K.: Investment decision making using FGP: a case study. In: Proceedings of the 1999 Congress on Evolutionary Computation (CEC 1999), vol. 2, no. 2, pp. 1259–1279 (1999)Google Scholar
  29. 29.
    Shukla, M., Santos, R.D., Fong, A., Lu, C.-T.: DISTL: distributed in-memory spatio-temporal event-based storyline categorization platform in social media. In: Proceedings of the 2nd International Conference on Geographical Information Systems Theory, Applications and Management, pp. 39–50, Italy, Rome (2016)Google Scholar
  30. 30.
    Cheh, J.J., Weinberg, R.S., Yook, K.C.: An application of an artificial neural network investment system to predict takeover targets. J. Appl. Bus. Res. (JABR) 15(4), 33–46 (2013)CrossRefGoogle Scholar
  31. 31.
    Geum, Y., Lee, S., Yoon, B., Park, Y.: Identifying and evaluating strategic partners for collaborative R&D: index-based approach using patents and publications. Technovation 33(6), 211–224 (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Manu Shukla
    • 1
    • 2
    • 3
    Email author
  • Andrew Fong
    • 1
    • 2
    • 3
  • Raimundo Dos Santos
    • 1
    • 2
    • 3
  • Chang-Tien Lu
    • 1
    • 2
    • 3
  1. 1.Omniscience CorporationPalo AltoUSA
  2. 2.US Army Corps of Engineers ERDC GRLAlexandriaUSA
  3. 3.Computer Science DepartmentVirginia TechFalls ChurchUSA

Personalised recommendations