Advertisement

Concept Drift Adaptive Physical Event Detection for Social Media Streams

  • Abhijit SupremEmail author
  • Aibek Musaev
  • Calton Pu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11517)

Abstract

Event detection has long been the domain of physical sensors operating in a static dataset assumption. The prevalence of social media and web access has led to the emergence of social, or human sensors who report on events globally. This warrants development of event detectors that can take advantage of the truly dense and high spatial and temporal resolution data provided by more than 3 billion social users. The phenomenon of concept drift, which causes terms and signals associated with a topic to change over time, renders static machine learning ineffective. Towards this end, we present an application for physical event detection on social sensors that improves traditional physical event detection with concept drift adaptation. Our approach continuously updates its machine learning classifiers automatically, without the need for human intervention. It integrates data from heterogeneous sources and is designed to handle weak-signal events (landslides, wildfires) with around ten posts per event in addition to large-signal events (hurricanes, earthquakes) with hundreds of thousands of posts per event. We demonstrate a landslide detector on our application that detects almost 350% more landslides compared to static approaches. Our application has high performance: using classifiers trained in 2014, achieving event detection accuracy of 0.988, compared to 0.762 for static approaches.

Keywords

Concept drift Machine learning event detection Disaster detection 

Notes

Acknowledgement

This research has been partially funded by National Science Foundation by CISE’s SAVI/RCN (1402266, 1550379), CNS (1421561), CRISP (1541074), SaTC (1564097) programs, an REU supplement (1545173), and gifts, grants, or contracts from Fujitsu, HP, Intel, and Georgia Tech Foundation through the John P. Imlay, Jr. Chair endowment. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or other funding agencies and companies mentioned above.

References

  1. 1.
    Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th International Conference on World Wide Web, pp. 851–860. ACM (2010)Google Scholar
  2. 2.
    Jongman, B., Wagemaker, J., Romero, B.R., de Perez, E.C.: Early flood detection for rapid humanitarian response: harnessing near real-time satellite and Twitter signals. ISPRS Int. J. Geo-Inf. 4(4), 2246–2266 (2015)CrossRefGoogle Scholar
  3. 3.
    Wakamiya, S., Kawai, Y., Aramaki, E.: Twitter-based influenza detection after flu peak via tweets with indirect information: text mining study. JMIR Public Health Surveill. 4(3), e65 (2018)CrossRefGoogle Scholar
  4. 4.
    Yang, S., Santillana, M., Kou, S.C.: Accurate estimation of influenza epidemics using Google search data via ARGO. Proc. Natl. Acad. Sci. 112(47), 14473–14478 (2015)CrossRefGoogle Scholar
  5. 5.
    Hirose, H., Wang, L.: Prediction of infectious disease spread using Twitter: a case of influenza. In: 2012 Fifth International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), pp. 100–105. IEEE (2012)Google Scholar
  6. 6.
    Musaev, A., Wang, D., Shridhar, S., Pu, C.: Fast text classification using randomized explicit semantic analysis. In: 2015 IEEE International Conference on Information Reuse and Integration (IRI), pp. 364–371. IEEE (2015)Google Scholar
  7. 7.
    Musaev, A., Wang, D., Pu, C.: LITMUS: landslide detection by integrating multiple sources. In: ISCRAM (2014)Google Scholar
  8. 8.
    Thom, D., Bosch, H., Koch, S., Wörner, M., Ertl, T.: Spatiotemporal anomaly detection through visual analysis of geolocated Twitter messages. In: 2012 IEEE Pacific Visualization Symposium (PacificVis), pp. 41–48. IEEE (2012)Google Scholar
  9. 9.
    Lazer, D., Kennedy, R., King, G., Vespignani, A.: The parable of Google Flu: traps in big data analysis. Science 343(6176), 1203–1205 (2014)CrossRefGoogle Scholar
  10. 10.
    Lazer, D., Kennedy, R.: What we can learn from the epic failure of Google Flu Trends. Wired. Conde Nast 10 (2015)Google Scholar
  11. 11.
    Almeida, P.R., Oliveira, L.S., Britto Jr., A.S., Sabourin, R.: Adapting dynamic classifier selection for concept drift. Expert Syst. Appl. 104, 67–85 (2018)CrossRefGoogle Scholar
  12. 12.
    Bach, S.H., Maloof, M.A.: Paired learners for concept drift. In: Eighth IEEE International Conference on 2008 Data Mining, ICDM 2008, pp. 23–32. IEEE (2008)Google Scholar
  13. 13.
    Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, Ana L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004).  https://doi.org/10.1007/978-3-540-28645-5_29CrossRefGoogle Scholar
  14. 14.
    Göpfert, J.P., Hammer, B., Wersing, H.: Mitigating concept drift via rejection. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds.) ICANN 2018. LNCS, vol. 11139, pp. 456–467. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01418-6_45CrossRefGoogle Scholar
  15. 15.
    Markou, M., Singh, S.: Novelty detection: a review—part 1: statistical approaches. Signal Process. 83(12), 2481–2497 (2003)CrossRefGoogle Scholar
  16. 16.
    Lazarescu, M.M., Venkatesh, S., Bui, H.H.: Using multiple windows to track concept drift. Intell. Data Anal. 8(1), 29–59 (2004)CrossRefGoogle Scholar
  17. 17.
    Gomes, H.M., et al.: Adaptive random forests for evolving data stream classification. Mach. Learn. 106(9–10), 1469–1495 (2017)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Ren, S., Liao, B., Zhu, W., Li, K.: Knowledge-maximized ensemble algorithm for different types of concept drift. Inf. Sci. 430, 261–281 (2018)CrossRefGoogle Scholar
  19. 19.
    Shan, J., Zhang, H., Liu, W., Liu, Q.: Online active learning ensemble framework for drifted data streams. IEEE Trans. Neural Netw. Learn. Syst. 30(2), 486–498 (2018)CrossRefGoogle Scholar
  20. 20.
    Žliobaitė, I., Bifet, A., Pfahringer, B., Holmes, G.: Active learning with evolving streaming data. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS (LNAI), vol. 6913, pp. 597–612. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-23808-6_39CrossRefGoogle Scholar
  21. 21.
    Dehghani, M., Zamani, H., Severyn, A., Kamps, J., Croft, W.B.: Neural ranking models with weak supervision. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval 2017, pp. 65–74. ACM (2017)Google Scholar
  22. 22.
    Mikolov, T., Chen, K., Corrado, G., Dean, J., Sutskever, L., Zweig, G.: Word2vec (2013). https://code.google.com/p/word2vec
  23. 23.
    Eisenstein, J., O’Connor, B., Smith, N.A., Xing, E.P.: Diffusion of lexical change in social media. PLoS One 9(11), e113114 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Georgia Institute of TechnologyAtlantaUSA
  2. 2.University of AlabamaTuscaloosaUSA

Personalised recommendations