Concept Drift Adaptive Physical Event Detection for Social Media Streams
Event detection has long been the domain of physical sensors operating in a static dataset assumption. The prevalence of social media and web access has led to the emergence of social, or human sensors who report on events globally. This warrants development of event detectors that can take advantage of the truly dense and high spatial and temporal resolution data provided by more than 3 billion social users. The phenomenon of concept drift, which causes terms and signals associated with a topic to change over time, renders static machine learning ineffective. Towards this end, we present an application for physical event detection on social sensors that improves traditional physical event detection with concept drift adaptation. Our approach continuously updates its machine learning classifiers automatically, without the need for human intervention. It integrates data from heterogeneous sources and is designed to handle weak-signal events (landslides, wildfires) with around ten posts per event in addition to large-signal events (hurricanes, earthquakes) with hundreds of thousands of posts per event. We demonstrate a landslide detector on our application that detects almost 350% more landslides compared to static approaches. Our application has high performance: using classifiers trained in 2014, achieving event detection accuracy of 0.988, compared to 0.762 for static approaches.
KeywordsConcept drift Machine learning event detection Disaster detection
This research has been partially funded by National Science Foundation by CISE’s SAVI/RCN (1402266, 1550379), CNS (1421561), CRISP (1541074), SaTC (1564097) programs, an REU supplement (1545173), and gifts, grants, or contracts from Fujitsu, HP, Intel, and Georgia Tech Foundation through the John P. Imlay, Jr. Chair endowment. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or other funding agencies and companies mentioned above.
- 1.Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th International Conference on World Wide Web, pp. 851–860. ACM (2010)Google Scholar
- 5.Hirose, H., Wang, L.: Prediction of infectious disease spread using Twitter: a case of influenza. In: 2012 Fifth International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), pp. 100–105. IEEE (2012)Google Scholar
- 6.Musaev, A., Wang, D., Shridhar, S., Pu, C.: Fast text classification using randomized explicit semantic analysis. In: 2015 IEEE International Conference on Information Reuse and Integration (IRI), pp. 364–371. IEEE (2015)Google Scholar
- 7.Musaev, A., Wang, D., Pu, C.: LITMUS: landslide detection by integrating multiple sources. In: ISCRAM (2014)Google Scholar
- 8.Thom, D., Bosch, H., Koch, S., Wörner, M., Ertl, T.: Spatiotemporal anomaly detection through visual analysis of geolocated Twitter messages. In: 2012 IEEE Pacific Visualization Symposium (PacificVis), pp. 41–48. IEEE (2012)Google Scholar
- 10.Lazer, D., Kennedy, R.: What we can learn from the epic failure of Google Flu Trends. Wired. Conde Nast 10 (2015)Google Scholar
- 12.Bach, S.H., Maloof, M.A.: Paired learners for concept drift. In: Eighth IEEE International Conference on 2008 Data Mining, ICDM 2008, pp. 23–32. IEEE (2008)Google Scholar
- 20.Žliobaitė, I., Bifet, A., Pfahringer, B., Holmes, G.: Active learning with evolving streaming data. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS (LNAI), vol. 6913, pp. 597–612. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23808-6_39CrossRefGoogle Scholar
- 21.Dehghani, M., Zamani, H., Severyn, A., Kamps, J., Croft, W.B.: Neural ranking models with weak supervision. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval 2017, pp. 65–74. ACM (2017)Google Scholar
- 22.Mikolov, T., Chen, K., Corrado, G., Dean, J., Sutskever, L., Zweig, G.: Word2vec (2013). https://code.google.com/p/word2vec