A Survey of Social Web Mining Applications for Disease Outbreak Detection

  • Gema Bello-OrgazEmail author
  • Julio Hernandez-Castro
  • David Camacho
Conference paper
Part of the Studies in Computational Intelligence book series (SCI, volume 570)


Social Web Media is one of the most important sources of big data to extract and acquire new knowledge. Social Networks have become an important environment where users provide information of their preferences and relationships. This information can be used to measure the influence of ideas and the society opinions in real time, being very useful on several fields and research areas such as marketing campaigns, financial prediction or public healthcare among others. Recently, the research on artificial intelligence techniques applied to develop technologies allowing monitoring web data sources for detecting public health events has emerged as a new relevant discipline called Epidemic Intelligence. Epidemic Intelligence Systems are nowadays widely used by public health organizations like monitoring mechanisms for early detection of disease outbreaks to reduce the impact of epidemics. This paper presents a survey on current data mining applications and web systems based on web data for public healthcare over the last years. It tries to take special attention to machine learning and data mining techniques and how they have been applied to these web data to extract collective knowledge from Twitter.


Severe Acute Respiratory Syndrome Disease Outbreak Public Healthcare Severe Acute Respiratory Syndrome Name Entity Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Aramaki, E., Maskawa, S., Morita, M.: Twitter catches the flu: detecting influenza epidemics using twitter. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1568–1576. Association for Computational Linguistics (2011)Google Scholar
  2. 2.
    Asur, S., Huberman, B.A.: Predicting the future with social media. In: 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), vol. 1, pp. 492–499. IEEE (2010)Google Scholar
  3. 3.
    Bello, G., Menéndez, H., Okazaki, S., Camacho, D.: Extracting collective trends from twitter using social-based data mining. In: Bǎdicǎ, C., Nguyen, N.T., Brezovan, M. (eds.) ICCCI 2013. LNCS, vol. 8083, pp. 622–630. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  4. 4.
    Bodnar, T., Salathé, M.: Validating models for disease detection using twitter. In: Proceedings of the 22nd International Conference on World Wide Web Companion, pp. 699–702. International World Wide Web Conferences Steering Committee (2013)Google Scholar
  5. 5.
    Brownstein, J.S., Freifeld, C.C., Reis, B.Y., Mandl, K.D.: Surveillance sans frontieres: Internet-based emerging infectious disease intelligence and the healthmap project. PLoS Medicine 5(7), e151 (2008)Google Scholar
  6. 6.
    Carneiro, H.A., Mylonakis, E.: Google trends: a web-based tool for real-time surveillance of disease outbreaks. Clinical Infectious Diseases 49(10), 1557–1564 (2009)CrossRefGoogle Scholar
  7. 7.
    Chen, H., Zeng, D.: Ai for global disease surveillance. IEEE Intelligent Systems 24(6), 66–82 (2009)CrossRefGoogle Scholar
  8. 8.
    Chen, W., Wang, C., Wang, Y.: Scalable influence maximization for prevalent viral marketing in large-scale social networks. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1029–1038. ACM (2010)Google Scholar
  9. 9.
    Cohen, A.M., Hersh, W.R.: A survey of current work in biomedical text mining. Briefings in Bioinformatics 6(1), 57–71 (2005)CrossRefGoogle Scholar
  10. 10.
    Collier, N.: Uncovering text mining: A survey of current work on web-based epidemic intelligence. Global Public Health 7(7), 731–749 (2012)CrossRefGoogle Scholar
  11. 11.
    Collier, N., Doan, S., Kawazoe, A., Goodwin, R.M., Conway, M., Tateno, Y., Ngo, Q.H., Dien, D., Kawtrakul, A., Takeuchi, K., et al.: Biocaster: detecting public health rumors with a web-based text mining system. Bioinformatics 24(24), 2940–2941 (2008)CrossRefGoogle Scholar
  12. 12.
    Collier, N., Goodwin, R.M., McCrae, J., Doan, S., Kawazoe, A., Conway, M., Kawtrakul, A., Takeuchi, K., Dien, D.: An ontology-driven system for detecting global health events. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 215–222. Association for Computational Linguistics (2010)Google Scholar
  13. 13.
    Culotta, A.: Towards detecting influenza epidemics by analyzing twitter messages. In: Proceedings of the First Workshop on Social Media Analytics, pp. 115–122. ACM (2010)Google Scholar
  14. 14.
    Fisichella, M., Stewart, A., Cuzzocrea, A., Denecke, K.: Detecting health events on the social web to enable epidemic intelligence. In: Grossi, R., Sebastiani, F., Silvestri, F. (eds.) SPIRE 2011. LNCS, vol. 7024, pp. 87–103. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  15. 15.
    Ginsberg, J., Mohebbi, M.H., Patel, R.S., Brammer, L., Smolinski, M.S., Brilliant, L.: Detecting influenza epidemics using search engine query data. Nature 457(7232), 1012–1014 (2009)CrossRefGoogle Scholar
  16. 16.
    Hartley, D.M., Nelson, N.P., Walters, R., Arthur, R., Yangarber, R., Madoff, L., Linge, J., Mawudeku, A., Collier, N., Brownstein, J.S., et al.: The landscape of international event-based biosurveillance. Emerging Health Threats 3 (2010)Google Scholar
  17. 17.
    Kamel Boulos, M.N., Sanfilippo, A.P., Corley, C.D., Wheeler, S.: Social web mining and exploitation for serious applications: Technosocial predictive analytics and related technologies for public health, environmental and national security surveillance. Computer Methods and Programs in Biomedicine 100(1), 16–23 (2010)CrossRefGoogle Scholar
  18. 18.
    Keller, M., Blench, M., Tolentino, H., Freifeld, C.C., Mandl, K.D., Mawudeku, A., Eysenbach, G., Brownstein, J.S.: Use of unstructured event-based reports for global infectious disease surveillance. Emerging Infectious Diseases 15(5), 689 (2009)CrossRefGoogle Scholar
  19. 19.
    Keller, M., Freifeld, C.C., Brownstein, J.S.: Automated vocabulary discovery for geo-parsing online epidemic intelligence. BMC Bioinformatics 10(1), 385 (2009)CrossRefGoogle Scholar
  20. 20.
    Lampos, V., Cristianini, N.: Nowcasting events from the social web with statistical learning. ACM Transactions on Intelligent Systems and Technology (TIST) 3(4), 72 (2012)Google Scholar
  21. 21.
    Lee, K., Agrawal, A., Choudhary, A.: Real-time disease surveillance using twitter data: demonstration on flu and cancer. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1474–1477. ACM (2013)Google Scholar
  22. 22.
    Leskovec, J., Krause, A., Guestrin, C., Faloutsos, C., VanBriesen, J., Glance, N.: Cost-effective outbreak detection in networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 420–429. ACM (2007)Google Scholar
  23. 23.
    Linge, J.P., Belyaeva, J., Steinberger, R., Gemo, M., Fuart, F., Al-Khudhairy, D., Bucci, S., Yangarber, R., van der Goot, E.: Medisys: Medical information system. In: Advanced ICTs for Disaster Management and Threat Detection: Collaborative and Distributed Frameworks, pp. 131–142 (2010)Google Scholar
  24. 24.
    Mykhalovskiy, E., Weir, L.: The global public health intelligence network and early warning outbreak detection. Canadian Journal of Public Health 97(1) (2006)Google Scholar
  25. 25.
    Paquet, C., Coulombier, D., Kaiser, R., Ciotti, M.: Epidemic intelligence: a new framework for strengthening disease surveillance in europe. Euro Surveillance: Bulletin Europeen Sur Les Maladies Transmissibles= European Communicable Disease Bulletin 11(12), 212–214 (2005)Google Scholar
  26. 26.
    Polgreen, P.M., Chen, Y., Pennock, D.M., Nelson, F.D., Weinstein, R.A.: Using internet searches for influenza surveillance. Clinical Infectious Diseases 47(11), 1443–1448 (2008)CrossRefGoogle Scholar
  27. 27.
    Ritterman, J., Osborne, M., Klein, E.: Using prediction markets and twitter to predict a swine flu pandemic. In: 1st International Workshop on Mining Social Media (2009)Google Scholar
  28. 28.
    Victor, L.Y., Madoff, L.C.: Promed-mail: an early warning system for emerging diseases. Clinical Infectious Diseases 39(2), 227–232 (2004)CrossRefGoogle Scholar
  29. 29.
    Xie, Y., Chen, Z., Cheng, Y., Zhang, K., Agrawal, A., Liao, W.K., Choudhary, A.: Detecting and tracking disease outbreaks by mining social media data. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, pp. 2958–2960. AAAI Press (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Gema Bello-Orgaz
    • 1
    Email author
  • Julio Hernandez-Castro
    • 2
  • David Camacho
    • 1
  1. 1.Escuela Politecnica SuperiorUniversidad Autonoma de MadridMadridSpain
  2. 2.School of ComputingUniversity of KentCanterburyUK

Personalised recommendations