Using Twitter Data and Sentiment Analysis to Study Diseases Dynamics

  • Vincenza Carchiolo
  • Alessandro LongheuEmail author
  • Michele Malgeri
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9267)


Twitter has been recently used to predict and/or monitor real world outcomes, and this is also true for health related topic. In this work, we extract information about diseases from Twitter with spatio-temporal constraints, i.e. considering a specific geographic area during a given period. We exploit the SNOMED-CT terminology to correctly detect medical terms, using sentiment analysis to assess to what extent each disease is perceived by persons. We show our first results for a monitoring tool that allow to study the dynamic of diseases.


Health Information Systems (HIS) Twitter Natural Language Processing (NLP) SNOMED-CT Sentiment analysis 


  1. 1.
    Achrekar, H., Gandhe, A., Lazarus, R., Yu, S.H., Liu, B.: Predicting flu trends using twitter data. In: 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 702–707, April 2011Google Scholar
  2. 2.
    Asur, S., Huberman, B.A.: Predicting the future with social media. CoRR abs/1003.5699 (2010).
  3. 3.
    Atefeh, F., Khreich, W.: A survey of techniques for event detection in twitter. Comput. Intell. 31(1), 132–164 (2015)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Baeza-yates, R., Ribeiro-Neto, B.: Modern Information Retrievial. ACM Press, Seattle (1999)Google Scholar
  5. 5.
    Carchiolo, V., Longheu, A., Cifalino, S.: Contestualizzazione spaziale di informazioni medico scientifiche tramite sensori sociali. DIEEI - Internal, Report (2015)Google Scholar
  6. 6.
    Cios, K.J., Moore, W.: Uniqueness of medical data mining. Artif. Intell. Med. 26, 1–24 (2002)CrossRefGoogle Scholar
  7. 7.
    Diakopoulos, N.A., Shamma, D.A.: Characterizing debate performance via aggregated twitter sentiment. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2010, pp. 1195–1198. ACM, New York (2010).
  8. 8.
    Eysenbach, G.: Infodemiology and Infoveillance. Am. J. Prev. Med. 40(5), S154–S158 (2011). CrossRefGoogle Scholar
  9. 9.
    Fisher, J., Clayton, M.: Who gives a tweet: assessing patients interest in the use of social media for health care. Worldviews Evid.-Based Nurs. 9(2), 100–108 (2012). CrossRefGoogle Scholar
  10. 10.
    Gonçalves, P., Araújo, M., Benevenuto, F., Cha, M.: Comparing and combining sentiment analysis methods. In: Proceedings of the First ACM Conference on Online Social Networks, COSN 2013, pp. 27–38. ACM, New York (2013),
  11. 11.
  12. 12.
    Jackson, P., Moulinier, I.: Natural Language Processing for Online Applications: Text Retrieval, Extraction and Categorization, 2nd edn. John Benjamins, Amsterdam (2007)CrossRefGoogle Scholar
  13. 13.
    Kanhabua, N., Nejdl, W.: Understanding the diversity of tweets in the time of outbreaks. In: Proceedings of the 22nd International Conference on World Wide Web Companion, WWW 2013 Companion, pp. 1335–1342. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland (2013).
  14. 14.
    Kumar, S., Morstatter, F., Liu, H.: Twitter Data Analytics. Springer, New York (2013)Google Scholar
  15. 15.
    Lee, D., Cornet, R., Lau, F., de Keizer, N.: A survey of snomed-ct implementations. J. Biomed. Inform. 46(1), 87–96 (2013). CrossRefGoogle Scholar
  16. 16.
    Lee, K., Agrawal, A., Choudhary, A.: Real-time disease surveillance using twitter data: demonstration on flu and cancer. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, pp. 1474–1477. ACM, New York (2013).
  17. 17.
    Medhat, W., Hassan, A., Korashy, H.: Sentiment analysis algorithms and applications: a survey. Ain Shams Eng. J. 5(4), 1093–1113 (2014). CrossRefGoogle Scholar
  18. 18.
    Natural Language Toolkit.
  19. 19.
    Natural Language Toolkit chunk package.
  20. 20.
    Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008). CrossRefGoogle Scholar
  21. 21.
    Paul, M.: Discovering health topics in social media using topic models, April 2014.
  22. 22.
  23. 23.
    Raghupathi, W., Raghupathi, V.: Big data analytics in healthcare: promise and potential. Health Inf. Sci. Syst. 2(1), 3 (2014). CrossRefGoogle Scholar
  24. 24.
    Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 851–860. ACM, New York (2010).
  25. 25.
    Signorini, A., Segre, A.M., Polgreen, P.M.: The use of Twitter to track levels of disease activity and public concern in the US during the influenza A H1N1 pandemic. PLoS One 6(5), e19467 (2011). doi: 10.1371/journal.pone.0019467 CrossRefGoogle Scholar
  26. 26.
  27. 27.
    Sunmoo Yoon, N.E., Bakken, S.: A practical approach for content mining of tweets. Am. J. Prev. Med. 45(1), S122–S129 (2013)CrossRefGoogle Scholar
  28. 28.
    Tweepy - A Python library for accessing Twitter API.
  29. 29.
  30. 30.
    Twitter Streaming APIs.
  31. 31.
    Tyshchuk, Y., Wallace, W., Li, H., Ji, H., Kase, S.: The nature of communications and emerging communities on twitter following the 2013 syria sarin gas attacks. In: 2014 IEEE Joint on Intelligence and Security Informatics Conference (JISIC), pp. 41–47, September 2014Google Scholar
  32. 32.

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Vincenza Carchiolo
    • 1
  • Alessandro Longheu
    • 1
    Email author
  • Michele Malgeri
    • 1
  1. 1.Dip. Ingegneria Elettrica, Elettronica e InformaticaUniversità Degli Studi di CataniaCataniaItaly

Personalised recommendations