A Public Health Surveillance Platform Exploiting Free-Text Sources via Natural Language Processing and Linked Data: Application in Adverse Drug Reaction Signal Detection Using PubMed and Twitter

  • Pantelis NatsiavasEmail author
  • Nicos Maglaveras
  • Vassilis Koutkias
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10096)


This paper presents a platform enabling the systematic exploitation of diverse, free-text data sources for public health surveillance applications. The platform relies on Natural Language Processing (NLP) and a micro-services architecture, utilizing Linked Data as a data representational formalism. In order to perform NLP in an extendable and modular fashion, the proposed platform employs the Apache Unstructured Information Management Architecture (UIMA) and semantically annotates the results through a newly developed UIMA Semantic Common Analysis Structure Consumer (SCC). The SCC output is a graph represented in the Resource Description Framework (RDF) based on the W3C Web Annotation Data Model (WADM) and SNOMED-CT. We also present the use of the proposed platform through an exemplar application scenario concerning the detection of adverse drug reaction (ADR) signals using data retrieved from PubMed and Twitter.


Public health surveillance Micro-services Semantic Web Linked Data Natural Language Processing Adverse drug reactions 


  1. 1.
    Harpaz, R., Callahan, A., Tamang, S., Low, Y., Odgers, D., Finlayson, S., Jung, K., LePendu, P., Shah, N.H.: Text mining for adverse drug events: the promise, challenges, and state of the art. Drug Saf. 37, 777–790 (2014)CrossRefGoogle Scholar
  2. 2.
    Bizer, C.: The emerging web of Linked Data. IEEE Intell. Syst. 24, 87–92 (2009)CrossRefGoogle Scholar
  3. 3.
  4. 4.
    Apache UIMA - Apache UIMA.
  5. 5.
    Sarker, A., Ginn, R., Nikfarjam, A., O’Connor, K., Smith, K., Jayaraman, S., Upadhaya, T., Gonzalez, G.: Utilizing social media data for pharmacovigilance: a review. J. Biomed. Inform. 54, 202–212 (2015)CrossRefGoogle Scholar
  6. 6.
    Council for International Organizations of Medical Sciences (CIOMS): Practical Aspects of Signal Detection in Pharmacovigilance. Council for International Organizations of Medical Sciences. Report of CIOMS Working Group VIII. CIOMS, Geneva (2010)Google Scholar
  7. 7.
    Klann, J.G., Buck, M.D., Brown, J., Hadley, M., Elmore, R., Weber, G.M., Murphy, S.N.: Query Health: standards-based, cross-platform population health surveillance. J. Am. Med. Inform. Assoc. 21, 650–656 (2014)CrossRefGoogle Scholar
  8. 8.
    Teodoro, D., Pasche, E., Gobeill, J., Emonet, S., Ruch, P., Lovis, C.: Building a transnational biosurveillance network using Semantic Web technologies: requirements, design, and preliminary evaluation. J. Med. Internet Res. 14(3), e73 (2012)CrossRefGoogle Scholar
  9. 9.
    Daniulaityte, R., Chen, L., Lamy, F.R., Carlson, R.G., Thirunarayan, K., Sheth, A.: When “Bad” is “Good”: identifying personal communication and sentiment in drug-related tweets. JMIR Public Heal. Surveill. 2, e162 (2016)CrossRefGoogle Scholar
  10. 10.
    Huff, A.G., Breit, N., Allen, T., Whiting, K., Kiley, C.: Evaluation and verification of the global rapid identification of threats system for infectious diseases in textual data sources. Interdiscip. Perspect. Infect. Dis. 2016, 5080746 (2016)Google Scholar
  11. 11.
    Yang, M., Kiang, M., Shang, W.: Filtering big data from social media – building an early warning system for adverse drug reactions. J. Biomed. Inform. 54, 230–240 (2015)CrossRefGoogle Scholar
  12. 12.
    Cameron, D., Smith, G.A., Daniulaityte, R., Sheth, A.P., Dave, D., Chen, L., Anand, G., Carlson, R., Watkins, K.Z., Falck, R.: PREDOSE: a Semantic Web platform for drug abuse epidemiology using social media. J. Biomed. Inform. 46, 985–997 (2013)CrossRefGoogle Scholar
  13. 13.
    Shang, N., Xu, H., Rindflesch, T.C., Cohen, T.: Identifying plausible adverse drug reactions using knowledge extracted from the literature. J. Biomed. Inform. 52, 293–310 (2014)CrossRefGoogle Scholar
  14. 14.
    Freifeld, C.C., Brownstein, J.S., Menone, C.M., Bao, W., Filice, R., Kass-Hout, T., Dasgupta, N.: Digital drug safety surveillance: monitoring pharmaceutical products in Twitter. Drug Saf. 37, 343–350 (2014)CrossRefGoogle Scholar
  15. 15.
    Chew, C., Eysenbach, G.: Pandemics in the age of Twitter: content analysis of tweets during the 2009 H1N1 outbreak. PLoS ONE 5, e14118 (2010)CrossRefGoogle Scholar
  16. 16.
    Ram, S., Zhang, W., Williams, M., Pengetnze, Y.: Predicting asthma-related emergency department visits using big data. IEEE J. Biomed. Heal. Inform. 19, 1216–1223 (2015)CrossRefGoogle Scholar
  17. 17.
    Gesualdo, F., Stilo, G., D’Ambrosio, A., Carloni, E., Pandolfi, E., Velardi, P., Fiocchi, A., Tozzi, A.E.: Can Twitter be a source of information on allergy? correlation of pollen counts with tweets reporting symptoms of allergic rhinoconjunctivitis and names of antihistamine drugs. PLoS ONE 10, e0133706 (2015)CrossRefGoogle Scholar
  18. 18.
    Gittelman, S., Lange, V., Gotway Crawford, C.A., Okoro, C.A., Lieb, E., Dhingra, S.S., Trimarchi, E.: A new source of data for public health surveillance: Facebook likes. J. Med. Internet Res. 17(4), e98 (2015)CrossRefGoogle Scholar
  19. 19.
    Fullwood, M.D., Kecojevic, A., Basch, C.H.: Examination of YouTube videos related to synthetic cannabinoids. Int. J. Adolesc. Med. Health (2016)Google Scholar
  20. 20.
    Shin, S.-Y., Seo, D.-W., An, J., Kwak, H., Kim, S.-H., Gwack, J., Jo, M.-W.: High correlation of Middle East respiratory syndrome spread with google search and Twitter trends in Korea. Sci. Rep. 6, 32920 (2016)CrossRefGoogle Scholar
  21. 21.
    Santillana, M., Nguyen, A.T., Dredze, M., Paul, M.J., Nsoesie, E.O., Brownstein, J.S.: Combining search, social media, and traditional data sources to improve influenza surveillance. PLoS Comput. Biol. 11, e1004513 (2015)CrossRefGoogle Scholar
  22. 22.
    Koutkias, V., Lillo-Le Louët, A., Jaulent, M.C.: Exploiting heterogeneous publicly available data sources for drug safety surveillance: computational framework and case studies. Expert Opin. Drug Saf. 16, 113–124 (2016)Google Scholar
  23. 23.
    Poulymenopoulou, M., Papakonstantinou, D., Malamateniou, F., Vassilacopoulos, G.: A health analytics semantic ETL service for obesity surveillance. Stud. Health Technol. Inform. 210, 840–844 (2015)Google Scholar
  24. 24.
    Chorianopoulos, K., Talvis, K.: open-source and Linked Data for epidemiology. Health Inform. J. 22(4), 962–974 (2015)CrossRefGoogle Scholar
  25. 25.
    Kato, Y., Izui, T., Murakawa, Y., Okabayashi, K., Ueki, M., Tsuchiya, Y., Narita, M.: Research and development environments for robot services and its implementation. In: 2011 IEEE/SICE International Symposium on System Integration (SII), pp. 306–311 (2011)Google Scholar
  26. 26.
    Vögler, M., Schleicher, J., Inzinger, C., Nastic, S., Sehic, S., Dustdar, S.: LEONORE – large-scale provisioning of resource-constrained IoT deployments. In: 9th International Symposium on Service-Oriented System Engineering, pp. 78–87 (2015)Google Scholar
  27. 27.
    Ono, K., Muetze, T., Kolishovski, G., Shannon, P., Demchak, B.: CyREST: turbocharging cytoscape access for external tools via a RESTful API. F1000Research 4, 478 (2015)Google Scholar
  28. 28.
    Fages, F., Soliman, S. (eds.): PPSWR 2005. LNCS, vol. 3703. Springer, Heidelberg (2005)Google Scholar
  29. 29.
    Samwald, M., Jentzsch, A., Bouton, C., Kallesøe, C.S., Willighagen, E., Hajagos, J., Marshall, M.S., Prud’hommeaux, E., Hassenzadeh, O., Pichler, E., Stephens, S.: Linked open drug data for pharmaceutical research and development. J Cheminform. 3, 19 (2011)CrossRefGoogle Scholar
  30. 30.
    Callahan, A., Cruz-Toledo, J., Ansell, P., Dumontier, M.: Bio2RDF release 2: improved coverage, interoperability and provenance of life science Linked Data. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) The Semantic Web: Semantics and Big Data, pp. 200–212. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  31. 31.
    Salvadores, M., Alexander, P.R., Musen, M.A., Noy, N.F.: BioPortal as a dataset of linked biomedical ontologies and terminologies in RDF. Semant. Web. 4, 277–284 (2013)Google Scholar
  32. 32.
    Sneps-Sneppe, M., Namiot, D.: Micro-service architecture for emerging telecom applications. Int. J. Open Inf. Technol. 2, 34–38 (2014)Google Scholar
  33. 33.
    Fielding, R.T., Taylor, R.N.: Principled design of the modern web architecture. In: Proceedings of the 22nd International Conference on Software Engineering, pp. 407–416. ACM, New York (2000)Google Scholar
  34. 34.
    Savova, G.K., Masanz, J.J., Ogren, P.V., Zheng, J., Sohn, S., Kipper-Schuler, K.C., Chute, C.G.: Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inform. Assoc. 17, 507–513 (2010)CrossRefGoogle Scholar
  35. 35.
  36. 36.
    Koutkias, V.G., Jaulent, M.-C.: Computational approaches for pharmacovigilance signal detection: toward integrated and semantically-enriched frameworks. Drug Saf. 38, 219–232 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Pantelis Natsiavas
    • 1
    • 2
    Email author
  • Nicos Maglaveras
    • 1
    • 2
  • Vassilis Koutkias
    • 1
    • 2
  1. 1.Lab of Computing and Medical Informatics, Department of MedicineAristotle University of ThessalonikiThessalonikiGreece
  2. 2.Institute of Applied BiosciencesCentre for Research and Technology HellasThermi, ThessalonikiGreece

Personalised recommendations