Applications of Clinical Text Mining

  • Hercules Dalianis
Open Access


This chapter presents various applications of clinical text mining that all use the electronic patient record text as input data.


  1. Andersson, P., & Sjöberg, A. (2016). Generating and Evaluating an Automatic Mapping Between SNOMED-CT and the Swedish Extension Codes of ICD-10 Based on Lexical Similarities. Master’s thesis, Department of Computer and Systems Sciences, Stockholm University.Google Scholar
  2. Aramaki, E., Miura, Y., Tonoike, M., Ohkuma, T., Mashuichi, H., & Ohe, K. (2009). Text2table: Medical text summarization system based on named entity recognition and modality identification. In Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing (pp. 185–192). Association for Computational Linguistics.Google Scholar
  3. Aramaki, E., Miura, Y., Tonoike, M., Ohkuma, T., Masuichi, H., Waki, K., et al. (2010). Extraction of adverse drug effects from clinical records. Studies in Health Technology and Informatics, 160(Pt 1), 739–743.Google Scholar
  4. Aramaki, E., Morita, M., Kano, Y., & Ohkuma, T. (2014). Overview of the NTCIR-11 MedNLP-2 Task. In NTCIR.Google Scholar
  5. Bailey, C., Peddie, D., Wickham, M. E., Badke, K., Small, S. S., Doyle-Waters, M. M., et al. (2016). Adverse drug event reporting systems: A systematic review. British Journal of Clinical Pharmacology, 82(1), 17–29.CrossRefGoogle Scholar
  6. Barak-Corren, Y., Castro, V. M., Javitt, S., Hoffnagle, A. G., Dai, Y., Perlis, R. H., et al. (2016). Predicting suicidal behavior from longitudinal electronic health records. American Journal of Psychiatry, 174(2), 154–162.CrossRefGoogle Scholar
  7. Beijer, H. J. M., & de Blaey, C. J. (2002). Hospitalisations caused by adverse drug reactions (ADR): A meta-analysis of observational studies. Pharmacy World and Science, 24(2), 46–54.CrossRefGoogle Scholar
  8. Blacky, A., Mandl, H., Adlassnig, K.-P., & Koller, W. (2011). Fully automated surveillance of healthcare-associated infections with MONI-ICU - A Breakthrough in clinical infection surveillance. Applied Clinical Informatics, 2(3), 365–372.Google Scholar
  9. Boytcheva, S. (2011). Automatic matching of ICD-10 codes to diagnoses in discharge letters. In Proceedings of the Workshop on Biomedical Natural Language Processing (pp. 11–18).Google Scholar
  10. Buchanan, B. G., & Shortliffe, E. H. (1984). Rule-Based Expert Systems (Vol. 3). Reading, MA: Addison-Wesley.Google Scholar
  11. Buckley, J. M., Coopey, S. B., Sharko, J., Polubriaginof, F., Drohan, B., Belli, A. K., et al. (2012). The feasibility of using natural language processing to extract clinical information from breast pathology reports. Journal of Pathology Informatics, 3(1), 23.CrossRefGoogle Scholar
  12. Casillas, A., Pérez, A., Oronoz, M., Gojenola, K., & Santiso, S. (2016). Learning to extract adverse drug reaction events from electronic health records in Spanish. Expert Systems with Applications, 61, 235–245.CrossRefGoogle Scholar
  13. Chen, Y., Argentinis, J. D. E., & Weber, G. (2016). IBM Watson: How cognitive computing can be applied to big data challenges in life sciences research. Clinical Therapeutics, 38(4), 688–701.CrossRefGoogle Scholar
  14. Coden, A., Savova, G., Sominsky, I., Tanenblatt, M., Masanz, J., Schuler, K., et al. (2009). Automatically extracting cancer disease characteristics from pathology reports into a disease knowledge representation model. Journal of Biomedical Informatics, 42(5), 937–949.CrossRefGoogle Scholar
  15. Currie, A.-M., Fricke, T., Gawne, A., Johnston, R., Liu, J., & Stein, B. (2006). Automated extraction of free-text from pathology reports. In AMIA Annual Symposium Proceedings.Google Scholar
  16. Dahl, A., Özkan, A., & Dalianis, H. (2016). Pathology text mining-on Norwegian prostate cancer reports. In 2016 IEEE 32nd International Conference on Data Engineering Workshops (ICDEW) (pp. 84–87). New York: IEEE.Google Scholar
  17. Dalianis, H. (2014). Clinical text retrieval - An overview of basic building blocks and applications. In Professional Search in the Modern World (pp. 147–165). Berlin: Springer.Google Scholar
  18. Dalianis, H., Hassel, M., & Velupillai, S. (2009). The Stockholm EPR Corpus-characteristics and some initial findings. In Proceedings of ISHIMR 2009, Evaluation and Implementation of e-Health and Health Information Initiatives: International Perspectives. 14th International Symposium for Health Information Management Research (pp. 243–249).Google Scholar
  19. Decker, A. (2003). Towards Automatic Grammatical Simplification of Swedish Text. Master’s thesis, Computational Linguistics, Department of Linguistics, Stockholm University.Google Scholar
  20. Doupi, P., Svaar, H., Bjørn, B., Deilkås, E., Nylén, U., & Rutberg, H. (2015). Use of the global trigger tool in patient safety improvement efforts: Nordic experiences. Cognition, Technology & Work, 17(1), 45–54.CrossRefGoogle Scholar
  21. Downs, J., Velupillai, S., Gkotsis, G., Holden, R., Kikoler, M., Dean, H., et al. (2017). Detection of suicidality in adolescents with autism spectrum disorders: Developing a natural language processing approach for use in electronic health records. In AMIA Annual Symposium Proceedings.Google Scholar
  22. Ducel, G., Fabry, J., & Nicolle, L. (Eds.). (2002). Prevention of Hospital Acquired Infections: A Practical Guide., 2nd edn. World Health Organization. Accessed 11 Jan 2018.
  23. Edwards, I. R., & Aronson, J. K. (2000). Adverse drug reactions: Definitions, diagnosis, and management. The Lancet, 356(9237), 1255–1259.CrossRefGoogle Scholar
  24. Ehrentraut, C., Ekholm, M., Tanushi, H., Tiedemann, J., & Dalianis, H. (2016). Detecting hospital-acquired infections: A document classification approach using support vector machines and gradient tree boosting. Health Informatics Journal, 24(1), 24–42.CrossRefGoogle Scholar
  25. Ehrentraut, C., Kvist, M., Sparrelid, E., & Dalianis, H. (2014). Detecting healthcare-associated infections in electronic health records: Evaluation of machine learning and preprocessing techniques. In Sixth International Symposium on Semantic Mining in Biomedicine (SMBM 2014). University of Aveiro.Google Scholar
  26. Elhadad, N., McKeown, K., Kaufman, D. R., & Jordan, D. A. (2005). Facilitating physicians’ access to information via tailored text summarization. In AMIA Annual Symposium Proceedings. Citeseer.Google Scholar
  27. Eriksson, R., Jensen, P. B., Frankild, S., Jensen, L. J., & Brunak, S. (2013). Dictionary construction and identification of possible adverse drug events in Danish clinical narrative text. Journal of the American Medical Informatics Association, 20(5), 947–953.CrossRefGoogle Scholar
  28. Falkenjack, J., Fahlborg, D., Rennes, E., Johansson, V., & Jönsson, A. (2017). Services for text simplification and analysis. In Proceedings of NODALIDA ’17 - 21th Nordic Conference on Computational Linguistics.Google Scholar
  29. Farkas, R., & Szarvas, G. (2008). Automatic construction of rule-based ICD-9-CM coding systems. BMC Bioinformatics, 9(3), S10.CrossRefGoogle Scholar
  30. Forster, A. J., Jennings, A., Chow, C., Leeder, C., & van Walraven, C. (2012). A systematic review to evaluate the accuracy of electronic adverse drug event detection. Journal of the American Medical Informatics Association, 19(1), 31–38.CrossRefGoogle Scholar
  31. Fraser, K. C., Meltzer, J. A., & Rudzicz, F. (2015). Linguistic features identify Alzheimer’s disease in narrative speech. Journal of Alzheimer’s Disease, 49(2), 407–422.CrossRefGoogle Scholar
  32. Freeman, R., Moore, L. S. P., Álvarez, L. G., Charlett, A., & Holmes, A. (2013). Advances in electronic surveillance for healthcare-associated infections in the 21st century: A systematic review. Journal of Hospital Infection, 84(2), 106–119.CrossRefGoogle Scholar
  33. Friedrich, S., & Dalianis, H. (2015). Adverse drug event classification of health records using dictionary-based pre-processing and machine learning. In Proceedings of the Sixth International Workshop on Health Text Mining and Information Analysis, Louhi, Held in Conjunction with EMNLP 2015, Lisbon, Portugal (pp. 121–130).Google Scholar
  34. Gerdes, L. U., & Hardahl, C. (2012). Text mining electronic health records to identify hospital adverse events. Studies in Health Technology and Informatics, 192, 1145–1145.Google Scholar
  35. Gkotsis, G., Velupillai, S., Oellrich, A., Dean, H., Liakata, M., & Dutta, R. (2016). Don’t let notes be misunderstood: A negation detection method for assessing risk of suicide in mental health records. In Proceedings of the 3rd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality (pp. 95–105). Association for Computational Linguistics.Google Scholar
  36. Grigonyte, G., Kvist, M., Velupillai, S., & Wirén, M. Improving readability of Swedish electronic health records through lexical simplification: First results. In Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations – PITR, Gothenburg, Sweden, April 2014 (pp. 74–83). Association for Computational Linguistics. Accessed 11 Jan 2018.
  37. Gurulingappa, H., Rajput, A. M., Roberts, A., Fluck, J., Hofmann-Apitius, M., & Toldo, L. (2012). Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports. Journal of Biomedical Informatics, 45(5), 885–892.CrossRefGoogle Scholar
  38. Haerian, K., Salmasian, H., & Friedman, C. (2012). Methods for identifying suicide or suicidal ideation in EHRs. In AMIA Annual Symposium Proceedings (Vol. 2012, p. 1244). American Medical Informatics Association.Google Scholar
  39. Halpin, H., Shortell, S. M., Milstein, A., & Vanneman, M. (2011). Hospital adoption of automated surveillance technology and the implementation of infection prevention and control programs. American Journal of Infection Control, 39(4), 270–276.CrossRefGoogle Scholar
  40. Harpaz, R., DuMouchel, W., Shah, N. H., Madigan, D., Ryan, P., & Friedman, C. (2012). Novel data-mining methodologies for adverse drug event discovery and analysis. Clinical Pharmacology & Therapeutics, 91(6), 1010–1021.CrossRefGoogle Scholar
  41. Hassel, M. (2007). Resource Lean and Portable Automatic Text Summarization. PhD thesis, School of Computer Science and Communication, Royal Institute of Technology, Stockholm, Sweden, June 2007. Accessed 11 Jan 2018.
  42. Hassel, M., & Sjöbergh, J. (2006). Towards holistic summarization: Selecting summaries, not sentences. In Proceedings of LREC 2006, Genoa, Italy. Accessed 11 Jan 2018.
  43. Hazlehurst, B., Naleway, A., & Mullooly, J. (2009). Detecting possible vaccine adverse events in clinical notes of the electronic medical record. Vaccine, 27(14), 2077–2083.CrossRefGoogle Scholar
  44. Henriksson, A. (2015). Ensembles of Semantic Spaces, on Combining Models of Distributional Semantics with Applications in Healthcare. PhD thesis, Department of Computer and Systems Sciences, Stockholm University.Google Scholar
  45. Henriksson, A., & Hassel, M. (2013). Optimizing the dimensionality of clinical term spaces for improved diagnosis coding support. In Proceedings of Louhi Workshop on Health Document Text Mining and Information Analysis.Google Scholar
  46. Henriksson, A., Hassel, M., & Kvist, M. (2011). Diagnosis code assignment support using random indexing of patient records – A qualitative feasibility study. In Proceedings of Artificial Intelligence in Medicine (pp. 348–352). Berlin: Springer.CrossRefGoogle Scholar
  47. Henriksson, A., Kvist, M., Dalianis, H., & Duneld, M. (2015). Identifying adverse drug event information in clinical notes with distributional semantic representations of context. Journal of Biomedical Informatics, 57, 333–349.CrossRefGoogle Scholar
  48. High, R. (2012). The Era of Cognitive Systems: An Inside Look at IBM Watson and How it Works. IBM Corporation, Redbooks.Google Scholar
  49. Humphreys, H., & Smyth, E. T. M. (2006). Prevalence surveys of healthcare-associated infections: What do they tell us, if anything? Clinical Microbiology and Infection, 12(1), 2–4.CrossRefGoogle Scholar
  50. Jacobson, O., & Dalianis, H. (2016). Applying deep learning on electronic health records in Swedish to predict healthcare-associated infections. In ACL Proceedings of the 15th Workshop on Biomedical Natural Language Processing, BioNLP 2016 (pp. 191–195).Google Scholar
  51. Jensen, P. B., Jensen, L. J., & Brunak, S. (2012). Mining electronic health records: Towards better research applications and clinical care. Nature Reviews Genetics, 13(6), 395–405.CrossRefGoogle Scholar
  52. Johnson, S. B., Bakken, S., Dine, D., Hyun, S., Mendonça, E., Morrison, F., et al. (2008). An electronic health record based on structured narrative. Journal of the American Medical Informatics Association, 15(1), 54–64.CrossRefGoogle Scholar
  53. Kandula, S., Curtis, D., & Zeng-Treitler, Q. (2010). A semantic and syntactic text simplification tool for health content. In AMIA Annual Symposium Proceedings (Vol. 2010, pp. 366–370).Google Scholar
  54. Kanhov, M. (2014). Generating Descriptions for Concepts of Swedish SNOMED CT by Implementing a Natural Language Generation System. Master’s thesis, Department of Computer and Systems Sciences, Stockholm University.Google Scholar
  55. Kanhov, M., Feng, X., & Dalianis, H. (2012). Natural language generation from SNOMED specifications. In In the Proceedings of CLEF 2012 Workshop on Cross-Language Evaluation of Methods, Applications, and Resources for eHealth Document Analysis (CLEFeHealth2012), Rome, September 17–18.Google Scholar
  56. Karimi, S., Metke-Jimenez, A., Kemp, M., & Wang, C. (2015a). Cadec: A corpus of adverse drug event annotations. Journal of Biomedical Informatics, 55, 73–81.CrossRefGoogle Scholar
  57. Karimi, S., Wang, C., Metke-Jimenez, A., Gaire, R., & Paris, C. (2015b). Text and data mining techniques in adverse drug reaction detection. ACM Computing Surveys (CSUR), 47(4), 56.CrossRefGoogle Scholar
  58. Kavuluru, R., Rios, A., & Lu, Y. (2015). An empirical evaluation of supervised learning approaches in assigning diagnosis codes to electronic medical records. Artificial Intelligence in Medicine, 65(2), 155–166.CrossRefGoogle Scholar
  59. Kokkinakis, D., Fors, K. L., Björkner, E., & Nordlund, A. (2017). Data collection from persons with mild forms of cognitive impairment and healthy controls-infrastructure for classification and prediction of dementia. In Proceedings of the 21st Nordic Conference on Computational Linguistics, NoDaLiDa, 22–24 May 2017, Gothenburg, Sweden (pp. 172–182). Linköping University Electronic Press.Google Scholar
  60. Koopman, B., Karimi, S., Nguyen, A., McGuire, R., Muscatello, D., Kemp, M., et al. (2015a). Automatic classification of diseases from free-text death certificates for real-time surveillance. BMC Medical Informatics and Decision Making, 15(1), 53.Google Scholar
  61. Koopman, B., Zuccon, G., Wagholikar, A., Chu, K., O’Dwyer, J., Nguyen, A., et al. (2015b). Automated reconciliation of radiology reports and discharge summaries. In AMIA Annual Symposium Proceedings (Vol. 2015, pp. 775–784). American Medical Informatics Association.Google Scholar
  62. Korkontzelos, I., Mu, T., & Ananiadou, S. (2012). ASCOT: A text mining-based web-service for efficient search and assisted creation of clinical trials. BMC Medical Informatics and Decision Making, 12(1), S3.CrossRefGoogle Scholar
  63. Lagos, K. (2016). Building an Artifact to Detect Adverse Drug Events in Stockholm EPR Corpus by Using the Stausberg and Hasford’s Framework. Master’s thesis, Department of Computer and Systems Sciences, Stockholm University.Google Scholar
  64. Läkemedelsverket. (2012). Läkemedelsboken 2011–2012, (In Swedish). Läkemedelsverket. Accessed 11 Jan 2018.
  65. Lavergne, T., Névéol, A., Robert, A., Grouin, C., Rey, G., & Zweigenbaum, P. (2016). A dataset for ICD-10 coding of death certificates: Creation and usage. In the Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM 2016), Held in Conjunction with Coling 2016 (pp. 60–69).Google Scholar
  66. Lee, D., Cornet, R., Lau, F., & De Keizer, N. (2013). A survey of SNOMED CT implementations. Journal of Biomedical Informatics, 46(1), 87–96.CrossRefGoogle Scholar
  67. Lee, D., de Keizer, N., Lau, F., & Cornet, R. (2014). Literature review of SNOMED CT use. Journal of the American Medical Informatics Association, 21(e1), e11–e19.CrossRefGoogle Scholar
  68. Leonard Westgate, C., Shiner, B., Thompson, P., & Watts, B. V. (2015). Evaluation of veterans’ suicide risk with the use of linguistic detection methods. Psychiatric Services, 66(10), 1051–1056.CrossRefGoogle Scholar
  69. Liu, S. (2009). Experiences with and reflections on text summarization tools. International Journal of Computational Intelligence Systems, 2(3), 202–218.Google Scholar
  70. Luhn, H. P. (1958). The automatic creation of literature abstracts. IBM Journal of Research and Development, 2(2), 159–165.MathSciNetCrossRefGoogle Scholar
  71. Luo, Y., Thompson, W. K., Herr, T. M., Zeng, Z., Berendsen, M. A., Jonnalagadda, S. R., et al. (2017). Natural language processing for EHR-based pharmacovigilance: A structured review. Drug Safety, 40(11), 1075–1089.CrossRefGoogle Scholar
  72. Mani, I., & Maybury, M. T. (1999). Advances in Automatic Text Summarization (Vol. 293). Cambridge, MA: MIT Press.Google Scholar
  73. Martinez, D., & Li, Y. (2011). Information extraction from pathology reports in a hospital setting. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (pp. 1877–1882). New York: ACM.Google Scholar
  74. Metzger, M.-H., Durand, T., Lallich, S., Salamon, R., & Castets, P. (2012). The use of regional platforms for managing electronic health records for the production of regional public health indicators in France. BMC Medical Informatics and Decision Making, 12(1), 28.Google Scholar
  75. Metzger, M.-H., Tvardik, N., Gicquel, Q., Bouvry, C., Poulet, E., & Potinet-Pagliaroli, V. (2016). Use of emergency department electronic medical records for automated epidemiological surveillance of suicide attempts: a French pilot study. International Journal of Methods in Psychiatric Research, 26(2), 1–10.CrossRefGoogle Scholar
  76. Moen, H. (2016). Distributional Semantic Models for Clinical Text Applied to Health Record Summarization. PhD thesis, Department of Computer and Information Science, Norwegian University of Science and Technology, NTNU.Google Scholar
  77. Moen, H., Peltonen, L.-M., Heimonen, J., Airola, A., Pahikkala, T., Salakoski, T., et al. (2016). Comparison of automatic summarisation methods for clinical free text notes. Artificial Intelligence in Medicine, 67, 25–37.CrossRefGoogle Scholar
  78. Napolitano, G., Marshall, A., Hamilton, P., & Gavin, A. T. (2016). Machine learning classification of surgical pathology reports and chunk recognition for information extraction noise reduction. Artificial Intelligence in Medicine, 70, 77–83.CrossRefGoogle Scholar
  79. Nebeker, J. R., Barach, P., & Samore, M. H. (2004). Clarifying adverse drug events: A clinician’s guide to terminology, documentation, and reporting. Annals of Internal Medicine, 140(10), 795–801.CrossRefGoogle Scholar
  80. Nguyen, A., Lawley, M., Hansen, D., & Colquist, S. (2011). Structured pathology reporting for cancer from free text: Lung cancer case study. Electronic Journal of Health Informatics, 7(1), 8.Google Scholar
  81. Nguyen, A. N., Moore, J., O’Dwyer, J., & Philpot, S. (2015). Assessing the utility of automatic cancer registry notifications data extraction from free-text pathology reports. In AMIA Annual Symposium Proceedings (Vol. 2015, p. 953). American Medical Informatics Association.Google Scholar
  82. Ou, Y., & Patrick, J. (2014). Automatic population of structured reports from narrative pathology reports. In Proceedings of the Seventh Australasian Workshop on Health Informatics and Knowledge Management (Vol. 153, pp. 41–50). Australian Computer Society, Inc.Google Scholar
  83. Pérez, A., Weegar, R., Casillas, A., Gojenola, K., Oronoz, M., & Dalianis, H. (2017). Semi-supervised medical entity recognition: A study on Spanish and Swedish clinical corpora. Journal of Biomedical Informatics, 71, 16–30.CrossRefGoogle Scholar
  84. Perotte, A., Pivovarov, R., Natarajan, K., Weiskopf, N., Wood, F., & Elhadad, N. (2014). Diagnosis code assignment: Models and evaluation metrics. Journal of the American Medical Informatics Association, 21(2), 231–237.CrossRefGoogle Scholar
  85. Pestian, J. P., Brew, C., Matykiewicz, P., Hovermale, D. J., Johnson, N., Cohen, K. B., et al. (2007). A shared task involving multi-label classification of clinical free text. In Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing (pp. 97–104). Association for Computational Linguistics.Google Scholar
  86. Pivovarov, R., & Elhadad, N. (2015). Automated methods for the summarization of electronic health records. Journal of the American Medical Informatics Association, 22(5), 938–947.CrossRefGoogle Scholar
  87. Plaisant, C., Mushlin, R., Snyder, A., Li, J., Heller, D., & Shneiderman, B. (1998). Lifelines: Using visualization to enhance navigation and analysis of patient records. In AMIA Annual Symposium Proceedings (pp. 76–80). American Medical Informatics Association.Google Scholar
  88. Portet, F., Reiter, E., Gatt, A., Hunter, J., Sripada, S., Freer, Y., et al. (2009). Automatic generation of textual summaries from neonatal intensive care data. Artificial Intelligence, 173(7–8), 789–816.CrossRefGoogle Scholar
  89. Proux, D., Hagège, C., Gicquel, Q., Pereira, S., Darmoni, S., Segond, F., et al. (2011). Architecture and systems for monitoring hospital acquired infections inside a hospital information workflow. In Proceedings of the Workshop on Biomedical Natural Language Processing. USA: Portland, Oregon (p. 43e48). Citeseer.Google Scholar
  90. Ramesh, B. P., Houston, T. K., Brandt, C., Fang, H., & Yu, H. (2013). Improving patients’ electronic health record comprehension with NoteAid. In Medinfo (pp. 714–718).Google Scholar
  91. Roberts, K., Simpson, M., Demner-Fushman, D., Voorhees, E., & Hersh, W. (2016). State-of-the-art in biomedical literature retrieval for clinical cases: A survey of the TREC 2014 CDS track. Information Retrieval Journal, 19(1–2), 113–148.CrossRefGoogle Scholar
  92. Roller, R., & Stevenson, M. (2014). Self-supervised relation extraction using UMLS. In International Conference of the Cross-Language Evaluation Forum for European Languages (pp. 116–127). Berlin: Springer.Google Scholar
  93. Roque, F. S., Jensen, P. B., Schmock, H., Dalgaard, M., Andreatta, M., Hansen, T., et al. (2011a). Using electronic patient records to discover disease correlations and stratify patient cohorts. PLOS Computational Biology, 7(8), e1002141.CrossRefGoogle Scholar
  94. Roque, F. S., Jensen, P. B., Schmock, H., Dalgaard, M., Andreatta, M., Hansen, T., et al. (2011b). Using electronic patient records to discover disease correlations and stratify patient cohorts. PLOS Computational Biology, 7(8), e1002141.CrossRefGoogle Scholar
  95. SALAR. (2014). Swedish Association of Local Authorities and Regions: Vårdrelaterade infektioner framgångsfaktorer som förebygger. Stockholm, Sweden. Accessed 10 Apr 2014. ISBN 978-91-7585-109-9.
  96. Santiso, S., Pérez, A., Gojenola, K., Taldea, I. X. A., Casillas, A., & Oronoz, M. (2014). Adverse drug event prediction combining shallow analysis and machine learning. In Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis (Louhi)@ EACL (pp. 85–89).Google Scholar
  97. Sarker, A., Mollá, D., & Paris, C. (2013). An approach for query-focused text summarisation for evidence based medicine. In Artificial Intelligence in Medicine (pp. 295–304). Berlin: Springer.CrossRefGoogle Scholar
  98. Scharber, W. (2007). Evaluation of open source text mining tools for cancer surveillance. CDC, 24, 28. Accessed 11 Jan 2018.
  99. Singh, H., Knudsen Sollie, M., Orholm Solhøi, E., & Sverre Syberg, F. (2015). Information Extraction: The Case of Kreftregisteret, (In Norwegian). Bachelor’s thesis, Westerdals Oslo ACT.Google Scholar
  100. Skeppstedt, M., Kvist, M., Nilsson, G., & Dalianis, H. (2014). Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: An annotation and machine learning study. In Journal of Biomedical Informatics, 49, 148–158.CrossRefGoogle Scholar
  101. Socialstyrelsen. (2010). The National Board of Health and Welfare, Kodningskvalitet i patientregistret, Slutenvård 2008, (In Swedish).
  102. Spasić, I., Livsey, J., Keane, J. A., & Nenadić, G. (2014). Text mining of cancer-related information: Review of current status and future directions. International Journal of Medical Informatics, 83(9), 605–623. Accessed 11 Jan 2018.CrossRefGoogle Scholar
  103. Stanfill, M. H., Williams, M., Fenton, S. H., Jenders, R. A., & Hersh, W. R. (2010). A systematic literature review of automated clinical coding and classification systems. Journal of the American Medical Informatics Association, 17(6), 646–651.CrossRefGoogle Scholar
  104. Stausberg, J., & Hasford, J. (2011). Drug-related admissions and hospital-acquired adverse drug events in Germany: A longitudinal analysis from 2003 to 2007 of ICD-10-coded routine data. BMC Health Research, 11(1), 1.Google Scholar
  105. Suominen, H., Ginter, F., Pyysalo, S., Airola, A., Pahikkala, T., Salanterä, S., et al. (2008). Machine learning to automate the assignment of diagnosis codes to free-text radiology reports: A method description. In Proceedings of the ICML/UAI/COLT Workshop on Machine Learning for Health-Care Applications.Google Scholar
  106. Suominen, H., Salanterä, S., Velupillai, S., Chapman, W. W., Savova, G., Elhadad, N., et al. (2013). Overview of the ShARe/CLEF eHealth Evaluation Lab 2013. In Information Access Evaluation. Multilinguality, Multimodality, and Visualization (pp. 212–231). Berlin: Springer.Google Scholar
  107. Tang, R., Ouyang, L., Li, C., He, Y., Griffin, M., Taghian, A., et al. (2018). Machine learning to parse breast pathology reports in Chinese. Breast Cancer Research and Treatment, 1–8,
  108. Tanushi, H., Dalianis, H., & Nilsson, G. (2011). Calculating prevalence of comorbidity and comorbidity combinations with diabetes in hospital care in Sweden using a health care record database. In Proceedings of the LOUHI 2011, Third International Workshop on Health Document Text Mining and Information Analysis, Co-located with AIME 2011 Bled, Slovenia, July 6, 2011, CEUR-WS (Vol. 744, pp. 59–66). ISSN 1613-0073Google Scholar
  109. Tanushi, H., Kvist, M., & Sparrelid, E. (2014). Detection of healthcare-associated urinary tract infection in Swedish electronic health records. Studies in Health Technology and Informatics, 207, 330–339.Google Scholar
  110. Torgersson, O., & Falkman, G. (2002). Using text generation to access clinical data in a variety of contexts. Studies in Health Technology and Informatics, Vol 90, 460–465.zbMATHGoogle Scholar
  111. Tran, T., Luo, W., Phung, D., Harvey, R., Berk, M., Kennedy, R. L., et al. (2014). Risk stratification using data from electronic medical records better predicts suicide risks than clinician assessments. BMC Psychiatry, 14(1), 76.CrossRefGoogle Scholar
  112. Van Vleck, T. T., & Elhadad, N. (2010). Corpus-based problem selection for EHR note summarization. In AMIA Annual Symposium Proceedings (Vol. 2010, p. 817). American Medical Informatics Association.Google Scholar
  113. Voorhees, E. M., & Hersh, W. R. (2012). Overview of the TREC 2012 medical records track. In Proceedings of Text REtrieval Conference (TREC).Google Scholar
  114. Wang, X., Hripcsak, G., Markatou, M., & Friedman, C. (2009). Active computerized pharmacovigilance using natural language processing, statistics, and electronic health records: A feasibility study. Journal of the American Medical Informatics Association, 16(3), 328–337.CrossRefGoogle Scholar
  115. Wang, Y., Coiera, E., Runciman, W., & Magrabi, F. (2017). Using multiclass classification to automate the identification of patient safety incident reports by type and severity. BMC Medical Informatics and Decision Making, 17(1), 84.CrossRefGoogle Scholar
  116. Wang, Y., Patrick, J., Miller, G., & O’Hallaran, J. (2008). A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT. BMC Medical Informatics and Decision Making, 8(1), S5.CrossRefGoogle Scholar
  117. Warrer, P., Hansen, E. H., Juhl-Jensen, L., & Aagaard, L. (2012). Using text-mining techniques in electronic patient records to identify ADRS from medicine use. British Journal of Clinical Pharmacology, 73(5), 674–684.CrossRefGoogle Scholar
  118. Weegar, R., & Dalianis, H. (2015). Creating a rule based system for text mining of Norwegian breast cancer pathology reports. In Sixth International Workshop in Health Text Mining and Information Analysis (LOUHI), Held in Conjunction with EMNLP 2015, Lisbon, Portugal (pp. 73–78).Google Scholar
  119. Weegar, R., Kvist, M., Sundström, K., Brunak, S., & Dalianis, H. (2015). Finding cervical cancer symptoms in Swedish clinical text using a machine learning approach and NegEx. In AMIA Annual Symposium Proceedings (Vol. 2015, pp. 1296–1305). American Medical Informatics Association.Google Scholar
  120. Weegar, R., Nygård, J., & Dalianis, H. (2017). Efficient encoding of pathology reports using natural language processing. In Proceedings of Recent Advances in Natural Language Processing, RANLP 2017, Varna, Bulgaria (pp. 778–783).Google Scholar
  121. Wester, K., Jönsson, A. K., Spigset, O., Druid, H., & Hägg, S. (2008). Incidence of fatal adverse drug reactions: A population based study. British Journal of Clinical Pharmacology, 65(4), 573–579.CrossRefGoogle Scholar
  122. Yala, A., Barzilay, R., Salama, L., Griffin, M., Sollender, G., Bardia, A., et al. (2017). Using machine learning to parse breast pathology reports. Breast Cancer Research and Treatment, 161(2), 203–211.CrossRefGoogle Scholar
  123. Zhao, D., & Weng, C. (2011). Combining PubMed knowledge and EHR data to develop a weighted Bayesian network for pancreatic cancer prediction. Journal of Biomedical Informatics, 44, 859–868.CrossRefGoogle Scholar

Copyright information

© The Author(s) 2018

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  • Hercules Dalianis
    • 1
  1. 1.DSV-Stockholm UniversityKistaSweden

Personalised recommendations