© 2018

Clinical Text Mining

Secondary Use of Electronic Patient Records

  • Provides a comprehensive overview of technical and ethical issues arising in clinical text mining

  • Presents both the general background and structure of patient records, as well as various natural language processing and machine learning methodologies

  • Written for graduate students in health informatics, computational linguistics, and information retrieval

Open Access

Table of contents

  1. Front Matter
    Pages i-xvii
  2. Hercules Dalianis
    Pages 1-4 Open Access
  3. Hercules Dalianis
    Pages 5-12 Open Access
  4. Hercules Dalianis
    Pages 21-34 Open Access
  5. Hercules Dalianis
    Pages 35-43 Open Access
  6. Hercules Dalianis
    Pages 45-53 Open Access
  7. Hercules Dalianis
    Pages 55-82 Open Access
  8. Hercules Dalianis
    Pages 83-96 Open Access
  9. Hercules Dalianis
    Pages 109-148 Open Access
  10. Hercules Dalianis
    Pages 149-152 Open Access
  11. Hercules Dalianis
    Pages 153-157 Open Access
  12. Back Matter
    Pages 159-181

About this book


This open access book describes the results of natural language processing and machine learning methods applied to clinical text from electronic patient records.

It is divided into twelve chapters. Chapters 1-4 discuss the history and background of the original paper-based patient records, their purpose, and how they are written and structured. These initial chapters do not require any technical or medical background knowledge. The remaining eight chapters are more technical in nature and describe various medical classifications and terminologies such as ICD diagnosis codes, SNOMED CT, MeSH, UMLS, and ATC. Chapters 5-10 cover basic tools for natural language processing and information retrieval, and how to apply them to clinical text. The difference between rule-based and machine learning-based methods, as well as between supervised and unsupervised machine learning methods, are also explained. Next, ethical concerns regarding the use of sensitive patient records for research purposes are discussed, including methods for de-identifying electronic patient records and safely storing patient records. The book’s closing chapters present a number of applications in clinical text mining and summarise the lessons learned from the previous chapters.

The book provides a comprehensive overview of technical issues arising in clinical text mining, and offers a valuable guide for advanced students in health informatics, computational linguistics, and information retrieval, and for researchers entering these fields.


Data Mining Text Mining Health Informatics Health Care Information Systems Medical Terminologies Natural Language Processing Text Analysis Support Vector Machines Open Access

Authors and affiliations

  1. 1.DSV-Stockholm UniversityKistaSweden

About the authors

Hercules Dalianis, Master of Science in engineering (civilingenjör) with speciality in electrical engineering, graduated in 1984 at the Royal Institute of Technology, KTH, Stockholm, Sweden, and received his PhD/Teknologie doktor in 1996 also at KTH. Since 2011 he is professor in Computer and Systems Sciences at Stockholm University, Sweden. Since 1988 he has carried out research in natural language processing and information retrieval, and during the last ten years he has applied these methods on text from electronic patient records, resulting in the research area clinical text mining.

Bibliographic information

Industry Sectors
IT & Software
Consumer Packaged Goods
Finance, Business & Banking