Building medical dictionaries for patient encoding systems: A methodology

  • C. Lovis
  • R. Baud
  • P. A. Michel
  • J. R. Scherrer
  • A. M. Rassinoux
Natural Language and Terminology
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1211)


One of the most critical problems of automatic natural language processing (NLP) is the size of the medical dictionaries. The set of compound medical words and the often used possibility to create new terms render the exhaustivity of medical dictionaries beyond question. The structure of such dictionaries is usually composed of two parts: the first one generally contains morphological and sometimes syntactical information necessary to identify, on a grapheme level, a given word in a sentence whereas the second part is often devoted to conceptual knowledge associated with the recognised word. It is only when these two prerequisites are fulfilled that an attempt to understand the meaning of a whole expression is possible. The approach developed in this paper shows the pragmatic method used to implement a powerful analyser dedicated to help physicians or coding clerks to encode medico-economic information about patients using international classifications like ICD. It describes how to build medical dictionaries that can help the application of morphological and conceptual analysers (encoders). The methods used have proved to be efficient for various classifications as well as for multiple languages as the system presently supports French, German, English and Dutch for the full ICD-10 classification.


Term Source Natural Language Processing Compound Word Word Segmentation Medical Dictionary 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chute CG, Atkin GE, Ihrke DM. An empirical evaluation of concept capture by clinical classifications. In: Proceedings MEDINFO 92 (Ed. Lun KC, Degoulet P, Piemme TE, Rienhoff O), North-Holland, Amsterdam, 1992, pp.1469–1474Google Scholar
  2. 2.
    Jollis JG, Ancukiewicz M, DeLong ER, Pryor DB, Muhlbaier LH, Mark DB. Discordance of databases designed for claims payment versus clinical information systems. Ann Intern Med, 119:844–850, 1993Google Scholar
  3. 3.
    Lovis C, Michel PA, Borst F, Baud R, Griesser V, Scherrer JR Medico-Economic Patient Encoding in the University Hospital of Geneva. Proceedings Google Scholar
  4. 4.
    K. Koskenniemi. Two-level model for morphological analysis. PhD Thesis. University of Helsinki, 1983Google Scholar
  5. 5.
    K. Matsuno. Semantic commitments as a mode of non-programmable computation in the brain. Biosystems (Netherlands), 27/4: 235–239, 1992Google Scholar
  6. 6.
    LJJ. Wittgenstein. Philosophical Investigations. Oxford: Basil Blackwell, 1953Google Scholar
  7. 7.
    F. de Saussure (1915). Cours de linguistique générale. Bally & Sechehaye, Ed. Payot, 1966Google Scholar
  8. 8.
    MG. Pacak, LM. Norton, GS. Dunham. Morphosemantic Analysis of-ITIS Forms in Medical Language. Meth Inform Med, 19: 99–105, 1980Google Scholar
  9. 9.
    LM. Norton, MG. Pacak. Morphosemantic Analysis of Compoud Word Forms Denoting Surgical Procedures. Meth Inform Med, 22: 29–36, 1983Google Scholar
  10. 10.
    S. Wolff. The Use of Morphosemantic Regularities in the Medical Vocabulary for Automatic Lexical Coding. Meth Inform Med, 23: 195–203, 1984Google Scholar
  11. 11.
    P. Dujols, P. Aubas, C. Baylon, F. Grémy. Morphosemantic Analysis and Translation of Medical Compound Terms. Meth Inform Med, 30: 30–35, 1991Google Scholar
  12. 12.
    Brigl B., Mieth M., Haux R., Glück W., The LBI-method for automated indexing of diagnoses by using SNOMED. Part 1. Int J Bio Med Computing, 37: 237–247, 1994Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • C. Lovis
    • 1
  • R. Baud
    • 2
  • P. A. Michel
    • 2
  • J. R. Scherrer
    • 2
  • A. M. Rassinoux
    • 3
  1. 1.Department of Internal MedicineUniversity State Hospital of GenevaSwitzerland
  2. 2.Division of Medical InformaticsUniversity State Hospital of GenevaSwitzerland
  3. 3.Division of Biomedical InformaticsVanderbilt UniversityNashvilleUSA

Personalised recommendations