Automatic Extraction of Structured Information from Drug Descriptions

  • Radu Razvan SlavescuEmail author
  • Constantin Maşca
  • Kinga Cristina Slavescu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11308)


This paper describes a Conditional Random Field (CRF) based named entity extraction model that is used for identifying relevant information from drug prescriptions. The entities that the model is able to extract are: dosage, measuring unit, to whom the treatment is directed, frequency and the total duration of treatment. A corpus with 1800 sentences has been compiled and annotated by two experts from drug prescription texts. Using the set of features identified by us, the CRF model hits around 95% F1-measure values for unit, dosage and frequency detection.


Conditional Random Field Drug description Named Entity Recognition 



The work for this paper has been supported in part by the Computer Science Department of the Technical University of Cluj-Napoca, Romania.


  1. 1.
    Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python. O’Reilly Media, Inc., Sebastopol (2009)zbMATHGoogle Scholar
  2. 2.
    Okazaki, N.: CRFsuite: a fast implementation of Conditional Random Fields (CRFs) (2007)Google Scholar
  3. 3.
    Patrick, J., Li, M.: A cascade approach to extracting medication events. In: Proceedings of the Australasian Language Technology Association Workshop 2009, pp. 99–103 (2009)Google Scholar
  4. 4.
    Patrick, J., Li, M.: High accuracy information extraction of medication information from clinical notes: 2009 i2b2 medication extraction challenge. J. Am. Med. Inf. Assoc. 17(5), 524–527 (2010)CrossRefGoogle Scholar
  5. 5.
    Rubrichi, S., Quaglini, S.: Summary of product characteristics content extraction for a safe drugs usage. J. Biomed. Inform. 45(2), 231–239 (2012)CrossRefGoogle Scholar
  6. 6.
    Slavescu, R.R., Masca, C., Slavescu, K.C.: Sequence labeling for extracting relevant pieces of information from raw text medicine descriptions. In: Proceedings of the International Conference on Advancements of Medicine and Health Care through Technology, October 2018, Cluj-Napoca, Romania (2018, In press)Google Scholar
  7. 7.
    Sutton, C., McCallum, A.: An introduction to conditional random fields. Found. Trends Mach. Learn. 4(4), 267–373 (2012)CrossRefGoogle Scholar
  8. 8.
    Tao, C., Filannino, M., Uzuner, Ö.: Prescription extraction using CRFs and word embeddings. J. Biomed. Inform. 72, 60–66 (2017)CrossRefGoogle Scholar
  9. 9.
    Tikk, D., Solt, I.: Improving textual medication extraction using combined conditional random fields and rule-based systems. J. Am. Med. Inform. Assoc. 17(5), 540–544 (2010)CrossRefGoogle Scholar
  10. 10.
    Uzuner, Ö., Solti, I., Cadag, E.: Extracting medication information from clinical text. J. Am. Med. Inform. Assoc. 17(5), 514–518 (2010)CrossRefGoogle Scholar
  11. 11.
    Zhang, Y., Jiang, M., Wang, J., Xu, H.: Semantic role labeling of clinical text: comparing syntactic parsers and features. In: AMIA 2016, American Medical Informatics Association Annual Symposium, Chicago, IL, USA (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceTechnical University of Cluj-NapocaCluj-NapocaRomania
  2. 2.“Iuliu Hatieganu” University of Medicine and PharmacyCluj-NapocaRomania

Personalised recommendations