Skip to main content

Abbreviation Identification in Clinical Notes with Level-wise Feature Engineering and Supervised Learning

  • Conference paper
  • First Online:
Knowledge Management and Acquisition for Intelligent Systems (PKAW 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9806))

Included in the following conference series:

Abstract

Nowadays, electronic medical records get more popular and significant in medical, biomedical, and healthcare research activities. Their popularity and significance lead to a growing need for sharing and utilizing them from the outside. However, explicit noises in the shared records might hinder users in their efforts to understand and consume the records. One kind of explicit noises that has a strong impact on the readability of the records is a set of abbreviations written in free text in the records because of writing-time saving and record simplification. Therefore, automatically identifying abbreviations and replacing them with their correct long forms are necessary for enhancing their readability and further their sharability. In this paper, our work concentrates on abbreviation identification to lay the foundations for de-noising clinical text with abbreviation resolution. Our proposed solution to abbreviation identification is general, practical, simple but effective with level-wise feature engineering and a supervised learning mechanism. We do level-wise feature engineering to characterize each token that is either an abbreviation or a non-abbreviation at the token, sentence, and note levels to formulate a comprehensive vector representation in a vector space. After that, many open options can be made to build an abbreviation identifier in a supervised learning mechanism and the resulting identifier can be used for automatic abbreviation identification in clinical text of the electronic medical records. Experimental results on various real clinical note types have confirmed the effectiveness of our solution with high accuracy, precision, recall, and F-measure for abbreviation identification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. A Set of Electronic Medical Records, VanDon Hospital, Vietnam, 24 February 2016

    Google Scholar 

  2. Adnan, M., Warren, J., Orr, M.: Iterative refinement of SemLink to enhance patient readability of discharge summaries. In: Grain, H., Schaper, L.K. (eds.) Health Informatics: Digital Health Service Delivery - The Future is Now!, pp. 128–134 (2013)

    Google Scholar 

  3. Berman, J.J.: Pathology abbreviated: a long review of short terms. Arch. Pathol. Lab. Med. 128, 347–352 (2004)

    Google Scholar 

  4. Collard, B., Royal, A.: The use of abbreviations in surgical note keeping. Ann. Med. Surg. 4, 100–102 (2015)

    Article  Google Scholar 

  5. Henriksson, A., Moen, H., Skeppstedt, M., Daudaravičius, V., Duneld, M.: Synonym extraction and abbreviation expansion with ensembles of semantic spaces. J. Biomed. Semant. 5(6), 1–25 (2014)

    Google Scholar 

  6. Kim, Y., Hurdle, J., Meystre, S.M.: Using UMLS lexical resources to disambiguate abbreviations in clinical text. In: AMIA Annual Symposium Proceedings, pp. 715–722 (2011)

    Google Scholar 

  7. Kim, J.-B., Oh, H.-S., Nam, S.-S., Myaeng, S.-H.: Using candidate exploration and ranking for abbreviation resolution in clinical documents. In: Proceedings of the 2013 International Conference on Healthcare Informatics, pp. 317–326 (2013)

    Google Scholar 

  8. Kim, M.-Y., Xu, Y., Zaiane, O.R., Goebel, R.: Recognition of patient-related named entities in noisy tele-health texts. ACM Trans. Intell. Syst. Technol. 6(4), 59:1–59:23 (2015)

    Google Scholar 

  9. Kim, S., Yoon, J.: Link-topic model for biomedical abbreviation disambiguation. J. Biomed. Inform. 53, 367–380 (2015)

    Article  Google Scholar 

  10. Kreuzthaler, M., Schulz, S.: Detection of sentence boundaries and abbreviations in clinical narratives. BMC Med. Inform. Decis. Making 15, 1–13 (2015)

    Article  Google Scholar 

  11. Liu, Y., Ge, T., Mathews, K.S., Ji, H., McGuinness, D.L.: Exploiting task-oriented resources to learn word embeddings for clinical abbreviation expansion. In: Proceedings of the 2015 Workshop on Biomedical Natural Language Processing, pp. 92–97 (2015)

    Google Scholar 

  12. Long, W.J.: Parsing free text nursing notes. In: AMIA Annual Symposium Proceedings, p. 917 (2003)

    Google Scholar 

  13. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Workshop Proceedings of the International Conference on Learning Representations (2013)

    Google Scholar 

  14. Moon, S., Berster, B.T., Xu, H., Cohen, T.: Word sense disambiguation of clinical abbreviations with hyperdimensional computing. In: AMIA Annual Symposium Proceedings, pp. 1007–1016 (2013)

    Google Scholar 

  15. Moon, S., McInnes, B., Melton, G.B.: Challenges and practical approaches with word sense disambiguation of acronyms and abbreviations in the clinical domain. Healthc. Inform. Res. 21(1), 35–42 (2015)

    Article  Google Scholar 

  16. Moon, S., Pakhomov, S., Liu, N., Ryan, J.O., Melton, G.M.: A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources. J. Am. Med. Inform. Assoc. 21, 299–307 (2014)

    Article  Google Scholar 

  17. Pakhomov, S., Pedersen, T., Chute, C.G.: Abbreviation and acronym disambiguation in clinical discourse. In: AMIA Annual Symposium Proceedings, pp. 589–593 (2005)

    Google Scholar 

  18. Weka 3, Data Mining Software in Java. http://www.cs.waikato.ac.nz/ml/weka. Accessed on 22 February 2016

  19. Wong, W., Glance, D.: Statistical semantic and clinician confidence analysis for correcting abbreviations and spelling errors in clinical progress notes. Artif. Intell. Med. 53, 171–180 (2011)

    Article  Google Scholar 

  20. Word2VecJava. https://github.com/medallia/Word2VecJava. Accessed on 22 February 2016

  21. Wu, Y., Denny, J.C., Rosenbloom, S.T., Miller, R.A., Giuse, D.A., Xu, H.: A comparative study of current clinical natural language processing systems on handling abbreviations in discharge summaries. In: AMIA Annual Symposium Proceedings, pp. 997–1003 (2012)

    Google Scholar 

  22. Wu, Y., Rosenbloom, S.T., Denny, J.C., Miller, R.A., Mani, S., Giuse, D.A., Xu, H.: Detecting abbreviations in discharge summaries using machine learning methods. In: AMIA Annual Symposium Proceedings, pp. 1541–1549 (2011)

    Google Scholar 

  23. Wu, Y., Tang, B., Jiang, M., Moon, S., Denny, J.C., Xu, H.: Clinical acronym/abbreviation normalization using a hybrid approach. In: CLEF (2013)

    Google Scholar 

  24. Wu, Y., Xu, J., Zhang, Y., Xu, H.: Clinical abbreviation disambiguation using neural word embeddings. In: Proceedings of the 2015 Workshop on Biomedical Natural Language Processing (BioNLP 2015), pp. 171–176 (2015)

    Google Scholar 

  25. Xu, H., Stetson, P.D., Friedman, C.: A study of abbreviations in clinical notes. In: AMIA Annual Symposium Proceedings, pp. 822–825 (2007)

    Google Scholar 

  26. Xu, H., Stetson, P.D., Friedman, C.: Methods for building sense inventories of abbreviations in clinical notes. J. Am. Med. Inform. Assoc. 16(1), 103–108 (2009)

    Article  Google Scholar 

Download references

Acknowledgments

This work is funded by Vietnam National University at Ho Chi Minh City under the grant number B2016-42-01. In addition, we would like to thank John von Neumann Institute, Vietnam National University at Ho Chi Minh City, very much to provide us with a very powerful server machine to carry out the experiments. Moreover, this work was completed when the authors were working at Vietnam Institute for Advanced Study in Mathematics, Vietnam. Besides, our thanks go to Dr. Nguyen Thi Minh Huyen and her team at University of Science, Vietnam National University, Hanoi, Vietnam, for external resources used in the experiments and also to the administrative board at VanDon Hospital for their real clinical data and support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thi Ngoc Chau Vo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Vo, T.N.C., Cao, T.H., Ho, T.B. (2016). Abbreviation Identification in Clinical Notes with Level-wise Feature Engineering and Supervised Learning. In: Ohwada, H., Yoshida, K. (eds) Knowledge Management and Acquisition for Intelligent Systems . PKAW 2016. Lecture Notes in Computer Science(), vol 9806. Springer, Cham. https://doi.org/10.1007/978-3-319-42706-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42706-5_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42705-8

  • Online ISBN: 978-3-319-42706-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics