Abbreviation Identification in Clinical Notes with Level-wise Feature Engineering and Supervised Learning

Vo, Thi Ngoc Chau; Cao, Tru Hoang; Ho, Tu Bao

doi:10.1007/978-3-319-42706-5_1

Thi Ngoc Chau Vo¹⁵,
Tru Hoang Cao¹⁵ &
Tu Bao Ho^16,17

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9806))

Included in the following conference series:

Pacific Rim Knowledge Acquisition Workshop

1036 Accesses
2 Citations

Abstract

Nowadays, electronic medical records get more popular and significant in medical, biomedical, and healthcare research activities. Their popularity and significance lead to a growing need for sharing and utilizing them from the outside. However, explicit noises in the shared records might hinder users in their efforts to understand and consume the records. One kind of explicit noises that has a strong impact on the readability of the records is a set of abbreviations written in free text in the records because of writing-time saving and record simplification. Therefore, automatically identifying abbreviations and replacing them with their correct long forms are necessary for enhancing their readability and further their sharability. In this paper, our work concentrates on abbreviation identification to lay the foundations for de-noising clinical text with abbreviation resolution. Our proposed solution to abbreviation identification is general, practical, simple but effective with level-wise feature engineering and a supervised learning mechanism. We do level-wise feature engineering to characterize each token that is either an abbreviation or a non-abbreviation at the token, sentence, and note levels to formulate a comprehensive vector representation in a vector space. After that, many open options can be made to build an abbreviation identifier in a supervised learning mechanism and the resulting identifier can be used for automatic abbreviation identification in clinical text of the electronic medical records. Experimental results on various real clinical note types have confirmed the effectiveness of our solution with high accuracy, precision, recall, and F-measure for abbreviation identification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

A Set of Electronic Medical Records, VanDon Hospital, Vietnam, 24 February 2016
Google Scholar
Adnan, M., Warren, J., Orr, M.: Iterative refinement of SemLink to enhance patient readability of discharge summaries. In: Grain, H., Schaper, L.K. (eds.) Health Informatics: Digital Health Service Delivery - The Future is Now!, pp. 128–134 (2013)
Google Scholar
Berman, J.J.: Pathology abbreviated: a long review of short terms. Arch. Pathol. Lab. Med. 128, 347–352 (2004)
Google Scholar
Collard, B., Royal, A.: The use of abbreviations in surgical note keeping. Ann. Med. Surg. 4, 100–102 (2015)
Article Google Scholar
Henriksson, A., Moen, H., Skeppstedt, M., Daudaravičius, V., Duneld, M.: Synonym extraction and abbreviation expansion with ensembles of semantic spaces. J. Biomed. Semant. 5(6), 1–25 (2014)
Google Scholar
Kim, Y., Hurdle, J., Meystre, S.M.: Using UMLS lexical resources to disambiguate abbreviations in clinical text. In: AMIA Annual Symposium Proceedings, pp. 715–722 (2011)
Google Scholar
Kim, J.-B., Oh, H.-S., Nam, S.-S., Myaeng, S.-H.: Using candidate exploration and ranking for abbreviation resolution in clinical documents. In: Proceedings of the 2013 International Conference on Healthcare Informatics, pp. 317–326 (2013)
Google Scholar
Kim, M.-Y., Xu, Y., Zaiane, O.R., Goebel, R.: Recognition of patient-related named entities in noisy tele-health texts. ACM Trans. Intell. Syst. Technol. 6(4), 59:1–59:23 (2015)
Google Scholar
Kim, S., Yoon, J.: Link-topic model for biomedical abbreviation disambiguation. J. Biomed. Inform. 53, 367–380 (2015)
Article Google Scholar
Kreuzthaler, M., Schulz, S.: Detection of sentence boundaries and abbreviations in clinical narratives. BMC Med. Inform. Decis. Making 15, 1–13 (2015)
Article Google Scholar
Liu, Y., Ge, T., Mathews, K.S., Ji, H., McGuinness, D.L.: Exploiting task-oriented resources to learn word embeddings for clinical abbreviation expansion. In: Proceedings of the 2015 Workshop on Biomedical Natural Language Processing, pp. 92–97 (2015)
Google Scholar
Long, W.J.: Parsing free text nursing notes. In: AMIA Annual Symposium Proceedings, p. 917 (2003)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Workshop Proceedings of the International Conference on Learning Representations (2013)
Google Scholar
Moon, S., Berster, B.T., Xu, H., Cohen, T.: Word sense disambiguation of clinical abbreviations with hyperdimensional computing. In: AMIA Annual Symposium Proceedings, pp. 1007–1016 (2013)
Google Scholar
Moon, S., McInnes, B., Melton, G.B.: Challenges and practical approaches with word sense disambiguation of acronyms and abbreviations in the clinical domain. Healthc. Inform. Res. 21(1), 35–42 (2015)
Article Google Scholar
Moon, S., Pakhomov, S., Liu, N., Ryan, J.O., Melton, G.M.: A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources. J. Am. Med. Inform. Assoc. 21, 299–307 (2014)
Article Google Scholar
Pakhomov, S., Pedersen, T., Chute, C.G.: Abbreviation and acronym disambiguation in clinical discourse. In: AMIA Annual Symposium Proceedings, pp. 589–593 (2005)
Google Scholar
Weka 3, Data Mining Software in Java. http://www.cs.waikato.ac.nz/ml/weka. Accessed on 22 February 2016
Wong, W., Glance, D.: Statistical semantic and clinician confidence analysis for correcting abbreviations and spelling errors in clinical progress notes. Artif. Intell. Med. 53, 171–180 (2011)
Article Google Scholar
Word2VecJava. https://github.com/medallia/Word2VecJava. Accessed on 22 February 2016
Wu, Y., Denny, J.C., Rosenbloom, S.T., Miller, R.A., Giuse, D.A., Xu, H.: A comparative study of current clinical natural language processing systems on handling abbreviations in discharge summaries. In: AMIA Annual Symposium Proceedings, pp. 997–1003 (2012)
Google Scholar
Wu, Y., Rosenbloom, S.T., Denny, J.C., Miller, R.A., Mani, S., Giuse, D.A., Xu, H.: Detecting abbreviations in discharge summaries using machine learning methods. In: AMIA Annual Symposium Proceedings, pp. 1541–1549 (2011)
Google Scholar
Wu, Y., Tang, B., Jiang, M., Moon, S., Denny, J.C., Xu, H.: Clinical acronym/abbreviation normalization using a hybrid approach. In: CLEF (2013)
Google Scholar
Wu, Y., Xu, J., Zhang, Y., Xu, H.: Clinical abbreviation disambiguation using neural word embeddings. In: Proceedings of the 2015 Workshop on Biomedical Natural Language Processing (BioNLP 2015), pp. 171–176 (2015)
Google Scholar
Xu, H., Stetson, P.D., Friedman, C.: A study of abbreviations in clinical notes. In: AMIA Annual Symposium Proceedings, pp. 822–825 (2007)
Google Scholar
Xu, H., Stetson, P.D., Friedman, C.: Methods for building sense inventories of abbreviations in clinical notes. J. Am. Med. Inform. Assoc. 16(1), 103–108 (2009)
Article Google Scholar

Download references

Acknowledgments

This work is funded by Vietnam National University at Ho Chi Minh City under the grant number B2016-42-01. In addition, we would like to thank John von Neumann Institute, Vietnam National University at Ho Chi Minh City, very much to provide us with a very powerful server machine to carry out the experiments. Moreover, this work was completed when the authors were working at Vietnam Institute for Advanced Study in Mathematics, Vietnam. Besides, our thanks go to Dr. Nguyen Thi Minh Huyen and her team at University of Science, Vietnam National University, Hanoi, Vietnam, for external resources used in the experiments and also to the administrative board at VanDon Hospital for their real clinical data and support.

Author information

Authors and Affiliations

University of Technology, Vietnam National University, Ho Chi Minh City, Vietnam
Thi Ngoc Chau Vo & Tru Hoang Cao
Japan Advanced Institute of Science and Technology, Nomi, Japan
Tu Bao Ho
John von Neumann Institute, Vietnam National University, Ho Chi Minh City, Vietnam
Tu Bao Ho

Authors

Thi Ngoc Chau Vo
View author publications
You can also search for this author in PubMed Google Scholar
Tru Hoang Cao
View author publications
You can also search for this author in PubMed Google Scholar
Tu Bao Ho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thi Ngoc Chau Vo .

Editor information

Editors and Affiliations

Tokyo University of Science , Noda, Japan
Hayato Ohwada
University of Tsukuba, Tokyo, Japan
Kenichi Yoshida

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vo, T.N.C., Cao, T.H., Ho, T.B. (2016). Abbreviation Identification in Clinical Notes with Level-wise Feature Engineering and Supervised Learning. In: Ohwada, H., Yoshida, K. (eds) Knowledge Management and Acquisition for Intelligent Systems . PKAW 2016. Lecture Notes in Computer Science(), vol 9806. Springer, Cham. https://doi.org/10.1007/978-3-319-42706-5_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-42706-5_1
Published: 07 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42705-8
Online ISBN: 978-3-319-42706-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics