Text Mining for Medical Documents Using a Hidden Markov Model

Jang, Hyeju; Song, Sa Kwang; Myaeng, Sung Hyon

doi:10.1007/11880592_45

Text Mining for Medical Documents Using a Hidden Markov Model

Hyeju Jang²⁰,
Sa Kwang Song²¹ &
Sung Hyon Myaeng²⁰

Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4182))

Abstract

We propose a semantic tagger that provides high level concept information for phrases in clinical documents. It delineates such information from the statements written by doctors in patient records. The tagging, based on Hidden Markov Model (HMM), is performed on the documents that have been tagged with Unified Medical Language System (UMLS), Part-of-Speech (POS), and abbreviation tags. The result can be used to extract clinical knowledge that can support decision making or quality assurance of medical treatment.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Rabiner, L.R., et al.: An Introduction to Hidden Markov Models. IEEE ASSP Magazine (1986)
Google Scholar
van Guilder, L.: Automated Part of Speech Tagging:A Brief Overview. In: Handout for LING, vol. 361 (1995)
Google Scholar
Kupiec, J.: Robust part-of-speech tagging using a hidden Markov model. In: Computer Speech and Language, pp. 225–242 (1992)
Google Scholar
Cutting, D., et al.: A Practical Part-of-Speech Tagger. In: Proceedings of the 3rd ACL, pp. 133–140 (1992)
Google Scholar
Ruch, P.: MEDTAG: Tag-like Semantics for Medical Document Indexing. In: Proceedings of AMIA 1999, pp. 35–42 (1999)
Google Scholar
Johnson, S.B.: A Semantic Lexicon for Medical Language Processing. JAMIA 6(3), 205–218 (1999)
Google Scholar
Hahn, U.: Tagging Medical Documents with High Accuracy. In: Pacific Rim International Conference on Artificial Intelligence Auckland, Newzealand, pp. 852–861 (2004)
Google Scholar
Paulussen, H.: DILEMMA-2: A Lemmatizer-Tagger for Medical Abstracts. In: Proceeings of ANLP, pp. 141–146 (1992)
Google Scholar
Friedman, C.: Automatic Structuring of Sublanguage Information, pp. 85–102. IEA, London (1986)
Google Scholar
Chi, E.C., et al.: Processing Free-text Input to Obtain a Database of Medical Information. In: Proceedings of the 8th Annual ACM-SIGIR Conference (1985)
Google Scholar
Hahn, U.: Automatic Knowledge Acquisition from Medical Texts. In: Proceedings of the 1996 AMIA Annual Fall Symposium, pp. 383–387 (1996)
Google Scholar
What is CDA?: http://www.h17.org.au/CDA.htm#CDA
Elworthy, D.: Does Baum-Welch Re-estimation Help Taggers? In: Proceedings of the 27th ACL (1989)
Google Scholar
Merialdo, B.: Tagging English Text with a Probabilistic Model. Computational Linguistics 20(2), 155–172 (1994)
Google Scholar
Viterbi, A.J.: Error bounds for convolutional codes and an asymptotically optimal decoding algorithm. IEEE Transactions of Information Theory 13, 260–269 (1967)
Article MATH Google Scholar
Baum, L.: An inequality and associated maximization technique in statistical estimation for probabilistic functions of a Markov process. Inequalities 3, 1–8 (1972)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Information and Communications University, Daejeon, Korea
Hyeju Jang & Sung Hyon Myaeng
Electronics and Telecommunications Research Institute, Daejeon, Korea
Sa Kwang Song

Authors

Hyeju Jang
View author publications
You can also search for this author in PubMed Google Scholar
Sa Kwang Song
View author publications
You can also search for this author in PubMed Google Scholar
Sung Hyon Myaeng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, National University of Singapore, 3 Science Drive 2, 117543, Singapore
Hwee Tou Ng
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore
Mun-Kew Leong
Department of Computer Science, School of Computing, National University of Singapore, 117543, Singapore
Min-Yen Kan
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, P.O. Box, 119613, Singapore
Donghong Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jang, H., Song, S.K., Myaeng, S.H. (2006). Text Mining for Medical Documents Using a Hidden Markov Model. In: Ng, H.T., Leong, MK., Kan, MY., Ji, D. (eds) Information Retrieval Technology. AIRS 2006. Lecture Notes in Computer Science, vol 4182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11880592_45

Download citation

DOI: https://doi.org/10.1007/11880592_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45780-0
Online ISBN: 978-3-540-46237-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics