Medical Document Categorization Using a Priori Knowledge

Itert, Lukasz; Duch, Włodzisław; Pestian, John

doi:10.1007/11550822_99

Medical Document Categorization Using a Priori Knowledge

Lukasz Itert^20,21,
Włodzisław Duch^21,22 &
John Pestian²⁰

Conference paper

1212 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3696))

Abstract

A significant part of medical data remains stored as unstructured texts. Semantic search requires introduction of markup tags. Experts use their background knowledge to categorize new documents, and knowing category of these documents disambiguate words and acronyms. A model of document similarity that includes a priori knowledge and captures intuition of an expert, is introduced. It has only a few parameters that may be evaluated using linear programming techniques. This approach applied to categorization of medical discharge summaries provided simpler and much more accurate model than alternative text categorization approaches.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Campbell, D., Johnson, S.B.: Comparing syntactic complexity in medical and non-medical corpora. In: Proc. of the AMIA Annual Symposium, pp. 90–95 (2001)
Google Scholar
Pestian, J., Aronow, B., Davis, K.: Design and Data Collection in the Discovery System. In: Proc. Int. Conf. on Math. and Eng. Techniques in Medicine and Biological Sciences. CSREA Press, Providence (2002)
Google Scholar
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
MATH Google Scholar
UMLS: http://www.nlm.nih.gov/research/umls
MetaMap: http://mmtx.nlm.nih.gov
Czyzyk, J., Mehrotra, S., Wagner, M., Wright, S.J.: PCx: An Interior-Point Code for Linear Programming. Optim. Method. Softw. 12, 397–430 (1999)
Article MathSciNet Google Scholar
MedNet: http://www.medicinenet.com
Boston, C.H.: http://web1.tch.harvard.edu/cfapps/A2Z.cfm
Medline Plus: http://www.nlm.nih.gov/medlineplus/encyclopedia.html
GhostMiner: http://www.fqspl.com.pl/ghostminer/

Download references

Author information

Authors and Affiliations

Department of Biomedical Informatics, Children’s Hospital Research Foundation, 3333 Burnet Avenue, Cincinnati, OH, 45229, USA
Lukasz Itert & John Pestian
Department of Informatics, Nicolaus Copernicus University, Toruń, Poland
Lukasz Itert & Włodzisław Duch
School of Computer Engineering, Nanyang Technological University, Singapore
Włodzisław Duch

Authors

Lukasz Itert
View author publications
You can also search for this author in PubMed Google Scholar
Włodzisław Duch
View author publications
You can also search for this author in PubMed Google Scholar
John Pestian
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics, Nicolaus Copernicus University, Toruń, Poland
Włodzisław Duch
Systems Research Institute, Polish Academy of Sciences, ul. Newelska 6, 01–447, Warsaw, Poland
Janusz Kacprzyk
Adaptive Informatics Research Centre, Helsinki University of Technology, P.O. Box 5400, 02015 HUT, Finland
Erkki Oja
Systems Research Institute, Polish Academy of Sciences, ul. Newelska 6, 01-447, Warsaw, Poland
Sławomir Zadrożny

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Itert, L., Duch, W., Pestian, J. (2005). Medical Document Categorization Using a Priori Knowledge . In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds) Artificial Neural Networks: Biological Inspirations – ICANN 2005. ICANN 2005. Lecture Notes in Computer Science, vol 3696. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11550822_99

Download citation

DOI: https://doi.org/10.1007/11550822_99
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28752-0
Online ISBN: 978-3-540-28754-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics