Dense Annotation of Free-Text Critical Care Discharge Summaries from an Indian Hospital and Associated Performance of a Clinical NLP Annotator

Ramanan, S. V.; Radhakrishna, Kedar; Waghmare, Abijeet; Raj, Tony; Nathan, Senthil P.; Sreerama, Sai Madhukar; Sampath, Sriram

doi:10.1007/s10916-016-0541-2

Dense Annotation of Free-Text Critical Care Discharge Summaries from an Indian Hospital and Associated Performance of a Clinical NLP Annotator

Patient Facing Systems
Published: 24 June 2016

Volume 40, article number 187, (2016)
Cite this article

Journal of Medical Systems Aims and scope Submit manuscript

S. V. Ramanan¹,
Kedar Radhakrishna ORCID: orcid.org/0000-0002-2334-5285²,
Abijeet Waghmare²,
Tony Raj²,
Senthil P. Nathan¹,
Sai Madhukar Sreerama² &
…
Sriram Sampath³

465 Accesses
4 Citations
Explore all metrics

Abstract

Electronic Health Record (EHR) use in India is generally poor, and structured clinical information is mostly lacking. This work is the first attempt aimed at evaluating unstructured text mining for extracting relevant clinical information from Indian clinical records. We annotated a corpus of 250 discharge summaries from an Intensive Care Unit (ICU) in India, with markups for diseases, procedures, and lab parameters, their attributes, as well as key demographic information and administrative variables such as patient outcomes. In this process, we have constructed guidelines for an annotation scheme useful to clinicians in the Indian context. We evaluated the performance of an NLP engine, Cocoa, on a cohort of these Indian clinical records. We have produced an annotated corpus of roughly 90 thousand words, which to our knowledge is the first tagged clinical corpus from India. Cocoa was evaluated on a test corpus of 50 documents. The overlap F-scores across the major categories, namely disease/symptoms, procedures, laboratory parameters and outcomes, are 0.856, 0.834, 0.961 and 0.872 respectively. These results are competitive with results from recent shared tasks based on US records. The annotated corpus and associated results from the Cocoa engine indicate that unstructured text mining is a viable method for cohort analysis in the Indian clinical context, where structured EHR records are largely absent.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The role of artificial intelligence in healthcare: a structured literature review

Article Open access 10 April 2021

Silvana Secinaro, Davide Calandra, … Paolo Biancone

Revolutionizing healthcare: the role of artificial intelligence in clinical practice

Article Open access 22 September 2023

Shuroug A. Alowais, Sahar S. Alghamdi, … Abdulkareem M. Albekairy

Big Data Analytics in Healthcare

References

H.E.S.S. Committee, And the G.E.T. Force, Electronic Health Records, A Global Perspective, 2010.
Electronic Health Record Standards For India Helpdesk | National Health Portal Of India, (n.d.). http://www.nhp.gov.in/ehr-standards-helpdesk_ms (accessed May 12, 2016).
Debra, D., Sullivan, guide to clinical documentation, 2nd edn. F. A, Davis Company, Philadelphia, 2004.
Google Scholar
Anthes, A.M., Harinstein, L.M., Smithburger, P.L., Seybert, A.L., and Kane-Gill, S.L., Improving adverse drug event detection in critically ill patients through screening intensive care unit transfer summaries. Pharmacoepidemiol. Drug Saf. 22:510–516, 2013. doi:10.1002/pds.3422.
Article PubMed Google Scholar
Constant, E., Garin, H., Bouchet, C., and Kohler, F., Differences of case-mix according to the type of hospital: methodological aspects and results. Stud. Health Technol. Inform. 52(Pt 2):874–878 , 1998.http://www.ncbi.nlm.nih.gov/pubmed/10384586 (accessed May 12, 2016)
PubMed Google Scholar
Kind, A.J.H., Thorpe, C.T., Sattin, J.A., Walz, S.E., and Smith, M.A., Provider characteristics, clinical-work processes and their relationship to discharge summary quality for sub-acute care patients. J. Gen. Intern. Med. 27:78–84, 2012. doi:10.1007/s11606-011-1860-0.
Article PubMed Google Scholar
M. Skouroliakou, G. Soloupis, A. Gounaris, A. Charitou, P. Papasarantopoulos, S.L. Markantonis, C. Golna, K. Souliotis, Data analysis of the benefits of an electronic registry of information in a neonatal intensive care unit in Greece., Perspect. Health Inf. Manag. 5 (2008) 10. http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2508737&tool=pmcentrez&rendertype=abstract (accessed May 12, 2016).
Blair, D.R., Lyttle, C.S., Mortensen, J.M., Bearden, C.F., Jensen, A.B., Khiabanian, H., Melamed, R., Rabadan, R., Bernstam, E.V., Brunak, S., Jensen, L.J., Nicolae, D., Shah, N.H., Grossman, R.L., Cox, N.J., White, K.P., and Rzhetsky, A., A nondegenerate code of deleterious variants in Mendelian loci contributes to complex disease risk. Cell. 155:70–80, 2013. doi:10.1016/j.cell.2013.08.030.
Article CAS PubMed Google Scholar
Li, L., Ruau, D.J., Patel, C.J., Weber, S.C., Chen, R., Tatonetti, N.P., Dudley, J.T., and Butte, A.J., Disease risk factors identified through shared genetic architecture and electronic medical records. Sci. Transl. Med. 6:234–ra57, 2014. doi:10.1126/scitranslmed.3007191.
Article Google Scholar
Earl, M.F., Information retrieval in biomedicine: natural language processing for knowledge integration. J. Med. Libr. Assoc. 98:190–191, 2010. doi:10.3163/1536-5050.98.2.020.
Article PubMed Central Google Scholar
Mehrotra, A., Dellon, E.S., Schoen, R.E., Saul, M., Bishehsari, F., Farmer, C., and Harkema, H., Applying a natural language processing tool to electronic health records to assess performance on colonoscopy quality measures. Gastrointest. Endosc. 75:1233–9.e14, 2012. doi:10.1016/j.gie.2012.01.045.
Article PubMed Google Scholar
Uzuner, O., Solti, I., Xia, F., and Cadag, E., Community annotation experiment for ground truth generation for the i2b2 medication challenge. J. Am. Med. Inform. Assoc. 17:519–523. doi:10.1136/jamia.2010.004200.
Gobbel, G.T., Reeves, R., Jayaramaraja, S., Giuse, D., Speroff, T., Brown, S.H., Elkin, P.L., and Matheny, M.E., Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives. J. Biomed. Inform. 48:54–65, 2014. doi:10.1016/j.jbi.2013.11.008.
Article PubMed Google Scholar
S. Sohn, Z. Ye, H. Liu, C.G. Chute, I.J. Kullo, Identifying Abdominal Aortic Aneurysm Cases and Controls using Natural Language Processing of Radiology Reports., AMIA Jt. Summits Transl. Sci. Proc. AMIA Summit Transl. Sci. (2013) 249–253. http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3845740&tool=pmcentrez&rendertype=abstract (accessed May 12, 2016).
Imler, T.D., Morea, J., Kahi, C., and Imperiale, T.F., Natural language processing accurately categorizes findings from colonoscopy and pathology reports. Clin. Gastroenterol. Hepatol. 11:689–694, 2013. doi:10.1016/j.cgh.2012.11.035.
Article PubMed PubMed Central Google Scholar
Shaban-Nejad, A., Mamiya, H., Riazanov, A., Forster, A.J., Baker, C.J.O., Tamblyn, R., and Buckeridge, D.L., From cues to nudge: a knowledge-based framework for surveillance of healthcare-associated infections. J. Med. Syst. 40:1–12, 2016. doi:10.1007/s10916-015-0364-6.
Article Google Scholar
Chen, L.S., Lin, Z.C., and Chang, J.R., FIR: An Effective Scheme for Extracting Useful Metadata from Social Media. J. Med. Syst. 39, 2015. doi:10.1007/s10916-015-0333-0.
Y.a, W., Y.a, T., L.-L.b, T., Y.-M.b, Q., and J.-S.a, L., An Electronic Medical Record System with Treatment Recommendations Based on Patient Similarity. J. Med. Syst. 39, 2015. doi:10.1007/s10916-015-0237-z.
Sun, W., Rumshisky, A., and Uzuner, O., Evaluating temporal relations in clinical text: 2012 i2b2 challenge. J. Am. Med. Inform. Assoc. 20:806–813. doi:10.1136/amiajnl-2013-001628.
i2b2: Informatics for Integrating Biology & the Bedside, (n.d.). https://www.i2b2.org/NLP/HeartDisease/ (accessed May 12, 2016).
S. Pradhan, N. Elhadad, B.R. South, D. Martinez, Lee, Christensen, A. Vogel, H. Suominen, W.W. Chapman, A.G. Savova, Task 1: ShARe/CLEF eHealth Evaluation Lab, 2013. http://ceur-ws.org/Vol-1179/CLEF2013wn-CLEFeHealth-PradhanEt2013.pdf.
D.L. Mowery, S. Velupillai, B.R. South, L. Christensen, D. Martinez, L. Kelly, L. Goeuriot, N. Elhadad, Sameer, Pradhan, G. Savova, and W.W. Chapman, Task 2: ShARe/CLEF eHealth Evaluation Lab, 2014. http://ceur-ws.org/Vol-1180/CLEF2014wn-eHealth-MoweryEt2014.pdf.
S. Pradhan, N. Elhadad, W. Chapman, G. Savova, S. Manandhar, Task 7: analysis of clinical text, in: 8th Int. Work. Semant. Eval., 2014.
Google Scholar
N. Elhadad, S. Pradhan, S.L. Gorman, W. Manandhar, Suresh Chapman, G. Savova, Task 14: Analysis of Clinical Text, 2015. http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval051.pdf.
van Walraven, C., and Austin, P., Administrative database research has unique characteristics that can risk biased results. J. Clin. Epidemiol. 65:126–131, 2012. doi:10.1016/j.jclinepi.2011.08.002.
Article PubMed Google Scholar
P. Stenetorp, S. Pyysalo, G. Topić, T. Ohta, S. Ananiadou, J. Tsujii, BRAT: a web-based tool for NLP-assisted text annotation, (2012) 102–107. http://dl.acm.org/citation.cfm?id=2380921.2380942 (accessed May 12, 2016).
ABNEY, S., Partial parsing via finite-state cascades. Nat. Lang. Eng. 2:337–344, 1996. doi:10.1017/S1351324997001599.
Article Google Scholar
Chapman, W.W., Bridewell, W., Hanbury, P., Cooper, G.F., and Buchanan, B.G., A simple algorithm for identifying negated findings and diseases in discharge summaries. J. Biomed. Inform. 34:301–310, 2001. doi:10.1006/jbin.2001.1029.
Article CAS PubMed Google Scholar
S. Ramanan, S.P. Nathan, Performance and limitations of the linguistically motivated cocoa/Peaberry system in a broad biomedical domain, in: BioNLP Shar. Task, 2013. http://www.aclweb.org/anthology/W13-2011.
S. V Ramanan, S.P. Nathan, Performance of a multi-class biomedical tagger on the BioCreative IV CTD task, in: Fourth BioCreative Chall. Eval. Work., 2013. http://www.biocreative.org/media/store/files/2013/bc4_v1_13.pdf.
S. V Ramanan, S.P. Nathan, RelAgent: Entity Detection and Normalization for Diseases in Clinical Records: a Linguistically Driven Approach, in: 8th Int. Work. Semant. Eval., 2014. http://www.aclweb.org/anthology/S14-2083.
S. V Ramanan, S.P. Nathan, Cocoa: Extending a rule-based system to tag disease attributes in clinical records, in: ShARe/CLEF eHealth Eval. Lab, 2014. http://ceur-ws.org/Vol-1180/CLEF2014wn-eHealth-RamananEt2014.pdf.
S. Pradhan, N. Elhadad, W. Chapman, S. Manandhar, G. Savova, SemEval-2014 Task 7: Analysis of Clinical Text, in: Proc. 8th Int. Work. Semant. Eval. (SemEval 2014), 2014: pp. 54–62. http://www.aclweb.org/anthology/S14-2007.

Download references

Author information

Authors and Affiliations

RelAgent Technologies (P) Limited, IIT Madras Research Park, #14, 1st Floor, Taramani, Chennai, 600113, India
S. V. Ramanan & Senthil P. Nathan
Division of Medical Informatics, St. John’s Research Institute, 100 Feet Road, Koramangala, Bangalore, 560034, India
Kedar Radhakrishna, Abijeet Waghmare, Tony Raj & Sai Madhukar Sreerama
Department of Critical Care Medicine, St. John’s Medical College, Bangalore, India
Sriram Sampath

Authors

S. V. Ramanan
View author publications
You can also search for this author in PubMed Google Scholar
Kedar Radhakrishna
View author publications
You can also search for this author in PubMed Google Scholar
Abijeet Waghmare
View author publications
You can also search for this author in PubMed Google Scholar
Tony Raj
View author publications
You can also search for this author in PubMed Google Scholar
Senthil P. Nathan
View author publications
You can also search for this author in PubMed Google Scholar
Sai Madhukar Sreerama
View author publications
You can also search for this author in PubMed Google Scholar
Sriram Sampath
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to S. V. Ramanan or Kedar Radhakrishna.

Ethics declarations

Funding

This work was covered completely by internal funding from St. John’s Research Institute and RelAgent Tech Pvt. Ltd.

Competing Interests

P. Senthil Nathan and S. V. Ramanan are founders of RelAgent Tech Pvt. Ltd., a biomedical text mining company. Other authors declare that they have no competing interests.

Ethics Statement

Ethical approval for the study was granted by the Institutional Ethics Committee (IEC) of St. John’s National Academy of Health Sciences. Patient consent for data collection is obtained as part of routine procedure during admission to the ICU.

Additional information

This article is part of the Topical Collection on Patient Facing Systems

Electronic supplementary material

ESM 1

Supplemmentary material A document containing the corpus annotation guidelines (DOCX 340 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ramanan, S.V., Radhakrishna, K., Waghmare, A. et al. Dense Annotation of Free-Text Critical Care Discharge Summaries from an Indian Hospital and Associated Performance of a Clinical NLP Annotator. J Med Syst 40, 187 (2016). https://doi.org/10.1007/s10916-016-0541-2

Download citation

Received: 20 August 2015
Accepted: 08 June 2016
Published: 24 June 2016
DOI: https://doi.org/10.1007/s10916-016-0541-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dense Annotation of Free-Text Critical Care Discharge Summaries from an Indian Hospital and Associated Performance of a Clinical NLP Annotator

Abstract

Access this article

Similar content being viewed by others

The role of artificial intelligence in healthcare: a structured literature review

Revolutionizing healthcare: the role of artificial intelligence in clinical practice

Big Data Analytics in Healthcare

References

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Funding

Competing Interests

Ethics Statement

Additional information

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dense Annotation of Free-Text Critical Care Discharge Summaries from an Indian Hospital and Associated Performance of a Clinical NLP Annotator

Abstract

Access this article

Similar content being viewed by others

The role of artificial intelligence in healthcare: a structured literature review

Revolutionizing healthcare: the role of artificial intelligence in clinical practice

Big Data Analytics in Healthcare

References

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Funding

Competing Interests

Ethics Statement

Additional information

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation