Fully Contextualized Biomedical NER

  • Ashim GuptaEmail author
  • Pawan Goyal
  • Sudeshna Sarkar
  • Mahanandeeshwar Gattu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11438)


Recently, neural network architectures have outperformed traditional methods in biomedical named entity recognition. Borrowed from innovations in general text NER, these models fail to address two important problems of polysemy and usage of acronyms across biomedical text. We hypothesize that using a fully-contextualized model that uses contextualized representations along with context dependent transition scores in CRF can alleviate this issue and help further boost the tagger’s performance. Our experiments with this architecture have shown to improve state-of-the-art F1 score on 3 widely used biomedical corpora for NER. We also perform analysis to understand the specific cases where our contextualized model is superior to a strong baseline.



This work was sponsored by Ministry of Human Resource Development (MHRD), and Excelra Knowledge Solutions under a UAY project.

Supplementary material

482053_1_En_15_MOESM1_ESM.pdf (143 kb)
Supplementary material 1 (pdf 143 KB)


  1. 1.
    Doğan, R.I., Leaman, R., Lu, Z.: Ncbi disease corpus: a resource for disease name recognition and concept normalization. J. Biomed. Inform. 47, 1–10 (2014)CrossRefGoogle Scholar
  2. 2.
    Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
  3. 3.
    Jagannatha, A.N., Yu, H.: Structured prediction models for RNN based sequence labeling in clinical text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, vol. 2016, p. 856. NIH Public Access (2016)Google Scholar
  4. 4.
    Kim, J.D., Ohta, T., Tsuruoka, Y., Tateisi, Y., Collier, N.: Introduction to the bio-entity recognition task at JNLPBA. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications, pp. 70–75. Association for Computational Linguistics (2004)Google Scholar
  5. 5.
    Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)Google Scholar
  6. 6.
    Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Proceedings of NAACL-HLT, pp. 260–270 (2016)Google Scholar
  7. 7.
    Leaman, R., Gonzalez, G.: Banner: an executable survey of advances in biomedical named entity recognition. In: Biocomputing 2008, pp. 652–663. World Scientific (2008)Google Scholar
  8. 8.
    Leaman, R., Islamaj Doğan, R., Lu, Z.: DNorm: disease name normalization with pairwise learning to rank. Bioinformatics 29(22), 2909–2917 (2013)CrossRefGoogle Scholar
  9. 9.
    Li, J., et al.: BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database 2016 (2016)Google Scholar
  10. 10.
    Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1064–1074 (2016)Google Scholar
  11. 11.
    McCann, B., Bradbury, J., Xiong, C., Socher, R.: Learned in translation: contextualized word vectors. In: Advances in Neural Information Processing Systems, pp. 6294–6305 (2017)Google Scholar
  12. 12.
    Peters, M., Ammar, W., Bhagavatula, C., Power, R.: Semi-supervised sequence tagging with bidirectional language models. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1756–1765 (2017)Google Scholar
  13. 13.
    Peters, M., et al.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (Volume 1: Long Papers), vol. 1, pp. 2227–2237 (2018)Google Scholar
  14. 14.
    Pisanelli, D.M., Gangemi, A., Battaglia, M., Catenacci, C.: Coping with medical polysemy in the semantic web: the role of ontologies. In: Medinfo, pp. 416–419 (2004)Google Scholar
  15. 15.
    Sahu, S., Anand, A.: Recurrent neural network models for disease name recognition using domain invariant features. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 2216–2225 (2016)Google Scholar
  16. 16.
    Sahu, S.K., Anand, A.: Unified neural architecture for drug, disease and clinical entity recognition. arXiv preprint arXiv:1708.03447 (2017)
  17. 17.
    Smith, L., et al.: Overview of biocreative ii gene mention recognition. Genome Biol. 9(2), S2 (2008)CrossRefGoogle Scholar
  18. 18.
    Wang, X., et al.: Cross-type biomedical named entity recognition with deep multi-task learning. arXiv preprint arXiv:1801.09851 (2018)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Ashim Gupta
    • 1
    Email author
  • Pawan Goyal
    • 1
  • Sudeshna Sarkar
    • 1
  • Mahanandeeshwar Gattu
    • 2
  1. 1.Indian Institute of Technology KharagpurKharagpurIndia
  2. 2.Excelra Knowledge SolutionsHyderabadIndia

Personalised recommendations