Dynamic Transfer Learning for Named Entity Recognition

Bhatia, Parminder; Arumae, Kristjan; Busra Celikkaya, E.

doi:10.1007/978-3-030-24409-5_7

Dynamic Transfer Learning for Named Entity Recognition

Parminder Bhatia⁴,
Kristjan Arumae⁵ &
E. Busra Celikkaya⁴

Chapter
First Online: 02 August 2019

1020 Accesses
5 Citations

Part of the book series: Studies in Computational Intelligence ((SCI,volume 843))

Abstract

State-of-the-art named entity recognition (NER) systems have been improving continuously using neural architectures over the past several years. However, many tasks including NER require large sets of annotated data to achieve such performance. In particular, we focus on NER from clinical notes, which is one of the most fundamental and critical problems for medical text analysis. Our work centers on effectively adapting these neural architectures towards low-resource settings using parameter transfer methods. We complement a standard hierarchical NER model with a general transfer learning framework consisting of parameter sharing between the source and target tasks, and showcase scores significantly above the baseline architecture. These sharing schemes require an exponential search over tied parameter sets to generate an optimal configuration. To mitigate the problem of exhaustively searching for model optimization, we propose the Dynamic Transfer Networks (DTN), a gated architecture which learns the appropriate parameter sharing scheme between source and target datasets. DTN achieves the improvements of the optimized transfer learning framework with just a single training setting, effectively removing the need for exponential search.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Augenstein, I., Ruder, S., Søgaard, A.: Multi-task learning of pairwise sequence classification tasks over disparate label spaces. arXiv:1802.09913 (2018)
Bhatia, P., Guthrie, R., Eisenstein, J.: Morphological priors for probabilistic neural word embeddings. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 490–500 (2016)
Google Scholar
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. arXiv:1607.04606 (2016)
Chen, T., Li, M., Li, Y., Lin, M., Wang, N., Wang, M., Xiao, T., Xu, B., Zhang, C., Zhang, Z.: Mxnet: a flexible and efficient machine learning library for heterogeneous distributed systems. arXiv:1512.01274 (2015)
Chiu, J., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNS. Trans. Assoc. Comput. Linguist. 4(1), 357–370 (2016)
Article Google Scholar
Fan, X., Monti, E., Mathias, L., Dreyer, M.: Transfer learning for neural semantic parsing. arXiv:1706.04326 (2017)
Francis-Landau, M., Durrett, G., Klein, D.: Capturing semantic similarity for entity linking with convolutional neural networks. arXiv:1604.00734 (2016)
Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (icassp), pp. 6645–6649. IEEE (2013)
Google Scholar
Guo, H., Pasunuru, R., Bansal, M.: Dynamic multi-level multi-task learning for sentence simplification. arXiv:1806.07304 (2018)
Guo, H., Pasunuru, R., Bansal, M.: Soft layer-specific multi-task summarization with entailment and question generation. arXiv:1805.11004 (2018)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Jin, M., Bahadori, M.T., Colak, A., Bhatia, P., Celikkaya, B., Bhakta, R., Senthivel, S., Khalilia, M., Navarro, D., Zhang, B., et al.: Improving hospital mortality prediction with medical named entities and multimodal learning. arXiv:1811.12276 (2018)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv:1412.6980 (2014)
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Proceedings of NAACL-HLT, pp. 260–270 (2016)
Google Scholar
Liu, L., Shang, J., Xu, F., Ren, X., Gui, H., Peng, J., Han, J.: Empower sequence labeling with task-aware neural language model. arXiv:1709.04109 (2017)
McCann, B., Keskar, N.S., Xiong, C., Socher, R.: The natural language decathlon: Multitask learning as question answering. arXiv:1806.08730 (2018)
Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association (2010)
Google Scholar
Peng, N., Dredze, M.: Multi-task domain adaptation for sequence tagging. In: Proceedings of the 2nd Workshop on Representation Learning for NLP, pp. 91–100 (2017)
Google Scholar
Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: CoNLL. http://cogcomp.org/papers/RatinovRo09.pdf (2009)
Sachan, D.S., Xie, P., Xing, E.P.: Effective use of bidirectional language modeling for medical named entity recognition. arXiv:1711.07908 (2017)
See, A., Liu, P.J., Manning, C.D.: Get to the point: summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1073–1083 (2017)
Google Scholar
Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, pp. 2951–2959 (2012)
Google Scholar
Srivastava, R.K., Greff, K., Schmidhuber, J.: Training very deep networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems. vol. 2, pp. 2377–2385. MIT Press (2015)
Google Scholar
Uzuner, Ö., Solti, I., Cadag, E.: Extracting medication information from clinical text. J. Am. Med. Inform. Assoc. 17(5), 514–518 (2010)
Article Google Scholar
Verga, P., Strubell, E., McCallum, A.: Simultaneously self-attending to all mentions for full-abstract biological relation extraction. arXiv:1802.10569 (2018)
Wang, Z., Qu, Y., Chen, L., Shen, J., Zhang, W., Zhang, S., Gao, Y., Gu, G., Chen, K., Yu, Y.: Label-aware double transfer learning for cross-specialty medical named entity recognition. arXiv:1804.09021 (2018)
Williams, R.J., Zipser, D.: A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1(2), 270–280 (1989)
Article Google Scholar
Yang, Z., Salakhutdinov, R., Cohen, W.: Multi-task cross-lingual sequence tagging from scratch. arXiv:1603.06270 (2016)
Yang, Z., Salakhutdinov, R., Cohen, W.W.: Transfer learning for sequence tagging with hierarchical recurrent networks. arXiv:1703.06345 (2017)

Download references

Author information

Authors and Affiliations

Amazon.com Services Inc, Seattle, WA, USA
Parminder Bhatia & E. Busra Celikkaya
University of Central Florida, Orlando, FL, 32816, USA
Kristjan Arumae

Authors

Parminder Bhatia
View author publications
You can also search for this author in PubMed Google Scholar
Kristjan Arumae
View author publications
You can also search for this author in PubMed Google Scholar
E. Busra Celikkaya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Parminder Bhatia .

Editor information

Editors and Affiliations

Department of Pediatrics, The University of Tennessee Health Science Center – Oak-Ridge National Lab (UTHSC-ORNL) Center for Biomedical Informatics, Memphis, TN, USA
Arash Shaban-Nejad
School of Nursing, University of Minnesota, Minneapolis, MN, USA
Martin Michalowski

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bhatia, P., Arumae, K., Busra Celikkaya, E. (2020). Dynamic Transfer Learning for Named Entity Recognition. In: Shaban-Nejad, A., Michalowski, M. (eds) Precision Health and Medicine. W3PHAI 2019. Studies in Computational Intelligence, vol 843. Springer, Cham. https://doi.org/10.1007/978-3-030-24409-5_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-24409-5_7
Published: 02 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-24408-8
Online ISBN: 978-3-030-24409-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics