Transfer Learning Using Progressive Neural Networks and NMT for Classification Tasks in NLP

Devanapalli, Ravi Shankar; Devi, V. Susheela

doi:10.1007/978-3-030-04182-3_17

Ravi Shankar Devanapalli¹⁶ &
V. Susheela Devi¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11303))

Included in the following conference series:

International Conference on Neural Information Processing

2204 Accesses

Abstract

Recently neural networks are obtaining state of the art results on many NLP tasks like sentiment classification, machine translation, etc. However one of the drawbacks of these techniques is that they need large amounts of training data. Even though there is a lot of data being generated everyday, not all tasks have large amounts of data. One possible solution when data is not sufficient is using transfer learning techniques. In this paper, we explored methods of transfer learning (or sharing the parameters) between different tasks so that the performance on the low data resource tasks is improved. We have first tried to replicate the prior results of transfer learning in semantically related tasks. When we have semantically different tasks, we tried using Progressive Neural Networks. We also experimented on sharing the encoder from neural machine translator to classification tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Mou, L., et al.: How transferable are neural networks in NLP applications? In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 478–489 (2016)
Google Scholar
Yang, Z., Salakhutdinov, R., Cohen, W.W.: Transfer learning for sequence tagging with hierarchical recurrent networks. In: ICLR 2017 (2017)
Google Scholar
Yoon, S., Yun, H., Kim, Y., Park, G., Jung, K.: Efficient Transfer Learning Schemes for Personalized Language Modeling using Recurrent Neural Network CoRR 2017, volume: abs/1701.03578
Google Scholar
Rusu, A.A., et al.: Progressive Neural Networks CoRR 2016, volume: abs/1606.04671
Google Scholar
Luong, M.-T., Brevdo, E., Zhao, R.: Neural Machine Translation (seq2seq) Tutorial (2017). https://github.com/tensorflow/nmt
Johnson, M., et al.: Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation. CoRR 2016, volume: abs/1611.04558
Google Scholar
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics (2015)
Google Scholar
Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014
Google Scholar
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, June 2011
Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Science, CV Raman Road, Bengaluru, 560012, Karnataka, India
Ravi Shankar Devanapalli & V. Susheela Devi

Authors

Ravi Shankar Devanapalli
View author publications
You can also search for this author in PubMed Google Scholar
V. Susheela Devi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ravi Shankar Devanapalli .

Editor information

Editors and Affiliations

The Chinese Academy of Sciences, Beijing, China
Long Cheng
City University of Hong Kong, Kowloon, Hong Kong
Andrew Chi Sing Leung
Kobe University, Kobe, Japan
Seiichi Ozawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Devanapalli, R.S., Devi, V.S. (2018). Transfer Learning Using Progressive Neural Networks and NMT for Classification Tasks in NLP. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11303. Springer, Cham. https://doi.org/10.1007/978-3-030-04182-3_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-04182-3_17
Published: 18 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04181-6
Online ISBN: 978-3-030-04182-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics