Controlling the Transition of Hidden States for Neural Machine Translation

Zheng, Zaixiang; Huang, Shujian; Dai, Xin-Yu; Chen, Jiajun

doi:10.1007/978-981-13-3083-4_8

Zaixiang Zheng¹¹,
Shujian Huang¹¹,
Xin-Yu Dai¹¹ &
…
Jiajun Chen¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 954))

Included in the following conference series:

China Workshop on Machine Translation

533 Accesses

Abstract

Recurrent Neural Networks (RNN) based Neural Machine Translation (NMT) models under an encoder-decoder framework show significant improvements in translation quality recently. Given the encoded representations of source sentence, the NMT systems generate translated sentence word by word, dependent on the hidden states of the decoder. The hidden states of the decoder update at each decoding step, deciding the next translation to be generated. In this case, the transitions of the hidden states between successive steps contribute to the decisions of the next token of the translation, which draws less attention in previous work. In this work, we propose an explicit supervised objective on the transitions of the decoder hidden states, aiming to help our model to learn the transitional patterns better. We first attempt to model the increment of the transition by the proposed subtraction operation. Then, we require the increment to be predictive of the word to translate. The proposed approach strengthens the relationship between the transition of the decoder and the translation. Empirical evaluation shows considerable improvements on Chinese-English, German-English, and English-German translation tasks, demonstrating the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The corpora includes LDC2002E18, LDC2003E07, LDC2003E14, Hansards portion of LDC2004T07, LDC2004T08 and LDC2005T06.

References

Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: ICLR 2015 (2015)
Google Scholar
Bojar, O., et al. (eds.): Proceedings of the Second Conference on Machine Translation. Association for Computational Linguistics, Copenhagen, September 2017. http://www.aclweb.org/anthology/W17-47
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP 2014 (2014)
Google Scholar
Conneau, A., Kruszewski, G., Lample, G., Barrault, L., Baroni, M.: What you can cram into a single vector: probing sentence embeddings for linguistic properties (2018). http://arxiv.org/abs/1805.01070
Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.: Convolutional sequence to sequence learning. In: ICML 2017 (2017)
Google Scholar
Luong, T., Pham, H., Manning, D.C.: Effective approaches to attention-based neural machine translation. In: EMNLP 2015 (2015)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: ACL 2002 (2002)
Google Scholar
Sennrich, R., et al.: The University of Edinburgh’s neural MT systems for WMT17 (2017)
Google Scholar
Sennrich, R., et al.: Nematus: A Toolkit for Neural Machine Translation (2016)
Google Scholar
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. Comput. Sci. (2016)
Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NIPS 2014 (2014)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NIPS 2017. Curran Associates, Inc. (2017)
Google Scholar
Weng, R., Huang, S., Zheng, Z., Dai, X.Y., Chen, J.: Neural machine translation with word predictions. In: EMNLP 2017 (2017)
Google Scholar
Zheng, Z., et al.: Modeling Past and Future for Neural Machine Translation. ArXiv e-prints, November 2017
Google Scholar

Download references

Author information

Authors and Affiliations

Nanjing University, Nanjing, 210023, People’s Republic of China
Zaixiang Zheng, Shujian Huang, Xin-Yu Dai & Jiajun Chen

Authors

Zaixiang Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Shujian Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xin-Yu Dai
View author publications
You can also search for this author in PubMed Google Scholar
Jiajun Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shujian Huang .

Editor information

Editors and Affiliations

Department of Computer Science and Technology, Nanjing University, Nanjing, China
Jiajun Chen
National Laboratory of Pattern Recognition, Chinese Academy of Sciences, Beijing, China
Jiajun Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, Z., Huang, S., Dai, XY., Chen, J. (2019). Controlling the Transition of Hidden States for Neural Machine Translation. In: Chen, J., Zhang, J. (eds) Machine Translation. CWMT 2018. Communications in Computer and Information Science, vol 954. Springer, Singapore. https://doi.org/10.1007/978-981-13-3083-4_8

Download citation

DOI: https://doi.org/10.1007/978-981-13-3083-4_8
Published: 09 January 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-3082-7
Online ISBN: 978-981-13-3083-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics