Skip to main content

Controlling the Transition of Hidden States for Neural Machine Translation

  • Conference paper
  • First Online:
Machine Translation (CWMT 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 954))

Included in the following conference series:

  • 533 Accesses

Abstract

Recurrent Neural Networks (RNN) based Neural Machine Translation (NMT) models under an encoder-decoder framework show significant improvements in translation quality recently. Given the encoded representations of source sentence, the NMT systems generate translated sentence word by word, dependent on the hidden states of the decoder. The hidden states of the decoder update at each decoding step, deciding the next translation to be generated. In this case, the transitions of the hidden states between successive steps contribute to the decisions of the next token of the translation, which draws less attention in previous work. In this work, we propose an explicit supervised objective on the transitions of the decoder hidden states, aiming to help our model to learn the transitional patterns better. We first attempt to model the increment of the transition by the proposed subtraction operation. Then, we require the increment to be predictive of the word to translate. The proposed approach strengthens the relationship between the transition of the decoder and the translation. Empirical evaluation shows considerable improvements on Chinese-English, German-English, and English-German translation tasks, demonstrating the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The corpora includes LDC2002E18, LDC2003E07, LDC2003E14, Hansards portion of LDC2004T07, LDC2004T08 and LDC2005T06.

References

  1. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: ICLR 2015 (2015)

    Google Scholar 

  2. Bojar, O., et al. (eds.): Proceedings of the Second Conference on Machine Translation. Association for Computational Linguistics, Copenhagen, September 2017. http://www.aclweb.org/anthology/W17-47

  3. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP 2014 (2014)

    Google Scholar 

  4. Conneau, A., Kruszewski, G., Lample, G., Barrault, L., Baroni, M.: What you can cram into a single vector: probing sentence embeddings for linguistic properties (2018). http://arxiv.org/abs/1805.01070

  5. Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.: Convolutional sequence to sequence learning. In: ICML 2017 (2017)

    Google Scholar 

  6. Luong, T., Pham, H., Manning, D.C.: Effective approaches to attention-based neural machine translation. In: EMNLP 2015 (2015)

    Google Scholar 

  7. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: ACL 2002 (2002)

    Google Scholar 

  8. Sennrich, R., et al.: The University of Edinburgh’s neural MT systems for WMT17 (2017)

    Google Scholar 

  9. Sennrich, R., et al.: Nematus: A Toolkit for Neural Machine Translation (2016)

    Google Scholar 

  10. Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. Comput. Sci. (2016)

    Google Scholar 

  11. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NIPS 2014 (2014)

    Google Scholar 

  12. Vaswani, A., et al.: Attention is all you need. In: NIPS 2017. Curran Associates, Inc. (2017)

    Google Scholar 

  13. Weng, R., Huang, S., Zheng, Z., Dai, X.Y., Chen, J.: Neural machine translation with word predictions. In: EMNLP 2017 (2017)

    Google Scholar 

  14. Zheng, Z., et al.: Modeling Past and Future for Neural Machine Translation. ArXiv e-prints, November 2017

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shujian Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zheng, Z., Huang, S., Dai, XY., Chen, J. (2019). Controlling the Transition of Hidden States for Neural Machine Translation. In: Chen, J., Zhang, J. (eds) Machine Translation. CWMT 2018. Communications in Computer and Information Science, vol 954. Springer, Singapore. https://doi.org/10.1007/978-981-13-3083-4_8

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-3083-4_8

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-3082-7

  • Online ISBN: 978-981-13-3083-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics