Improving Context-Aware Neural Machine Translation with Target-Side Context

Yamagishi, Hayahide; Komachi, Mamoru

doi:10.1007/978-981-15-6168-9_10

Hayahide Yamagishi¹⁰ &
Mamoru Komachi¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1215))

Included in the following conference series:

International Conference of the Pacific Association for Computational Linguistics

676 Accesses
1 Citations

Abstract

In recent years, several studies on neural machine translation (NMT) have attempted to use document-level context by using a multi-encoder and two attention mechanisms to read the current and previous sentences to incorporate the context of the previous sentences. These studies concluded that the target-side context is less useful than the source-side context. However, we considered that the reason why the target-side context is less useful lies in the architecture used to model these contexts.

Therefore, in this study, we investigate how the target-side context can improve context-aware neural machine translation. We propose a weight sharing method wherein NMT saves decoder states and calculates an attention vector using the saved states when translating a current sentence. Our experiments show that the target-side context is also useful if we plug it into NMT as the decoder state when translating a previous sentence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Hereinafter, “document-level context” is simply referred to as a “context”.
2.
http://taku910.github.io/mecab/.
3.
https://github.com/fxsjy/jieba.
4.
http://www.statmt.org/moses/.
5.
http://lotus.kuee.kyoto-u.ac.jp/WAT/recipe-corpus/.

References

Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, vol. 27, pp. 3104–3112 (2014)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Proceedings of the International Conference on Learning Representations (2015)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30, pp. 5998–6008 (2017)
Google Scholar
Bawden, R., Sennrich, R., Birch, A., Haddow, B.: Evaluating discourse phenomena in neural machine translation. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Long Papers), vol. 1, pp. 1304–1313 (2018)
Google Scholar
Jean, S., Lauly, S., Firat, O., Cho, K.: Does neural machine translation benefit from larger context? CoRR, vol. abs/1704.05135 (2017)
Google Scholar
Voita, E., Serdyukov, P., Sennrich, R., Titov, I.: Context-aware neural machine translation learns anaphora resolution. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1264–1274 (2018)
Google Scholar
Zhang, J., et al.: Improving the transformer translation model with document-level context. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 533–542 (2018)
Google Scholar
Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421 (2015)
Google Scholar
Cettolo, M., Girardi, C., Federico, M.: WIT\(^3\): web inventory of transcribed and translated talks. In: Proceedings of the 16\(^{th}\) Conference of the European Association for Machine Translation, pp. 261–268, May 2012
Google Scholar
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1715–1725 (2016)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318 (2002)
Google Scholar
Neubig, G.: Travatar: a forest-to-string machine translation engine based on tree transducers. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 91–96 (2013)
Google Scholar
Dabre, R., Fujita, A.: Recurrent stacking of layers for compact neural machine translation models. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6292–6299 (2019)
Google Scholar
Kiros, R., et al.: Skip-thought vectors. In: Advances in Neural Information Processing Systems, vol. 28, pp. 3294–3302 (2015)
Google Scholar
Müller, M., Rios, A., Voita, E., Sennrich, R.: A large-scale test set for the evaluation of context-aware pronoun translation in neural machine translation. In: Proceedings of the Third Conference on Machine Translation: Research Papers, pp. 61–72 (2018)
Google Scholar
Tiedemann, J., Scherrer, Y.: Neural machine translation with extended context. In: Proceedings of the Third Workshop on Discourse in Machine Translation, pp. 82–92 (2017)
Google Scholar
Ranzato, M., Chopra, S., Auli, M., Zaremba, W.: Sequence level training with recurrent neural networks. In: Proceedings of the International Conference on Learning Representations (2016)
Google Scholar
Wang, L., Tu, Z., Way, A., Liu, Q.: Exploiting cross-sentence context for neural machine translation. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2826–2831 (2017)
Google Scholar
Maruf, S., Haffari, G.: Document context neural machine translation with memory networks. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1275–1284 (2018)
Google Scholar
Tu, Z., Liu, Y., Shi, S., Zhang, T.: Learning to remember translation history with a continuous cache. Trans. Assoc. Comput. Linguist. 6, 407–420 (2018)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Tokyo Metropolitan University, 6-6 Asahigaoka, Hino, Tokyo, 191-0065, Japan
Hayahide Yamagishi & Mamoru Komachi

Authors

Hayahide Yamagishi
View author publications
You can also search for this author in PubMed Google Scholar
Mamoru Komachi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hayahide Yamagishi .

Editor information

Editors and Affiliations

Japan Advanced Institute of Science and Technology, Ishikawa, Japan
Le-Minh Nguyen
University of Engineering and Technology, Hanoi, Vietnam
Xuan-Hieu Phan
Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
Kôiti Hasida
Japan Advanced Institute of Science and Technology, Ishikawa, Japan
Satoshi Tojo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yamagishi, H., Komachi, M. (2020). Improving Context-Aware Neural Machine Translation with Target-Side Context. In: Nguyen, LM., Phan, XH., Hasida, K., Tojo, S. (eds) Computational Linguistics. PACLING 2019. Communications in Computer and Information Science, vol 1215. Springer, Singapore. https://doi.org/10.1007/978-981-15-6168-9_10

Download citation

DOI: https://doi.org/10.1007/978-981-15-6168-9_10
Published: 02 July 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-6167-2
Online ISBN: 978-981-15-6168-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics