Abstract
Attention-based neural machine translation (NMT) employs an attention network to capture structural correspondences between the source and target language at the word level. Unfortunately, alignments between source and target equivalents are complicated, which makes word-level attention not adequate to model these relations (e.g., alignments between a source idiom and its target translation). In order to handle this issue, we propose a phrase-level attention mechanism to complement the word-level attention network in this paper. The proposed phrasal attention framework is simple yet effective, keeping the strength of phrase-based statistical machine translation (SMT) on the source side. Experiments on Chinese-to-English translation task demonstrate that the proposed method is able to statistically improve word-level attention-based NMT.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kalchbrenner, N., Blunsom, P.: Recurrent continuous translation models. In: Proceedings of 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013), pp. 1700–1709. ACL, Washington (2013)
Cho, K., Bahdanau, D., Bougares, F., et al.: Recurrent continuous translation models. In: Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), pp. 1724–1734. ACL, Doha (2014)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 1724–1734. NIPS (2014)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: ICLR (2014)
Feng, S., Liu, S., Li, M., et al.: Implicit distortion and fertility models for attention-based encoder-decoder NMT model (2016). http://arxiv.org/abs/1601.03317
Cohn, T., Hoang, C.D.V., Vymolova, E., et al.: Incorporating structural alignment biases into an attentional neural translation model. In: NAACL 2016, pp. 3093–3102. ACL, San Diego (2016)
Eriguchi, A., Hashimoto, K., Tsuruoka, Y.: Tree-to-sequence attentional neural machine translation. In: ACL 2016, pp. 823–833. ACL, Berlin (2016)
Luong, M.-T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: EMNLP 2015, pp. 1412–1421. ACL, Lisbon (2015)
Liu, L., Utiyama, M., Finch, A.M., et al.: Incorporating structural alignment biases into an attentional neural translation model. In: COLING 2016, pp. 876–885. COLING, Osaka (2016)
Liu, Y., Sun, M.: Contrastive unsupervised word alignment with non-local features. In: AAAI 2015, pp. 2295–2301. AAAI, Austin (2015)
Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Comput. Linguist. 29, 19–51 (2013)
Zeiler, M.D.: ADADELTA: An Adaptive Learning Rate Method (2013). http://arxiv.org/abs/1212.5701
Acknowledgements
This work was supported by National Natural Science Foundation of China (Grant Nos. 61525205, 61432013, 61403269), the Fundamental Research Funds for the Central Universities of Northwest MinZu University (Grand Nos. 31920170154, 31920170153) and the Scientific Research Project of Universities in Gansu (2016B-007).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Li, Y., Xiong, D., Zhang, M. (2017). Neural Machine Translation with Phrasal Attention. In: Wong, D., Xiong, D. (eds) Machine Translation. CWMT 2017. Communications in Computer and Information Science, vol 787. Springer, Singapore. https://doi.org/10.1007/978-981-10-7134-8_1
Download citation
DOI: https://doi.org/10.1007/978-981-10-7134-8_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7133-1
Online ISBN: 978-981-10-7134-8
eBook Packages: Computer ScienceComputer Science (R0)