Advertisement

Improving Low-Resource Neural Machine Translation with Weight Sharing

  • Tao Feng
  • Miao Li
  • Xiaojun Liu
  • Yichao Cao
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11221)

Abstract

Neural machine translation (NMT) has achieved great success under a great deal of bilingual corpora in the past few years. However, it is much less effective for low-resource language. In order to alleviate the problem, we present two approaches which can improve the performance of low-resource NMT system. The first approach employs the weight sharing of decoder to enhance the target language model of low-resource NMT system. The second approach applies cross-lingual embedding and source sentence representation space sharing to strengthen the encoder of low-resource NMT. Our experiments demonstrate that the proposed method can obtain significant improvements on low-resource neural machine translation than baseline system. On the IWSLT2015 Vietnamese-English translation task, our model can improve the translation quality by an average of 1.43 BLEU scores. Besides, we can also get the increase of 0.96 BLEU scores when translating from Mongolian to Chinese.

Keywords

Low-resource Neural machine translation Weight sharing 

Notes

Acknowledgements

The work is supported by the Nation Natural Science Foundation of China under No. 61572462, 61502445.

References

  1. 1.
    Wu, Y., Schuster, M., Chen, Z., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)
  2. 2.
    Gehring, J., Auli, M., Grangier, D., et al.: Convolutional sequence to sequence learning. arXiv preprint arXiv:1705.03122 (2017)
  3. 3.
    Junczys-Dowmunt, M., Dwojak, T., Hoang, H.: Is neural machine translation ready for deployment? A case study on 30 translation directions. arXiv preprint arXiv:1610.01108 (2016)
  4. 4.
    Bojar, O., Chatterjee, R., Federmann, C., et al.: Findings of the 2016 conference on machine translation. In: ACL 2016 First Conference On Machine Translation (WMT16), pp. 131–198. The Association for Computational Linguistics (2016)Google Scholar
  5. 5.
    Kalchbrenner, N., Blunsom, P.: Recurrent continuous translation models. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1700–1709 (2013)Google Scholar
  6. 6.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)Google Scholar
  7. 7.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
  8. 8.
    Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015)
  9. 9.
    Koehn, P., Knowles, R.: Six challenges for neural machine translation. arXiv preprint arXiv:1706.03872 (2017)
  10. 10.
    Zoph, B., Yuret, D., May, J., et al.: Transfer learning for low-resource neural machine translation. arXiv preprint arXiv:1604.02201 (2016)
  11. 11.
    Artetxe, M., Labaka, G., Agirre, E., et al.: Unsupervised neural machine translation. arXiv preprint arXiv:1710.11041 (2017)
  12. 12.
    Johnson, M., Schuster, M., Le, Q.V., et al.: Google’s multilingual neural machine translation system: enabling zero-shot translation. arXiv preprint arXiv:1611.04558 (2016)
  13. 13.
    Artetxe, M., Labaka, G., Agirre, E.: Learning bilingual word embeddings with (almost) no bilingual data. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 451–462 (2017)Google Scholar
  14. 14.
    Nguyen, T.Q., Chiang, D.: Transfer learning across low-resource, related languages for neural machine translation. arXiv preprint arXiv:1708.09803 (2017)
  15. 15.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  16. 16.
    Gal, Y., Ghahramani, Z.A.: Theoretically grounded application of dropout in recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 1019–1027 (2016)Google Scholar
  17. 17.
    Sennrich, R., Haddow, B., Birch, A.: Improving neural machine translation models with monolingual data. arXiv preprint arXiv:1511.06709 (2015)
  18. 18.
    Currey, A., Barone, A.V.M., Heafield, K.: Copied monolingual data improves low-resource neural machine translation. In: Proceedings of the Second Conference on Machine Translation, pp. 148–156 (2017)Google Scholar
  19. 19.
    Zhang, J., Zong, C.: Exploiting source-side monolingual data in neural machine translation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1535–1545 (2016)Google Scholar
  20. 20.
    Chen, Y., Liu, Y., Cheng, Y., et al.: A teacher-student framework for zero-resource neural machine translation. arXiv preprint arXiv:1705.00753 (2017)
  21. 21.
    Zheng, H., Cheng, Y., Liu, Y.: Maximum expected likelihood estimation for zero-resource neural machine translation. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI 2017), Melbourne, Australia, pp. 4251–4257 (2017)Google Scholar
  22. 22.
    Cheng, Y., Xu, W., He, Z., et al.: Semi-supervised learning for neural machine translation. arXiv preprint arXiv:1606.04596 (2016)
  23. 23.
    Gulcehre, C., Firat, O., Xu, K., et al.: On using monolingual corpora in neural machine translation. arXiv preprint arXiv:1503.03535 (2015)
  24. 24.
    Firat, O., Sankaran, B., Al-Onaizan, Y., et al.: Zero-resource translation with multi-lingual neural machine translation. arXiv preprint arXiv:1606.04164 (2016)
  25. 25.
    Lample, G., Denoyer, L., Ranzato, M.A.: Unsupervised machine translation using monolingual corpora only. arXiv preprint arXiv:1711.00043 (2017)
  26. 26.
    Dong, D., Wu, H., He, W., et al.: Multi-task learning for multiple language translation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), vol. 1, pp. 1723–1732 (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Institute of Intelligent MachinesChinese Academy of ScienceHefeiChina
  2. 2.University of Science and Technology of ChinaHefeiChina
  3. 3.School of Information and ComputerAnhui Agricultural UniversityHefeiChina

Personalised recommendations