Skip to main content

Instant Translation Model Adaptation by Translating Unseen Words in Continuous Vector Space

  • Conference paper
  • First Online:
  • 1136 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9624))

Abstract

In statistical machine translation (smt), differences between domains of training and test data result in poor translations. Although there have been many studies on domain adaptation of language models and translation models, most require supervised in-domain language resources such as parallel corpora for training and tuning the models. The necessity of supervised data has made such methods difficult to adapt to practical smt systems. We thus propose a novel method that adapts translation models without in-domain parallel corpora. Our method infers translation candidates of unseen words by nearest-neighbor search after projecting their vector-based semantic representations to the semantic space of the target language. In our experiment of out-of-domain translation from Japanese to English, our method improved bleu score by 0.5–1.5.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    k was set to 10 in the experiments.

  2. 2.

    http://www.phontron.com/kftt/.

  3. 3.

    http://cookpad.com/.

  4. 4.

    http://www.statmt.org/moses/.

  5. 5.

    http://www.speech.sri.com/projects/srilm/.

  6. 6.

    https://github.com/moses-smt/giza-pp.

  7. 7.

    http://dumps.wikimedia.org/ (versions of Nov, 4th, 2014 (ja), Oct, 8th, 2014 (en).

  8. 8.

    http://compling.hss.ntu.edu.sg/omw/.

References

  1. Brants, T., Popat, A.C., Xu, P., Och, F.J., Dean, J.: Large language models in machine translation. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 858–867 (2007)

    Google Scholar 

  2. Irvine, A., Morgan, J., Carpuat, M., Daumé III, H., Munteanu, D.: Measuring machine translation errors in new domains. Trans. Associ. Comput. Linguist. 1, 429–440 (2013)

    Google Scholar 

  3. Costa-Jussà, M.R.: Domain adaptation strategies in statistical machine translation: a brief overview. Knowl. Eng. Rev. 30, 514–520 (2015)

    Article  Google Scholar 

  4. Mansour, S., Ney, H.: Unsupervised adaptation for statistical machine translation. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 457–465 (2014)

    Google Scholar 

  5. Ishiwatari, S., Kaji, N., Yoshinaga, N., Toyoda, M., Kitsuregawa, M.: Accurate cross-lingual projection between count-based word vectors by exploiting translatable context pairs. In: Proceedings of the 19th Conference on Computational Natural Language Learning (CoNLL), pp. 300–304 (2015)

    Google Scholar 

  6. Wu, H., Wang, H., Zong, C.: Domain adaptation for statistical machine translation with domain dictionary and monolingual corpora. In: Proceedings of the 22nd International Conference on Computational Linguistics (COLING), pp. 993–1000 (2008)

    Google Scholar 

  7. Daumé III, H., Jagarlamudi, J.: Domain adaptation for machine translation by mining unseen words. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT), pp. 407–412 (2011)

    Google Scholar 

  8. Irvine, A., Quirk, C., Daumé III, H.: Monolingual marginal matching for translation model adaptation. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1077–1088 (2013)

    Google Scholar 

  9. Razmara, M., Siahbani, M., Haffari, R., Sarkar, A.: Graph propagation for paraphrasing out-of-vocabulary words in statistical machine translation. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1105–1115 (2013)

    Google Scholar 

  10. Mathur, P., Keseler, F.B., Venkatapathy, S., Cancedda, N.: Fast domain adaptation of SMT models without in-domain parallel data. In: Proceedings of the 25th International Conference on Computational Linguistics (COLING), pp. 1114–1123 (2014)

    Google Scholar 

  11. Yamamoto, H., Sumita, E.: Bilingual cluster based models for statistical machine translation. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 514–523 (2007)

    Google Scholar 

  12. Harris, Z.S.: Distributional structure. Word 10, 146–162 (1954)

    Article  Google Scholar 

  13. Firth, J.R.: A synopsis of linguistic theory. In: Studies in Linguistic Analysis, pp. 1–32 (1957)

    Google Scholar 

  14. Lund, K., Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence. Behav. Res. Methods Instr. Comput. 28, 203–208 (1996)

    Article  Google Scholar 

  15. Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)

    MATH  Google Scholar 

  16. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at International Conference on Learning Representations (ICLR) (2013)

    Google Scholar 

  17. Turney, P.D., Pantel, P., et al.: From frequency to meaning: vector space models of semantics. J. Artif. Intell. Res. (JAIR) 37, 141–188 (2010)

    MathSciNet  MATH  Google Scholar 

  18. Erk, K.: Vector space models of word meaning and phrase meaning: a survey. Lang. Linguist. Compass 6, 635–653 (2012)

    Article  Google Scholar 

  19. Mikolov, T., Le, Q.V., Sutskever, I.: Exploiting similarities among languages for machine translation. arXiv preprint (2013)

    Google Scholar 

  20. Fung, P.: A statistical view on bilingual lexicon extraction: from parallel corpora to non-parallel corpora. In: Proceedings of the Third Conference of the Association for Machine Translation in the Americas (AMTA), pp. 1–17 (1998)

    Google Scholar 

  21. Tsunakawa, T., Okazaki, N., Liu, X., Tsujii, J.: A Chinese-Japanese lexical machine translation through a pivot language. ACM Trans. Asian Lang. Inf. Process. (TALIP), 8, 9:1–9:21 (2009)

    Google Scholar 

  22. Neubig, G.: The Kyoto free translation task (2011). http://www.phontron.com/kftt

  23. Koehn, P., Knight, K.: Learning a translation lexicon from monolingual corpora. In: Proceedings of ACL Workshop on Unsupervised lexical acquisition. pp. 9–16 (2002)

    Google Scholar 

  24. Stolcke, A., et al.: SRILM-an extensible language modeling toolkit. In: Proceedings of the Seventh International Conference on Spoken Language Processing (ICSLP), pp. 901–904 (2002)

    Google Scholar 

  25. Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Comput. Linguist. 29, 19–51 (2003)

    Article  MATH  Google Scholar 

  26. Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. Comput. Linguist. 16, 22–29 (1990)

    Google Scholar 

  27. Koehn, P.: Statistical significance tests for machine translation evaluation. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 388–395 (2004)

    Google Scholar 

  28. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 311–318 (2002)

    Google Scholar 

  29. Lembersky, G., Ordan, N., Wintner, S.: Adapting translation models to translationese improves SMT. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 255–265 (2012)

    Google Scholar 

  30. Schwenk, H.: Continuous space translation models for phrase-based statistical machine translation. In: Proceedings of 24th International Conference on Computational Linguistics (COLING): Posters, pp. 1071–1080 (2012)

    Google Scholar 

Download references

Acknowledgments

The authors thank Nobuhiro Kaji and the anonymous reviewers for their valuable comments and suggestions. We also thank Jun Harashima for providing us the Cookpad recipe corpus. This work was partially supported by JSPS KAKENHI Grant Number 25280111.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shonosuke Ishiwatari .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ishiwatari, S., Yoshinaga, N., Toyoda, M., Kitsuregawa, M. (2018). Instant Translation Model Adaptation by Translating Unseen Words in Continuous Vector Space. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9624. Springer, Cham. https://doi.org/10.1007/978-3-319-75487-1_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-75487-1_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-75486-4

  • Online ISBN: 978-3-319-75487-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics