Instant Translation Model Adaptation by Translating Unseen Words in Continuous Vector Space

Ishiwatari, Shonosuke; Yoshinaga, Naoki; Toyoda, Masashi; Kitsuregawa, Masaru

doi:10.1007/978-3-319-75487-1_5

Instant Translation Model Adaptation by Translating Unseen Words in Continuous Vector Space

Shonosuke Ishiwatari¹⁴,
Naoki Yoshinaga^15,16,
Masashi Toyoda¹⁵ &
…
Masaru Kitsuregawa^15,17

Conference paper
First Online: 21 March 2018

1136 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9624))

Abstract

In statistical machine translation (smt), differences between domains of training and test data result in poor translations. Although there have been many studies on domain adaptation of language models and translation models, most require supervised in-domain language resources such as parallel corpora for training and tuning the models. The necessity of supervised data has made such methods difficult to adapt to practical smt systems. We thus propose a novel method that adapts translation models without in-domain parallel corpora. Our method infers translation candidates of unseen words by nearest-neighbor search after projecting their vector-based semantic representations to the semantic space of the target language. In our experiment of out-of-domain translation from Japanese to English, our method improved bleu score by 0.5–1.5.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
k was set to 10 in the experiments.
2.
http://www.phontron.com/kftt/.
3.
http://cookpad.com/.
4.
http://www.statmt.org/moses/.
5.
http://www.speech.sri.com/projects/srilm/.
6.
https://github.com/moses-smt/giza-pp.
7.
http://dumps.wikimedia.org/ (versions of Nov, 4th, 2014 (ja), Oct, 8th, 2014 (en).
8.
http://compling.hss.ntu.edu.sg/omw/.

References

Brants, T., Popat, A.C., Xu, P., Och, F.J., Dean, J.: Large language models in machine translation. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 858–867 (2007)
Google Scholar
Irvine, A., Morgan, J., Carpuat, M., Daumé III, H., Munteanu, D.: Measuring machine translation errors in new domains. Trans. Associ. Comput. Linguist. 1, 429–440 (2013)
Google Scholar
Costa-Jussà, M.R.: Domain adaptation strategies in statistical machine translation: a brief overview. Knowl. Eng. Rev. 30, 514–520 (2015)
Article Google Scholar
Mansour, S., Ney, H.: Unsupervised adaptation for statistical machine translation. In: Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 457–465 (2014)
Google Scholar
Ishiwatari, S., Kaji, N., Yoshinaga, N., Toyoda, M., Kitsuregawa, M.: Accurate cross-lingual projection between count-based word vectors by exploiting translatable context pairs. In: Proceedings of the 19th Conference on Computational Natural Language Learning (CoNLL), pp. 300–304 (2015)
Google Scholar
Wu, H., Wang, H., Zong, C.: Domain adaptation for statistical machine translation with domain dictionary and monolingual corpora. In: Proceedings of the 22nd International Conference on Computational Linguistics (COLING), pp. 993–1000 (2008)
Google Scholar
Daumé III, H., Jagarlamudi, J.: Domain adaptation for machine translation by mining unseen words. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT), pp. 407–412 (2011)
Google Scholar
Irvine, A., Quirk, C., Daumé III, H.: Monolingual marginal matching for translation model adaptation. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1077–1088 (2013)
Google Scholar
Razmara, M., Siahbani, M., Haffari, R., Sarkar, A.: Graph propagation for paraphrasing out-of-vocabulary words in statistical machine translation. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1105–1115 (2013)
Google Scholar
Mathur, P., Keseler, F.B., Venkatapathy, S., Cancedda, N.: Fast domain adaptation of SMT models without in-domain parallel data. In: Proceedings of the 25th International Conference on Computational Linguistics (COLING), pp. 1114–1123 (2014)
Google Scholar
Yamamoto, H., Sumita, E.: Bilingual cluster based models for statistical machine translation. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 514–523 (2007)
Google Scholar
Harris, Z.S.: Distributional structure. Word 10, 146–162 (1954)
Article Google Scholar
Firth, J.R.: A synopsis of linguistic theory. In: Studies in Linguistic Analysis, pp. 1–32 (1957)
Google Scholar
Lund, K., Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence. Behav. Res. Methods Instr. Comput. 28, 203–208 (1996)
Article Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
MATH Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at International Conference on Learning Representations (ICLR) (2013)
Google Scholar
Turney, P.D., Pantel, P., et al.: From frequency to meaning: vector space models of semantics. J. Artif. Intell. Res. (JAIR) 37, 141–188 (2010)
MathSciNet MATH Google Scholar
Erk, K.: Vector space models of word meaning and phrase meaning: a survey. Lang. Linguist. Compass 6, 635–653 (2012)
Article Google Scholar
Mikolov, T., Le, Q.V., Sutskever, I.: Exploiting similarities among languages for machine translation. arXiv preprint (2013)
Google Scholar
Fung, P.: A statistical view on bilingual lexicon extraction: from parallel corpora to non-parallel corpora. In: Proceedings of the Third Conference of the Association for Machine Translation in the Americas (AMTA), pp. 1–17 (1998)
Google Scholar
Tsunakawa, T., Okazaki, N., Liu, X., Tsujii, J.: A Chinese-Japanese lexical machine translation through a pivot language. ACM Trans. Asian Lang. Inf. Process. (TALIP), 8, 9:1–9:21 (2009)
Google Scholar
Neubig, G.: The Kyoto free translation task (2011). http://www.phontron.com/kftt
Koehn, P., Knight, K.: Learning a translation lexicon from monolingual corpora. In: Proceedings of ACL Workshop on Unsupervised lexical acquisition. pp. 9–16 (2002)
Google Scholar
Stolcke, A., et al.: SRILM-an extensible language modeling toolkit. In: Proceedings of the Seventh International Conference on Spoken Language Processing (ICSLP), pp. 901–904 (2002)
Google Scholar
Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Comput. Linguist. 29, 19–51 (2003)
Article MATH Google Scholar
Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. Comput. Linguist. 16, 22–29 (1990)
Google Scholar
Koehn, P.: Statistical significance tests for machine translation evaluation. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 388–395 (2004)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 311–318 (2002)
Google Scholar
Lembersky, G., Ordan, N., Wintner, S.: Adapting translation models to translationese improves SMT. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 255–265 (2012)
Google Scholar
Schwenk, H.: Continuous space translation models for phrase-based statistical machine translation. In: Proceedings of 24th International Conference on Computational Linguistics (COLING): Posters, pp. 1071–1080 (2012)
Google Scholar

Download references

Acknowledgments

The authors thank Nobuhiro Kaji and the anonymous reviewers for their valuable comments and suggestions. We also thank Jun Harashima for providing us the Cookpad recipe corpus. This work was partially supported by JSPS KAKENHI Grant Number 25280111.

Author information

Authors and Affiliations

Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
Shonosuke Ishiwatari
Institute of Industrial Science, The University of Tokyo, Tokyo, Japan
Naoki Yoshinaga, Masashi Toyoda & Masaru Kitsuregawa
National Institute of Information and Communications Technology, Tokyo, Japan
Naoki Yoshinaga
National Institute of Informatics, Tokyo, Japan
Masaru Kitsuregawa

Authors

Shonosuke Ishiwatari
View author publications
You can also search for this author in PubMed Google Scholar
Naoki Yoshinaga
View author publications
You can also search for this author in PubMed Google Scholar
Masashi Toyoda
View author publications
You can also search for this author in PubMed Google Scholar
Masaru Kitsuregawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shonosuke Ishiwatari .

Editor information

Editors and Affiliations

CIC, Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ishiwatari, S., Yoshinaga, N., Toyoda, M., Kitsuregawa, M. (2018). Instant Translation Model Adaptation by Translating Unseen Words in Continuous Vector Space. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9624. Springer, Cham. https://doi.org/10.1007/978-3-319-75487-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-75487-1_5
Published: 21 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75486-4
Online ISBN: 978-3-319-75487-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics