Monte Carlo techniques for phrase-based translation

Arun, Abhishek; Haddow, Barry; Koehn, Philipp; Lopez, Adam; Dyer, Chris; Blunsom, Phil

doi:10.1007/s10590-010-9080-7

Monte Carlo techniques for phrase-based translation

Published: 24 June 2010

Volume 24, pages 103–121, (2010)
Cite this article

Machine Translation

Abhishek Arun¹,
Barry Haddow¹,
Philipp Koehn¹,
Adam Lopez¹,
Chris Dyer² &
…
Phil Blunsom³

298 Accesses
2 Citations
Explore all metrics

Abstract

Recent advances in statistical machine translation have used approximate beam search for NP-complete inference within probabilistic translation models. We present an alternative approach of sampling from the posterior distribution defined by a translation model. We define a novel Gibbs sampler for sampling translations given a source sentence and show that it effectively explores this posterior distribution. In doing so we overcome the limitations of heuristic beam search and obtain theoretically sound solutions to inference problems such as finding the maximum probability translation and minimum risk training and decoding.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Arun A, Dyer C, Haddow B, Blunsom P, Lopez A, Koehn P (2009) Monte Carlo inference and maximization for phrase-based translation. In: Proceedings of CoNLL, Association for Computational Linguistics, Boulder, Colorado, pp 102–110
Blunsom P, Cohn T, Osborne M (2008) A discriminative latent variable model for statistical machine translation. In: Proceedings of ACL-08: HLT, Association for Computational Linguistics, Columbus, Ohio, pp 200–208
Callison-Burch C, Koehn P, Monz C, Schroeder J (2009) Findings of the 2009 workshop on statistical machine translation. In: Proceedings of the fourth workshop on statistical machine translation, Association for Computational Linguistics, Athens, Greece, pp 1–28
Casacuberta F, Higuera CDL (2000) Computational complexity of problems on probabilistic grammars and transducers. Springer-Verlag, London, UK
Google Scholar
DeNero J, Bouchard-Côté A, Klein D (2008) Sampling alignment structure under a Bayesian translation model. In: Proceedings of the 2008 conference on empirical methods in natural language processing, Association for Computational Linguistics, Honolulu, Hawaii, pp 314–323
Eisner J, Tromble RW (2006) Local search with very large-scale neighborhoods for optimal permutations in machine translation. In: Proceedings of the HLT-NAACL workshop on computationally hard problems and joint inference in speech and language processing, New York, pp 57–75
Finkel JR, Manning CD, Ng AY (2006) Solving the problem of cascading errors: approximate bayesian inference for linguistic annotation pipelines. In: Proceedings of the 2006 conference on empirical methods in natural language processing, Association for Computational Linguistics, Sydney, Australia, pp 618–626
Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6: 721–741
Article MATH Google Scholar
Germann U, Jahr M, Knight K, Marcu D, Yamada K (2001) Fast decoding and optimal decoding for machine translation. In: Proceedings of 39th annual meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Toulouse, France, pp 228–235
Green PJ (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82: 711–732
Article MATH MathSciNet Google Scholar
Johnson H, Martin J, Foster G, Kuhn R (2007a) Improving translation quality by discarding most of the phrasetable. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), Association for Computational Linguistics, Prague, Czech Republic, pp 967–975
Johnson M, Griffiths T, Goldwater S (2007b) Bayesian inference for PCFGs via Markov Chain Monte Carlo. In: Human language technologies 2007: the conference of the North American chapter of the Association for Computational Linguistics, Proceedings of the Main Conference, Association for Computational Linguistics, Rochester, New York, pp 139–146
Koehn P, Hoang H (2007) Factored translation models. In: Proceedings of EMNLP, Association for Computational Linguistics, Prague, Czech Republic, pp 868–876
Koehn P, Och F, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of HLT-NAACL. Morristown, NJ, USA, pp 48–54
Kumar S, Byrne W (2004) Minimum Bayes-risk decoding for statistical machine translation. In: Susan Dumais DM, Roukos S (eds) HLT-NAACL 2004: main proceedings, Association for Computational Linguistics, Boston, Massachusetts, USA, pp 169–176
Langlais P, Gotti F, Patry A (2007) A greedy decoder for phrase-based statistical machine translation. In: 11th international conference on theoretical and methodological issues in machine translation (TMI 2007), Sḱdcvde, Sweden, pp 104–113
Li Z, Eisner J, Khudanpur S (2009) Variational decoding for statistical machine translation. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP, pp 593–601
Liu DC, Nocedal J (1989) On the limited memory BFGS method for large scale optimization. Math Program 45(3): 503–528
Article MATH MathSciNet Google Scholar
Marcu D, Wong W (2002) A phrase-based, joint probability model for statistical machine translation. In: EMNLP ’02: Proceedings of the ACL-02 conference on Empirical methods in natural language processing, Association for Computational Linguistics, Morristown, NJ, USA, pp 133–139
Metropolis N, Ulam S (1949) The Monte Carlo method. J Am Stat Assoc 44(247): 335–341
Article MATH MathSciNet Google Scholar
Och FJ (2003) Minimum error rate training in statistical machine translation. In: Proceedings of the 41st annual meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Sapporo, Japan, pp 160–167
Papineni K, Roukos S, Ward T, Zhu W-J (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of 40th annual meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, pp 311–318
Schraudolph NN (1999) Local gain adaptation in stochastic gradient descent. Technical Report IDSIA-09-99, IDSIA
Smith DA, Eisner J (2006) Minimum risk annealing for training log-linear models. In: Proceedings of the COLING/ACL 2006 main conference poster sessions, Sydney, Australia, pp 787–794
Zens R, Hasan S, Ney H (2007) A systematic comparison of training criteria for statistical machine translation. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 524–532

Download references

Author information

Authors and Affiliations

University of Edinburgh, Edinburgh, UK
Abhishek Arun, Barry Haddow, Philipp Koehn & Adam Lopez
University of Maryland, College Park, MD, USA
Chris Dyer
Oxford University Computing Laboratory, Oxford, UK
Phil Blunsom

Authors

Abhishek Arun
View author publications
You can also search for this author in PubMed Google Scholar
Barry Haddow
View author publications
You can also search for this author in PubMed Google Scholar
Philipp Koehn
View author publications
You can also search for this author in PubMed Google Scholar
Adam Lopez
View author publications
You can also search for this author in PubMed Google Scholar
Chris Dyer
View author publications
You can also search for this author in PubMed Google Scholar
Phil Blunsom
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abhishek Arun.

Additional information

This paper extends work presented in Arun et al. (2009).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arun, A., Haddow, B., Koehn, P. et al. Monte Carlo techniques for phrase-based translation. Machine Translation 24, 103–121 (2010). https://doi.org/10.1007/s10590-010-9080-7

Download citation

Received: 02 November 2009
Accepted: 25 May 2010
Published: 24 June 2010
Issue Date: June 2010
DOI: https://doi.org/10.1007/s10590-010-9080-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Monte Carlo techniques for phrase-based translation

Abstract

Access this article

Similar content being viewed by others

Improving the Minimum Description Length Inference of Phrase-Based Translation Models

A Statistical Translation Approach by Network Model

Minimum description length inference of phrase-based translation models

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Monte Carlo techniques for phrase-based translation

Abstract

Access this article

Similar content being viewed by others

Improving the Minimum Description Length Inference of Phrase-Based Translation Models

A Statistical Translation Approach by Network Model

Minimum description length inference of phrase-based translation models

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation