Advertisement

Machine Translation Quality Estimation: Applications and Future Perspectives

  • Lucia Specia
  • Kashif Shah
Chapter
Part of the Machine Translation: Technologies and Applications book series (MATRA, volume 1)

Abstract

Predicting the quality of machine translation (MT) output is a topic that has been attracting significant attention. By automatically distinguishing bad from good quality translations, it has the potential to make MT more useful in a number of applications. In this chapter we review various practical applications where quality estimation (QE) at sentence level has shown positive results: filtering low quality cases from post-editing, selecting the best MT system when multiple options are available, improving MT performance by selecting additional parallel data, and sampling for quality assurance by humans. Finally, we discuss QE at other levels (word and document) and general challenges in the field, as well as perspectives for novel directions and applications.

Keywords

Translation quality assessment Principles to practice Translation errors Translation models Post-editing effort Statistical machine translation Machine translation system ranking Machine translation system selection Quality estimation 

Notes

Acknowledgements

We thank Arle Lommel for his help with setting up and running the error annotation for the experiments in Sect. 6.

References

  1. Ambati V, Vogel S, Carbonell J (2010) Active learning and crowd-sourcing for machine translation. In: Proceedings of the seventh international conference on language resources and evaluation (LREC 2010), 17–23 May 2010, Valletta, pp 2169–2174Google Scholar
  2. Ananthakrishnan S, Prasad R, Stallard D, Natarajan P (2010) Discriminative sample selection for statistical machine translation. In: Proceedings of the 2010 conference on empirical methods in natural language processing (EMNLP-2010), MIT, Massachusetts, 9–11 Oct 2010, pp 626–635Google Scholar
  3. Avramidis E (2013) Rankeval: open tool for evaluation of machine-learned ranking. Prague Bull Math Linguist (PBML) 100:63–72Google Scholar
  4. Avramidis E (2016) Qualitative: python tool for MT quality estimation supporting server mode and hybrid MT. Prague Bull Math Linguist 106:147–158Google Scholar
  5. Bach N, Huang F, Al-Onaizan Y (2011) Goodness: a method for measuring machine translation confidence. In: ACL11, Portland, pp 211–219Google Scholar
  6. Banerjee P, Rubino R, Roturier J, van Genabith J (2013) Quality estimation-guided data selection for domain adaptation of SMT. In: MT summit XIV: Proceedings of the fourteenth machine translation summit, Nice, 2–6 Sept 2013. EAMT, pp 101–108Google Scholar
  7. Blatz J, Fitzgerald E, Foster G, Gandrabur S, Goutte C, Kulesza A, Sanchis A, Ueffing N (2004) Confidence estimation for machine translation. In: Coling04, Geneva, pp 315–321Google Scholar
  8. Bojar O, Buck C, Callison-Burch C, Federmann C, Haddow B, Koehn P, Monz C, Post M, Soricut R, Specia L (2013) Findings of the 2013 WMT. In: 8th WMT, Sofia, pp 1–44Google Scholar
  9. Bojar O, Buck C, Federmann C, Haddow B, Koehn P, Monz C, Post M, Specia L (eds) (2014) Proceedings of the ninth workshop on statistical machine translation, BaltimoreGoogle Scholar
  10. Bojar O, Chatterjee R, Federmann C, Haddow B, Hokamp C, Huck M, Logacheva V, Pecina P (eds) (2015) Proceedings of the tenth workshop on statistical machine translation, LisbonGoogle Scholar
  11. Bojar O, Chatterjee R, Federmann C, Graham Y, Haddow B, Huck M, Jimeno Yepes A, Koehn P, Logacheva V, Monz C, Negri M, Neveol A, Neves M, Popel M, Post M, Rubino R, Scarton C, Specia L, Turchi, M, Verspoor K, Zampieri M (2016) Findings of the 2016 conference on machine translation. In: First conference on machine translation, volume 2: shared task papers, WMT, Berlin, pp 131–198Google Scholar
  12. Callison-Burch C, Koehn P, Monz C, Post M, Soricut R, Specia L (2012a) Findings of the 2012 WMT. In: WMT12, Montréal, pp 10–51Google Scholar
  13. Callison-Burch C, Koehn P, Monz C, Post M, Soricut R, Specia L (eds) (2012b) Proceedings of the seventh workshop on statistical machine translation, MontréalGoogle Scholar
  14. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27CrossRefGoogle Scholar
  15. Eck M, Vogel S, Waibel A (2005) Low cost portability for statistical machine translation based on N-gram frequency and TF-IDF. In: IWSLT 2005: proceedings of the international workshop on spoken language translation, Pittsburgh, 24–25 Oct 2005Google Scholar
  16. Haffari G, Roy M, Sarkar A (2009) Active learning for statistical phrase-based machine translation. In: The 2009 annual conference of the North American chapter of the Association for Computational Linguistics. https://doi.org/10.3115/1620754.1620815
  17. He Y, Ma Y, van Genabith J, Way A (2010) Bridging SMT and TM with translation recommendation. In: ACL2010, Uppsala, pp 622–630Google Scholar
  18. Kim H, Lee JH (2016) Recurrent neural network based translation quality estimation. In: Proceedings of the 1st conference on MT, pp 787–792Google Scholar
  19. Kim H, Lee JH, Na SH (2017) Predictor-estimator using multilevel task learning with stack propagation for neural quality estimation. In: Proceedings of the 2nd conference on MT, pp 562–568Google Scholar
  20. Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: 45th annual meeting of the Association for Computational Linguistics: demo and poster sessions, Prague, pp 177–180Google Scholar
  21. Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions and reversals. Sov Phys Dokl 10(8):707–710. https://www.bibsonomy.org/bibtex/220546d80ce76f58c6ef6ece9dd5f5056/jimregan
  22. Logacheva V, Specia L (2014) A quality-based active sample selection strategy for statistical machine translation. In: Chair NCC, Choukri K, Declerck T, Loftsson H, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S (eds) Proceedings of the ninth international conference on language resources and evaluation (LREC’14). European Language Resources Association (ELRA), ReykjavikGoogle Scholar
  23. Logacheva V, Hokamp C, Specia L (2016) Marmot: a toolkit for translation quality estimation at the word level. In: Tenth international conference on language resources and evaluation, Portoroz, pp 3671–3674Google Scholar
  24. Lommel A, Popovic M, Burchardt A (2014) Assessing inter-annotator agreement for translation error annotation. In: Automatic and manual metrics for operational translation evaluation workshop programme, p 5Google Scholar
  25. Och FJ (2003) Minimum error rate training in statistical machine translation. In: Proceedings of the 41st annual meeting on Association for Computational Linguistics, ACL ’03, Sapporo, vol 1. Association for Computational Linguistics, Stroudsburg, pp 160–167. https://doi.org/10.3115/1075096.1075117
  26. Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: ACL02, Philadelphia, pp 311–318Google Scholar
  27. Quang LN, Laurent B, Benjamin L (2014) LIG system for word level QE task at WMT14. In: Workshop on machine translation (WMT)Google Scholar
  28. Rasmussen CE, Williams CKI (2006) Gaussian processes for machine learning. The MIT Press, Cambridge, MAGoogle Scholar
  29. Scarton C, Zampieri M, Vela M, van Genabith J, Specia L (2015) Searching for context: a study on document-level labels for translation quality estimation. In: 18th annual conference of the European Association for machine translation, Antalya, pp 121–128Google Scholar
  30. Servan C, Le NT, Luong NQ, Lecouteux B, Besacier L (2015) An open source toolkit for word-level confidence estimation in machine translation. In: 12th international workshop on spoken language translation, Da NangGoogle Scholar
  31. Settles B (2010) Active learning literature survey. Computer sciences technical report 1648, University of Wisconsin, MadisonGoogle Scholar
  32. Shah K, Specia L (2016) Large-scale multitask learning for machine translation quality estimation. In: Conference of the North American chapter of the association for computational linguistics: human language technologies, San Diego, pp 558–567. http://www.aclweb.org/anthology/N16-1069
  33. Shah K, Cohn T, Specia L (2013) An investigation on the effectiveness of features for translation quality estimation. In: Machine translation summit XIV, Nice, pp 167–174Google Scholar
  34. Shah K, Cohn T, Specia L (2015) A Bayesian non-linear method for feature selection in machine translation quality estimation. Mach Translat 125. https://doi.org/10.1007/s10590-014-9164-x
  35. Soricut R, Echihabi A (2010) TrustRank: inducing trust in automatic translations via ranking. In: ACL11, Uppsala, pp 612–621Google Scholar
  36. Specia L (2011) Exploiting objective annotations for measuring translation post-editing effort. In: EAMT11, Leuven, pp 73–80Google Scholar
  37. Specia L, Turchi M, Cancedda N, Dymetman M, Cristianini N (2009) Estimating the sentence-level quality of machine translation systems. In: EAMT09, Barcelona, pp 28–37Google Scholar
  38. Specia L, Raj D, Turchi M (2010) Machine translation evaluation versus quality estimation. Mach Translat 24:39–50CrossRefGoogle Scholar
  39. Specia L, Shah K, Souza JGCD, Cohn T (2013, to appear) QuEst – a translation quality estimation framework. In: Proceedings of ACL demo sessionGoogle Scholar
  40. Specia L, Paetzold G, Scarton C (2015a) Multi-level translation quality prediction with quest++. In: ACL-IJCNLP 2015 system demonstrations, Beijing, pp 115–120Google Scholar
  41. Specia L, Paetzold GH, Scarton C (2015b) Multi-level translation quality prediction with quest++. In: Proceedings of the 53rd ACLGoogle Scholar
  42. Turchi M, Negri M, Federico M (2015) MT quality estimation for computer-assisted translation: does it really help? In: 53rd annual meeting of the association for computational linguistics, Beijing, pp 530–535Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of SheffieldSheffieldUK
  2. 2.eBay ResearchSan JoseUSA

Personalised recommendations