Post-editing neural machine translation versus translation memory segments

  • Pilar Sánchez-GijónEmail author
  • Joss Moorkens
  • Andy Way


The use of neural machine translation (NMT) in a professional scenario implies a number of challenges despite growing evidence that, in language combinations such as English to Spanish, NMT output quality has already outperformed statistical machine translation in terms of automatic metric scores. This article presents the result of an empirical test that aims to shed light on the differences between NMT post-editing and translation with the aid of a translation memory (TM). The results show that NMT post-editing involves less editing than TM segments, but this editing appears to take more time, with the consequence that NMT post-editing does not seem to improve productivity as may have been expected. This might be due to the fact that NMT segments show a higher variability in terms of quality and time invested in post-editing than TM segments that are ‘more similar’ on average. Finally, results show that translators who perceive that NMT boosts their productivity actually performed faster than those who perceive that NMT slows them down.


Neural machine translation Translation memory Translation quality perception MT acceptance Translation productivity 



This work has been supported by the ProjecTA-U project, Grant Number FFI2016-78612-R (MINECO/FEDER/UE), and by the ADAPT Centre for Digital Content Technology which is funded under the SFI Research Centres Programme (Grant 13/RC/2016) and is co-funded under the European Regional Development Fund.


  1. Alabau V, Bonk R, Buck C, Carl M, Casacuberta F, García-Martínez M, González J, Koehn P, Leiva L, Mesa-Lao B, Ortiz D, Saint-Amand H, Sanchis G, Tsoukala C (2013) CASMACAT: an open source workbench for advanced computer aided translation. Prague Bull Math Linguist 100:101–112CrossRefGoogle Scholar
  2. Cadwell P, Castilho S, O’Brien S, Mitchell L (2016) Human factors in machine translation and post-editing among institutional translators. Transl Spaces 5(2):222–243CrossRefGoogle Scholar
  3. Castilho S, Moorkens J, Gaspari F, Calixto I, Tinsley J, Way A (2017) Is neural machine translation the new state of the art? Prague Bull Math Linguist 108:109–120CrossRefGoogle Scholar
  4. Castilho S, Moorkens J, Gaspari F, Sennrich R, Way A, Georgakopoulou P (2018) Evaluating MT for massive open online courses. Mach Transl 32(3):255–278CrossRefGoogle Scholar
  5. Flournoy R, Duran C (2009) Machine translation and document localization at Adobe: from pilot to production. In: Proceedings of MT summit XII, pp 425–428Google Scholar
  6. Forcada M, Esplà-Gomis M, Sánchez-Martínez F, Specia L (2017) One-parameter models for sentence-level post-editing effort estimation. In: Proceedings of MT summit XVI, vol 1: Research Track, pp 132–143Google Scholar
  7. Klubička F, Toral A, Sánchez-Cartagena VM (2017) Fine-grained human evaluation of neural versus phrase-based machine translation. Prague Bull Math Linguist 108(1):121–132CrossRefGoogle Scholar
  8. Koehn P, Knowles R (2017) Six challenges for neural machine translation. In: Proceedings of the first workshop on neural machine translation, Vancouver, Canada, pp 28—39Google Scholar
  9. Krings HP (2001) Repairing texts: empirical investigations of machine translation post-editing process. The Kent State University Press, KentGoogle Scholar
  10. Läubli S, Fishel M, Massey G, Ehrensberger-Dow M, Volk M (2013) Assessing post-editing efficiency in a realistic translation environment. In: O’Brien S, Simard M, Specia L (eds) Proceedings of MT summit XIV workshop on post-editing technology and practice. Nice, pp 83–91Google Scholar
  11. Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions and reversals. Sov Phys Doklady 10(8):707–710MathSciNetGoogle Scholar
  12. Lommel A, DePalma D (2016) Europe’s leading role in machine translation. Common sense advisory—cracker projectGoogle Scholar
  13. Martín-Mor A, Piqué Huerta R, Sánchez-Gijón P (2016) Tradumàtica: Tecnologies de la traducció. Eumo Editioral, VicGoogle Scholar
  14. Moorkens J (2017) Under pressure: translation in times of austerity. Perspectives 25(3):464–477CrossRefGoogle Scholar
  15. Moorkens J (2018) Eye-tracking as a measure of cognitive effort for post-editing of machine translation. In: Walker C, Federici F (eds) Eye tracking and multidisciplinary studies on translation. John Benjamins, Amsterdam, pp 55–69CrossRefGoogle Scholar
  16. Moorkens J, Way A (2016) Comparing translator acceptability of TM and SMT outputs. Balt J. Mod Comput 4(2):141–151Google Scholar
  17. Moorkens J, O’Brien S, Silva IAL, Fonseca N, Alves F (2015) Correlations of perceived post-editing effort with measurements of actual effort. Mach Transl 29(3–4):267–284CrossRefGoogle Scholar
  18. Moorkens J, Lewis D, Reijers W, Vanmassenhove E, Way A (2016) Translation resources and translator disempowerment. In: Proceedings of ETHI-CA2 2016: ETHics in corpus collection, annotation and application, pp 49–53Google Scholar
  19. O’Brien S (2012) Translation as human-computer interaction. Transl Spaces 1:101–122CrossRefGoogle Scholar
  20. Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, Philadelphia, Pennsylvania, pp 311–318Google Scholar
  21. Pinnis M, Kalnis R, Skandis R, Skadina I (2016) What can we really learn from post-editing. In: Proceedings of AMTA 2016 vol 2: MT Users’ Track, pp 86–91Google Scholar
  22. Plitt M, Masselot F (2010) A productivity test of statistical machine translation post-editing in a typical localization context. Prague Bull Math Linguist 93:7–16CrossRefGoogle Scholar
  23. Rossetti A, Gaspari F (2017) Modelling the analysis of translation memory use and post-editing of raw machine translation output: a pilot study of trainee translators’ perceptions of difficulty and time effectiveness. In: Hansen-Schirra S, Czulo O, Hoffmann S (eds) Empirical modelling of translation and interpreting. Language Science Press, Berlin, pp 41–67Google Scholar
  24. Sánchez-Gijón P (2016) La posedición: hacia una definición competencial del perfil y una descripción multidimensional del fenómeno. Sendebar 27:151–162Google Scholar
  25. Sánchez-Torrón M, Koehn P (2016) Machine translation quality and post-editor productivity. Proceedings of AMTA 2016:16Google Scholar
  26. Shterionov D, Superbo R, Nagle P, Casanellas L, O’Dowd T, Way A (2018) Human versus automatic quality evaluation on NMT and PBSMT. Mach Transl 32(3):217–235CrossRefGoogle Scholar
  27. Specia L, Blain F, Astudillo RF, Logacheva V, Martins A (2018) Findings of the WMT 2018 shared task on quality estimation. In: Proceedings of the third conference on machine translation (WMT), vol 2: Shared Task Papers, Belgium, Brussels, pp 689–709Google Scholar
  28. Teixeira CSC, Moorkens J, Turner D, Vreeke J, Way A (2019) Creating a multimodal translation tool and testing machine translation integration using touch and voice. In: Macken L, Daems J, Tezcan A (eds) Informatics 6(1):13 Special Issue on Advances in Computer-Aided Translation Technology
  29. Torres-Hostench O, Presas M, Cid P et al (2016). El uso de traducción automática y posedición en las empresas de servicios lingüísticos españolas: informe de investigación ProjecTA 2015.
  30. Way A (2018) Quality expectations of machine translation. In: Moorkens J, Castilho S, Gaspari F, Doherty S (eds) Translation quality assessment. Springer, Berlin, pp 159–178CrossRefGoogle Scholar

Copyright information

© Springer Nature B.V. 2019

Authors and Affiliations

  1. 1.Grup Tradumàtica, DTIEAO - Universitat Autònoma de BarcelonaBarcelonaSpain
  2. 2.ADAPT Centre, School of Applied Language and Intercultural StudiesDublin City UniversityDublinIreland
  3. 3.ADAPT Centre, School of ComputingDublin City UniversityDublinIreland

Personalised recommendations