Machine Translation

, Volume 31, Issue 4, pp 163–185 | Cite as

Segment-based interactive-predictive machine translation

  • Miguel Domingo
  • Álvaro Peris
  • Francisco Casacuberta
Article

Abstract

Machine translation systems require human revision to obtain high-quality translations. Interactive methods provide an efficient human–computer collaboration, notably increasing productivity. Recently, new interactive protocols have been proposed, seeking for a more effective user interaction with the system. In this work, we present one of these new protocols, which allows the user to validate all correct word sequences in a translation hypothesis. Thus, the left-to-right barrier from most of the existing protocols is broken. We compare this protocol against the classical prefix-based approach, obtaining a significant reduction of the user effort in a simulated environment. Additionally, we experiment with the use of confidence measures to select the word the user should correct at each iteration, reaching the conclusion that the order in which words are corrected does not affect the overall effort.

Keywords

Machine translation Computer-assisted translation Interactive-predictive machine translation 

Notes

Acknowledgements

The research leading to these results has received funding from the Ministerio de Economía y Competitividad (MINECO) under Project CoMUN-HaT (Grant Agreement TIN2015-70924-C2-1-R), and Generalitat Valenciana under Project ALMAMATER (Ggrant Agreement PROMETEOII/2014/030).

References

  1. Alabau V, Bonk R, Buck C, Carl M, Casacuberta F, García-Martínez M, González-Rubio J, Koehn P, Leiva LA, Mesa-Lao B, Ortiz-Martínez D, Saint-Amand H, Sanchis-Trilles G, Tsoukala C (2013) CASMACAT: an open source workbench for advanced computer aided translation. Prague Bull Math Linguist 100:101–112CrossRefGoogle Scholar
  2. Alabau V, Rodríguez-Ruiz L, Sanchis A, Martínez-Gómez P, Casacuberta F (2011) On multimodal interactive machine translation using speech recognition. In: Proceedings of the International Conference on Multimodal Interaction, pp 129–136Google Scholar
  3. Alabau V, Sanchis A, Casacuberta F (2014) Improving on-line handwritten recognition in interactive machine translation. Pattern Recognit 47(3):1217–1228CrossRefGoogle Scholar
  4. Apostolico A, Guerra C (1987) The longest common subsequence problem revisited. Algorithmica 2:315–336MathSciNetCrossRefMATHGoogle Scholar
  5. Azadi F, Khadivi S (2015) Improved search strategy for interactive machine translation in computer assisted translation. In: Proceedings of Machine Translation Summit XV, pp 319–332Google Scholar
  6. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of the International Conference on Learning Representations. arXiv:1409.0473
  7. Barrachina S, Bender O, Casacuberta F, Civera J, Cubel E, Khadivi S, Lagarda A, Ney H, Tomás J, Vidal E, Vilar J-M (2009) Statistical approaches to computer-assisted translation. Comput Linguist 35:3–28MathSciNetCrossRefGoogle Scholar
  8. Brown PF, Pietra VJD, Pietra SAD, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2):263–311Google Scholar
  9. Chen SF, Goodman J (1996) An empirical study of smoothing techniques for language modeling. In: Proceedings of the Annual Meeting on Association for Computational Linguistics, pp 310–318Google Scholar
  10. Cheng S, Huang S, Chen H, Dai X, Chen J (2016) PRIMT: a pick-revise framework for interactive machine translation. In: Proceedings of the North American Chapter of the Association for Computational Linguistics, pp 1240–1249Google Scholar
  11. Dale R (2016) How to make money in the translation business. Nat Lang Eng 22(2):321–325CrossRefGoogle Scholar
  12. Domingo M, Peris, Á, Casacuberta F (2016) Interactive-predictive translation based on multiple word-segments. In: Proceedings of the Annual Conference of the European Association for Machine Translation, pp 282–291Google Scholar
  13. Federico M, Bentivogli L, Paul M, Stüker S (2011) Overview of the IWSLT 2011 evaluation campaign. In: International Workshop on Spoken Language Translation, pp 11–27Google Scholar
  14. Foster G, Isabelle P, Plamondon P (1997) Target-text mediated interactive machine translation. Mach Transl 12:175–194CrossRefGoogle Scholar
  15. González-Rubio J, Benedí J-M, Casacuberta F (2016) Beyond prefix-based interactive translation prediction. In: Proceedings of the SIGNLL Conference on Computational Natural Language Learning, pp 198–207Google Scholar
  16. González-Rubio J, Ortiz-Martínez D, Casacuberta F (2010) On the use of confidence measures within an interactive-predictive machine translation system. In: Proceedings of the Annual Conference of the European Association for Machine TranslationGoogle Scholar
  17. Knowles R, Koehn P (2016) Neural interactive translation prediction. In: Proceedings of the Association for Machine Translation in the Americas, pp 107–120Google Scholar
  18. Koehn P (2005) Europarl: a parallel corpus for statistical machine translation. In: Proceedings of the Machine Translation Summit, pp 79–86Google Scholar
  19. Koehn P (2010) Statistical machine translation. Cambridge University Press, CambridgeMATHGoogle Scholar
  20. Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp 177–180Google Scholar
  21. Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp 48–54Google Scholar
  22. Koehn P, Tsoukala C, Saint-Amand H (2014) Refinements to interactive translation prediction based on search graphs. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp 574–578Google Scholar
  23. Marie B, Max A (2015) Touch-based pre-post-editing of machine translation output. In: Proceedings of the conference on empirical methods in natural language processing, pp 1040–1045Google Scholar
  24. Nepveu L, Lapalme G, Langlais P, Foster G (2004) Adaptive language and translation models for interactive machine translation. In: Proceedings of the conference on empirical method in natural language processing, pp 190–197Google Scholar
  25. Nielsen J (1993) Usability engineering. Morgan Kaufmann Publishers Inc, BurlingtonMATHGoogle Scholar
  26. Och F J (2003) Minimum error rate training in statistical machine translation. In: Proceedings of the annual meeting of the association for computational linguistics, pp 160–167Google Scholar
  27. Och FJ, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of the annual meeting of the association for computational linguistics, pp 295–302Google Scholar
  28. Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51CrossRefMATHGoogle Scholar
  29. Ortiz-Martínez D (2016) Online learning for statistical machine translation. Comput Linguist 42(1):121–161MathSciNetCrossRefGoogle Scholar
  30. Papineni K, Roukos S, Ward T, Zhu W-J (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the annual meeting of the association for computational linguistics, pp 311–318Google Scholar
  31. Peris Á, Domingo M, Casacuberta F (2017) Interactive neural machine translation. Comput Speech Lang. 45:201–220CrossRefGoogle Scholar
  32. Sanchis-Trilles G, Ortiz-Martínez D, Civera J, Casacuberta F, Vidal E, Hoang H (2008) Improving interactive machine translation via mouse actions. In: Proceedings of the conference on empirical methods in natural language processing, pp 485–494Google Scholar
  33. Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: Proceedings of the Association for Machine Translation in the Americas, pp 223–231Google Scholar
  34. Stolcke A (2002) SRILM—an extensible language modeling toolkit. In: Proceedings of the international conference on spoken language processing, pp 257–286Google Scholar
  35. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. NIPS 27:3104–3112Google Scholar
  36. Tiedemann J (2009) News from OPUS—a collection of multilingual parallel corpora with tools and interfaces. Recent Adv Nat Lang Process 5:237–248CrossRefGoogle Scholar
  37. Tomás J, Casacuberta F(2006) Statistical phrase-based models for interactive computer-assisted translation. In: Proceedings of the international conference on computational linguistics/Association for Computational Linguistics, pp 835–841Google Scholar
  38. Torregrosa D, Forcada ML, Pérez-Ortiz JA (2014) An open-source web-based tool for resource-agnostic interactive translation prediction. Prague Bull Math Linguist 102:69–80Google Scholar
  39. Tseng H, Chang P, Andrew G, Jurafsky D, Manning C (2005) A conditional random field word segmenter. In: Proceedings of the special interest group of the association for computational linguistics workshop on Chinese language processing, pp 168–171Google Scholar
  40. Ueffing N, Ney H (2005) Application of word-level confidence measures in interactive statistical machine translation. In: Proceedings of the European Association for Machine Translation, pp 262–270Google Scholar
  41. Vogel S, Ney H, Tillmann C (1996) HMM-based word alignment in statistical translation. Proc Conf Comput Linguist 2:836–841Google Scholar
  42. Wuebker J, Green S, DeNero J, Hasan S, Luong M-T(2016) Models and inference for prefix-constrained machine translation. In: Proceedings of the annual meeting of the association for the computational linguistics, pp 66–75Google Scholar
  43. Zens R, Och FJ, Ney H (2002) Phrase-based statistical machine translation. In: Proceedings of the annual German conference on advances in artificial intelligence 2479:18–32Google Scholar

Copyright information

© Springer Science+Business Media B.V., part of Springer Nature 2018

Authors and Affiliations

  1. 1.Pattern Recognition and Human Language Technology Research CenterUniversitat Politècnica de ValènciaValenciaSpain

Personalised recommendations