Segment-based interactive-predictive machine translation

Domingo, Miguel; Peris, Álvaro; Casacuberta, Francisco

doi:10.1007/s10590-017-9213-3

Segment-based interactive-predictive machine translation

Published: 05 January 2018

Volume 31, pages 163–185, (2017)
Cite this article

Machine Translation

1053 Accesses
10 Citations
2 Altmetric
Explore all metrics

Abstract

Machine translation systems require human revision to obtain high-quality translations. Interactive methods provide an efficient human–computer collaboration, notably increasing productivity. Recently, new interactive protocols have been proposed, seeking for a more effective user interaction with the system. In this work, we present one of these new protocols, which allows the user to validate all correct word sequences in a translation hypothesis. Thus, the left-to-right barrier from most of the existing protocols is broken. We compare this protocol against the classical prefix-based approach, obtaining a significant reduction of the user effort in a simulated environment. Additionally, we experiment with the use of confidence measures to select the word the user should correct at each iteration, reaching the conclusion that the order in which words are corrected does not affect the overall effort.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on large language model based autonomous agents

Article Open access 22 March 2024

Natural Language Processing

The Use of Artificial Intelligence in Writing Scientific Review Articles

Article Open access 16 January 2024

Notes

http://www.statmt.org/wmt14/medical-task/.
https://wit3.fbk.eu/mt.php?release=2013-01.
http://www.statmt.org/wmt15/translation-task.html.
The partition selected as development was news-test2013.
One mouse action is enough for selecting or deleting a one-word segment: the user would simply click on the word.
Tested on a machine with an Intel i5 CPU at 3.1 GHz.

References

Alabau V, Bonk R, Buck C, Carl M, Casacuberta F, García-Martínez M, González-Rubio J, Koehn P, Leiva LA, Mesa-Lao B, Ortiz-Martínez D, Saint-Amand H, Sanchis-Trilles G, Tsoukala C (2013) CASMACAT: an open source workbench for advanced computer aided translation. Prague Bull Math Linguist 100:101–112
Article Google Scholar
Alabau V, Rodríguez-Ruiz L, Sanchis A, Martínez-Gómez P, Casacuberta F (2011) On multimodal interactive machine translation using speech recognition. In: Proceedings of the International Conference on Multimodal Interaction, pp 129–136
Alabau V, Sanchis A, Casacuberta F (2014) Improving on-line handwritten recognition in interactive machine translation. Pattern Recognit 47(3):1217–1228
Article Google Scholar
Apostolico A, Guerra C (1987) The longest common subsequence problem revisited. Algorithmica 2:315–336
Article MathSciNet MATH Google Scholar
Azadi F, Khadivi S (2015) Improved search strategy for interactive machine translation in computer assisted translation. In: Proceedings of Machine Translation Summit XV, pp 319–332
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of the International Conference on Learning Representations. arXiv:1409.0473
Barrachina S, Bender O, Casacuberta F, Civera J, Cubel E, Khadivi S, Lagarda A, Ney H, Tomás J, Vidal E, Vilar J-M (2009) Statistical approaches to computer-assisted translation. Comput Linguist 35:3–28
Article MathSciNet Google Scholar
Brown PF, Pietra VJD, Pietra SAD, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2):263–311
Google Scholar
Chen SF, Goodman J (1996) An empirical study of smoothing techniques for language modeling. In: Proceedings of the Annual Meeting on Association for Computational Linguistics, pp 310–318
Cheng S, Huang S, Chen H, Dai X, Chen J (2016) PRIMT: a pick-revise framework for interactive machine translation. In: Proceedings of the North American Chapter of the Association for Computational Linguistics, pp 1240–1249
Dale R (2016) How to make money in the translation business. Nat Lang Eng 22(2):321–325
Article Google Scholar
Domingo M, Peris, Á, Casacuberta F (2016) Interactive-predictive translation based on multiple word-segments. In: Proceedings of the Annual Conference of the European Association for Machine Translation, pp 282–291
Federico M, Bentivogli L, Paul M, Stüker S (2011) Overview of the IWSLT 2011 evaluation campaign. In: International Workshop on Spoken Language Translation, pp 11–27
Foster G, Isabelle P, Plamondon P (1997) Target-text mediated interactive machine translation. Mach Transl 12:175–194
Article Google Scholar
González-Rubio J, Benedí J-M, Casacuberta F (2016) Beyond prefix-based interactive translation prediction. In: Proceedings of the SIGNLL Conference on Computational Natural Language Learning, pp 198–207
González-Rubio J, Ortiz-Martínez D, Casacuberta F (2010) On the use of confidence measures within an interactive-predictive machine translation system. In: Proceedings of the Annual Conference of the European Association for Machine Translation
Knowles R, Koehn P (2016) Neural interactive translation prediction. In: Proceedings of the Association for Machine Translation in the Americas, pp 107–120
Koehn P (2005) Europarl: a parallel corpus for statistical machine translation. In: Proceedings of the Machine Translation Summit, pp 79–86
Koehn P (2010) Statistical machine translation. Cambridge University Press, Cambridge
MATH Google Scholar
Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp 177–180
Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp 48–54
Koehn P, Tsoukala C, Saint-Amand H (2014) Refinements to interactive translation prediction based on search graphs. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp 574–578
Marie B, Max A (2015) Touch-based pre-post-editing of machine translation output. In: Proceedings of the conference on empirical methods in natural language processing, pp 1040–1045
Nepveu L, Lapalme G, Langlais P, Foster G (2004) Adaptive language and translation models for interactive machine translation. In: Proceedings of the conference on empirical method in natural language processing, pp 190–197
Nielsen J (1993) Usability engineering. Morgan Kaufmann Publishers Inc, Burlington
MATH Google Scholar
Och F J (2003) Minimum error rate training in statistical machine translation. In: Proceedings of the annual meeting of the association for computational linguistics, pp 160–167
Och FJ, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of the annual meeting of the association for computational linguistics, pp 295–302
Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51
Article MATH Google Scholar
Ortiz-Martínez D (2016) Online learning for statistical machine translation. Comput Linguist 42(1):121–161
Article MathSciNet Google Scholar
Papineni K, Roukos S, Ward T, Zhu W-J (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the annual meeting of the association for computational linguistics, pp 311–318
Peris Á, Domingo M, Casacuberta F (2017) Interactive neural machine translation. Comput Speech Lang. 45:201–220
Article Google Scholar
Sanchis-Trilles G, Ortiz-Martínez D, Civera J, Casacuberta F, Vidal E, Hoang H (2008) Improving interactive machine translation via mouse actions. In: Proceedings of the conference on empirical methods in natural language processing, pp 485–494
Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: Proceedings of the Association for Machine Translation in the Americas, pp 223–231
Stolcke A (2002) SRILM—an extensible language modeling toolkit. In: Proceedings of the international conference on spoken language processing, pp 257–286
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. NIPS 27:3104–3112
Google Scholar
Tiedemann J (2009) News from OPUS—a collection of multilingual parallel corpora with tools and interfaces. Recent Adv Nat Lang Process 5:237–248
Article Google Scholar
Tomás J, Casacuberta F(2006) Statistical phrase-based models for interactive computer-assisted translation. In: Proceedings of the international conference on computational linguistics/Association for Computational Linguistics, pp 835–841
Torregrosa D, Forcada ML, Pérez-Ortiz JA (2014) An open-source web-based tool for resource-agnostic interactive translation prediction. Prague Bull Math Linguist 102:69–80
Google Scholar
Tseng H, Chang P, Andrew G, Jurafsky D, Manning C (2005) A conditional random field word segmenter. In: Proceedings of the special interest group of the association for computational linguistics workshop on Chinese language processing, pp 168–171
Ueffing N, Ney H (2005) Application of word-level confidence measures in interactive statistical machine translation. In: Proceedings of the European Association for Machine Translation, pp 262–270
Vogel S, Ney H, Tillmann C (1996) HMM-based word alignment in statistical translation. Proc Conf Comput Linguist 2:836–841
Google Scholar
Wuebker J, Green S, DeNero J, Hasan S, Luong M-T(2016) Models and inference for prefix-constrained machine translation. In: Proceedings of the annual meeting of the association for the computational linguistics, pp 66–75
Zens R, Och FJ, Ney H (2002) Phrase-based statistical machine translation. In: Proceedings of the annual German conference on advances in artificial intelligence 2479:18–32

Download references

Acknowledgements

The research leading to these results has received funding from the Ministerio de Economía y Competitividad (MINECO) under Project CoMUN-HaT (Grant Agreement TIN2015-70924-C2-1-R), and Generalitat Valenciana under Project ALMAMATER (Ggrant Agreement PROMETEOII/2014/030).

Author information

Authors and Affiliations

Pattern Recognition and Human Language Technology Research Center, Universitat Politècnica de València, Camino de Vera s/n, 46022, Valencia, Spain
Miguel Domingo, Álvaro Peris & Francisco Casacuberta

Authors

Miguel Domingo
View author publications
You can also search for this author in PubMed Google Scholar
Álvaro Peris
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Casacuberta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Miguel Domingo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Domingo, M., Peris, Á. & Casacuberta, F. Segment-based interactive-predictive machine translation. Machine Translation 31, 163–185 (2017). https://doi.org/10.1007/s10590-017-9213-3

Download citation

Received: 20 March 2017
Accepted: 14 December 2017
Published: 05 January 2018
Issue Date: December 2017
DOI: https://doi.org/10.1007/s10590-017-9213-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Segment-based interactive-predictive machine translation

Abstract

Access this article

Similar content being viewed by others

A survey on large language model based autonomous agents

Natural Language Processing

The Use of Artificial Intelligence in Writing Scientific Review Articles

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Segment-based interactive-predictive machine translation

Abstract

Access this article

Similar content being viewed by others

A survey on large language model based autonomous agents

Natural Language Processing

The Use of Artificial Intelligence in Writing Scientific Review Articles

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation