Advertisement

Application 2: Machine Translation

  • Carlos Ramisch
Chapter
  • 816 Downloads
Part of the Theory and Applications of Natural Language Processing book series (NLP)

Abstract

Throughout the previous chapters, we have demonstrated that MWEs are a source of errors for machine translation (MT) systems and for human non-native speakers of a language. As Manning and Schütze (1999, p. 184) point out, “a nice way to test whether a combination is a collocation [MWE] is to translate it into another language. If we cannot translate the combination word by word, then there is evidence that we are dealing with a collocation”. In Sect.  2.3.2, we argue that the fact that MWEs cannot be translated word-for-word is a consequence of their limited syntactic and semantic compositionality. Adequate solutions for the variable syntactic/semantic fixedness of MWEs are not easy to find, especially in the context of statistical MT models. However, for high quality MT, it is important to detect MWEs, to disambiguate them semantically and to treat them appropriately in order to avoid generating unnatural translations or losing information.

Keywords

Machine Translation Translation Model Statistical Machine Translation English Sentence Parallel Corpus 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL 2005 workshop on intrinsic and extrinsic evaluation measures for MT and/or summarization, Ann Arbor. Association for Computational Linguistics, pp 65–72. http://www.aclweb.org/anthology/W/W05/W05-0909
  2. Bolinger D (1971) The phrasal verb in English. Harvard University Press, Harvard, 187pGoogle Scholar
  3. Briscoe T, Carroll J, Watson R (2006) The second release of the RASP system. In: Curran J (ed) Proceedings of the COLING/ACL 2006 interactive presentation sessions, Association for Computational Linguistics, Sidney, pp 77–80. http://www.aclweb.org/anthology/P/P06/P06-4020
  4. Carpuat M, Diab M (2010) Task-based evaluation of multiword expressions: a pilot study in statistical machine translation. In: Proceedings of human language technology: the 2010 annual conference of the North American chapter of the Association for Computational Linguistics (NAACL 2003), Los Angeles. Association for Computational Linguistics, pp 242–245. http://www.aclweb.org/anthology/N10-1029
  5. Cettolo M, Girardi C, Federico M (2012) WIT3: web inventory of transcribed and translated talks. In: Proceedings of the 16th conference of the European association for machine translation (EAMT), Trento, pp 261–268Google Scholar
  6. Fraser B (1976) The verb-particle combination in English. Academic, New YorkGoogle Scholar
  7. Gale WA, Church K (1993) A program for aligning sentences in bilingual corpora. Comput Linguist 19(1):75–102Google Scholar
  8. Knight K (1999) Decoding complexity in word-replacement translation models. Comput Linguist 25(4):607–615Google Scholar
  9. Knight K, Koehn P (2003) What’s new in statistical machine translation. In: Proceedings of the 2003 conference of the North American chapter of the Association for Computational Linguistics on human language technology (NAACL 2003), Edmonton. Association for Computational Linguistics, p 5Google Scholar
  10. Koehn P (2010) Statistical machine translation. Cambridge University Press, Cambridge, 488pzbMATHGoogle Scholar
  11. Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of the 2003 conference of the North American chapter of the Association for Computational Linguistics on human language technology (NAACL 2003), Edmonton. Association for Computational Linguistics, pp 48–54Google Scholar
  12. Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the Association for Computational Linguistics (ACL 2007), Prague. Association for Computational Linguistics, pp 177–180Google Scholar
  13. Lohse B, Hawkins JA, Wasow T (2004) Domain minimization in English verb-particle constructions. Language 80(2):238–261CrossRefGoogle Scholar
  14. Lopez A (2008) Statistical machine translation. ACM Comput Surv 40(3):1–49CrossRefGoogle Scholar
  15. Manning CD, Schütze H (1999) Foundations of statistical natural language processing. MIT, Cambridge, 620pzbMATHGoogle Scholar
  16. Och FJ, Ney H (2000) Improved statistical alignment models. In: Proceedings of the 38th annual meeting of the Association for Computational Linguistics (ACL 2000), Hong Kong. Association for Computational Linguistics, pp 440–447Google Scholar
  17. Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51CrossRefzbMATHGoogle Scholar
  18. Och FJ, Ney H (2004) The alignment template approach to statistical machine translation. Comput Linguist 30(4):417–449CrossRefzbMATHGoogle Scholar
  19. Papineni K, Roukos S, Ward T, Zhu W (2002) BLEU: a method for automatic evalution of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics (ACL 2002), Philadelphia. Association for Computational Linguistics, pp 311–318Google Scholar
  20. Ramisch C, Besacier L, Kobzar O (2013) How hard is it to automatically translate phrasal verbs from English to French? In: Mitkov R, Monti J, Pastor GC, Seretan V (eds) Proceedings of the MT summit 2013 workshop on multi-word units in machine translation and translation technology (MUMTTT 2013), Nice, pp 53–61. http://www.mtsummit2013.info/workshop4.asp
  21. Shinozaki T, Ostendorf M (2008) Cross-validation and aggregated EM training for robust parameter estimation. Comput Speech Lang 22(2):185–195CrossRefGoogle Scholar
  22. Sinclair J (ed) (1989) Collins COBUILD dictionary of phrasal verbs. Collins COBUILD, London, 512pGoogle Scholar
  23. Snover M, Dorr BJ, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: Proceedings of the 7th conference of the Association for Machine Translation in the Americas, Cambridge. Association for Machine Translation in the Americas, pp 223–231Google Scholar
  24. Stolcke A (2002) SRILM – an extensible language modeling toolkit. In: Hansen JHL, Pellom B (eds) Proceedings of the seventh international conference on spoken language processing, third INTERSPEECH event (ICSLP 2001 – INTERSPEECH 2002), Denver. International Speech Communication Association, pp 901–904Google Scholar
  25. Stymne S (2009) A comparison of merging strategies for translation of German compounds. In: Proceedings of the student research workshop at EACL 2009, Athens, pp 61–69Google Scholar
  26. Stymne S (2011a) Blast: a tool for error analysis of machine translation output. In: Proceedings of the ACL 2011 system demonstrations, Portland. Association for Computational Linguistics, pp 56–61. http://www.aclweb.org/anthology/P11-4010
  27. Stymne S (2011b) Pre- and postprocessing for statistical machine translation into Germanic languages. In: Proceedings of the ACL 2011 student research workshop, Portland. Association for Computational Linguistics, pp 12–17. http://www.aclweb.org/anthology/P11-3003
  28. Tillmann C, Ney H (2003) Word reordering and a dynamic programming beam search algorithm for statistical machine translation. Comput Linguist 29(1):97–133CrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Carlos Ramisch
    • 1
  1. 1.Aix Marseille UniversityMarseilleFrance

Personalised recommendations