Abstract
Automatic metrics for the evaluation of machine translation (MT) compute scores that characterize globally certain aspects of MT quality such as adequacy and fluency. This paper introduces a reference-based metric that is focused on a particular class of function words, namely discourse connectives, of particular importance for text structuring, and rather challenging for MT. To measure the accuracy of connective translation (ACT), the metric relies on automatic word-level alignment between a source sentence and respectively the reference and candidate translations, along with other heuristics for comparing translations of discourse connectives. Using a dictionary of equivalents, the translations are scored automatically, or, for better precision, semi-automatically. The precision of the ACT metric is assessed by human judges on sample data for English/French and English/Arabic translations: the ACT scores are on average within 2% of human scores. The ACT metric is then applied to several commercial and research MT systems, providing an assessment of their performance on discourse connectives.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Denkowski, M., Lavie, A.: METEOR-NEXT and the METEOR Paraphrase Tables: Improved Evaluation Support for Five Target Languages. In: Proc. of the ACL 2010 Joint Workshop on Statistical Machine Translation and Metrics MATR, Uppsala (2010)
Habash, N., Rambow, O.: Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop. In: Proc. of ACL 2010, Ann Arbor, MI, pp. 573–580 (2005)
Hajlaoui, N., Popescu-Belis, A.: Translating English Discourse Connectives into Arabic: a Corpus-based Analysis and an Evaluation Metric. In: Proc. of the CAASL4 Workshop at AMTA 2012 (Fourth Workshop on Computational Approaches to Arabic Script-based Languages), San Diego, CA, p. 8 (2012)
Koehn, P.: Europarl: A Parallel Corpus for Statistical Machine Translation. In: Proc. of the Tenth Machine Translation Summit, Phuket, pp. 79–86 (2005)
Lin, C.-Y., Och, F.J.: Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics. In: Proc. of the ACL, Barcelona (2004)
Max, A., Crego, J.M., Yvon, F.: Contrastive Lexical Evaluation of Machine Translation. In: Proc. of the International Conference on Language Resources and Evaluation (LREC), Valletta, Malta (2010)
Meyer, T., Popescu-Belis, A.: Using sense-labeled discourse connectives for statistical machine translation. In: Proc. of the EACL 2012 Joint Workshop on Exploiting Synergies between IR and MT and Hybrid Approaches to MT (ESIRMT-HyTra), Avignon, pp. 129–138 (2012)
Meyer, T., Popescu-Belis, A., Hajlaoui, N., Gesmundo, A.: Machine Translation of Labeled Discourse Connectives. In: Proc. of AMTA 2012 (10th Conference of the Association for Machine Translation in the Americas), San Diego, CA, p. 10 (2012)
Nagard, R.L., Koehn, P.: Aiding pronoun translation with co-reference resolution. In: Proc. of the Joint 5th Workshop on Statistical Machine Translation and Metrics (MATR), Uppsala, pp. 258–267 (2010)
Naskar, S.K., Toral, A., Gaspari, F., Way, A.: A framework for diagnostic evaluation of MT based on linguistic checkpoints. In: Proc. of MT Summit XIII, Xiamen, China (2011)
Och, F.J., Ney, H.: Improved Statistical Alignment Models. In: Proc. of the ACL, Hong-Kong, China, pp. 440–447 (2000)
Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Computational Linguistics 29(1), 19–51 (2003)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proc. of ACL, Philadelphia, PA, pp. 311–318 (2002)
Prasad, R., Dinesh, N., Lee, A., Miltsakaki, E., Robaldo, L., Joshi, A., Webber, B.: The Penn Discourse Treebank 2.0. In: Proc. of 6th International Conference on Language Resources and Evaluation (LREC), Marrakech, Morocco, pp. 2961–2968 (2008)
Popovic, M., Ney, H.: Towards automatic error analysis of machine translation output. Computational Linguistics 37(4), 657–688 (2011)
Zhou, M., Wang, B., Liu, S., Li, M., Zhang, D., Zhao, T.: Diagnostic evaluation of machine translation systems using automatically constructed linguistic check-points. In: Proc. of COLING, Manchester, UK, pp. 1121–1128 (2008)
Zufferey, S., Cartoni, B.: English and French causal connectives in contrast. Languages in Contrast 12(2), 232–250 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hajlaoui, N., Popescu-Belis, A. (2013). Assessing the Accuracy of Discourse Connective Translations: Validation of an Automatic Metric. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2013. Lecture Notes in Computer Science, vol 7817. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37256-8_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-37256-8_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37255-1
Online ISBN: 978-3-642-37256-8
eBook Packages: Computer ScienceComputer Science (R0)