Automatic and Human Evaluation on English-Croatian Legislative Test Set

Brkić, Marija; Seljan, Sanja; Vičić, Tomislav

doi:10.1007/978-3-642-37256-8_26

Marija Brkić¹⁷,
Sanja Seljan¹⁸ &
Tomislav Vičić¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7817))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

2809 Accesses
3 Citations

Abstract

This paper presents work on the manual and automatic evaluation of the online available machine translation (MT) service Google Translate, for the English-Croatian language pair in legislation and general domains. The experimental study is conducted on the test set of 200 sentences in total. Human evaluation is performed by native speakers, using the criteria of fluency and adequacy, and it is enriched by error analysis. Automatic evaluation is performed on a single reference set by using the following metrics: BLEU, NIST, F-measure and WER. The influence of lowercasing, tokenization and punctuation is discussed. Pearson’s correlation between automatic metrics is given, as well as correlation between the two criteria, fluency and adequacy, and automatic metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Callison-Burch, C., Fordyce, C., Koehn, P., Monz, C., Schroeder, J.: Meta-evaluation of machine translation. In: Proceedings of the Second Workshop on Statistical Machine Translation, pp. 136–158 (2007)
Google Scholar
Coughlin, D.: Correlating automated and human assessments of machine translation quality. In: Proceedings of MT Summit IX, pp. 63–70 (2003)
Google Scholar
Doddington, G.: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In: Proceedings of the Second International Conference on Human Language Technology Research, pp. 138–145. Morgan Kaufmann Publishers Inc. (2002)
Google Scholar
Farrús Cabeceran, M., Ruiz Costa-Jussà, M., Mariño Acebal, J.B., Rodríguez Fonollosa, J.A., et al.: Linguistic-based evaluation criteria to identify statistical machine translation errors. In: Proceedings of EAMT, pp. 52–57 (2010)
Google Scholar
Flanagan, M.: Error classification for mt evaluation. In: Technology Partnerships for Crossing the Language Barrier: Proceedings of the First Conference of the Association for Machine Translation in the Americas, pp. 65–72 (1994)
Google Scholar
Hovy, E., King, M., Popescu-Belis, A.: Principles of context-based machine translation evaluation. Machine Translation 17(1), 43–75 (2002)
Article Google Scholar
Koehn, P.: Statistical significance tests for machine translation evaluation. In: Proceedings of EMNLP, vol. 4, pp. 388–395 (2004)
Google Scholar
Koehn, P.: Statistical Machine Translation, vol. 11. Cambridge University Press (2010)
Google Scholar
Leusch, G., Ueffing, N., Ney, H., et al.: A novel string-to-string distance measure with applications to machine translation evaluation. In: Proceedings of MT Summit IX, pp. 33–40 (2003)
Google Scholar
Melamed, I.D., Green, R., Turian, J.P.: Precision and recall of machine translation. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Companion Volume of the Proceedings of HLT-NAACL 2003–Short Papers, vol. 2, pp. 61–63. Association for Computational Linguistics (2003)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)
Google Scholar
Stymne, S.: Blast: A tool for error analysis of machine translation output. In: Proc. of the 49th ACL, HLT, Systems Demonstrations, pp. 56–61 (2011)
Google Scholar
Tillmann, C., Vogel, S., Ney, H., Zubiaga, A., Sawaf, H.: Accelerated dp based search for statistical translation. In: European Conf. on Speech Communication and Technology, pp. 2667–2670 (1997)
Google Scholar
Vilar, D., Xu, J., d’Haro, L.F., Ney, H.: Error analysis of statistical machine translation output. In: Proceedings of LREC, pp. 697–702 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, University of Rijeka, Radmile Matejčić 2, 51000, Rijeka, Croatia
Marija Brkić
Department of Information Sciences, Faculty of Humanities and Social Sciences, Ivana Lučića 3, 10000, Zagreb, Croatia
Sanja Seljan
Freelance translator, Croatia
Tomislav Vičić

Authors

Marija Brkić
View author publications
You can also search for this author in PubMed Google Scholar
Sanja Seljan
View author publications
You can also search for this author in PubMed Google Scholar
Tomislav Vičić
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research, National Polytechnic Institute, Mexico D.F., Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brkić, M., Seljan, S., Vičić, T. (2013). Automatic and Human Evaluation on English-Croatian Legislative Test Set. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2013. Lecture Notes in Computer Science, vol 7817. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37256-8_26

Download citation

DOI: https://doi.org/10.1007/978-3-642-37256-8_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37255-1
Online ISBN: 978-3-642-37256-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics