Systems and Means of Informatics
2019, Volume 29, Issue 3, pp 92-103
MACHINE TRANSLATION ERRORS: PROBLEMS OF CLASSIFICATION
- A. A. Goncharov
- N. V. Buntman
- V.A. Nuriev
Abstract
The paper considers the problems of classifying machine translation errors. Its first part reviews some approaches to evaluation of machine translation quality and to classification of errors that machine translation systems tend to make. The other part of the paper describes an original taxonomy of machine translation errors - the targeted one. It has been devised specifically to classify the errors central to translation of connectives (from Russian into French). To date, there have been no such studies for this pair of languages. The proposed classification includes two groups of errors: (i) grammatical/lexical errors in the translation of the text chunk where a given connective occurs; and (ii) errors in the translation of a connective itself. This study uses a parallel Russian-French corpus that stores Russian source texts and their reference - made by professional humans - translations into French. The corpus totals 300 thousand sentences (about 4 million words). The source texts where connectives occur have been used to generate machine translations by two automated systems.
[+] References (16)
- Wu, Y., M. Schuster, Z. Chen, et al. 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. Available at: https://arxiv. org/pdf/1609.08144.pdf (accessed July 18, 2019).
- Johnson, M., M. Schuster, Q.V. Le, M. Krikun, Y. Wu, Zh. Chen, N. Thorat, F. Viegas, M. Wattenberg, G. Corrado, M. Hughes, and J. Dean. 2017. Google's multilingual neural machine translation system: Enabling zero-shot translation. T. Association Computational Linguistics 5:339-351.
- Nuriev, V., N. Buntman, and O. Inkova. 2018. Machine translation of Russian connectives into French: Errors and quality failures. Informatika i ee Primeneniya | Inform. Appl. 12(2): 105-113.
- Moorkens, J., Sh. Castilho, F. Gaspari, and S. Doherty, eds. 2018. Translation quality assessment. From principles to practice. Cham, Switzerland: Springer. 292 p.
- Specia, L., C. Scarton, and G. H. Paetzold. 2018. Quality estimation for machine translation. San Rafael, CA: Morgan & Claypool. 148 p.
- Lommel, A. 2018. Metrics for translation quality assessment: A case for standardising error typologies. Translation quality assessment: From principles to practice. Eds. J. Moorkens, Sh. Castilho, F. Gaspari, and S. Doherty. Cham, Switzerland: Springer. 109-127.
- Vilar, D., J. Xu, L. D'Haro, and H. Ney. 2006. Error analysis of statistical machine translation output. 5th Conference (International) on Language Resources and Evaluation Proceedings. Italy, Genoa: European Language Resources Association. Available at: http://www.lrec-conf.org/proceedings/lrec2006/pdf/413_pdf.pdf (accessed July 18, 2019).
- Zhou Wang, B., S. Liu, M. Li, D. Zhang, and T. Zhao. 2008. Diagnostic evaluation of machine translation systems using automatically constructed linguistic check-points. 22nd Conference (International) on Computational Linguistics Proceedings. Manchester. 1:1121-1128.
- Costa, A., W. Ling, T. Luis, R. Correia, and L. Coheur. 2015. A linguistically motivated taxonomy for machine translation error analysis. Machine Translation 29:127-161.
- Guillou, L., and C. Hardmeier. 2016. PROTEST: A test suite for evaluating pronouns in machine translation. 10th Conference (International) on Language Resources and Evaluation Proceedings. Portoroz. 636-643.
- Burchardt, A., V. Macketanz, J. Dehdari, G. Heigold, J.T. Peter, and P. Williams. 2017. A linguistic evaluation of rule-based, phrase-based, and neural MT engines. Prague Bull. Math. Linguistics 108(1): 159-170.
- Popovic, M. 2017. Comparing language related issues for NMT and PBMT between German and English. Prague Bull. Math. Linguistics 108(1):209-220.
- Burlot, F., and F. Yvon. 2017. Evaluating the morphological competence of machine translation systems. 2nd Conference on Statistical Machine Translation Proceedings. Copenhagen. 43-55.
- Comelles, E., V. Arranz, and I. Castellon. 2017. Guiding automatic MT evaluation by means of linguistic features. Digit. Scholarsh. Hum. 32(4):761-778.
- Isabelle, P., C. Cherry, and G. Foster. 2017. A challenge set approach to evaluating machine translation. Conference on Empirical Methods in Natural Language Processing Proceedings. Copenhagen. 2476-2486.
- Buntman, N. V., A. A. Goncharov, I. M. Zatsman, and V. A. Nuriev. 2018. Ko- lichestvennyy analiz rezul'tatov mashinnogo perevoda s ispol'zovaniem nadkorpus- nykh baz dannykh [Using supracorpora databases for quantitative analysis of machine translations]. Informatika i ee Primeneniya - Inform. Appl. 12(4):96-105.
[+] About this article
Title
MACHINE TRANSLATION ERRORS: PROBLEMS OF CLASSIFICATION
Journal
Systems and Means of Informatics
Volume 29, Issue 3, pp 92-103
Cover Date
2019-10-30
DOI
10.14357/08696527190308
Print ISSN
0869-6527
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
classification; machine translation; quality of machine translation; machine translation errors
Authors
A. A. Goncharov , N. V. Buntman , and V.A. Nuriev
Author Affiliations
Institute of Informatics Problems, Federal Research Center "Computer Science
and Control", Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
M.V. Lomonosov Moscow State University, GSP-1, Leninskie Gory, Moscow 119991, Russian Federation
|