Systems and Means of Informatics
2021, Volume 31, Issue 2, pp 139-151
INDICATOR-BASED EVALUATION OF MACHINE TRANSLATION INSTABILITY
- A. Yu. Egorova
- I. M. Zatsman
- M. G. Kruzhkov
- V. A. Nuriev
Abstract
The paper presents data collected by tracking performance of a neural machine translation (NMT) engine and results of translation errors analysis. Indicator-based evaluation of NMT instability was carried out as a part of an experiment that involved 250 Russian text fragments. Each month for the duration of one year, these fragments were translated into French using the Google Translate NMT engine. The translations were recorded and annotated in a supracorpora database; the annotations include types of translation errors found in the translations by language experts. This procedure resulted in a series of 12 annotated translations for each of the 250 Russian fragments. The annotations include not only the types of errors found in the translations but also the types of NMT instability which indicate dynamics of translation quality or lack thereof. The paper aims to provide comparative analysis of Google Translate performance that takes into account the temporal variation aspect.
[+] References (19)
- Banerjee, S., and A. Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. 43rd Annual Meeting of the Association of Computational Linguistics and the Workshop on Intrinsic and Extrinsic Evaluation Measures for MT and/or Summarization Proceedings. Ann Arbor, MI: The Association for Computational Linguistics. 65-72.
- Moorkens, J., S. Castilho, F. Gaspari, and S. Doherty, eds. 2018. Translation quality assessment. Machine translation: Technologies and applications ser. Cham: Springer International Publishing. Vol. 1. 299 p.
- Specia, L., C. Scarton, and G. H. Paetzold. 2018. Quality estimation for machine translation. Rafael, CA: Morgan & Claypool Publs. 162 p.
- Koehn, Ph. 2020. Neural machine translation. New York, NY: Cambridge University Press. 408 p.
- Mikulic, A. 2020. Ljudska evaluacija sustava za neuralno strojno prevodenje (Za- vrsni rad). Available at: https://urn.nsk.hr/urn:nbn:hr:131:516896 (accessed March 9, 2021).
- Egorova, A. Yu., I. M. Zatsman, M. G. Kruzhkov, and V. A. Nuriev. 2020. Metodika temporal'noy otsenki nestabil'nosti mashinnogo perevoda [The technique allowing for temporal estimation of the machine translation instability]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 30(3):67-80.
- Inkova-Manzotti, O. Yu. 2001. Konnektory protivopostavleniya vo frantsuzskom i russkom yazykakh. Sopostavitel'noe issledovanie [Connectors of opposition in French and Russian: A comparative study]. Moscow: Informelektro. 429 p.
- Natsional'nyy korpus russkogo yazyka [Russian National Corpus]. Available at: http://www.ruscorpora.ru/ (accessed March 9, 2021).
- Zaliznyak, A. A., I. M. Zatsman, O. Yu. Inkova, and M. G. Kruzhkov. 2015. Nadkorpusnye bazy dannykh kak lingvisticheskiy resurs [Supracorpora databases as linguistic resource]. 7th Conference (International) "Corpus Linguistics" Proceedings. St. Petersburg: SPbSU. 211-218.
- Durnovo, A. A., I. M. Zatsman, and E.Yu. Loshchilova. 2016. Krosslingvisticheskaya baza dannykh dlya annotirovaniya logiko-semanticheskikh otnosheniy v tekste [Cross-lingual database for annotating logical-semantic relations in the text]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 26(4):124-137.
- Zaliznyak, A. A., I. M. Zatsman, and O.Yu. Inkova. 2017. Nadkorpusnaya baza dannykh konnektorov: postroenie sistemy terminov [Supracorpora database on connectives: Term system development]. Informatika i ee Primeneniya - Inform. Appl. 11(1): 100-108.
- Zatsman, I. M., and M. G. Kruzhkov. 2018. Nadkorpusnaya baza dannykh konnektorov: razvitie sistemy terminov proektirovaniya [Supracorpora database of connectives: Design-oriented evolution of the term system]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 28(4): 15 6-167.
- Egorova, A. Yu., I. M. Zatsman, and O. S. Mamonova. 2019. Nadkorpusnye bazy dannykh v lingvisticheskikh proektakh [Supracorpora databases in linguistic projects]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 29(3):77-91.
- Inkova, O. Yu. 2018. Nadkorpusnaya baza dannykh kak instrument formal'noy variativnosti konnektorov [Supracorpora database as an instrument of the study of the formal variability of connectives]. Komp'yuternaya lingvistika i intellektual'nye tekhnologii: po mat-lam Mezhdunar. konf. "Dialog" [Computer Linguistic and Intellectual Technologies: Conference (International) "Dialog" Proceedings]. Moscow. 17(24):240-253.
- Dobrovol'skiy, D.O., A. A. Kretov, and S. A. Sharov. 2005. Korpus parallel'nykh tekstov: arkhitektura i vozmozhnosti ispol'zovaniya [Corpus of parallel texts: Architecture and applications]. Natsional'nyy korpus russkogo yazyka: 2003-2005 [Russian National Corpus: 2003-2005]. Ed. V. A. Plungyan. Moscow: Indrik. 263-296.
- Egorova, A. Yu., I. M. Zatsman, V. V. Kosarik, and V. A. Nuriev. 2020. Nestabil'nost' neyronnogo mashinnogo perevoda [Instability of neural machine translation]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 30(2): 124-135.
- Egorova, A. Yu., I. M. Zatsman, M. G. Kruzhkov, and V. A. Nuriev. 2020. Mashinnyy perevod: indikatornaya otsenka rezul'tatov obucheniya iskusstvennoy neyronnoy seti [Machine translation: Indicator-based evaluation of training progress in neural processing]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 30(4): 124-137.
- Zatsman, I.M., O.Yu. Inkova, M. G. Kruzhkov, and N. A. Popkova. 2016. Pred- stavlenie krossyazykovykh znaniy o konnektorakh v nadkorpusnykh bazakh dannykh [Representation of cross-lingual knowledge about connectors in supracorpora databases]. Informatika i ee Primeneniya - Inform. Appl. 10(1): 106-118.
- Goncharov, A. A., N. V. Buntman, and V.A. Nuriev. 2019. Oshibki v mashinnom perevode: problemy klassifikatsii [Machine translation errors: Problems of classification]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 29(3):92-103.
[+] About this article
Title
INDICATOR-BASED EVALUATION OF MACHINE TRANSLATION INSTABILITY
Journal
Systems and Means of Informatics
Volume 31, Issue 2, pp 139-151
Cover Date
2021-05-20
DOI
10.14357/08696527210213
Print ISSN
0869-6527
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
neural machine translation (NMT); instability of machine translation; supracorpora database; indicator-based evaluation; linguistic annotation;
NMT instability types
Authors
A. Yu. Egorova , I. M. Zatsman , M. G. Kruzhkov , and V. A. Nuriev
Author Affiliations
Institute of Informatics Problems, Federal Research Center "Computer Science
and Control", Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
|