Systems and Means of Informatics
2019, Volume 29, Issue 3, pp 77-91
- A. Yu. Egorova
- I. M. Zatsman
- O. S. Mamonova
The paper considers the task of providing linguistic studies with means of supracorpora databases containing aligned parallel texts (each includes the original text and its translation) as well as bilingual annotations of the researched linguistic items and their translation equivalents formed on the basis of parallel texts. Each annotation, formed by a linguist, fixes a translation model of a linguistic item. The experience of implementing several linguistic projects at Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences showed that not all translation models that linguists extract from parallel texts during linguistic annotation are described in bilingual dictionaries and handbooks. Thus, supracorpora databases allow researchers to create new knowledge about the translation equivalents of the researched linguistic items. It is extracted by linguists when comparing and annotating the sentences of the original text and their translations. The main aim of the paper is to describe the functions of supracorpora databases that provide linguists with new knowledge in the process of annotation.
Systems and Means of Informatics
Volume 29, Issue 3, pp 77-91
Institute of Informatics Problems, Russian Academy of Sciences
supracorpora database; linguistic annotation; linguistic unit; corpus linguistics; translation models
A. Yu. Egorova  , I. M. Zatsman  , and O. S. Mamonova
 Institute of Informatics Problems, Federal Research Center "Computer Science
and Control", Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
 Faculty of Foreign Languages and Area Studies, M. V. Lomonosov Moscow State University, 1 Leninskie Gory, Bldg. 13-14, Moscow 119991, Russian Federation