Systems and Means of Informatics
2017, Volume 27, Issue 4, pp 164-176
- I. M. Zatsman
- M. G. Kruzhkov
- E. Ju. Loshchilova
The methods of Russian connectives frequency analysis are examined, including analysis of their translation models in Russian-French parallel texts.
The parallel texts are integrated into a supracorpora database (SCDB) which also includes bilingual annotations of translation correspondences. The annotations include properties of the examined linguistic items (Russian connectives) and, at the same time, properties of the corresponding linguistic items found in the translation. These properties are organized as a faceted classification in the SCDB describing the translation models found in the SCDB from various perspectives. A characteristic feature of the connectives translations frequency analysis methods implemented in the SCDB is the reversibility of the calculated statistical data, meaning that the calculated frequency values act as hyperlinks to the lists of the annotations those values are based on, which represent occurrences of the corresponding connectives in the parallel texts of the SCDB. The use of faceted classifications in the SCDB allows for multidimensional statistical analysis of the annotated connectives and translation models. The calculated statistical data are verifiable because they allow tracing the given values directly to the annotations they are based on. The main goal of this paper is to describe methods of frequency analysis of connectives translation models, including those that support the reversibility of the calculated statistical data on different generalization levels.
Key words
supracorpora database; translation models; annotation of translation models; faceted classifications; corpus linguistics; generalization; reversibility of generalization process
I. M. Zatsman, M. G. Kruzhkov, and E. Ju. Loshchilova
 Institute of Informatics Problems, Federal Research Center "Computer Science
and Control", Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation