In recent decades, problems of language specificity in the Russian language attract considerable attention of researchers, although until recently, they have not been thoroughly examined using corpus-based methods. This paper presents a new method of investigating language specificity of Russian connectives based on statistical analysis of annotated parallel texts. Russian-French and French-Russian parallel texts are processed with the help of the Supracorpora Database (SCDB) of Connectives designed specifically for annotation of translation correspondences (TCs) found in parallel texts. Each TC includes annotations of a Russian connective and its translation equivalent (TE), which enables one to obtain statistical data on various translation models (TMs) based on several proposed parameters of language specificity of connectives. As an example, in this work, language specificity of two Russian connectives will be examined: или and а то. Based on the proposed statistical parameters, it will be demonstrated that или has a very low degree of language specificity in the context of the Russian-French language pair, while а то is a highly language-specific connective. The results of this research are applicable to informatics (machine translation and statistical analysis of textual data) and comparative study of languages, such as lexical typology, lexicography, and theory and practice of translation.
Key words
supracorpora databases; statistical analysis; contrastive corpus analysis; language specificity; parallel corpora; linguistic information resources; connectives; discourse relations; semantics
 Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation