Informatics and Applications

2016, Volume 10, Issue 1, pp 106-118

REPRESENTATION OF CROSS-LINGUAL KNOWLEDGE ABOUT CONNECTORS IN SUPRACORPORA DATABASES

  • I. M. Zatsman
  • O. Yu. Inkova
  • M. G. Kruzhkov
  • N. A. Popkova

Abstract

The article considers "supracorpora databases," which are used in contrastive linguistic studies. Such databases result from processing of parallel texts from bilingual parallel subcorpora within the Russian National Corpus. Each of these parallel texts contains either one original Russian text with one or more translations into a foreign language, or one original text in a foreign language with one translation into Russian. Every source text is aligned with its translation(s) at the level of sentences. Supracorpora databases are a new type of linguistic resources designed for goal-oriented discovery of new knowledge about various linguistic units. This knowledge is needed to improve the quality of machine translation, to update monolingual and bilingual grammars, and to modernize a wide range of academic courses in such fields as linguistics and translation studies. The article describes the underlying conceptual foundations of the database and gives an example of how it can be implemented to represent knowledge about Russian connectors and their French translation correspondences.

[+] References (18)

[+] About this article