Systems and Means of Informatics
2018, Volume 28, Issue 4, pp 156-167
- I. M. Zatsman
- M. G. Kruzhkov
This article examines the process of design-oriented evolution of the term system for supracorpora databases (SCDB) which represent a new category of information resources in linguistics. The SCDB is based on parallel texts, i.e., texts placed alongside their translations and aligned with them at the sentence level. Although SCDBs are designed for annotation of a wide variety of linguistic items and their correspondences, this article specifically considers annotation of connectives. The annotation-centered design of SCDBs has led to emergence of new entities and notions in computer linguistics, and in the beginning of 2017, a custom term system was proposed for them. On one hand, the proposed terms are used by linguists in order to describe new knowledge generated as a result of annotation and investigation of linguistic units. On the other hand, these terms serve as a basis for design of the SCDB architecture and the associated dataware, lingware, and software. Since the first description of the terminology, the range of tasks accomplished with SCDBs has expanded significantly; hence, there is the need to further develop the initial design-oriented term system.
Systems and Means of Informatics
Volume 28, Issue 4, pp 156-167
Institute of Informatics Problems, Russian Academy of Sciences
Key words
supracorpora databases; term systems; annotation of linguistic units; parallel texts; corpus linguistics; connectives
I. M. Zatsman  and M. G. Kruzhkov
Author Affiliations
 Institute of Informatics Problems, Federal Research Center "Computer Science
and Control", Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation