Systems and Means of Informatics

2018, Volume 28, Issue 4, pp 168-181

METHOD FOR DESCRIPTION OF MULTIWORD CONNECTIVES IN SUPRACORPORA DATABASES

  • O. Yu. Inkova
  • M. G. Kruzhkov

Abstract

This article presents a new method for describing the structure of multiword connectives implemented in the Supracorpora database (SCDB) of connectives. Currently, the structure of connectives is underinvestigated, and criteria for determining boundaries of connectives and their components are lacking. The proposed method is based on the cognitive-semantic approach that considers multiword connectives as more or less free word combinations generated in the process of speech. A two-tier faceted classification is proposed which allows annotating, on one hand, specific tokens of connectives in texts (context annotation) and, on the other hand, the inner structure of connectives (structural annotation). The structural annotation is based on two aspects: structural type and structural components of connectives. Based on the proposed annotation method, a system of cross-clusters is implemented that extends the search and statistical capabilities of SCDB. In addition, this method allows researchers to eliminate subjectivity during the annotation process and to fill some gaps in linguistic knowledge, for example, to gather new data on combinatorial capabilities of Russian connectives.

[+] References (18)

[+] About this article