Systems and Means of Informatics

2024, Volume 34, Issue 4, pp 73-84

DEVELOPING THE STRUCTURE OF SUPRACORPORA DATABASES

  • A. A. Goncharov

Abstract

The paper presents the methods for developing the structure of supracorpora databases to provide a more detailed representation of the results from parallel text analysis. The initial data structure for the annotation of are described. These methods provide the possibilities (i) to mark up the original and translation text blocks in more detail; (ii) to classify the features of a text block using multiple facets; (iii) to save data about lexical markers of text block features; and (iv) to save data about the irrelevance of text fragments pairs to a search query. All these possibilities allow improving the quality of the final data in terms of its completeness and consistency and the corresponding changes in the data structure can make it more flexible. The proposed changes to the data structure are independent of the goals and objectives of any specific study that may be conducted using supracorpora databases.

[+] References (14)

[+] About this article