Informatics and Applications

2022, Volume 16, Issue 2, pp 52-59

PRINCIPLES OF DESCRIBING MARKERS OF LOGICAL-SEMANTIC RELATIONS AND THEIR HIERARCHY

  • A. A. Durnovo
  • O. Yu. Inkova
  • N. A. Popkova

Abstract

The article deals with annotation strategies in corpora with discourse markup. It is shown that Rhetorical Structure Theory (RST)-based corpora only contain annotations of coherence relations, or rhetorical relations (RR). In contrast, the Penn Discourse Treebank (PDTB) of the University of Pennsylvania annotates relations markers, as does the Supracorpora Database of Connectives. The RST Signaling Corpus (RST-SC), also based on RST, has been shown to annotate RR markers, but cannot combine the markup of RRs and their markers in a single annotation. This problem is solved by the GUM corpus and the Supracorpora Database of Hierarchy of Logical-Semantic Relations. The latter has a few advantages: the ability to search, to obtain statistics, and to form bilingual annotations. This makes it possible to identify both universal phenomena in the discursive organization of the text and language-specific phenomena.

[+] References (24)

[+] About this article