Systems and Means of Informatics
2022, Volume 32, Issue 1, pp 114-125
- A. A. Durnovo
- O. Yu. Inkova
- N. A. Popkova
The paper presents the architecture of a new linguistic resource: the Supracorpora database reflecting hierarchies of logical-semantic relations that ensure the text coherence. Annotations in the database are in the form of trees or, rather, arborescences, where vertices contain data while edges represent the subordination between vertices. Each tree vertex corresponds to either a context or a connector; the connectors in the text are marked. The authors describe the relationships between the database tables and trees as well as their properties.
The paper also demonstrates the differences between this new resource and the existing ones, in particular, the graphs of rhetorical relations created within the framework of the rhetorical structure theory: the ability to store data, to modify the annotated contexts, to work with empty contexts, and to reflect previous states of all the trees.
[+] References (15)
- Carlson, L., and D. Marcu. 2001. Discourse tagging reference manual. Technical Report ISI-TR-545. 87 p. Available at: manual.pdf (accessed February 14, 2022).
- Reese, B., J. Hunter, N. Asher, P. Denis, and J. Baldridge. 2007. Reference manual for the analysis and annotation of rhetorical structure (version 1.0). Austin, TX: University of Texas. Technical Report. 26 p. Available at: discor_manual.pdf ( (accessed February 14, 2022).
- PDTB Research Group. 2019. The Penn Discourse Tree-bank 3.0 Annotation Manual. 81 p. doi: 10.35111/qebf-gk47. Available at: LDC2019T05/PDTB3-Annotation-Manual.pdf (accessed February 14, 2022).
- Mann, W., and S. Thompson. 1988. Rhetorical structure theory: Towards a functional theory of text organization. Text 8(3):243-281. doi: 10.1515/text.1.1988.8.3.243.
- O'Donnell, M. 1997. RST-tool: An RST analysis tool. 6th European Workshop on Natural Language Generation Proceedings. Duisburg, Germany: Gerhard-Mercator University. 5 p.
- O'Donnell, M. 2000. RSTTool 2.4: A markup tool for rhetorical structure theory. 1st Conference (International) on Natural Language Generation Proceedings. Mitzpe Ramon, Izrael. 14:253-256. doi: 10.3115/1118253.1118290.
- Gessler, L., J. Liu, and A. Zeldes. 2019. A discourse signal annotation system for RST trees. Discourse Relation Parsing and Treebanking Proceedings. Minneapolis, MN: Association for Computational Linguistics. 56-61. doi: 10.18653/v1 /W19-2708.
- Taboada, M., and W. Mann. 2006. Rhetorical structure theory: Looking back and moving ahead. Discourse Stud. 8(3):423-459.
- Martin, J.R. 1996. Types of structure: Deconstructing notions of constituency in clause and text. Computational and conversational discourse. Eds. E.H. Hovy and
D. R. Scott. NATO ASI ser. F: Computer and systems sciences. Berlin, Heidelberg: Springer. 151:39-66. doi: 10.1007 978-3-662-03293-0^.
- Inkova, O.Yu. 2019. Logiko-semanticheskie otnosheniya: Problemy klassifikatsii [Logical-semantic relations: Classification problems]. Svyaznost' teksta: mereologiche- skie logiko-semanticheskie otnosheniya [Text coherence: Mereological logical semantic relations]. O. Inkova and E. Manzotti. Moscow: LRC Publishing House. 11-98.
- Skoufaki, S. 2020. Rhetorical structure theory and coherence break identification. Text Talk 40(1):99-124.
- Inkova, O., and N. Popkova. 2017. Statistical data as information source for linguistic analysis of Russian connectors. Informatika i ee Primeneniya - Inform. Appl.
11 (3): 123-131.
- Kruzhkov, M. 2015. Informatsionnye resursy kontrastivnykh lingvisticheskikh issledovaniy: elektronnye korpusa tekstov [Information resources for contrastive studies: Electronic text corpora]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics (25) 2:140-159.
- Kruzhkov, M. 2021. Kontseptsiya postroeniya nadkorpusnykh baz dannykh [Conceptual framework for supracorpora databases]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 31(3):101-112.
- Inkova-Manzotti, O.Yu. 2001. Konnektory protivopostavleniya vo frantsuzskom
i russkom yazykakh: sopostavitel'noe issledovanie [Connectors of opposition in French and Russian: A comparative study]. Moscow: Informelektro. 434 p.
[+] About this article
Systems and Means of Informatics
Volume 32, Issue 1, pp 114-125
Cover Date
Print ISSN
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
supracorpora database; corpus of texts' annotation; graph; discourse relations; connector
A. A. Durnovo , O. Yu. Inkova , , and N. A. Popkova
Author Affiliations
Federal Research Center "Computer Science and Control", Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
University of Geneva, 22 Bd des Philosophes, CH-1205 Geneva 4, Switzerland