Informatics and Applications
2024, Volume 18, Issue 3, pp 106-114
IMPLICIT LOGICAL-SEMANTIC RELATIONS IN PARALLEL TEXTS: ANNOTATION PRINCIPLES
- A. A. Goncharov
- P. V. Iaroshenko
Abstract
The problemof implicit logical-semantic relations (LSRs) annotation is considered. The state-of-the-art
in the annotation of implicit LSRs is analyzed. The approaches focused on (i) analysis of the global discourse
structure; (ii) analysis of the local discourse structure; and (iii) unification of the data annotated within different
frameworks and development of a unified annotation standard are presented. The principles for annotating implicit
LSRs in parallel texts are proposed, i. e., target of annotation is a translated correspondence (a pair of fragments
from the original and translated texts). Translation correspondences illustrating implicit–explicit mismatch have
been studied, i. e., where LSR markers are absent in the Russian text while in the text in another language, on the
contrary, they are present. Taking into account the specificity of implicit LSRs, the following principles of their
annotation were formulated: (i) it is necessary to determine the boundaries of LSR arguments (to ensure clarity and
convenience of analysis); (ii) features of text blocks should form a hierarchical structure (to ensure convenience of
using a large number of features); and (iii) if a feature of a text block has a lexical marker, this marker should be
indicated (to ensure better justification of the annotator’s decisions).
[+] References (33)
- Goncharov, A.A. 2021. Klassifikatsii vnutritekstovykh otnosheniy:
osnovaniya i printsipy strukturirovaniya [Classifications
of intratextual relations: Bases and structuring
principles]. Voprosy yazykoznaniya [Topics in
the Study of Language] 3:97¨C119. doi: 10.31857/0373-
658X.2021.3.97-119. EDN: OKPZEI.
- Stede, M. 2012. Discourse processing. San Rafael, CA:
Morgan & Claypool Publs. 155 p.
- Hobbs, J.R. 1990. Literature and cognition. Stanford, CA:
CSLI. 193 p.
- Kibrik, A.A. 2003. Analiz diskursa v kognitivnoy perspektive
[Discourse analysis in cognitive perspective].Moscow:
Institute of Linguistics RAS. D.Sc. Diss. 90 p.
- Xiang, W., and B. Wang. 2023. A survey of implicit
discourse relation recognition. ACM Comput. Surv.
55(12):258. 34 p. doi: 10.1145/3574134.
- Jiang, D., and J. He. 2020. Tree framework with BERT
word embedding for the recognition of Chinese implicit
discourse relations. IEEE Access 8:162004¨C162011. doi:
10.1109/ACCESS.2020.3019500.
- Taboada, M., and W. Mann. 2006. Rhetorical structure
theory: Looking back and moving ahead. Discourse Stud.
8(3):423¨C459. doi: 10.1177/146144560606188.
- Carlson, L., and D. Marcu. 2001. Discourse tagging reference
manual. ISI Technical Report ISI-TR-545. 87 p.
Available at: https://www.isi.edu/content/tr/tr-545.pdf
(accessed August 2, 2024).
- Das, D., and M. Taboada. 2014. RST Signalling Corpus
annotation manual. Available at: https://www.sfu.ca/~mtaboada/docs/publications/RST_Signalling_Corpus_Annotation_Manual.pdf (accessed August 2, 2024).
- Pisarevskaya, D.,M. Ananyeva,M. Kobozeva, A. Nasedkin,
S. Nikiforova, I. Pavlova, and A. Shelepov. 2017. Towards
building a discourse-annotated corpus of Russian.
Komp'yuternaya lingvistika i intellektual'nye tekhnologii
[Computational linguistics and intellectual technologies].
Moscow: RSUH. 16(1):201-212.
- Smirnov, I. V. 2023. Intellektual'nyy analiz tekstov na osnove
metodov raznourovnevoy obrabotki estestvennogo yazyka
[Intelligent text analysis based on multilevel natural language
processing methods]. Moscow: FRC CSC RAS.
354 p.
- Prasad,R., E. Miltsakaki, N. Dinesh, A. Lee, A. Joshi, and
B. L. Webber. 2006. The Penn Discourse TreeBank 1.0 annotation
manual. IRCS technical reports ser. The PDTB
Research Group. 54 p. Available at: https://catalog.
ldc.upenn.edu/docs/LDC2008T05/papers/pdtb-1.0-annotation-manual.pdf (accessed August 2, 2024).
- Prasad, R., B. Webber, and A. Joshi. 2017. The Penn
Discourse Treebank: An annotated corpus of discourse relations.
Handbook of linguistic annotation. Eds. N. Ide and
J. Pustejovsky. Dordrecht: Springer Science + Business
Media. 1197-1217. doi: 10.1007/978-94-024-0881-2_45.
- Webber, B., R. Prasad, A. Lee, and A. Joshi. 2019.
The Penn Discourse Treebank 3.0: Annotation manual.
Available at: https://catalog.ldc.upenn.edu/docs/
LDC2019T05/PDTB3-Annotation-Manual.pdf (accessed
August 2, 2024).
- Zufferey, S., and L. Degand. 2024. Connectives and discourse
relations. Key topics in semantics and pragmatics.
Cambridge: Cambridge University Press. 268 p.
- Fu, Y. 2022. Towards unification of discourse annotation
frameworks. 60th Annual Meeting of the Association
for Computational Linguistics Proceedings. Dublin: Association
for Computational Linguistics. 132¨C142. doi:
10.18653/v1/2022.acl-srw.12.
- Putra, J.W.G., K. Matsumura, S. Teufel, and T. Tokunaga.
2023. TIARA 2.0: An interactive tool for annotating
discourse structure and text improvement. Lang. Resour.
Eval. 57:5¨C29. doi: 10.1007/s10579-021-09566-0.
- Brat rapid annotation tool. Available at: https://brat.nlplab.org/index.html (accessed August 2, 2024).
- WebAnno. A flexible, web-based and visually supported
system for distributed annotations. Available at: https://
webanno.github.io/webanno (accessed August 2, 2024).
- INCEpTION. A semantic annotation platform offering
intelligent assistance and knowledge management. Available
at: https://inception-project.github.io (accessed August
2, 2024).
- Inkova,O. Yu. 2019. Logiko-semanticheskie otnosheniya:
problemy klassifikatsii [Logical-semantic relations: Classification
problems]. Svyaznost¡¯ teksta: mereologicheskie
logiko-semanticheskie otnosheniya [Text coherence:Mereological
logical semantic relations]. Moscow: LRC Publishing
House. 11¨C98.
- Zatsman, I., M. Kruzhkov, and E. Loshchilova. 2019.
Metody i sredstva informatiki dlya opisaniya struktury
neodnoslovnykh konnektorov [Methods and means of
informatics for multiword connectives structure description].
Struktura konnektorov i metody ee opisaniya [Connectives
structure and methods of its description]. Ed.
O. Yu. Inkova. Moscow: TORUS PRESS. 205¨C230. doi:
10.30826/SEMANTICS19-06. EDN: YVAJWN.
- Goncharov, A.A., O. Yu. Inkova, and M. Kruzhkov. 2019.
Metodologiya annotirovaniya v nadkorpusnykh bazakh
dannykh [Annotation methodology of supracorpora
databases]. Sistemy i Sredstva Informatiki - Systems
and Means of Informatics 29(2):148-160. doi: 10.14357/08696527190213. EDN: GNDCJE.
- Kruzhkov,M. 2021. Kontseptsiya postroeniya nadkorpusnykh
baz dannykh [Conceptual framework for supracorpora
databases]. Sistemy i Sredstva Informatiki - Systems
and Means of Informatics 31(3):101¨C112. doi: 10.14357/08696527210309. EDN: UMWNIU.
- Goncharov, A.A. 2023. Annotirovanie parallel’nykh korpusov:
podkhody i napravleniya razvitiya [Parallel corpus
annotation: Approaches and directions for development].
Informatika i ee Primeneniya—Inform. Appl. 17(4):81–87.
doi: 10.14357/19922264230411. EDN: GDKDOZ.
- Inkova, O. Yu., ed. 2018. Semantika konnektorov: kontrastivnoe
issledovanie [Semantics of connectives: A contrastive
study]. Moscow: TORUS PRESS. 368 p.
- Inkova, O. Yu., ed. 2019. Struktura konnektorov i metody
ee opisaniya [Connectives structure and methods of its
description]. Moscow: TORUS PRESS. 316 p. EDN:
VVIINM.
- Goncharov, A.A., and O. Yu. Inkova. 2019. Metodika
poiska implitsitnykh logiko-semanticheskikh otnosheniy
v tekste [Methods for identification of implicit logicalsemantic
relations in texts]. Informatika i ee Primeneniya
— Inform. Appl. 13(3):97–104. doi: 10.14357/
19922264190314. EDN:MWGFJW.
- Nuriev, V.A., and I.M. Zatsman. 2020. Redutsirovanie
spektra modeley perevoda v nadkorpusnykh bazakh
dannykh [Reducing the spectrum of translation models
in supracorpora databases]. Informatika i ee Primeneniya
— Inform. Appl. 14(2):119–126. doi: 10.14357/
19922264200217. EDN: EBUTTA.
- Goncharov, A.A., and O. Yu. Inkova. 2021. Izvlechenie
znaniy o sredstvakh vyrazheniya logiko-semanticheskikh
otnosheniy pri pomoshchi nadkorpusnoy bazy dannykh
[Extracting knowledge about means of expression
of logical-semantic relations from the supracorpora
database]. Informatika i ee Primeneniya — Inform. Appl.
15(2):96–103. doi: 10.14357/19922264210214. EDN:
GRPWIB.
- Inkova, O. Yu., and M.G. Kruzhkov. 2021. Strukturirovannye
opredeleniya diskursivnykh otnosheniy
v Nadkorpusnoy baze dannykh konnektorov [Structured
definitions of discourse relations in the Supracorpora
Database of Connectives]. Informatika i ee Primeneniya
— Inform. Appl. 15(4):27–32. doi: 10.14357/
19922264210404. EDN: EZJXVI.
- Durnovo, A.A., O. Yu. Inkova, and N.A. Popkova. 2022.
Printsipy opisaniya pokazateley logiko-semanticheskikh
otnosheniy i ikh ierarkhii [Principles of describing markers
of logical-semantic relations and their hierarchy]. Informatika
i ee Primeneniya — Inform. Appl. 16(2):52–59.
doi: 10.14357/19922264220207. EDN: NPFTOH.
- Bunt, H., and R.Prasad. 2016. ISODR-Core (ISO24617-
8):Core concepts for the annotation of discourse relations.
12th Joint ACL-ISO Workshop on Interoperable Semantic
Annotation Proceedings. Portoro z, Slovenia. 45–54.
[+] About this article
Title
IMPLICIT LOGICAL-SEMANTIC RELATIONS IN PARALLEL TEXTS: ANNOTATION PRINCIPLES
Journal
Informatics and Applications
2024, Volume 18, Issue 3, pp 106-114
Cover Date
2024-09-20
DOI
10.14357/19922264240313
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
linguistic annotation; discourse relations; logical-semantic relations; implicitness; parallel texts
Authors
A. A. Goncharov and P. V. Iaroshenko
Author Affiliations
Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
|