Systems and Means of Informatics
2017, Volume 27, Issue 2, pp 125-142
REVERSIBILITY AND ALTERNATIVENESS OF GENERALIZATION OF CONNECTIVES TRANSLATIONS MODELS IN PARALLEL TEXTS
- I. M. Zatsman
- O. S. Mamonova
- A. Yu. Shchurova
Abstract
The paper considers the task of annotation of Russian connectives and their translations with the use of a supracorpora database (SCDB). The first distinctive feature of the SCDB is that it supports creation of bilingual annotations that include both rubrics of the investigated linguistic items (i.e., connectives, in this case) and rubrics of their translations. The second feature is that the rubrics assigned by the linguists are in fact elements of faceted classifications. Implementation of these rubrics in the SCDB enables alternativeness of generalization of annotations that represent concrete informational entities in the SCDB. As these entities are created, abstract translation models of different generalization levels are produced. These models preserve certain common characteristics (aspects) of the generalizable annotations. The support of faceted classifications in the SCDB makes it possible to conduct multifaceted statistical analysis of annotations and connectives translation models in the SCDB. Furthermore, these statistical data are verifiable since the generated quantitative data provide direct links to lists of corresponding annotations. The main objective of the paper is to describe reversibility and alternativeness of the generalization processes in the SCDB, which provides a basis for conducting multifaceted and verifiable statistical analysis of annotations and connectives translation models in parallel texts.
[+] References (27)
- Prasad, R., and H. Bunt. 2015. Semantic relations in discourse: The current state of ISO 24617-8. 11th Joint ACL-ISO Workshop on Interoperable Semantic Annotation Proceedings. Tilburg: Tilburg University. 80-92.
- Bunt, H., and R. Prasad. 2016. ISO-DR-Core (ISO 24617-8): Core concepts for the annotation of discourse relations. 12th Joint ACL-ISO Workshop on Interoperable Semantic Annotation Proceedings. 45-54. Available at: http://www.lrec- conf.org/proceedings/lrec2016/LREC2016_Proceedings.zip (accessed March 8, 2017).
- Zatsman, I. M., O.Yu. In'kova, M. G. Kruzhkov, and N. A. Popkova. 2016. Pred- stavlenie krossyazykovykh znaniy o konnektorakh v nadkorpusnykh bazakh dannykh [Representation of cross-lingual knowledge about connectives in supracorpora databases]. Informatika i ee Primeneniya - Inform. Appl. 10(1): 106-118.
- Inkova-Manzotti, O.Yu. 2001. Konnektory protivopostavleniya vo frantsuzskom i russkom yazykakh. Sopostavitel'noe issledovanie [Connectives of opposition in French and Russian. A comparative study]. Moscow: Informelektro. 432 p.
- Zaliznyak, Anna A., I. M. Zatsman, O.Yu. In'kova, and M. G. Kruzhkov. 2015. Nadkorpusnye bazy dannykh kak lingvisticheskiy resurs [Supracorpora databases as linguistic resource]. 7th Conference (International) on Corpus Linguistics Proceedings. St. Petersburg: SPbGU. 211-218.
- Inkova, O. Yu. 2016. Kproblemeopisaniyamnogokomponentnykhkonnektorovrussko- go yazyka: ne tol'ko. . . no i [Towards the description of multiword connectives in Russian: Not only... but also]. Voprosy yazykoznaniya [Topics in the study of language] 2:37-60.
- Zaliznyak, Anna A., I. M. Zatsman, and O.Yu. In'kova. 2017. Nadkorpusnaya baza dannykh konnektorov: postroenie sistemy terminov [Supracorpora database on connectives: term system development]. Informatika i ee Primeneniya - Inform. Appl. 11(1):101-109.
- Dobrovol'skiy, D. O., A. A. Kretov, and S. A. Sharov. 2005. Korpus parallel'nykh tekstov: arkhitektura i vozmozhnosti ispol'zovaniya [Corpus of parallel texts: Architecture and applications]. Natsional'nyy korpus russkogo yazyka: 2003-2005 [Russian National Corpus: 2003-2005]. Moscow: Indrik. 263-296.
- Loiseau, S., D.V. Sitchinava, Anna A. Zalizniak, and I. M. Zatsman. 2013. Information technologies for creating the database of equivalent verbal forms in the Russian-French multivariant parallel corpus. Informatika i ee Primeneniya - Inform. Appl. 7(2): 100-109.
- Sitchinava, D.V. 2014. Ispol'zovanie parallel'nogo korpusa dlya kolichestvennogo izucheniya lingvospetsifichnoy leksiki [Using a parallel corpus for the quantitative study of language-specific units]. Yazyk, literatura, kul'tura: aktual'nye problemy izucheniya i prepodavaniya [Language, literature, culture: Actual problems of research and teaching] 10:37-44.
- Zaliznyak, Anna A. 2016. Baza dannykh mezh"yazykovykh ekvivalentsiy kak instrument lingvisticheskogo analiza [Database of cross-linguistic equivalences as a tool for linguistic analysis]. Computer Linguistics and Intellectual Technologies: Conference (International) "Dialog" Proceedings. Moscow: RGGU. 763-775.
- Eco, U. 1967. Opera aperta. Milano: Bompiani. 288 p.
- Zatsman, I. 2012. Tracing emerging meanings by computer: Semiotic framework. 13th European Conference on Knowledge Management Proceedings. Reading: Academic Publishing International Limited. 2:1298-1307.
- Zatsman, I., N. Buntman, M. Kruzhkov, V. Nuriev, and Anna A. Zalizniak. 2014. Conceptual framework for development of computer technology supporting cross- linguistic knowledge discovery. 15th European Conference on Knowledge Management Proceedings. Reading: Academic Publishing International Limited. 3:1063-1071.
- Zatsman, I., and N. Buntman. 2015. Outlining goals for discovering new knowledge and computerised tracing of emerging meanings discovery. 16th European Conference on Knowledge Management Proceedings. Reading: Academic Publishing International Limited. 851-860.
- Zatsman, I. 2015. Protsessy tselenapravlennoy generatsii i razvitiya krossyazykovykh ekspertnykh znaniy: semioticheskie osnovaniya modelirovaniya [Goal-oriented processes of cross-lingual expert knowledge creation: Semiotic foundations for modeling]. Informatika i ee Primeneniya - Inform. Appl. 9(3): 106-123.
- Zatsman, I., N. Buntman, A. Coldefy-Faucard, and V. Nuriev. 2016. WEB knowledge base for asynchronous brainstorming. 17th European Conference on Knowledge Management Proceedings. Reading: Academic Publishing International Limited. 976-983.
- Hull, R., and R. King. 1987. Semantic database modeling: Survey, applications, and research issues. ACM Comput. Surv. 19(3):201-260.
- Codd, E. F. 1979. Extending the database relational model to capture more meaning. ACM Trans. Database Syst. 4(4):397-434.
- Prasad, R., N. Dinesh, A. Lee, E. Miltsakaki, L. Robaldo, A. Joshi, and B. Webber. 2008. The Penn Discourse TreeBank 2.0. 6th Conference (International) on Language Resources and Evaluation Proceedings. Paris: European Language Resources Association. 2961-2968.
- Zufferey, S., and L. Degand. 2013. Annotating the meaning of discourse connectives in multilingual corpora. Corpus Linguist. Ling. Theory. 1-24. doi: 10.1515/cllt-2013-0022. Available at: https://www.researchgate.net/profile/Sandrine_Zufferey/publication/258088055_Annotating_the_meaning_of_discourse_connectives_in_ multilinguaLcorpora/links/00b49526e7355a985f000000.pdf (accessed March 8, 2017).
- Zatsman, I., O.Yu. In'kova, and V. Nuriev. 2017. Postroenie klassifikatsionnykh skhem: metody i tekhnologii ekspertnogo formirovaniya [Construction of classification schemes: Methods and technologies of experts' formation]. Nauchnaya i tekhnicheskaya informatsiya. Ser. 2 [Scientific and Technical Information. Ser. 2] 1:8-22.
- Kruzhkov, M. G., N. V. Buntman, E. Yu. Loshchilova, D. V. Sitchinava, Anna A. Zalizniak, and I. M. Zatsman. 2014. A database of Russian verbal forms and their French translation equivalents. Computer Linguistics and Intellectual Technologies: Conference (International) "Dialog" Proceedings. Moscow: RGGU. 13(20):284-296.
- Zalizniak, Anna A., andM. G. Kruzhkov. 2016. Baza dannykh bezlichnykhglagol'nykh konstruktsiy russkogo yazyka [Database of Russian impersonal verbal constructions]. Informatika i ee Primeneniya - Inform. Appl. 10(4): 13 6-145.
- Zalizniak, Anna A. 2015. Lingvospetsifichnye edinitsy russkogo yazyka v svete kon- trastivnogo korpusnogo analiza. Computer Linguistics and Intellectual Technologies: Conference (International) "Dialog" Proceedings. Moscow: RGGU. 14(21):651-662.
- Berlin, I. 1979. Against the current: Essays in the history of ideas. London: The Hogarth Press. 300 p.
- Rosenbloom, P. S. 2013. On computing: The fourth great scientific domain. Cambridge, MA: MIT Press. 307 p.
[+] About this article
Title
REVERSIBILITY AND ALTERNATIVENESS OF GENERALIZATION OF CONNECTIVES TRANSLATIONS MODELS IN PARALLEL TEXTS
Journal
Systems and Means of Informatics
Volume 27, Issue 2, pp 125-142
Cover Date
2017-05-30
DOI
10.14357/08696527170211
Print ISSN
0869-6527
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
supracorpora database; annotation of connectives; faceted classifications; corpus linguistics; generalization of annotations
Authors
I. M. Zatsman ,
O. S. Mamonova ,
and A. Yu. Shchurova
Author Affiliations
Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
Faculty of Foreign Languages and Area Studies, M. V. Lomonosov Moscow State University, 31-a Lomonosov Str., Moscow 119192, Russian Federation
|