Informatics and Applications
2016, Volume 10, Issue 1, pp 106-118
REPRESENTATION OF CROSS-LINGUAL KNOWLEDGE ABOUT CONNECTORS IN SUPRACORPORA DATABASES
- I. M. Zatsman
- O. Yu. Inkova
- M. G. Kruzhkov
- N. A. Popkova
Abstract
The article considers "supracorpora databases," which are used in contrastive linguistic studies. Such databases result from processing of parallel texts from bilingual parallel subcorpora within the Russian National Corpus. Each of these parallel texts contains either one original Russian text with one or more translations into a foreign language, or one original text in a foreign language with one translation into Russian. Every source text is aligned with its translation(s) at the level of sentences. Supracorpora databases are a new type of linguistic resources designed for goal-oriented discovery of new knowledge about various linguistic units. This knowledge is needed to improve the quality of machine translation, to update monolingual and bilingual grammars, and to modernize a wide range of academic courses in such fields as linguistics and translation studies. The article describes the underlying conceptual foundations of the database and gives an example of how it can be implemented to represent knowledge about Russian connectors and their French translation correspondences.
[+] References (18)
- Johansson, S. 2007. Seeing through Multilingual Corpora:
On the use of corpora in contrastive studies. Amsterdam:
John Benjamins. 377 p.
- Loiseau, S., D.V. Sitchinava, Anna A. Zalizniak, and
I. M. Zatsman. 2013. Information technologies for creating the database of equivalent verbal forms in the
Russian-French multivariant parallel corpus. Informatika
i ee Primeneniya - Inform. Appl. 7(2):100-109.
- Kruzhkov, M. G., N.V. Buntman, E.Ju. Loshchilova,
D. V. Sitchinava, Anna A. Zalizniak, and I. M. Zatsman.
2014. Adatabase of Russian verbal forms and their French
translation equivalents. Komp'yuternaya Lingvistika i Intellektual'nye Tekhnologii. Po mat-lam Ezhegodnoy Mezhdunar. Konf. "Dialog-2014" [Computational Linguistics
and Intellectual Technologies: Conference (International)
"Dialog-2014" Proceedings]. Moscow. 13(20):284-297.
- Buntman, N.V., Anna A. Zaliznyak, I. M. Zatsman,
M. G. Kruzhkov, E. Yu. Loshchilova, and D. V. Sichinava. 2014. Informatsionnye tekhnologii korpusnykh issledovaniy: printsipy postroeniya kross-lingvisticheskikh baz
dannykh [Information technologies for corpus studies:
Underpinnings for cross-linguistic database creation]. Informatika i ee Primeniya - Inform. Appl. 8(2):98-110.
- Zatsman, I., and N. Buntman. 2015. Outlining goals for discovering new knowledge and computerised tracing of emerging meanings discovery. 16th European Conference on Knowledge Management Proceedings. Reading: Aca-demic Publishing International Ltd. 851-860.
- Zaliznyak, Anna A., I. M. Zatsman, O. Yu. In'kova, and M. G. Kruzhkov. 2015. Nadkorpusnye bazy dannykh kak lingvisticheskiy resurs [Subcorpora databases as linguistic resource]. Korpusnaya Lingvistika: Tr. 7-y Mezhdunar. Konf. [7th Conference (International) on Corpus Linguistics Proceedings]. St. Petersburg: St. Petersburg State University. 211-218.
- Kruzhkov, M.G. 2015. Informatsionnye resursy kon- trastivnykh lingvisticheskikh issledovaniy: Tipologicheskie bazy dannykh [Information resources for contrastive stud-ies: Typological databases]. Sistemy i Sredstva Informati- ki - Systems and Means of Informatics 25(1):198-212.
- Kruzhkov, M.G. 2015. Informatsionnye resursy kon- trastivnykh lingvisticheskikh issledovaniy: Elektronnye korpusa tekstov [Information resources for contrastive studies: Electronic text corpora]. Sistemy i Sredstva Infor- matiki - Systems and Means of Informatics 25(2):140-159.
- Dobrovol'skiy, D. O., A. A. Kretov, and S. A. Sharov. 2005. Korpus parallel'nykh tekstov: Arkhitektura i vozmozhnosti ispol'zovaniya [Corpus of parallel texts: Architecture and applications]. Natsional'nyy korpus russkogo yazy- ka: 2003-2005 [Russian National Corpus: 2003-2005]. Moscow: Indrik. 263-296.
- Dobrovol'skiy, D. O, A. A. Kretov, and S. A. Sharov. 2005. Korpus parallel'nykh tekstov [Corpus of parallel texts]. Nauchnaya i Tekhnicheskaya Informatsiya [Scientific and Technical Information]. Ser. 2: Informatsionnye protsessy
i sistemy [Informationalprocesses and systems]. 6:16-27.
- Zatsman, I. 2013. Computer and information science: Background of formation. Scientific Technical Information Processing 40(3):119-130.
- Zatsman, I. 2014. Table of interfaces of informatics as computer and information science. Scientific Technical Information Processing 41(4):233-246.
- Zatsman, I.M., V.V. Kosarik, and O.A. Kurchavova. 2008. Zadachi predstavleniya lichnostnykh i kollektivnykh kontseptov v tsifrovoy srede [Representation of individual and collective concepts in digital medium]. Informatika
i ee Primeneniya - Inform. Appl. 2(3):54-69.
- Zatsman, I. 2009. Semioticheskaya model' vzaimosvyazey kontseptov, informatsionnykh ob"ektov i komp'yuternykh kodov [Semiotic model of relationships of concepts, in
formation objects, and computer codes]. Informatika i ee Primeneniya - Inform. Appl. 3(2):65-81.
- Zatsman, I. 2009. Nestatsionarnaya semioticheskaya model' komp'yuternogo kodirovaniya kontseptov, infor-matsionnykh ob"ektov i denotatov [Nonstationary semi- otic model of computer coding of concepts, information objects, and denotata]. Informatika i ee Primeneniya - Inform. Appl. 3(4):87-101.
- Zatsman, I., N. Buntman, M. Kruzhkov, V. Nuriev, and Anna A. Zalizniak. 2014. Conceptual framework for development of computer technology supporting cross- linguistic knowledge discovery. 15th European Conference on Knowledge Management Proceedings. Reading: Aca-demic Publishing International Ltd. 3:1063-1071.
- Inkova-Manzotti, O. Yu. 2001. Konnektory protivopostavleniya vo frantsuzskom i russkom yazykakh: So- postavitel'noe issledovanie [Connectors of opposition in French and Russian: A comparative study]. Moscow: In- formelektro. 434 p.
- Inkova, O. Yu. 2016. K probleme opisaniya mnogokomponentnykh konnektorov russkogo yazyka: Ne tol'ko... no i [Towards the description ofmultiword connectives in Russian: Ne tol'ko... no i (non only... but also)]. Voprosy Jazykoznanija [Topics in the Study of Language] 2:37-60.
[+] About this article
Title
REPRESENTATION OF CROSS-LINGUAL KNOWLEDGE ABOUT CONNECTORS IN SUPRACORPORA DATABASES
Journal
Informatics and Applications
2016, Volume 10, Issue 1, pp 106-118
Cover Date
2016-01-30
DOI
10.14357/19922264160110
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
cross-lingual studies; Russian connectors; representation of knowledge about connectors; supracorpora databases
Authors
I. M. Zatsman , O. Yu. Inkova , ,
M. G. Kruzhkov , and N. A. Popkova
Author Affiliations
Institute of Informatics Problems, Federal Research Center “Computer Sciences and Control” of the Russian
Academy of Sciences, 44-2 Vavilov Str.,Moscow 119333, Russian Federation
University of Geneva, 22 Bd des Philosophes, CH-1205 Geneva 4, Switzerland
|