Informatics and Applications
2024, Volume 18, Issue 3, pp 97-105
A MODEL FOR EXTRACTING KNOWLEDGE FROM PARALLEL TEXTS OF A LEXICOGRAPHIC INFORMATION SYSTEM
- D. O. Dobrovol’skij
- I. M. Zatsman
Abstract
The problem-oriented model of extracting linguistic knowledge from parallel texts is considered to be
a key theoretical component for creating a lexicographic information system that provides integration of electronic
bilingual dictionaries and parallel corpora. The proposed approach to solving the integration problem takes into
account the emergence of new meanings of words and phrasemes which is due to the acquisition of new knowledge
by experts who discover these meanings as a result of semantic analysis of regularly updated corpus data. The
proposed model describes the human–computer interaction including the search for fragments of parallel texts
as potential sources of new linguistic knowledge, its extracting by experts from texts, and representation in the
lexicographic information system. The basis for building the problem-oriented model is the spiral model of
knowledge generation which was proposed by Ikujiro Nonaka in 1991. The purpose of the paper is to describe
the stages of building the model for discovering linguistic knowledge used in the lexicographic information system
design.
[+] References (16)
- Nonaka, I. 1991. The knowledge-creating company. Harvard
Bus. Rev. 69(6):96–104.
- Nonaka, I., and H. Takeuchi. 1995. The knowledge-creating
company. Oxford, NY: Oxford University Press.
284 p.
- Dobrovol’skij, D.O., and Anna A. Zalizniak. 2018.
Nemetskie konstruktsii s modal’nymi glagolami i ikh
russkie sootvetstviya: proekt nadkorpusnoy bazy dannykh
[German constructions with modal verbs and their Russian
correlates: A supracorpora database project]. Computer
Linguistic and Intellectual Technologies: Conference
(International) “Dialog” Proceedings. Moscow: Russian
State University for the Humanities. 17(24):172–184.
- Zatsman, I.M. 2021. Problemno-oriyentirovannaya aktualizatsiya
slovarnykh statey dvuyazychnykh slovarey i meditsinskoy
terminologii: sopostavitel’nyy analiz [Problem-oriented
updating of dictionary entries of bilingual
dictionaries and medical terminology: Comparative analysis].
Informatika i ee Primeneniya — Inform. Appl.
15(1):94–101. doi: 10.14357/19922264210113. EDN: DMCMSK.
- Klein,W., and A.Geyken. 2010. Das Digitale Worterbuch
der Deutschen Sprache (DWDS). Lexicographica
26(2010):79–96. doi: 10.1515/9783110223231.1.79.
- Dobrovol’skij, D.O. 2015. Korpus parallel’nykh tekstov
i sopostavitel’naya leksikologiya [The corpus of parallel
texts and contrastive lexicology]. Trudy Instituta russkogo
yazyka im. V. V. Vinogradova [Proceedings of the
V. V. Vinogradov Russian Language Institute] 6:413–449.
EDN: VJQBHP.
- Goncharov, A.A. 2023. Annotirovanie parallel’nykh korpusov:
podkhody i napravleniya razvitiya [Parallel corpus
annotation: Approaches and directions for development].
Informatika i ee Primeneniya—Inform. Appl. 17(4):81–87.
doi: 10.14357/19922264230411. EDN: GDKDOZ.
- Goncharov, A.A., I.M. Zatsman, and M.G. Kruzhkov.
2020. Evolyutsiya klassifikatsiy v nadkorpusnykh bazakh
dannykh [Evolution of classifications in supracorpora
databases]. Informatika i ee Primeneniya — Inform. Appl.
14(4):108–116. doi: 10.14357/19922264200415. EDN:
GKWBZT.
- Goncharov, A.A., I.M. Zatsman, and M.G. Kruzhkov.
2021. Predstavlenie novykh leksikograficheskikh znaniy
v dinamicheskikh klassifikatsionnykh sistemakh [Representation
of new lexicographical knowledge in dynamic
classification systems]. Informatika i ee Primeneniya—Inform.
Appl. 15(1):86–93. doi: 10.14357/19922264210112.
EDN: OPEFXW.
- Zatsman, I.M. 2024. Transformatsii ob”ektov pervogo
i vtorogo poryadka v leksikograficheskoy informatsionnoy
sisteme [Object transformations of the first and second
order in a lexicographic information system]. Informatika
i ee Primeneniya — Inform. Appl. 18(2):82–91. doi:
10.14357/19922264240211. EDN: VZTGVV.
- Ackoff, R. 1989. From data to wisdom. J. Applied Systems
Analysis 16(1):3–9.
- Zatsman, I.M. 2023. Dannye, informatsiya i znanie v nauchnoy
paradigme informatiki [On the scientific paradigm
of informatics:Data, information, and knowledge]. Informatika
i ee Primeneniya — Inform. Appl. 17(1):116–125.
doi: 10.14357/19922264230115. EDN: CWIROJ.
- Wierzbicki, A. P., and Y. Nakamori. 2006. Basic dimensions
of creative space. Creative space: Models of creative
processes for knowledge civilization age.Eds.A.P.Wierzbicki
and Y. Nakamori. Berlin: Springer Verlag. 59–90.
- Zatsman, I. 2023. Digital spiral model of knowledge
creation and encoding its dynamics. 18th Forum (International)
on Knowledge Asset Dynamics Proceedings.Matera,
Italy: Arts for Business Institute. 581–596. Available at:
https://www.researchgate.net/publication/371303696
Digital Spiral Model of Knowledge Creation and
Encoding its Dynamics (accessed July 30, 2024).
- Bratianu, C. 2019. A strategic view on the knowledge
dynamics models used in knowledge management. 20th
European Conference on Knowledge Management Proceedings.
Reading, U.K.: Academic Publishing International
Ltd. 1:185–192.
- Goncharov, A.A., I.M. Zatsman, M.G. Kruzhkov, and
E.Yu. Loshchilova. 2021. Otrazhenie evolyutsii leksikograficheskikh
znaniy v dinamicheskikh klassifikatsionnykh
sistemakh [Capturing evolution of lexicographic
knowledge in dynamic classification systems]. Informatika
i ee Primeneniya — Inform. Appl. 15(4):41–49. doi:
10.14357/19922264210406. EDN:MGORMY.
[+] About this article
Title
A MODEL FOR EXTRACTING KNOWLEDGE FROM PARALLEL TEXTS OF A LEXICOGRAPHIC INFORMATION SYSTEM
Journal
Informatics and Applications
2024, Volume 18, Issue 3, pp 97-105
Cover Date
2024-09-20
DOI
10.14357/19922264240312
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
lexicographic information system; parallel texts; spiral model of knowledge generation; problem-oriented model
Authors
D. O. Dobrovol’skij , , and I. M. Zatsman
Author Affiliations
Vinogradov Russian Language Institute of the Russian Academy of Sciences, 18/2 Volkhonka Str., Moscow 119019, Russian Federation
Institute of Linguistics of the Russian Academy of Sciences, 1-1 Bolshoy Kislovsky Lane, Moscow 125009, Russian Federation
Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
|