Informatics and Applications

2024, Volume 18, Issue 3, pp 97-105

A MODEL FOR EXTRACTING KNOWLEDGE FROM PARALLEL TEXTS OF A LEXICOGRAPHIC INFORMATION SYSTEM

  • D. O. Dobrovol’skij
  • I. M. Zatsman

Abstract

The problem-oriented model of extracting linguistic knowledge from parallel texts is considered to be a key theoretical component for creating a lexicographic information system that provides integration of electronic bilingual dictionaries and parallel corpora. The proposed approach to solving the integration problem takes into account the emergence of new meanings of words and phrasemes which is due to the acquisition of new knowledge by experts who discover these meanings as a result of semantic analysis of regularly updated corpus data. The proposed model describes the human–computer interaction including the search for fragments of parallel texts as potential sources of new linguistic knowledge, its extracting by experts from texts, and representation in the lexicographic information system. The basis for building the problem-oriented model is the spiral model of knowledge generation which was proposed by Ikujiro Nonaka in 1991. The purpose of the paper is to describe the stages of building the model for discovering linguistic knowledge used in the lexicographic information system design.

[+] References (16)

[+] About this article