Институт проблем информатики Российской Академии наук
Институт проблем информатики Российской Академии наук
Российская Академия наук

Институт проблем информатики Российской Академии наук



«Systems and Means of Informatics»
Scientific journal
Volume 29, Issue 3, 2019

Content | About  Authors

Abstract and Keywords.

SELECTING THE DIMENSIONALITY FOR MIXTURE OF PROBABILISTIC PRINCIPAL COMPONENT ANALYZERS
  • M. P. Krivenko

Abstract: The article considers the problems of choosing structural parameters characterizing the model of a mixture of probabilistic principal component analyzers, namely, the number of elements of the mixture and the dimensions of these elements. Among the set of approaches used in practice for the task of classifying data, only sampling management methods are actually available.
To implement the choice of dimensions, it is proposed to use a combination of the known methods for model selecting. The mixture of probabilistic principal component analysis allows one to model bulk data using a relatively small number of free parameters. The number of free parameters can be controlled by selecting the latent dimension of the data.

Keywords: probabilistic principal component analysis (PPCA); mixtures of PPCA; model selection criterion; bootstrap; cross-validation

CONDITIONALLY OPTIMAL LINEAR ESTIMATION OF NORMAL PROCESSES IN VOLTERRA STOCHASTIC SYSTEMS
  • I. N. Sinitsyn
  • V. I. Sinitsyn

Abstract: On the basis of Pugachev's conditionally optimal estimation (filtering and extrapolation) and previous investigations of the present authors, two estimation approximate conditionally optimal methods for normal stochastic processes in Volterra stochastic systems (VStS) reducible to linear StS with additive and parametric noises are developed. Some approaches for synthesis of Pugachev's filters and extrapolators by replacing parametric noises with equivalent corresponding additive noises are given. Test examples for onedimensional VStS are presented. The given theory and test examples may be simply generalized to VStS with autocorrelated noises and VStS with hereditary and nonlinear interaction functions.

Keywords: Volterra stochastic systems (VStS); method of analytical modeling (MAM); method of canonical expansions (MCE); method of normal approximation (MNA); method of statistical linearization (MSL); stochastic system (StS); stochastic process (StP); Pugachev conditionally optimal filters and extrapola- tors; Kalman filters and extrapolators

ADVANTAGE INDEX IN BAYESIAN RELIABILITY AND BALANCE MODELS WITH BETA-POLYNOMIAL A PRIORI DENSITIES
  • A. A. Kudryavtsev
  • S. I. Palionnaia
  • O. V. Shestakov

Abstract: This work is devoted to the research of the probabilistic characteristics of the advantage index in Bayesian balance models, when negative and positive factors affecting the functioning of the system have an a priori beta-distribution and distribution with polynomial density, for example, uniform or parabolic distribution. The results of the work can be used to research marginal reliability of complex modifiable information-communication systems and other advantage indexes, for example, availability ratio and probability of staying in working condition in reliability theory, probability that the call will not be lost, in the theory of mass service, etc. The given method can be used for similar formulations of the problems in the research of distributions with piecewise polynomial a priori densities, for example, Simpson distribution, Irwin-Hall distribution, Bates distribution, etc.

Keywords: Bayesian method; mixed distributions; balance models; advantage index; reliability growth; beta-distribution

APPROXIMATION OF ANTENNA DIRECTIVITY GAIN FOR DIRECTIONAL DEAFNESS ANALYSIS IN THREE-DIMENSIONAL SPACE
  • O. V. Chukhno
  • N. V. Chukhno
  • Yu. V. Gaidamaka
  • S. Ya. Shorgin

Abstract: The paper deals with the problem of "directional deafness" that arises when a device cannot detect an occupied radio channel due to the highly directional communication link between other devices interacting at that time.
The "deafness" situation can arise between operating in the millimeter band devices, for example, during carrier-sense multiple access stage, in particular, in accordance with the IEEE 802.11ad/ay protocols. An analytical expression has been obtained for the "directional deafness" probability for several variants of the devices location in three-dimensional (3D) space and for the proposed linear approximation of the antenna directivity gain. The proposed formula for the lower bound of the deafness probability for three realistic antenna patterns and four variants of phased antenna arrays is investigated.

Keywords: mmWave; directional deafness; 3D; directional access

CLUSTERING METHOD OF NEWS MEDIA REPORTS BASED ON CONCEPTUAL ANALYSIS
  • V. N. Zakharov
  • R. R. Musabaev
  • A. M. Krasovitskiy
  • Y. D. Kozlovskaya
  • Al-dr A. Khoroshilov
  • Al-ey A. Khoroshilov

Abstract: The article describes the solution of a clustering news media reports based on the technique developed by authors of automatic calculation of a measure of semantic meaningfulness of the names of concepts of documents using their statistical, syntactic, and semantic features and technologies of automatic generation of declarative means for clustering documents based on the methods of their semantic-syntactic and conceptual analysis. On the basis of the suggested technique of calculation of a measure of semantic meaningfulness of the names of concepts and the software and declarative means created by the study process, an experiment was conducted to process a representative array of news media reports. The analysis of the results showed that the use of semantic correlating coefficients of concepts improves the accuracy of establishing semantic similarity between documents at automatically establishing the semantic meaningfulness of textual names of concepts.

Keywords: text clustering; semantic-syntactic analysis; conceptual analysis; declarative means; statistical measure of meaningfulness of textual names of documents; semantic correlating coefficient; semantic similarity between documents

THE SCIENCE CONTEXTUAL CITATION INDEX
  • I. V. Galina
  • M. M. Charnine

Abstract: A new indicator of the quality of a scientific article - the science contextual citation index (SCCI) and the relationship between the SCCI and the science citation index (SCI) with another author's indicator - a similarity measure (a semantic similarity measure) of two arbitrary texts are considered.
The results of experiments with these parameters are given, in particular, the correlation between SCCI and SCI, which depends on the value of the semantic similarity threshold, is studied. Based on modeling the values of independent variables and their regression coefficients, a predictive mathematical probability model is proposed for the dependence of the number of direct citations on the number of implicit references and their parameters.

Keywords: automated systems; science contextual citation index; semantic similarity measure; explicit and implicit reference

SUPRACORPORA DATABASES IN LINGUISTIC PROJECTS
  • A. Yu. Egorova
  • I. M. Zatsman
  • O. S. Mamonova

Abstract: The paper considers the task of providing linguistic studies with means of supracorpora databases containing aligned parallel texts (each includes the original text and its translation) as well as bilingual annotations of the researched linguistic items and their translation equivalents formed on the basis of parallel texts. Each annotation, formed by a linguist, fixes a translation model of a linguistic item. The experience of implementing several linguistic projects at Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences showed that not all translation models that linguists extract from parallel texts during linguistic annotation are described in bilingual dictionaries and handbooks. Thus, supracorpora databases allow researchers to create new knowledge about the translation equivalents of the researched linguistic items. It is extracted by linguists when comparing and annotating the sentences of the original text and their translations. The main aim of the paper is to describe the functions of supracorpora databases that provide linguists with new knowledge in the process of annotation.

Keywords: supracorpora database; linguistic annotation; linguistic unit; corpus linguistics; translation models

MACHINE TRANSLATION ERRORS: PROBLEMS OF CLASSIFICATION
  • A. A. Goncharov
  • N. V. Buntman
  • V.A. Nuriev

Abstract: The paper considers the problems of classifying machine translation errors. Its first part reviews some approaches to evaluation of machine translation quality and to classification of errors that machine translation systems tend to make. The other part of the paper describes an original taxonomy of machine translation errors - the targeted one. It has been devised specifically to classify the errors central to translation of connectives (from Russian into French). To date, there have been no such studies for this pair of languages. The proposed classification includes two groups of errors: (i) grammatical/lexical errors in the translation of the text chunk where a given connective occurs; and (ii) errors in the translation of a connective itself. This study uses a parallel Russian-French corpus that stores Russian source texts and their reference - made by professional humans - translations into French. The corpus totals 300 thousand sentences (about 4 million words). The source texts where connectives occur have been used to generate machine translations by two automated systems.

Keywords: classification; machine translation; quality of machine translation; machine translation errors

SEQUENTIAL SELF-TIMED CELL CHARACTERIZATION
  • Yu. A. Stepchenkov
  • Yu. G. Diachenko
  • N. V. Morozov
  • D. Yu. Stepchenkov
  • D. Yu. Diachenko

Abstract: Functional specificity of the self-timed circuits makes special requirements to their characterization procedure. This procedure should take into account a signal conditioning discipline for information and phase signals on base of user defined attributes of the characterized cell's inputs and outputs. The paper describes a technique of adjusting characterization process for sequential self-timed cells. It is based on using vectors that set static values and transition direction for all inputs and outputs. Algorithmization and implementation of the suggested approach in new SAHIB characterization system version have increased its efficiency and provided the valid characterization of all sequential cell types in the self-timed cell library for 65-nanometer standard CMOS (complementary metal-oxide-semiconductor) process. Automatic introduction of the Verilog constructions analyzing change order of all cell inputs and notifying their invalid sequence into the sequential cell models during characterization procedure accelerates and mitigates self-timed circuit design.

Keywords: self-timed circuit; timing parameters; characterization; simulation; sequential cell; initial state

THE METHOD OF SELECTING A VARIANT OF THE CONSTRUCTION OF INFORMATION AND TELECOMMUNICATION SYSTEMS
  • A. A. Zatsarinny
  • Yu. S. Ionenkov

Abstract: The article is devoted to the description of the method of choosing the option of building an information and telecommunication system (ITCS). The authors discuss the general methodological approach to the selection of solutions to system integrators of building ITCS taking into account their characteristics, principles, and conditions of the build. The method of choice of variants of building an ITCS includes two interrelated techniques: the methodology to assess the effectiveness of ITCS and the method of selection of variants of ITCS. The authors describe the methodology for assessing the effectiveness of the ITCS developed in previous publications. The method of choosing the option of building ITCS takes into account the contribution to the effectiveness of the relevant organizational system, technical feasibility, and risks of development and application. A list of specific performance indicators for each of the three groups of generalized performance indicators (contribution to the efficiency of the organizational system, technical feasibility, and risks) is proposed.

Keywords: information and telecommunication system; efficiency; indicator; criterion; technology

ABOUT THE PROBLEM OF INFORMATION RESOURCES INTEGRATION
  • S. K. Dulin
  • I. N. Rozenberg
  • V. I. Umanskiy

Abstract: The paper analyzes the processes typical for the integration of information, knowledge of analysts, and their joint actions in the information environment. Here, the integration of knowledge refers to the procedure for the synthesis of existing knowledge in order to obtain the new one. Three stages of analytical activity were identified and considered as well as their features. It is proposed to integrate information resources based on dynamic restructuring of the knowledge base to maintain its structural consistency and to present it as a structured set of information resources in accordance with the requirements of interoperability. To solve such problems, the authors use a technique based on an inductive-combinatorial apparatus comparing the structures of the connections of an arbitrary set and one of the types of consistent sets. This technique was chosen by the authors as the theoretical basis for the implementation of the tasks under consideration.

Keywords: information resources; knowledge base restructuring; interoperability; information integration

MODELING OF AGENT CONFLICTS IN HYBRID INTELLIGENT MULTIAGENT SYSTEMS
  • S. V. Listopad
  • I. A. Kirikov

Abstract: Conflict management is an integral part of the problem solving process by expert team at the round table, encouraging of conflicts that positively influence the course of solving the problem and preventing or resolving all others.
The existing models of hybrid intelligent multiagent systems have a significant drawback, which is that they do not model agent conflicts and a single agent based on recommendations of other agents makes final decisions. Modeling of conflicts in hybrid intelligent multiagent systems will make it possible to manage the "discussion" process, activating various types of collective thinking depending on the nature and intensity of the conflict, which will ensure their relevance to small teams of experts who successfully solve problems, which are underdetermined, characterized by a high combinatorial complexity, heterogene-ity, and other NON-factors. For this purpose, the paper proposes a model of problem and process oriented conflict in hybrid intelligent multiagent systems.

Keywords: conflict; hybrid intelligent multiagent system; expert team; round table

FUZZY STRING COMPARISON WHEN PROCESSING PERSONAL DATA
  • O. V. Bobyleva
  • I. S. Bekesheva
  • V. A. Bobylev
  • V. V. Charkova

Abstract: The article substantiates the necessity to develop a new fuzzy search method targeted at word comparison within the databases containing personal data. The advantages of this algorithm application are demonstrated through the specific examples from the sphere of health insurance. The development of mathematical algorithm model is carried out on the basis of cell texture of matrices.

Keywords: algorithm; fuzzy search; fuzzy comparison; matrix

THE ERROR CORRECTION PROCESS IN THE SEMANTIC NETWORK AS A NONLINEAR DYNAMIC SYSTEM
  • I. M. Adamovich
  • O. I. Volkov

Abstract: This article continues the series of works devoted to modeling the errors of independent users in the formation of a semantic network which is the basis for the distributed technology to support specific historical investigations.
This article is devoted to the description and substantiation of the approach to modeling of organizational measures of search and correction of errors of subnetworks of the semantic network of technology copies. The specificity of this type of errors was described and the necessity of their study was substantiated. The proposed approach is to analyze the processes of the number of semantic network errors changing and the efforts of users countering its increase as a nonlinear dynamic system. As a part of these efforts, a separate subclass - volunteerism characterized by voluntary and targeted actions of users to correct the errors - was highlighted and described. Using this approach, the effectiveness of volunteers' actions was quantified and on the basis of this estimate, the recommendations for the community of technology users were formulated.

Keywords: semantic net; model; user errors; dynamic systems; error correction

FORMATION OF SITUATIONALLY DEPENDENT SYSTEMS OF REQUIREMENTS FOR SOLVING THE PROBLEMS OF COST PLANNING
  • A. V. Ilyin
  • V. D. Ilyin

Abstract: An approach to the expert formation of situationally dependent systems of requirements for solving cost planning problems is proposed. The statement and methods for solving linear problem of situational cost planning are presented. Depending on the set of requirements, the problem is solved either by the method of prioritized interval allocation, or by the method of target displacement of solution. Both methods allow finding a cost plan that always meets the mandatory requirements and satisfy the orienting requirements as much as possible. At each step of the plan search in the computational experiment mode, the problem formulation is determined by the system of mandatory and orienting requirements, which is generated by the expert planner on the base of the situation portraits analysis. It is possible to set several indicators of the solution quality. The presentation of input data and the planning result in the form of numerical segments allows to take into account the accuracy of forecasting the amount of the allocated resource and the expected costs. The portraits of situations (target, starting, and achieved) formed with the help of digital twins are presented by a formalized description of the key parameters characterizing the state of the sources of the consumed resource, its consumers, and the planning conditions. The characteristic of the active online cost planning service is given.

Keywords: situationally dependent systems of requirements; situational cost planning; method of prioritized interval allocation; method of target displacement of solution; situation portraits; online cost planning service

THE DATA EMBEDDING METHOD BASED ON THE SECRET SHARING SCHEME
  • Yu. V. Kosolapov

Abstract: Important characteristics of stegosystems are the relative length а of the message to be inserted and the relative effectiveness of the embedding e. Additional essential characteristics of such systems are the degree of freedom in choosing modifiable bits of the container and the ability to resist loss of part of the stegocontainer blocks. This paper is devoted to the development of the stegosystem which, on the one hand, allows recovering partially lost data and, on the other hand, gives the opportunity to choose modifiable bits. On the basis of this system, stegosystems are constructed and investigated, for which the characteristics of а and e are calculated and the degree of freedom and the number of erasures of blocks that do not distort the disseminated data are estimated.

Keywords: information embedding; a secret sharing scheme

METHODS OF IDENTIFICATION OF "WEAK" SIGNS OF VIOLATIONS OF INFORMATION SECURITY
  • N. A. Grusho

Abstract: To ensure information security of information technologies in distributed information computing systems, a metadata mechanism implementing a permit system for establishing connections in a network has previously been proposed. If a host is captured by an adversary, there is a strategy for organizing attacks that are not detected at the traditional metadata level. A number of errors in data that can be generated by an adversary during the implementation of infor-mation technology require the construction of cause-and-effect chains preceding the error in order to identify the cause of the error. At the same time, metadata implement a simplified model of cause-and-effect relations when solving problems during implementation of information technology. This model can be used to find the specified errors. The author constructs a synergistic relationship between the solution of the mentioned problem of information security and the work of an experienced system administrator to determine the causes of implicit errors. This relationship allows leveraging the expertise of system administrators to make it easier to find a captured host and some strategies of an adversary to incorporate errors into the implementation of information technology. It also minimizes network reconfiguration requirements to bypass the captured host.

Keywords: information security; metadata; cause-and-effect relationships; system administration; implicit failures and errors