Институт проблем информатики Российской Академии наук
Институт проблем информатики Российской Академии наук
Российская Академия наук

Институт проблем информатики Российской Академии наук




«INFORMATICS AND APPLICATIONS»
Scientific journal
Volume 12, Issue 2, 2018

Content | About  Authors

Abstract and Keywords.

ON THE ROBUSTNESS OF CONFIGURATION GRAPHS IN A RANDOM ENVIRONMENT
  • M. M. Leri   Institute of Applied Mathematical Research of the Karelian Research Centre of the Russian Academy of Sciences, 11 Pushkinskaya Str., Petrozavodsk 185910, Russian Federation
  • Yu. L. Pavlov   Institute of Applied Mathematical Research of the Karelian Research Centre of the Russian Academy of Sciences, 11 Pushkinskaya Str., Petrozavodsk 185910, Russian Federation

Abstract: The paper considers configuration graphs with vertex degrees being independent identically distributed random variables following the power-law distribution with a random parameter. The parameter of the vertex degree distribution follows the truncated gamma distribution. The authors study the robustness of such graphs to the two types of destruction processes: random and targeted. The graphs function in a random environment where the values of the vertex degree distribution parameter are chosen separately for each vertex. A comparative analysis of destruction effects on these models and on graphs with the degree distribution common for all vertices and induced by averaging over the distribution parameter has been performed. The conditions under which the study of the graphs' behavior in a random environment can be reduced to the study of the evolution of graphs with an averaged vertex degree distribution are discussed. A comparative analysis of destruction effects of the two types of destruction processes has been performed.

Keywords: configuration graphs; power-law distribution; gamma distribution; robustness; forest fire model; simulation

UNBIASED RISK ESTIMATE OF STABILIZED HARD THRESHOLDING IN THE MODEL WITH A LONG-RANGE DEPENDENCE
  • O.V. Shestakov   Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, GSP-1, Moscow 119991, Russian Federation, Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: De-noising methods for processing signals and images, based on the thresholding of wavelet decomposition coefficients, have become popular due to their simplicity, speed, and the ability to adapt to signal functions that have a different degree of regularity at different locations. An analysis of inaccuracies of these methods is an important practical task, since it makes it possible to evaluate the quality of both the methods themselves and the equipment used for processing. The present author considers the recently proposed stabilized hard thresholding method which avoids the main disadvantages of the popular soft and hard thresholding techniques. The statistical properties of this method are studied. In the model with an additive Gaussian noise, the author analyzes the unbiased risk estimate. Assuming that the noise coefficients have a long-range dependence, the author formulates the conditions under which strong consistency and asymptotic normality of the unbiased risk estimate take place. The results obtained make it possible to construct asymptotic confidence intervals for the threshold processing errors using only observable data.

Keywords: wavelets; thresholding; unbiased risk estimate; correlated noise; asymptotic normality

MINIMIZATION OF ERRORS OF CALCULATING WAVELET COEFFICIENTS WHILE SOLVING INVERSE PROBLEMS
  • A. A. Kudryavtsev  Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, GSP-1, Moscow 119991, Russian Federation
  • O.V. Shestakov   Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, GSP-1, Moscow 119991, Russian Federation, Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: Statistical inverse problems arise in many applied fields, including medicine, astronomy, biology, plasma physics, chemistry, etc. At the same time, there are always errors in the observed data due to imperfect equipment, background noise, data discretization, and other reasons. To reduce these errors, it is necessary to apply special regularization methods that allow constructing approximate stable solutions of inverse problems. The classical regularization methods are based on the use of windowed singular value decomposition. However, this approach takes into account only the type of operator involved in the formation of observable data and does not take into account the properties of the object of observation. For linear homogeneous operators, this problem is solved with the help of special methods of wavelet analysis, which allow adapting simultaneously to the form of the operator and local features of the function describing the object. In this paper, the authors consider the problem of inverting a linear homogeneous operator in the presence of noise in the observational data by thresholding the wavelet expansion coefficients of the observed function. The asymptotically optimal thresholds and orders of the loss function are calculated when minimizing the averaged probability of error of wavelet coefficient calculation.

Keywords: wavelets; thresholding; linear homogeneous operator; loss function

SUFFICIENT ERGODICITY CONDITIONS FOR PRIORITY QUEUES
  • A. V. Mistryukov  Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, GSP-1, Moscow 119991, Russian Federation
  • V. G. Ushakov  Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, GSP-1, Moscow 119991, Russian Federation, Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: Known results in ergodicity of priority queues are based on the assumption that interarrival times in each queue have exponential distribution. The aim of this paper is to relax this assumption, namely, to establish sufficient conditions of ergodicity for queues with two priority classes GI\GI 11 under assumption that interarrival times only in high priority class queue have exponentional distribution. Queues with nonpreemptive and different kinds of preemptive priority are considered. To formulate desired conditions, the authors use Lindley's recursion for waiting times of each priority class queue. Using Lyapunov-Foster criteria, the authors obtain sufficient conditions for a given recursion to be a Harris-ergodic Markov chain, meaning to have a unique invariant measure, to which its transition probabilities converge in total variation.

Keywords: head of the line priority; preemptive priority; ergodicity; the method of test functions; waiting time; Lindley recursion

A PROBABILITY MODEL OF THE INFLUENCE OF THE ORDER BOOK ON THE PRICE PROCESS
  • L. V. Nazarov  Faculty of Computational Mathematics and Cybernetics, M. V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, GSP-1, Moscow 119991, Russian Federation
  • V. V. Lavrentyev  Faculty of Computational Mathematics and Cybernetics, M. V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, GSP-1, Moscow 119991, Russian Federation
  • E. V. Bykovets  Faculty of Computational Mathematics and Cybernetics, M. V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, GSP-1, Moscow 119991, Russian Federation

Abstract: The Limit Order Book model is considered, with buy and sell orders arriving as two independent Cox processes. It includes the price impact model built on the basis of a physical model of perfectly elastic collision. Price is treated as a particle of some mass, moving along a straight line without friction. The incoming buy orders and outgoing sell orders hit the price giving it additional momentum in one direction, while incoming sell orders and outgoing buy orders do the same in the opposite direction. A functional limit theorem for the price process is obtained at a high intensity of incoming order flow, which allows approximation by some Levy process

Keywords: limit orders; perfectly elastic collision; limit order book model; price process; Cox process; functional limit theorem

MAXIMAL BRANDING PROCESSES IN RANDOM ENVIRONMENT
  • A. V. Lebedev  Department of Probability Theory, Faculty of Mechanics and Mathematics, M.V. Lomonosov Moscow State University, Main Building, 1 Leninskiye Gory, Moscow 119991, Russian Federation

Abstract: The work continues the author's long research in the theory of maximal branching processes that are obtained from classical branching processes by replacing the sum of offsping numbers by the maximum. One can say that the next generation is formed by the offspring of the most productive particle. Earlier, the author generalized processes with integer values up to processes with arbitrary nonnegative values, investigated their properties, and proved the limit theorems. Further, maximal branching processes with several types ofparticles were introduced and studied. In this paper, the author introduces the concept of maximal branching processes in random environment (with one type of particles) and an important case of the "power" random environment. In the latter case, the basic properties of maximal branching processes are studied and the ergodic theorem is proved. As an application, the author considers gated infinite-server queues.

Keywords: maximal branching processes; random environment; ergodic theorem; stable distributions; extreme value theory

HIERARCHICAL METHOD OF META DATA GENERATION FOR CONTROL OF NETWORK CONNECTIONS
  • A. A. Grusho  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • E. E. Timonina  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • S. Ya. Shorgin  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: An important class of threats for distributed information systems is the possibility of organization of illegal information interactions and, vice versa, prohibition of the allowed information interactions. For preventing these threats, creation of connections control with the help of meta data is proposed. Meta data are created on the basis of mathematical models of business processes which are provided in distributed information systems by a set of information technologies. In the paper, two methods of meta data creation for information technologies are considered. The suggested approach is based on hierarchical decomposition of information technologies and tasks.

Keywords: information security; distributed information systems; metadata; hierarchical decomposition; composite tasks

A MODEL OF RISK MANAGEMENT IN GAUSSIAN STOCHASTIC SYSTEMS
  • A. N. Tyrsin  Ural Federal University named after first President of Russia B. N. Yeltsin, 19 Mira Str., Ekaterinburg 620002, Russian Federation, Institute of Economics, Ural Branch of the Russian Academy of Sciences, 29 Moskovskaya Str., Yekaterinburg 620014, Russian Federation
  • A. A. Surina  Institute of Natural Sciences, South Ural State University, 87 Lenin Ave., Chelyabinsk454080, Russian Federation

Abstract: A new approach to research of risk of multidimensional stochastic systems is described. It is based on a hypothesis that the risk can be managed by changing probabilistic properties of a component of a multidimensional stochastic system. The case of Gaussian stochastic systems described by random vectors having the multidimensional normal distribution is investigated. Modeling has shown that multidimensionality of a system and relative correlation of components unaccounted in an explicit form, can lead to essential understating of risk factors. Results of calculation of the probability of a dangerous outcome depending on numerical characteristics of a multidimensional Gaussian random variable (a covariance matrix and a vector of mathematical expectations) are given. Approbation of the suggested model is executed by the example of the analysis of the risk of cardiovascular diseases in population. Models of risk management in the form of a minimization problem or achievement of the given level are described. Control variables are the numerical characteristics of a random vector covariance matrix and a vector of mathematical expectations. Approbation of the method of risk management was carried out by means of statistical model operation by the Monte-Carlo method.

Keywords: risk; model; stochastic system; random vector; control; normal distribution

A VISUALIZATION ALGORITHM FOR THE PLANE PROBABILITY MEASURE KERNEL
  • S. N. Vasil'eva  Moscow Aviation Institute (National Research University), 4 Volokolamskoe Shosse, Moscow 125993, Russian Federation
  • Yu. S. Kan  Moscow Aviation Institute (National Research University), 4 Volokolamskoe Shosse, Moscow 125993, Russian Federation

Abstract: The authors propose an algorithm for constructing a probability measure kernel polyhedral approximation for a two-dimensional random vector with independent components. The kernel is one of the important concepts used in algorithms for solving stochastic programming problems with probabilistic criteria. The kernel is most effectively used in cases when the statements of the indicated problems have the property of linearity with respect to random parameters. Because of linearity, the maximum in random parameters is determined by searching all vertices of the approximating polyhedron. The authors propose an algorithm for constructing a polyhedral approximation of the kernel of a probability measure for a two-dimensional random vector with independent components. The algorithm is based on construction of the intersection of a finite number of confidence half-spaces, the parameters of which are estimated by the Monte-Carlo method. The result of the proposed algorithm is the definition of the set of vertices of the approximating polyhedron. Approximation of the nucleus is their convex hull. The results of calculations for a number of typical continuous distribution laws are presented.

Keywords: quantile optimization problem; linearization method; probability measure kernel

MATHEMATICAL MODEL OF OPTIMAL TRIANGULATION
  • A. Batenkov  Orel Branch of the Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 137 Moskovskoe Shosse, Orel 302025, Russian Federation
  • Yu. Maniakov  Orel Branch of the Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 137 Moskovskoe Shosse, Orel 302025, Russian Federation
  • A. Gasilov  Orel Branch of the Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 137 Moskovskoe Shosse, Orel 302025, Russian Federation
  • O. Yakovlev  Orel Branch of the Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 137 Moskovskoe Shosse, Orel 302025, Russian Federation

Abstract: The problem of synthesis of optimal planar convex triangulation is formalized. This problem arises in different applications of informatics problems and is very actual for its sections such as computer graphics and geographical information systems. The mathematical model is represented as an extremum problem with infinite number of constraints, as a minimax problem with bound variables, and as an extremum problem with additional constraints on line segments intersections of triangulation with limited number of constraints. By putting idempotent limitations on Boolean variables, the initial integer-valued problem could be solved as a general mathematical programming problem on a continuum set of answers. In addition, the comparison of results obtained by the greedy algorithm based on the represented model and Delaunay triangulation is provided.

Keywords: mathematical model; triangulation; Delaunay triangulation

AUTOMATIC METADATA EXTRACTION FROM SCIENTIFIC PDF DOCUMENTS
  • A. V. Ogaltsov  National Research University Higher School of Economics, 20 Myasnitskaya Str., Moscow 101000, Russian Federation, Antiplagiat JSC, 33 Varshavskoe Shosse, Moscow 117105, Russian Federation
  • O. Y. Bakhteev  Antiplagiat JSC, 33 Varshavskoe Shosse, Moscow 117105, Russian Federation, Moscow Institute of Physics and Technology, 9 Institutskiy Per., Dolgoprudny, Moscow Region 141700, Russian Federation

Abstract: The authors investigate the task of metadata extraction. The authors consider scientific PDF documents in Russian. One of the features of PDF is a rich layout. It is difficult to extract metadata due to this fact. The authors propose a method based on considering blocks from pdf-parser as objects in a machine learning task. The feature space is constructed not only of text statistics but also of formatting and positioning features of the block.
The authors performed computational experiments and compared their approach with the baseline.

Keywords: metadata extraction; natural language processing; layout features; information retrieval; metadescriptions

SOLUTION TO THE PROBLEM OF OPTIMAL CONTROL OF A STOCHASTIC SEMI-MARKOV MODEL OF CONTINUOUS SUPPLY OF PRODUCT MANAGEMENT UNDER THE CONDITION OF CONSTANTLY HAPPENING CONSUMPTION
  • P. V. Shnurkov  National Research University Higher School of Economics, 34 Tallinskaya Str., Moscow 123458, Russian Federation
  • A. Y. Egorov  National Research University Higher School of Economics, 34 Tallinskaya Str., Moscow 123458, Russian Federation

Abstract: The solution of the optimal control problem in the semi-Markov model in question is theoretically justified. To achieve this goal, formal analytic transformations of the integral representations obtained by the authors earlier for the basic probabilistic characteristics of the model were carried out. These transformations made it possible to use the theorem on the analytical representation of the stationary value of the management effectiveness of a semi-Markov process in the form of a fractional-linear integral functional. In the sequel, the authors use the general theorem on the extremum of a fractional-linear integral functional, proved by P. V. Shnurkov. This theorem makes it possible to reduce the problem of optimal reserve management to the problem of investigating the global extremum of a given function from a finite number of real nonnegative variables that can be effectively solved in practice using the known numerical methods.

Keywords: inventory management; semi-Markov stochastic process; stationary value index; optimal control of stochastic systems; fractional-linear integral functional

THE INFLUENCE OF THE CONNECTIONS' DENSITY ON CLUSTERIZATION AND PERCOLATION THRESHOLD DURING INFORMATION DISTRIBUTION IN SOCIAL NETWORKS
  • D. O. Zhukov  Moscow Technological University (MIREA), 78 Vernadskogo Ave., Moscow 119454, Russian Federation
  • T. Yu. Khvatova  Peter the Great St. Petersburg Polytechnic University, 29 Polytechnicheskaya Str., St. Petersburg 195251, Russian Federation
  • S. A. Lesko  Moscow Technological University (MIREA), 78 Vernadskogo Ave., Moscow 119454, Russian Federation
  • A. D. Zaltsman  Moscow Technological University (MIREA), 78 Vernadskogo Ave., Moscow 119454, Russian Federation

Abstract: The paper is focused on applying new theoretical approaches to describing the processes of information transmission and transformation in sociotechnical systems and in social networks based on the percolation theory. Percolation threshold of a random network depends on its density. In networks with random structure, in both the task of bonds and the task of nodes, percolation thresholds reach saturation when the network's density is high. The saturation value of a percolation threshold is higher in the task of bonds. From the point of information influence of a random network, increasing the average connection's density within the network turns out to be more preferable than fostering a small number of separate 'central nodes' with numerous connections. The results obtained in this study can be applied in interdisciplinary research in such areas as informatics, mathematic modeling, and economics involving certain sociological survey data for forecasting behavior and managing groups of individuals in network communities. This research enhances and enlarges the scope of methods and approaches applied in classic informatics for describing social and sociotechnical systems, which can be useful for a wide range of researchers engaged into studying social network structures.

Keywords: percolation theory; social network structure; connections' density; network clusterisation; percolation threshold

DISCRETE ANALYSIS IN PARSING
  • Ya. M. Mirzabekov  Dagestan State University, 43-a Gadzhiyev Str., Makhachkala 367000, Republic of Dagestan, Russian Federation
  • Sh. B. Shihiev  Dagestan State University, 43-a Gadzhiyev Str., Makhachkala 367000, Republic of Dagestan, Russian Federation

Abstract: An informal definition of syntax is given in terms of discrete mathematics and graph theory. The main difficulty for numerous attempts to formalize a natural language is the semantics of the language. It is shown how it is possible to classify expressions on a semantic basis, using their syntactic features. Classification of expressions by the kind of questions they answer is the simplest way of grouping expressions on the semantic basis. The present authors describe an algorithm that recognizes place pointers, that is, expressions which answer the question "where?". On specific examples, the problem of analysis of expressions and the inverse problem of synthesis, more precisely, the applied problem of recognition of expressions are considered.

Keywords: natural language; discrete mathematics; graph theory; syntax; word forms; morphological parameters; consistent definitions; inconsistent definitions; vocabulary; semantics

MACHINE TRANSLATION OF RUSSIAN CONNECTIVES INTO FRENCH: ERRORS AND QUALITY FAILURES
  • V. Nuriev  Institute of Linguistics of the Russian Academy of Sciences, 1 bld. 1 Bolshoy Kislovsky Lane, Moscow 125009, Russian Federation, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • N. Buntmanv  M. V. Lomonosov Moscow State University, GSP-1, Leninskie Gory, Moscow 119991, Russian Federation
  • O. Inkova  University of Geneva, 24 du General-Dufour Str., Geneve 4 1211, Switzerland

Abstract: The paper shows what machine errors and quality failures may occur when translating connectives. To that end, a statistical machine translation (SMT) system has been used in order to generate translation samples. The opening part presents a brief retrospective on how machine translation has been developing for over 60 years; it sets out the necessary background and provides the context. Also, section 1 explains how an SMT system works. Then, the paper takes a closer look at the problem of evaluation of machine translation quality. Several approaches to classifying machine translation errors are considered to finally attempt a taxonomy that covers specifically the errors central to translation of connectives (from Russian into French). The closing section provides examples of these machine translation errors.

Keywords: statistical machine translation; corpus linguistics; machine translation errors; parallel texts