Eng | Rus

“Informatics and Applications” scientific journal

Volume 13, Issue 4, 2019

Content   Abstract and Keywords   About Authors

DIGITAL MODEL OF THE AIRCRAFT'S WEIGHT PASSPORT
  • L. L. Vyshinsky  A. A. Dorodnicyn Computing Center, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 40 Vavilov Str., Moscow 119333, Russian Federation
  • M. K. Kuryansky  Department of Advanced Research-Scientific and Technical Center, United Aircraft Corporation, 5B Pionerskaya Str., Moscow 115054, Russian Federation
  • Yu. A. Flerov  A. A. Dorodnicyn Computing Center, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 40 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The paper is devoted to the problem of the digital modeling of an aircraft's weight passport. A weight passport is developed at the stage of a new product's design and accompanies it during all other stages of the life cycle. A digital weight passport plays the most important role when the released product is being operated.
A software implementation of the weight passport serves not only as the reference manual, but also as the tool for carrying out complex weight calculations during the preparation of flight tasks, maintenance, and repair work. The paper proposes the concept and the software implementation of an aircraft's digital weight passport.

Keywords: digital model; design automation; aircraft; weight design; weighting model; design tree; project generator

USING THE MODEL OF GAMMA DISTRIBUTION IN THE PROBLEM OF FORMING A TIME-LIMITED TEST IN A DISTANCE LEARNING SYSTEM
  • A. V. Bosov  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • A. V. Naumov  Moscow State Aviation Institute (National Research University), 4 Volokolamskoe Shosse, Moscow 125933, Russian Federation
  • G. A. Mkhitaryan  Moscow State Aviation Institute (National Research University), 4 Volokolamskoe Shosse, Moscow 125933, Russian Federation
  • A. P. Sapunova  Moscow State Aviation Institute (National Research University), 4 Volokolamskoe Shosse, Moscow 125933, Russian Federation

Abstract: For the distance learning systems, consideration is given to generation of individual tasks with minimization of execution time. As a criterion, the convolution of two weighted normalized values associated with the deviation of the complexity of the generated test from the specified level and the quantile of the test execution time is used. The gamma distribution model is used to describe a model of a student's random response time to a task. An algorithm is proposed for estimating the parameters of the gamma distribution for each task. It is assumed that task complexities are determined either by an expert or by using corresponding algorithms based on the Rush model. The results of a numerical experiment are presented.

Keywords: distance learning system; statistical analysis; adaptive systems; quantile optimization

ON COMPARATIVE EFFICIENCY OF CLASSIFICATION SCHEMES IN AN ENSEMBLE OF DATA SOURCES USING AVERAGE MUTUAL INFORMATION
  • M. M. Lange  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: Given ensemble of data sources and different fusion schemes, an accuracy of multiclass classification of the collections of the source objects is investigated. Using the average mutual information between the datasets of the sources and a set of the classes, a new approach to comparing lower bounds to an error probability in two fusion schemes is developed. The authors consider the WMV (Weighted Majority Vote) scheme which uses a composition of the class decisions on the objects of the individual sources and the GDM (General Dissimilarity Measure) scheme based on a composition of metrics in datasets of the sources. For the above fusion schemes, the mean values of the average mutual information per one source are estimated. It is proved that the mean in the WMV scheme is less than the similar mean in the GDM scheme. As a corollary, the lower bound to the error probability in the WMV scheme exceeds the similar bound to the error probability in the GDM scheme. This theoretical result is confirmed by experimental error rates in face recognition of HSI color images that yield the ensemble of H, S, and I sources.

Keywords: multiclass classification; ensemble of sources; fusion scheme; composition of decisions; composition of metrics; average mutual information; error probability

DATA MODEL SELECTION IN MEDICAL DIAGNOSTIC TASKS
  • M. P. Krivenko  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: Effective solution of medical diagnostics tasks requires the use of complex probabilistic models which allow one to adequately describe real data and permit the use of analytical methods of the supervised learning classification. Choosing a model of a mixture of normal distributions solves the posed problems but leads to the curse of dimensionality. The transition to the model of a mixture of probabilistic principal component analyzers allows one to formally set the task of choosing its structural parameters. The solution is proposed to search by combining the application of information criteria for the formation of initial approximations followed by refinement of the resulting estimates. Using the example of experiments to diagnose liver diseases and to predict the chemical composition of urinary stones, the capabilities of the described data analysis procedures are demonstrated. The proposed solutions give a source of improving the accuracy of classification, impetus to experts in the subject area to clarify the essence of the processes.

Keywords: medical diagnostics; mixture of probabilistic principal component analyzers; model selection criterion; cross validation

RESEARCH OF THE POSSIBILITY TO FORECAST CHANGES IN FINANCIAL STATE OF A CREDIT ORGANIZATION ON THE BASIS OF PUBLIC FINANCIAL STATEMENTS
  • Yu. I. Zhuravlev  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation, M. V. Lomonosov Moscow State University, 1-52 Leninskie Gory, GSP-1, Moscow 119991, Russian Federation
  • O. V. Sen'ko  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • N. N. Bondarenko  M. V. Lomonosov Moscow State University, 1-52 Leninskie Gory, GSP-1, Moscow 119991, Russian Federation
  • V. V. Ryazanov  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • A. A. Dokukin  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • A. P. Vinogradov  Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The mathematical model for forecasting of license revocation of a credit organization in the 6-month period based on public financial statements is considered. The model represents an ensemble of combinatorial and logical methods and decision trees of different types. Its effectiveness estimated by ROC AUC (area under receiver operating characteristic curve) is 0.74. The model allows distinguishing groups of credit organizations with higher and lower license revocation risks. Also, the ranking of different financial statement indicators has been performed which marked the importance of liquid and highly liquid assets.

Keywords: forecasting; algorithm ensembles; financial state; credit organization

THEORETICAL FOUNDATIONS OF CONTINUOUS VaR CRITERION OPTIMIZATION IN THE COLLECTION OF MARKETS
  • G. A. Agasandyan  A. A. Dorodnicyn Computing Center, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 40 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The work continues studying the problems of using continuous VaR criterion (CC-VaR) in financial markets. The application of CC-VaR in a collection of theoretical markets of different dimensions that are mutually connected by their underliers is concerned. In a typical model of the collection of one two-dimensional market and two one-dimensional markets, the most general case of their conjoint functioning is considered.
The rule of constructing a combined portfolio optimal on CC-VaR in these markets is submitted. This rule is founded on misbalance in returns relative between markets with maintaining optimality on CC-VaR. The optimal combined portfolio with three components is constructed from basis instruments of all markets and by using ideas of randomization in their composition. Also, the idealistic and surrogate versions of this combined portfolio, which are useful in testing all algorithmic calculations and in graphic illustrating portfolio's payoff functions, are adduced.
The model can be extended without academic difficulties onto markets of greater dimensions. Also, two truncated variants of problem setting with excluded either one of one-dimensional markets or the two-dimensional market are fully justified.

Keywords: underliers; risk preferences function; continuous VaR criterion; cost and forecast densities; return relative function; Newman-Pearson procedure; combined portfolio; randomization; surrogate portfolio; idealistic portfolio

THE OUTPUT STREAMS IN THE SINGLE SERVER QUEUEING SYSTEM WITH A HEAD OF THE LINE PRIORITY
  • V. G. Ushakov  Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, Moscow 119991, GSP-1, Russian Federation, Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • N. G. Ushakov  Institute of Microelectronics Technology and High-Purity Materials of the Russian Academy of Sciences, 6 Academician Osipyan Str., Chernogolovka, Moscow Region 142432, Russian Federation, Norwegian University of Science and Technology, 15A S. P. Andersensvei, Trondheim 7491, Norway

Abstract: The paper studies a single server queuing system with two types of customers, head of the line priority, and an infinite number of positions in the queue. The arrival stream of customers of each type is a Poisson stream.
Each type has its own generally distributed service time characteristics. The main result is the Laplace-Stieltjes transform of one- and two-dimensional stationary distribution functions of the interdeparture time for each type of customers. The analysis of the output process is carried out by the method of embedded Markov chains. As embedded times, successive moments of the end of service of the same type of customers are selected. From the practical perspective, an accurate characterization of the interdeparture time process is necessary when studying open networks of queues.

Keywords: output stream; head of the line priority; embedded Markov chain; single server

THE MEAN SQUARE RISK OF NONLINEAR REGULARIZATION IN THE PROBLEM OF INVERSION OF LINEAR HOMOGENEOUS OPERATORS WITH A RANDOM SAMPLE SIZE
  • O. V. Shestakov  Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, Moscow 119991, GSP-1, Russian Federation, Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The problems of constructing estimates from observations, which represent a linear transformation of the initial data, arise in many application areas, such as computed tomography, optics, plasma physics, and gas dynamics. In the presence of noise in the observations, as a rule, it is necessary to apply regularization methods. Recently, the methods of threshold processing of wavelet expansion coefficients have become popular. This is explained by the fact that such methods are simple, computationally efficient, and have the ability to adapt to functions which have different degrees of regularity at different areas. The analysis of errors of these methods is an important practical task, since it allows assessing the quality of both the methods themselves and the equipment used. When using threshold processing methods, it is usually assumed that the number of expansion coefficients is fixed and the noise distribution is Gaussian. This model is well studied in literature and optimal threshold values are calculated for different classes of signal functions. However, in some situations, the sample size is not known in advance and has to be modeled by a random variable. In this paper, the author considers a model with a random number of observations containing Gaussian noise and estimates the order of the mean-square risk with an increasing sample size.

Keywords: wavelets; threshold processing; linear homogeneous operator; random sample size; mean square risk

MIXED POLICIES FOR ONLINE JOB ALLOCATION IN ONE CLASS OF SYSTEMS WITH PARALLEL SERVICE
  • M. G. Konovalov  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • R. V. Razumchik  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation, Peoples' Friendship University of Russia (RUDN University), 6 Miklukho-Maklaya Str., Moscow 117198, Russian Federation

Abstract: Consideration is given to the problem of efficient job allocation in the class of systems with parallel service on independently working single-server stations each equipped with the infinite capacity queue. There is one dispatcher which routes jobs, arriving one by one, to servers. The dispatcher does not have a queue to store the jobs and, thus, the routing decision must be made on the fly. No jockeying between servers is allowed and jobs cannot be rejected. For a job, there is the soft deadline (maximum waiting time in the queue). If the deadline is violated, a fixed cost is incurred and the job remains in the system and must be served. The goal is to find the job allocation policy which minimizes both the job's stationary response time and probability of job's deadline violation. Based on simulation results, it is demonstrated that the goal may be achieved (to some extent) by adopting a mixed policy, i. e. a proper dispatching rule and the service discipline in the server.

Keywords: parallel service; dispatching policy; service discipline; sojourn time; deadline violation

DISCRETE-TIME Geo/G/l/infinity LIFO QUEUE WITH RESAMPLING POLICY
  • L. A. Meykhanadzhyan  Financial University under the Government of the Russian Federation, 49 Leningradsky Prosp., Moscow 125993, Russian Federation
  • R. V. Razumchik  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation, Peoples' Friendship University of Russia (RUDN University), 6 Miklukho-Maklaya Str., Moscow 117198, Russian Federation

Abstract: Consideration is given to the problem of estimation of the true stationary mean response time in the discrete-time single-server queue of infinite capacity, with Bernoulli input, round-robin scheduling, and inaccurate information about the service time distribution which is considered to be general arithmetic. It is shown that the upper bound for the true value may be provided by the mean response time in the discrete-time single-server queue with LIFO (last in, first out) service discipline and resampling policy. The latter implies that a customer arriving to the nonidle system assigns new remaining service time for the customer in the server. For the case when the true service time distribution is geometric and the error in the service times is of multiplicative type, conditions are provided which, when satisfied, guarantee that the proposed method yields the upper bound across all possible values of the system's load.

Keywords: discrete time; inverse service order; inaccurate service time; round robin scheduling; resampling policy

NUMERICAL SCHEMES OF MARKOV JUMP PROCESS FILTERING GIVEN DISCRETIZED OBSERVATIONS I: ACCURACY CHARACTERISTICS
  • A. V. Borisov  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The note is the initial in the series of the papers devoted to the numerical realization of the optimal state filtering of Markov jump processes given the indirect observations corrupted by the additive and/or multiplicative Wiener noises. This problem is solved by the time discretization of the observations with their subsequent processing.
Both the optimal and suboptimal estimations are expressed in terms of multiple integrals of the Gaussian densities with some mixing distributions. In the article, the author presents the investigation of various numerical integration schemes' influence on the accuracy of the approximating estimates. The problem turns into the characterization of distance between stochastic sequences generated by some recursions. The paper introduces a pseudometric describing the distance and presents a proposition determining the influence ofthe characteristic on both the local and global accuracy ofthe filtering estimate approximation.

Keywords: Markov jump process; optimal filtering; additive and multiplicative observation noises; stochastic differential equation; analytical and numerical approximation

ON THE REPRESENTATION OF GAMMA-EXPONENTIAL AND GENERALIZED NEGATIVE BINOMIAL DISTRIBUTIONS
  • A. A. Kudryavtsev  Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M. V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, GSP-1, Moscow 119991, Russian Federation

Abstract: For more than a century and a half, gamma-type distributions have shown their adequacy in modeling real processes and phenomena. Over time, designs using distributions from the gamma family are becoming more complex in order to improve the applicability of mathematical models to relevant aspects of life. The paper presents a number of results both generalizing and simplifying some classical forms used in the analysis of large-scale and structural mixtures of generalized gamma laws. The gamma-exponential distribution is introduced and its characteristics are described. An explicit form for integral representations of partial probabilities of the generalized negative binomial distribution is given. The results are formulated in terms of the gamma exponential function. The obtained results can be widely used in models that use scale and structural mixtures of distributions with positive unrestricted support to describe processes and phenomena.

Keywords: gamma exponential function; generalized gamma distribution; generalized negative binomial distribution; gamma-exponential distribution; mixed distributions

CONCEPTS FORMING ON THE BASIS OF SMALL SAMPLES
  • A. A. Grusho  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • M. I. Zabezhailo  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • N. A. Grusho  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • E. E. Timonina  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: Monitoring systems of information security of information systems obtain information in the form of chains of short messages which can be considered as chains of small samples. Often, owing to an inertance of information systems, these chains reflect close statuses of the computing system or network. In the paper, it is supposed that work of the system can be presented in the form of a finite set of modes which are called concepts. Violations of security are detected by means of anomalies that are associated with emergence of new concepts. The known technologies of identification of anomalies are based on creation of a model of a normal system's behavior. Concepts correspond to normal types of a system's behavior. In the paper, the problem of creation of concepts on the basis of machine learning based on chains of small samples is considered. The algorithm of concepts forming is constructed and its efficiency is proved.

Keywords: information security monitoring; small samples; small sample learning; concepts forming

USING METADATA TO IMPLEMENT MULTILEVEL SECURITY POLICY REQUIREMENTS
  • A. A. Grusho  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • N. A. Grusho  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • E. E. Timonina  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: A distributed information computing system which objects contain both valuable information (or are themselves valuable) and open (non-valuable) information is considered. To protect valuable information, multilevel security (MLS) policy is used that prohibits information flows from objects with valuable information to objects with open information. Objects with valuable information form a class of high-level objects, and objects with open information form a class of low-level objects. Metadata is created to manage network connections. Metadata is a simplification of mathematical models of business processes and is the basis of a permission system for host connections in a distributed information computing system. The paper constructs MLS security policy rules, and based on metadata-related infrastructure, shows the ability to implement this security policy in the distributed information computing system. The only trusted process required to implement the MLS security policy is at the connection management level. This layer is unrelated to the data plane and can be isolated to ensure its information security.

Keywords: MLS security policy; information flows; metadata

TEMPORAL DATA IN LEXICOGRAPHIC DATABASES
  • A. A. Goncharov  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • I. M. Zatsman  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • M. G. Kruzhkov  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The paper describes an approach to design of the Lexicographic Knowledge Base (LKB), which aims to fulfill two interrelated tasks: (i) goal-oriented development of linguistic typologies; and (ii) creation and updating of electronic bilingual dictionaries based on the developed typologies. In the LKB, some of the fields assigned by lexicographers to represent new knowledge on words' meanings (concepts) and translations are temporal. The content of these fields is time-dependent because lexicographers can change description of concepts with time.
These changes may involve not only changes to existing concepts and appropriate fields with their descriptions (definitions), but also the changes to structure of individual dictionary entries of a bilingual dictionary. One of the LKB's goals is to provide a method to describe these changes consistently. Although the LKB's initial design relies on the existing structure of dictionary entries, this structure may evolve along with the appropriate concepts as the underlying linguistic typologies and dictionary entries evolve with time. The goal of this paper is to describe the temporal structure of a dictionary entry and the approach to design of the LKB with temporal data that will be able to fulfill both of the specified tasks. The proposed approach is illustrated by an example of the temporal dictionary entry structure of a German-Russian dictionary

Keywords: lexicographic knowledge base; temporal structure of a dictionary entry; bilingual dictionaries; linguistic typologies; parallel texts; evolution of concepts

DIGITAL ENCODING OF CONCEPTS
  • I. M. Zatsman  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The tasks of encoding concepts of human knowledge in the digital medium of computers and networks are of particular relevance in connection with the widespread use of artificial intelligence systems in the world. In the process of expanding the scope of their applications, the range of categories of encoded concepts is increasing.
In addition to conventional concepts, which have stable forms of their presentation, for example, by the words of natural languages, it is often necessary to encode personal and collective concepts in the digital medium. Moreover, sometimes, it is necessary to take into account the degree of their socialization (the Wierzbicki&Nakamori's term) and reflect the dynamics of their change over time, as well as the stages of their transformation into conventional concepts. In the time dimension, the spectrum of scales has expanded for describing the dynamics of concepts of human knowledge. If earlier scales were used with units of measuring the dynamics of concepts in hundreds and tens of years (less often scales with accuracy up to a year and a month were used), then for personal and collective concepts, it is necessary to use a scale that fixes their dynamics up to days, and sometimes hours and minutes.
The goal of the paper is to describe the asymmetry problem encountered in the encoding process of concepts in the digital medium. The asymmetry significantly complicates the processes of representing human knowledge in artificial intelligence systems. To solve this problem, it is proposed to use at the same time encoding of both concepts of the listed categories and forms of their expression in the digital medium. The proposed approach is illustrated by the example of an intelligent vocabulary system that uses encoding of both concepts and words, which are verbal forms of concept representation.

Keywords: knowledge encoding; polyadic computing; digital medium; artificial intelligence; categories of concepts; socialization of knowledge concepts

UNDERSTANDING OF COMPLEX SYSTEMS USING THE LAWS OF SYNERGETICS AND INFORMATICS
  • R. B. Seyful-Mulyukov  Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The author shows how the laws of informatics and synergetics can be used to explain the genesis and evolution of such a complex natural system as petroleum. When one creates the matrix of hydrocarbon molecules using the laws of informatics, the latter imply the ambiguity in the quantum behavior of the electrons. This dynamic and static uncertainty comes into play during the oil field location process. Consideration is given to the laws of synergetics, which demonstrate the self-organization ability of the molecules. A new type of molecules is formed in the hydrocarbon fluid near the bifurcation points, associated with the variation of the thermodynamics, structure, and mixture of the geological environment. In the analysis of the petroleum formation process, consideration is also given to the notion of attractor. It serves as the basin of attraction for all hydrocarbon molecules, in which the exact petroleum molecular composition is formed.

Keywords: synergetics; petroleum formation; informatics and oil location; bifurcation and composition of hydrocarbon molecules; attractor and petroleum molecular composition

 

RUS