Институт проблем информатики Российской Академии наук
Институт проблем информатики Российской Академии наук
Российская Академия наук

Институт проблем информатики Российской Академии наук




«INFORMATICS AND APPLICATIONS»
Scientific journal
Volume 8, Issue 4, 2014

Content | About  Authors

Abstract and Keywords.

JOINT STATIONARY DISTRIBUTION OF THE NUMBER OF CUSTOMERS IN THE SYSTEM AND REORDERING BUFFER IN THE MULTISERVER REORDERING QUEUE

  • A. V. Pechinkin  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • R. V. Razumchik  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The paper considers a continuous-time multiserver queueing system with buffer on infinite capacity and reordering. The Poisson flow of customers arrives at the system. Service times of customers at each server are exponentially distributed with the same parameter. Each customer obtains a sequential number upon arrival. The order of customers upon arrival should be preserved upon departure from the system. Customers whose service finished but which violated the order are kept in the reordering buffer of infinite capacity. A joint stationary distribution of the number of customers in the buffer, servers, and reordering buffer is obtained in terms of a computational algorithm and a generating function. A numerical example is provided.

Keywords:  queueing system; reordering; infinite capacity; joint distribution

A MODIFIED GRID METHOD FOR STATISTICAL SEPARATION OF NORMAL VARIANCE-MEAN MIXTURES

  • V. Yu. Korolev  Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, Moscow 119991, GSP-1, Russian Federation, Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • A. Yu. Korchagin  Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, Moscow 119991, GSP-1, Russian Federation

Abstract: A modified two-stage grid method for statistical separation of normal variance-mean mixtures is described as an alternative to a pure EM (expectation-maximization) algorithm. At the first stage of this algorithm, a discrete approximation is constructed to the mixing distribution. At the second stage, the obtained discrete distribution is approximated by an absolutely continuous distribution from a predetermined family, say, by a generalized inverse Gaussian distribution. The convergence of this two-stage procedure is discussed. The monotonicity of the grid procedure used at the first stage is proved. The problem of the optimal choice of the parameters of the method is discussed in detail. First of all, the problem of the optimal choice of the grid thrown on the support of the mixing distribution is considered. Statistical estimators are proposed for the quantiles of the mixing law. The efficiency of the method is illustrated by examples of its application to the estimation of the parameters of generalized hyperbolic distributions.

Keywords:  mixture of probability distributions; normal variance-mean mixture; generalized hyperbolic distribution; EM-algorithm; grid method of separation of mixtures

ON THE FORMALIZATION OF ORDER FLOW TOXICITY ON FINANCIAL MARKETS

  • A. V. Chertok  Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, GSP-1, Moscow 119991, Russian Federation Federation, Euphoria Group LLC, 9, bld. 1, of. 6 Arkhangelsky Lane, Moscow 101000, Russian Federation

Abstract: The paper considers the microstructural order flow model for financial markets. The order flow imbalance process is used as an integral indicator of the current state of the limit-order book. The model of order flow imbalance is used to analyze the properties of the current limit-order book state, which is considered as two-sided risk process with stochastic premiums. The concept of order flow toxicity on financial markets is studied. This concept is formalized with probabilities of crossing fixed levels by the order flow imbalance process. The paper introduces the concepts of the instantaneous toxicity profile and Bayesian and quantile indicators of toxicity. These indicators are calculated for two model types of order flows: the first one has unit volume orders and the second one consists of orders with random volume which has exponential distribution.

Keywords:  financial markets; limit-order book; order flow; order flow imbalance; adverse selection; order flow toxicity; Poisson process; compound Poisson process; two-side risk process; risk process with stochastic premiums; ruin probability

ASYMPTOTIC PROPERTIES OF RISK ESTIMATE IN THE PROBLEM OF RECONSTRUCTING IMAGES WITH CORRELATED NOISE BY INVERTING THE RADON TRANSFORM

  • A. A. Eroshenko  Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, Moscow 119991, GSP-1, Russian Federation, Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • O. V. Shestakov  Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, Moscow 119991, GSP-1, Russian Federation

Abstract: In recent years, wavelet methods based on the decomposition of projections in a special basis and the following thresholding procedure became widely used for solving the problems of tomographic image reconstruction. These methods are easily implemented through fast algorithms; so, they are very appealing in practical situations. Besides, they allow the reconstruction of local parts of the images using incomplete projection data, which is essential, for example, for medical applications, where it is not desirable to expose the patient to the redundant radiation dose. Wavelet thresholding risk analysis is an important practical task, because it allows determining the quality of techniques themselves and the equipment which is used. The present paper considers the problem of estimating the function by inverting the Radon transform in the model of data with correlated noise.
The asymptotic properties of mean-square risk estimate of wavelet-vaguelette thresholding technique are studied.
The conditions under which the unbiased risk estimate is asymptotically normal are given.

Keywords:  linear homogeneous operator; Radon transform; thresholding; unbiased risk estimate; correlated noise; asymptotic normality

THE ANALYSIS OF TAGS IN COVERT CHANNELS

  • A. A. Grusho  Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, Moscow 119991, GSP-1, Russian Federation, Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • N. A. Grusho  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • E. E. Timonina  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The class of covert channels constructed on the basis of tags is considered. It is supposed that a covert channel is detected by the control subject using exclusively statistical techniques. It means that linguistic constructions seldom or often met are indiscernible for the control subject. For control subject, it is important that the sequence transmitted through the channel does not contain the bans which do not correspond to the probability model of legal messages. The main problem for supporting invisibility ofsuch channels consists in the fact that there can be bans ofthe probability measure describing a legal transmission when embedding tags. In the paper, the method for creating tags which cannot be detected by the control subject becomes suggested. Thanks to such creation oftags, the covert channel becomes invisible.

Keywords: covert channels; information security; covert channel generated by tags; "invisibility" of tags; mathematical models of covert channels

SWITCHING ON OF NEW BANS IN RANDOM SEQUENCES

  • A. A. Grusho  Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, M.V. Lomonosov Moscow State University, 1-52 Leninskiye Gory, Moscow 119991, GSP-1, Russian Federation, Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • N. A. Grusho  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • E. E. Timonina  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The problem ofgenerating one probability measure on space ofthe infinite sequences on finite alphabets with a-algebra generated by cylindrical sets out of another probability measure on this space is considered.
A new probability measure is arranged to reduce the set ofadmissible trajectories ofrandom sequences definitely. Inadmissibility of trajectories is defined in terms of specifications of the smallest bans. If a specification of the smallest bans is given, then the powers of support of projections of the new measure can be determined. It gives conditions to construct several sets offunctions. These functions and projections ofthe initial measure define a set ofmeasures on finite spaces which define the only probability measure on the space ofinfinite sequences.

Keywords:  random sequences; bans of probability measures; generation of probability measures; statistical problems on random sequences

ABOUT OPTIMUM DELIVERY OF FREIGHTS BY THE VEHICLE TAKING INTO ACCOUNT DEPENDENCE OF COST OF TRANSPORTATIONS ON LOADING OF VEHICLES ON SEVERAL CYCLIC ROUTES

  • E. M. Bronshtein Ufa State Aviation Technical University, 12 K. Marx Str., Ufa 450000, Russian Federation
  • P. A. Zelyov  Ufa State Aviation Technical University, 12 K. Marx Str., Ufa 450000, Russian Federation

Abstract: The problem of creation of a route of freights delivery from one producer (base, a warehouse) to consumers by the vehicle with the minimum costs of transportations is considered. Dependence of cost of transportation on loading of the vehicle and quality of the road is thus considered. It is supposed that the vehicle can come back to the base for additional charge. The corresponding mathematical model is constructed; for a case of linear dependence of fare from loading, the linear integer model is received. For the solution of an objective along with the exact algorithm, modification of the known heuristic algorithm of Clark and Right is suggested. Computing experiment has been made.

Keywords:  heuristic algorithm; creation of a route; transportation; problem of routing

A METHOD OF ENHANCING PROBABILISTIC VERIFICATION EFFICIENCY FOR COMPUTER AND TELECOMMUNICATION SYSTEMS

  • A. M. Mironov  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • S. L. Frenkel  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation, Moscow Institute of Radio, Electronics, and Automation (MIREA), 78 Prosp. Vernadskogo, Moscow 119454, Russian Federation

Abstract: The paper considers the problem of reduction of probabilistic transition systems (PTS) in order to reduce the complexity of model checking of such systems. The problem of model checking of a PTS is to calculate truth values of formulas of temporal probabilistic computational tree logic (PCTL) in the initial state of the PTS. The paper introduces the concept of equivalence of states of a PTS and represents an algorithm for removing equivalent states. The result of this algorithm is a PTS such that all its properties expressed by formulas of PCTL coincide with those of the original PTS.

Keywords:  verification; model checking; probabilistic transition systems; probabilistic temporal logic; reduction of probabilistic models

FALSE TEXTS: CLASSIFICATION AND METHODS OF IDENTIFICATION OF TEXT DOCUMENTS WITH IMITATIONS AND SUBSTITUTION OF AUTHORSHIP

  • M. Yu. Mikheev Research Computer Center, M.V. Lomonosov Moscow State Uviversity (MGU NIVC), 1-52 Leninskiye Gory, GSP-1, Moscow 119991, Russian Federation, Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • N. V. Somin Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • I. V. Galina Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • O. V. Zolotaryev Russian New University, 22 Radio Str., Moscow 105005, Russian Federation
  • E. B. Kozerenko Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • Yu. I. Morozova Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • M. M. Charnine Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: Modern textual space, including the Internet, is enormous and is constantly updated with new texts.
All text documents can be divided into two large groups: "good texts" and that might be called "false texts." So far, the industry of false texts flow production has become so massive that there is an urgent need to study this phenomenon and to develop effective methods of detection of such text documents. The purpose of the paper is to give an adequate description of the concept of false text as information and linguistic phenomenon and suggest some approaches to the identification of such texts.

Keywords:  text generation; natural language processing; statistical analysis of language objects; plagiarism; typology of false texts

A VISUALIZATION OF ESTIMATORS IN THE METHOD OF MOVING SEPARATION OF MIXTURES

  • A. K. Gorshenin  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation, Moscow Institute of Radio, Electronics, and Automation (MIREA), 78 Prosp. Vernadskogo, Moscow 119454, Russian Federation

Abstract: The method of moving separation of mixtures (MSM method) is a powerful tool for analyzing different stochastic processes. Using the MSM method within the experts' conclusions for the results obtained by iterative numerical procedures, a number of important results were achieved in the physics of turbulent plasma, a few mathematical models for the functioning of financial markets were refined. In most cases, research teams present results in a form which is convenient just for themselves, and it is difficult for experts to compare and interpret results, especially, in the case when the model is tested on fundamentally dissimilar samples from different subject areas.
The paper presents a visualization tool for displaying parameter estimates independently of the used numerical methods. The tool is convenient for researchers and experts.

Keywords:  method of moving separation of mixtures; user interface; normal mixtures; probabilistic models; data mining

REGARDING ERGONOMIC DEPENDENCES BETWEEN SITUATIONAL HALL PARAMETERS USING COLLECTIVE CURVED SCREEN

  • A. A. Zatsarinnyy Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • K. G. Сhuprakov  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: The paper presents an approach to determining dependences between such parameters of a situational hall as measurements of the hall, quantity of people working with the screen, information capacity of the content (the quantity of symbols), and screen width. These dependences make it possible to calculate an unknown parameter of a situational hall using known parameters satisfying requirements of the Russian and International ergonomic standards. The presented formulas are applicable to the case of curved screens by using the angle of curvature в (for a flat screen, в = 0). This parameter maybe interpreted as an angle between displays in a polyscreen or a videowall.
This parameter makes it possible to evaluate the efficiency of curved screens as a collective screen compared to the flat screens. The paper also suggests an approach to estimating the quantity of workplaces that may be used for their different interpositions.

Keywords:  collective curved screen; situational hall; dispatch room; ergonomic dependences; comfort observation area; curve angle; videowall, polyscreen; efficiency; price justification

METHODS OF ENTITY RESOLUTION AND DATA FUSION IN THE ETL-PROCESS AND THEIR IMPLEMENTATION IN THE HADOOP ENVIRONMENT

  • A. E. Vovchenko  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • L. A. Kalinichenko  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation, Faculty of Computational Mathematics and Cybernetics, M. V. Lomonosov Moscow State University, 1-52 Lenin-skiye Gory, GSP-1, Moscow 119991, Russian Federation
  • D. Yu. Kovalev  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: Entities extraction, their transformation and loading in the integrated repository are the main problem of data integration. These actions are part of the ETL-process (extract-transform-loading). An entity is a digital representation of a real world object (for example, information about a person). Entity resolution takes care of duplicate detection, deduplication, record linkage, object identification, reference matching, and other ETL- related tasks. After the entity resolution step, entities should be merged into the one reference entity (containing information from all related entities). Data fusion is the final step in the data integration process. The paper gives an overview of the entity resolution and data fusion methods. Also, the paper presents the techniques for programming the entity resolution and data fusion methods for implementing the ETL-process in the Hadoop environment. High-Level Integration Language (HIL), a declarative language that focuses on resolution and fusion of entities in the Hadoop-infrastructure, is used in this part of the paper.

Keywords: data integration; ETL; entity resolution; data fusion; big data; Hadoop; Jaql; HIL

CONCEPTUAL MODELING OF MULTIDIALECT WORKFLOWS

  • L. A. Kalinichenko  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation, Faculty of Computational Mathematics and Cybernetics, M. V. Lomonosov Moscow State University, 1-52 Lenin-skiye Gory, GSP-1, Moscow 119991, Russian Federation
  • S. Stupnikov  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • A. E. Vovchenko  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
  • D. Yu. Kovalev  Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation

Abstract: This paper contributes to the techniques for conceptual representation of data analysis algorithms and data integration facilities as well as processes to specify data and behavior semantics in one paradigm. An investigation of a novel approach for applying a combination of semantically different platform-independent rule-based languages (dialects) for interoperable conceptual specifications over various rule-based systems (RSs) relying on the rule-based program transformation technique recommended by the W3C Rule Interchange Format (RIF) is extended here. Such approach is combined with the facilities aimed at the semantic rule-based mediation intended for the heterogeneous data base integration. This paper extends a previous research of the authors in the direction of workflow modeling for definition of compositions of algorithmic modules in a process structure. A capability of the multidialect workflow support specifying the tasks in semantically different languages mostly suited to the task orientation is presented. A practical workflow use case, the interoperating tasks of which are specified in several rule-based languages (RIF-CASPD, RIF-BLD, RIF-PRD), is introduced. In addition, OWL 2 is used for the conceptual schema definition, RIF-PRD is used also for the workflow orchestration. The use case implementation infrastructure includes a production rule-based system (IBM ILOG), a logic rule-based system (DLV), and a mediation system.

Keywords:  conceptual specification; workflow; RIF; production rule languages; database integration; mediators; PRD; multidialect infrastructure

AUTOMATION BEYOND WEB 2.0

  • A. Sorokin  IBM EE/A, 10 Presnenskaya Nab., Moscow 123317, Russian Federation

Abstract: This paper introduces a new approach to the analysis of information systems (IS) evolution based on a range of technological activities. The issue centres on the prospect that Web-driven IS will be expanded from business processes to other domains of activities. The classical approach by which automation eliminates bottlenecks in business processes does not work under these conditions. Current trends in information technologies (IT) increase the capability for Web integration that leads to new types of virtual systems that will create a new Web architecture, conditionally named a Web "spiral." The spiral type of integration on Web supported by integrated cross-industry solutions is more promising and effective in comparison with the "radial" ones. The paper describes this new class of IT systems.

Keywords:  automation; business process reengineering; collaborative software; economies of scale; Internet topology; sociotechnical systems; systems of systems; virtual enterprises; Web 2.0