Институт проблем информатики Российской Академии наук
Институт проблем информатики Российской Академии наук
Российская Академия наук

Институт проблем информатики Российской Академии наук



«Systems and Means of Informatics»
Scientific journal
Volume 25, Issue 1, 2015

Content | About  Authors

Abstract and Keywords.

GRID AND CLOUD SERVICES SIMULATION AS AN IMPORTANT STEP OF THEIR DEVELOPMENT.

  • V. V. Korenkov
  • A. V. Nechaevskiy
  • G. A. Ososkov
  • D. I. Pryakhina
  • V. V. Trofimov
  • A. V. Uzhinskiy

Abstract:  A new system for grid and cloud services simulation is described. It is focused on improving the efficiency of grid-cloud systems development by using work quality indicators of a real system to design and predict its evolution. For these purposes, the simulation programis combinedwith a real monitoring system of a grid-cloud service through a special database. The simulation principles and their implementation in the SyMSim software package are described. An example of using the program to simulate a general cloud structure is given.

Keywords:  simulation; distributed data storage; cloud computing; Big Data; optimization; monitoring

COMBINING CORPUS AND THE SAURUS INFORMATION FOR EXTRACTING SENTIMENT WORDS.

  • N. V. Loukachevitch
  • I. I. Chetviorkin

Abstract:  The paper describes a combined approach to extraction of a domain- specific sentiment lexicon. At first, an initial version of a domain-specific lexicon is obtained by application of a supervised model. At the second stage, the ordered list of sentiment words is refined using the thesaurus information. This combined model is applied to several domains and at last, the domain-specific sentiment lexicons are united to create an improved version of the Russian sentiment lexicon in the generalized domain of products.

Keywords:  sentiment analysis; domain adaptation; natural language processing; thesaurus

MULTICRITERIA METHOD FOR DETECTING NEAR-DUPLICATES IN A STREAM OF TEXT MESSAGES.

  • A. Andreev
  • D. Berezkin
  • I. Kozlov
  • K. Simakov

Abstract:  The problem of near-duplicate detection in a stream of text messages is considered. A model of a text document and a multicriteria duplicate identification method is proposed. The model provides flexible adjustment for different domains. The method is based on binary classification using support vector machine. The paper also provides a method of candidates prefiltration in order to ensure high efficiency of the approach. Several experiments with data obtained from a stream of news articles were carried out. The results show feasibility of the suggested approach.

Keywords:  near-duplicate detection; similarity measure; binary classification

CONTROL FLOW BASED TEST SUITE GENERATION.

  • N. Voinov
  • P. Drobintsev
  • I. Nikiforov
  • V. Kotlyarov
  • I. Selin

Abstract:  The article is devoted to description of an approach to test suite generation in accordance with standard structured coverage criteria based on the control flow model. The approach is based on automatic test generation with usage of symbolic verification. The main advantage of the approach is reducing the number of generated tests obtained due to analysis of control flow data and reducing the state space for the verification system. The article contains the main ideas of the approach, the formal model of control flow, and the tools for model analysis. The results of piloting the approach in a set of projects devoted to software development are also presented.

Keywords:  testing automation; formal model; coverage criterion

ANALYSIS OF UCM-MODEL COVERAGE BY TEST SCENARIOS.

  • N. Voinov
  • P. Drobintsev
  • I. Nikiforov
  • V. Kotlyarov

Abstract:  The article observes approaches to analysis of UCM-models coverage by test scenarios generated based on integral coverage criteria. Existing criteria for automatic generation of test scenarios from high-level UCM-specifications are reviewed. Two approaches to analysis of UCM-model coverage are proposed: the automatic one which provides information about covered and uncovered elements, branches, and paths in one view, and the visual one which allows the user to explicitly make sure that a UCM-model is covered by test scenarios. The described approaches are implemented in the analysis tool which significantly reduces the time needed to create a test set which covers an UCM-model. Future plans on coverage analysis improvement are also mentioned.

Keywords:  test generation criteria; test scenarios; UCM; specifications; analysis

GENERALIZED TABLE-BASED LL-PARSING.

  • S. V. Grigorev
  • A.K. Ragozina

Abstract:  Syntax analysis is an important step of code analysis. The problem is that the grammars have to be in a form which is deterministic, or at least near- deterministic for the chosen parsing technique. Generalized parsing algorithms| Generalized LR and Generalized LL (GLL) | make it possible to remove these restrictions. Abstract analysis makes it possible to parse embedded languages for supporting them in IDE, reengineering tasks, or finding vulnerabilities (SQL- injection). Abstract syntax analysis is based on the classic table-based analysis. The generalized algorithm of top-down parsing without the use of predictive tables was described earlier in order to extend the class of languages processed by descent analyzers. This paper describes an approach to creation of a table-based GLL-analyzer based on the proposed algorithm, which will be used later for an abstract analyzer. This article describes the algorithm of generalized top-down analysis, its modifications, and the results of comparison with the generalized bottom-up parsing algorithm, which was implemented earlier.

Keywords:  generalized parsing; GLL; RNGLR; abstract parsing; string- embedded languages

SYNTHESIS OF STABLE LINEAR PUGACHEV FILTERS AND EXTRAPOLATORS FOR STOCHASTIC SYSTEMS WITH WIDE BAND MULTIPLICATIVE NOISES.

  • I. N. Sinitsyn
  • E. R. Korepanov

Abstract:  The article is dedicated to the analytical synthesis of continuous and discrete uniquely asymptotically stable conditionally optimal linear Pugachev filters and extrapolators (LPF and LPE) for stochastic systems (StS) with wide band multiplicative Gaussian noises. It is supposed that observation is part of the state and observation equations. The theorems serving as the basis for the algorithms of synthesis of continuous uniquely asymptotical stable LPF and LPE are proven. Continuous LPF and LPE for StS with wide band Gaussian autocorrelated noises are presented. Discrete LPF and LPE for continuous and discrete StS with wide band multiplicative Gaussian noises are considered. An illustrative example is given. Some generalizations are considered.

Keywords:  accuracy; continuous stochastic system; discrete stochastic system; linear Pugachev extrapolator; linear Pugachev filter; multiplicative noises; Riccati equation; unique asymptotical stability; wide band gaussian

ON CONVERGENCE OF RANDOM SUMS OF INDEPENDENT RANDOM VECTORS TO MULTIVARIATE GENERALIZED VARIANCE-GAMMA DISTRIBUTIONS.

  • A. Yu. Korchagin

Abstract:  The purpose of this work is to describe the conditions for convergence of the distributions for sums of a random number of independent not necessarily identically distributed multivariate random variables to multivariate normal variance-mean mixtures, in particular, to multivariate generalized variance- gamma distributions.

Keywords:  random sum; multivariate normal variance-mean mixture; multivariate generalized hyperbolic distribution; multivariate generalized variance-gamma distribution; generalized inverse Gaussian distribution; generalized gamma distribution

LARGE CAPACITY OF RAILWAY CARGO TRANSPORTATION FORECASTING.

  • R.K. Gazizullina
  • M.M. Medvednikova
  • V. V. Strijov

Abstract:  The article is devoted to research of the algorithm of nonparametric forecasting of railway cargo transportation capacity. The problem considered is forecasting the number of wagons with various goods, following various routes. The topology of the railway network is given | for all possible pairs of railway lines, information about all blocks of wagons, which have moved from one line to another, including the number of wagons in a block, the type of cargo, and the date of the route, is provided. The algorithm, based on convolution of the empirical density distribution of the values of time series with the loss function is used for prediction. Previously, forecasting was carried out for each railway junction separately. It is proposed to be improved by the quality of forecasting predicting by pairs of lines instead of predicting departure of all wagons from the given junction. The algorithm is illustrated by the daily data on transportation of 38 types of cargo collected during a year and a half.

Keywords:  forecasting; nonparametric method; railroad station occupancy; loss function; empirical distribution; compression

SOME APPROACHES TO FORMING THE REGULATORY AND TECHNICAL BASE FOR THE UNIFIED INFORMATION SPACE OF RUSSIA IN THE FIELD OF INFORMATION RESOURCES.

  • A. A. Zatsarinny
  • E. V. Kiselev

Abstract:  The methodical approaches to forming the regulatory and technical base for the unified information space of the Russian Federation (UIS RF) in the field of systematization and interaction of information resources are developed. It is proposed to create two centralized components of the UIS RF federal level as a mega system with their own information resources: the control center and the world information space interaction center. The authors suggest a generalized model of forming and interaction for secured information resources included in the unified information space. The general approach to creation, interaction, and usage of secured information resources at the site of a mega system participant is described as well. There is a participant-generalized model which includes a secured information resources general circuit. The circuit includes three independent circuits for open, confidential, and enclosed information resources. Some issues of design of profiles of open system environments for participants of the unified information space are determined. Finally, a model of the process of creation of an open system environment profile for a participant of the unified information system is suggested on the basis of the Russian guidance R 50.1.041-2002.

Keywords:  secured information resources; UIS RF control center; world information space interaction center; UIS RF secured interaction gateways; All-Russian System of Electronic Interaction; participant's secured information resources circuits; participant's databank; participant's information archive; participant's unified information (information and telecommunication) secured system; participant's open system environment profile design

TECHNOLOGY FOR PREVENTION OF DUPLICATION OF BIBLIOGRAPHIC DESCRIPTIONS IN THE SCIENTIFIC DATABASE BIAS IPI RAS.

  • M. Yu. Zaikin
  • V. S. Dolgopolov
  • O. L. Obuhova
  • I. V. Soloviev

Abstract:  The paper considers the developed technology aimed at avoiding duplication of bibliographic descriptions in the scientific database Bibliographic Information-Analytical System (BIAS) of IPI RAS. The analysis of the reasons of duplications is given. The constituent parts of the developed software are the modules of definition of similarity using the methods of fuzzy search based on the Oliver algorithm and the modules of visualization of the results which are built into the system at the level of formation of the database content. Program modules of visualization allow moderators of BIAS IPI RAS to receive full information about the conflicts. They will be able to decide on further action using additional information. The concept of similarity index used in the software modules of definition of similarity is introduced. The paper considers the formal data model underlying construction of the database, built on the principles of facet navigation. Application of the developed software made it possible to detect and remove duplicate bibliographic descriptions in the scientific database.

Keywords: similarity index; software modules of definition of similarity; method of fuzzy search on the Oliver algorithm; facet navigation

CREATION OF PATIENT FLOWS CONTROL SYSTEM.

  • G. Y. Ilushin
  • V. I. Limansky

Abstract:  This article deals with the health care system organizational model regarding rendering of primarymedical care to the population and the hierarchical three-level structure of medical institutions formed during the process of its reforming. The analysis of the existing ways of patient routing is carried out, including patient booking in medical institutions of the first level (the attached contingent), the redirection of patients to medical institutions of the second and the third levels by means of coupons and electronic assignments issued by doctors. The analysis of advantages and disadvantages of these ways is presented. The electronic control systems of patient flows used for patient routing are examined as well as some problems of their implementation, in particular, implementation of EMIAS inMoscow. The analysis of electronic control systems of patient flows is given. Their main components are defined. The structure of this system based on the principles of a single point of distributed systems components interaction is suggested. The processes taking place in the system during realization of the main precedents are examined. The data flows between components of the system arising during these processes are defined. The paper also deals with approaches to organization of electronic documents flow between the medical institutions based on application of electronic appointments.

Keywords:  medical information systems (MIS); electronic control systems of patient flows; distributed information systems; sequence diagram

INFORMATION RESOURCES FOR CONTRASTIVE STUDIES:
TYPOLOGICAL DATABASES.

  • M.G. Kruzhkov

Abstract:  This article presents information resources used in contrastive linguistic studies and their principle features. There are two main types of such information resources: typological databases and electronic text corpora. The attention is focused on typological databases, which can be subdivided into general-purpose typological databases and specialized typological databases. General-purpose typological databases act as repositories of a wide range of data on a wide variety of languages. They may be utilized as reference resources and may also be helpful while dealing with language classification problems. Specialized typological databases are used for closer investigation of specific language phenomena in restricted sets of languages. They supply detailed models of such phenomena and include examples to illustrate their functioning in the considered languages. The paper also looks into problems related to development of typological databases and to integration of data fromheterogeneous typological databases.

Keywords:  contrastive linguistic studies; databases; typological databases; electronic text corpora

ABOUT ACADEMICIAN I. A.MIZIN’S CONTRIBUTION TO THEORY AND PRACTICE OF DOMESTIC INFORMATION-TELECOMMUNICATION SYSTEMS CREATION:
TO THE 80th ANNIVERSARY.

  • I. A. Sokolov
  • A. A. Zatsarinny
  • V.N. Zakharov

Abstract:  The article is devoted to the 80th anniversary of academician I.A.Mizin, head of IPI RAS during 1989 - 1999, outstanding scientist, designer and engineer. A brief biography concerning his scientific work is presented. The scientific and practical contribution of I.A. Mizin to the theory and its applications in creation of domestic information-telecommunication systems (ITS) is considered in three directions of his work. The first direction is development and implementation of the data communication system for the purposes of Armed Force ACS (automated control system), which was the first domestic network with packet commutation. The second direction is justification of information technologies for creation of the huge territorial system of data communication in the regions of Russia. The third direction is creation and development of information networks for the purposes of government with requirements of information protection.

Keywords:  chief designer; academician; information telecommunication net- works; information technologies; data communication system; packet commu- tation; methods of data communication; information protection; link channels; trial-run; government