|
|||
Informatics and Applications scientific journalVolume 13, Issue 3, 2019Content Abstract and Keywords About Authors METHODS OF IDENTIFICATION OF "WEAK" SIGNS OF VIOLATIONS OF INFORMATION SECURITY
Abstract: New approach of identification of "weak" signs of violations of information security is suggested. Initial information for identification of "weak" signs of violations of information security by the insider-malefactor are the observed potential purposes of the insider-malefactor. Emergence of new valuable information, in which the insider-malefactor is interested, will cause behavioral reaction of the insider-malefactor in some information spaces. Methods of searching of such reactions in various information spaces are the purpose of this work. The probability model of a reaction of an insider-malefactor in case of repeated emergence of a purpose is constructed.
Keywords: information security; information spaces; behavioral signs of a violator of information security ON THE ASYMPTOTICS OF CLUSTERING COEFFICIENT IN A CONFIGURATION GRAPH WITH UNKNOWN DISTRIBUTION OF VERTEX DEGREES
Abstract: The author considers configuration graphs with vertex degrees being independent identically distributed random variables. The degree of each vertex equals to the number of incident half-edges that are numbered in an arbitrary order. The graph is constructed by joining each half-edge to another equiprobably to form edges. Configuration graphs are widely used for modeling of complex communication networks such as the Internet, social, transport, telephone networks. The distribution of vertex degrees can be unknown. It is only assumed that this distribution either has a finite variance or that some sufficient weak constraints on the asymptotic behavior of the tail are satisfied. The notion of clustering coefficient and its properties in such graphs are discussed. The author proves the limit theorem for the clustering coefficient with the number of vertices tending to infinity. The conditions under which this coefficient increases indefinitely are found. Keywords: random graphs; configuration graphs; clustering coefficient; limit theorems ON THE BOUNDS OF THE RATE OF CONVERGENCE FOR SOME QUEUEING MODELS WITH INCOMPLETELY DEFINED INTENSITIES
Abstract: The authors consider some queuing systems with incompletely defined 1-periodical intensities under corresponding conditions. The authors deal with Mt/Mt/S queue for any number of servers S and Mt/Mt/S/S (the Erlang model). Estimates of the rate of convergence in weakly ergodic situation are obtained by applying the method of the logarithmic norm of the operator of a linear function. The examples with exact given values of intensities and different variations of amplitude and frequency are considered, ergodicity conditions and estimates of the rate of convergence are obtained for each model, and plots of the effect of intensities' amplitude and frequency of incoming requirements on the limiting characteristics of the process are constructed. The authors use the general algorithm to build graphs, it is associated with solving the Cauchy problem for the forward Kolmogorov system on the corresponding interval, which has already been used by the authors in previous papers. Keywords: queuing systems; incompletely defined intensities; rate of convergence; ergodicity; logarithmic norm; Mt/Mt/S queue; Mt/Mt/S/S queue NONTRANSITIVE TRIPLETS OF CONTINUOUS RANDOM VARIABLES AND THEIR APPLICATIONS
Abstract: The phenomenon of nontransitivity of the stochastic precedence relation for three independent random variables with distributions from some classes of continuous distributions is studied. Initially, this question was posed in connection with the application in strength theory. With paired comparisons of iron bars from three factories, a paradoxical situation may arise when the bars from the first factory are "worse" than the bars from the second factory, the bars from the second factory are "worse" than the bars from the third factory, and the bars from the third factory are "worse" than the bars from the first factory. Further, the nontransitivity topic gained popularity for the example of the so-called nontransitive dice; however, this led to its narrowing down to discrete random variables with finite sets of values. The paper presents that for mixtures of normal and exponential distributions, nontransitivity is possible in a wide range of parameters. Specific features of the mutual arrangement of the graphs of the distribution functions in these cases are indicated. Keywords: nontransitivity; nontransitive dice; stochastic precedence; continuous distributions; mixtures of distributions A PRIORI GENERALIZED GAMMA DISTRIBUTION IN BAYESIAN BALANCE MODELS
Abstract: The work is devoted to the study of Bayesian balance models, involving the division of the system parameters into two classes: supporting system functioning positive factors and interfering with the functioning negative factors. The balance index, defined as the ratio of the negative factor to the positive factor, is considered.
Keywords: Bayesian approach; generalized gamma distribution; gamma-exponential function; balance models; mixed distributions HYBRID EXTREME GRADIENT BOOSTING MODELS TO IMPUTE THE MISSING DATA IN PRECIPITATION RECORDS
Abstract: The article compares the classical method of extreme gradient boosting implemented in the XGBoost (eXtreme Gradient Boosting) framework with the new modification CatBoost (Categorial Boosting), which is rarely involved in scientific researches. Some hybrid classification-regression models are proposed to improve the accuracy of imputation in missing values in real data using 14 meteorological stations in Germany. The achieved accuracy of the classification is up to 92% and the root-mean-square errors are quite moderate. The hybrid methods outperformed both simple classification and regression models in prediction accuracy. The proposed approaches can be successfully used for meteorological data analysis by machine learning methods as well as for improving the forecasting accuracy in physical models of atmospheric processes. Keywords: data imputation; precipitation; classification; regression; gradient boosting; XGBoost; CatBoost STOCHASTIC DIFFERENTIAL SYSTEM OUTPUT CONTROL BY THE QUADRATIC CRITERION. III. OPTIMAL CONTROL PROPERTIES ANALYSIS
Abstract: The investigation of the optimal control problem for the Ito diffusion process and linear controlled output with a quadratic quality criterion is continued. The properties of the optimal solution defined by the Bellman function of the form Vt(y, z) = atz2 + @t(y)z + Yt(y), whose coefficients @t(y) and Yt(y) are described by linear parabolic equations, are studied. For these coefficients, alternative equivalent descriptions are defined in the form of stochastic differential equations and a theoretical-to-probabilistic representation of their solutions, known as the Kolmogorov equation. It is shown that the obtained differential representation is equivalent to the Feynman-Kac integral formula. In the future, the obtained description of the coefficients and, as a result, the solutions of the original control problem can be used to implement an alternative numerical method for calculating them as a result of computer simulation of the solution of a stochastic differential equation. Keywords: stochastic differential equation; optimal control; Bellman function; linear differential equations of parabolic type; Kolmogorov equation; Feynman-Kac formula ON THE SOLUTION OF THE OPTIMAL CONTROL PROBLEM OF INVENTORY OF A DISCRETE PRODUCT IN THE STOCHASTIC MODEL OF REGENERATION WITH CONTINUOUSLY OCCURING CONSUMPTION
Abstract: The article is the second and final part of the research of the optimal control problem of inventory of a discrete product in a stochastic regeneration model. The main content of the work is the derivation of analytical representations for the mathematical expectation of the increment of the functional of profit obtained during the regeneration period. At the same time, these mathematical expectations are determined under different conditions for decisions made during the regeneration period. The obtained analytical representations enable one to explicitly determine the stationary cost indicator of control efficiency, which was introduced in the first part of the research.
Keywords: inventory management of a discrete product; controlled regenerative process; stationary cost indicator ofcontrol efficiency EVALUATION OF THE SIGNIFICANCE LEVEL IN SCHUIRMANN'S TEST FOR CHECKING THE BIOEQUIVALENCE HYPOTHESIS IN MISSING DATA CONDITIONS
Abstract: The bioequivalence hypothesis testing is the important task in pharmacokinetics. It helps to make a decision about the equivalence of the reproduced drug to the reference drug. One of the problems of bioequivalence studies is the availability of missing data. A small amount of data entails the inability to delete a data sample with missing data. Therefore, there is a task to estimate the impact of missing data on bioequivalence testing task, in particular, to estimate the significance level. The main method of the bioequivalence hypothesis testing is Schuirmann's two one-sided tests procedure. The article shows the significance level evaluation of this procedure in the case of missing data. The evaluation component, depending on the level of data completeness, is shown in the explicit form. Keywords: bioequivalence; significance level; type I error; missing data; Schuirmann's two one-sided tests procedure FORMALIZATION OF THE ALTERNATIVES RANKING METHOD FOR GROUP DECISION MAKING IN SOCIAL NETWORKS
Abstract: The expansion and accessibility of Internet technology has allowed a new look at social networks. A few decades ago, this online service was more entertaining in nature. However, today, with increasing transmission rates and the possibility of real-time communication, social networks, on which platform a poll or vote can be easily organized, become a powerful mechanism for achieving consensus in decision making process. The paper offers an overview of the known models of group decision making (GDM) and a formal description of the decision-making algorithm developed on the basis of the overview taking into account a large amount of data in a social network.
Keywords: group decision making; social network analysis; fuzzy logic; LTS PERFORMANCE ESTIMATIONS FOR OPTIMAL-ON-CC-VaR PORTFOLIOS IN OPTION MARKETS
Abstract: The paper continues investigations of the author about using continuous VaR-criterion (CC-VaR) in financial markets. The problem ofprojecting ideas and methods elaborated for investments in the ideal theoretical one-period market and its discrete scenario analog onto a discrete-in-strikes option market is considered. The main focus is on the methods of calculating distribution function of income and return relative, and also their mean for option portfolios optimal on CC-VaR and their randomized versions, both full and partial. A discrete optimization algorithm as the result of projecting the theoretical algorithm based on the Newman-Pearson procedure onto scenario market is suggested. The optimal vector of weights derived from this algorithm is applied to the basis of normalized simplest butterflies. If randomizing portfolios are admissible, then special algorithms based on the ideas of the Monte-Carlo method that determine distribution functions of income and return relative are suggested.
Keywords: continuous VaR-criterion (CC-VaR); investor's risk-preferences function (r.p.f.); Newman-Pearson procedure; scenarios; options; indicators; butterflies; full and partial randomizing; optimal portfolio; income; yield THIRD-ORDER INTERFACES IN INFORMATICS
Abstract: The European Strategy "Informatics for All," formally launched in Brussels in March 2018, distinguishes two tiers of teaching informatics in the system of secondary and higher education. The second tier, focused on the study of informational transformations in artificial, living, and social systems, involves choosing an informatics paradigm for teaching, and then its development. The need for development is due to two reasons: firstly, a significant expansion of the scope of applications of information technologies considered in educational processes, and secondly, the integration of methods and tools of informatics into curricula in other areas of knowledge, which expands the range of information transformations. In the absence of a dominant informatics paradigm and the presence of several of its variants, the question of its choosing as the starting point of development is disputable. The "Informatics education in Europe" review, published in 2017 and preceding the development of the European strategy "Informatics for All," lists three paradigm variants, including positioning informatics as the fourth great domain of science, proposed by Denning and Rosenbloom in 2009. A detailed description of this variant under the name of polyadic computing was given by Rosenbloom in his book in 2013. The goal of the paper is to define a new concept of "third-order interface" based on the one-natured division of the domain of informatics as polyadic computing. The relevance of the concept is illustrated by the example of robotic arm control using brain-computer interfaces. Keywords: third-order interface; polyadic computing; one-natured media of informatics domain; information transformations ARCHITECTURE OF A MACHINE TRANSLATION SYSTEM
Abstract: The paper describes architecture of a Neural Machine Translation (NMT) system. The subject is brought up since NMT, i. e., translation using artificial neural networks, is now a leading Machine Translation paradigm.
Keywords: neural machine translation; artificial neural networks; recurrent neural networks; attention mechanism; architecture of a machine translation system; Google's Neural Machine Translation system METHODS FOR IDENTIFICATION OF IMPLICIT LOGICAL-SEMANTIC RELATIONS IN TEXTS
Abstract: The paper presents methods for identification of implicit logical-semantic relations (LSR) in parallel texts of the Supracorpora Database (SCDB) of Connectives. The stages of the search process are described based on the Russian-French translations: (i) selection of an LSR to be analyzed and creation of an array of annotations of Russian connectives considered as prototypical means for expressing this LSR; (ii) analysis of the produced array of annotations and identification of common equivalents for translating Russian connectives into French; (iii) utilizing the bilingual search functions of the SCDB with exclusion of Russian connectives annotated during the first stage and with specification ofthe most frequent French language units identified during the second stage; (iv) annotation of the pairs of fragments of parallel texts found as the result of the third stage; and (v) analysis of the array annotations produced at the fourth stage in order to identify and categorize instances of implicit LSR. The proposed SCDB-based search methods make it possible to gather new data on implicit LSRs. Keywords: identifying implicit information; connectives; contrastive linguistics; corpus linguistics; supracorpora databases; logical-semantic relations PERSONAL COGNITIVE ASSISTANT: CONCEPT AND KEY PRINCIPALS
Abstract: The paper proposes the concept of cognitive personal assistant. The cognitive assistant is a virtual intelligent agent that has its own sign-based world model and builds a world model of the user, which it helps to solve various problems. The architecture of the cognitive assistant is described, the main functions that it should implement are considered, and the main methods and technologies that are used in the construction of such assistants are presented. Two subject areas in which the use of cognitive assistants is the most promising are considered. Keywords: cognitive assistant; educational assistant; medical assistant; sign-based worldview; natural language processing; script; dialog system; planning APPLICATION OF RECURRENT NEURAL NETWORKS TO FORECASTING THE MOMENTS OF FINITE NORMAL MIXTURES
Abstract: The article compares the application of feedforward and recurrent neural networks to forecasting continuous values of expectation, variance, skewness, and kurtosis of finite normal mixtures. Fourteen various architectures of neural networks are considered. To increase training speed, the high-performance computing cluster is used. It is demonstrated that the best forecasting results based on standard metrics (root-mean-square error, mean absolute errors, and loss function) are achieved on the two LSTM (Long-Short Term Memory) networks: with 100 neurons in one hidden layer and 50 neurons in each three hidden layers. Keywords: recurrent neural networks; forecasting; deep learning; high-performance computing; CUDA METHODS OF MODELING AND VISUAL REPRESENTATION OF A CONFLICT IN A SMALL COLLECTIVE OF EXPERTS SOLVING PROBLEMS (REVIEW)
Abstract: Small collectives of experts as natural collective decision support intellect (heterogeneous collective) solve problems effectively In addition, the form of interaction between experts as conflict generates positive changes in collective such as development of the group, diagnostics of relations, tension reduction, and consolidation of the group and inspire saving the collective. Preset sketches of standard situations play a huge role in human reasoning.
Keywords: small collective of experts; conflict; model of a conflict; visualization of a conflict DEVELOPMENT OF A METHOD FOR THE FORMATION OF ATTRIBUTE SPACE AND A MODEL FOR THE ASSESSMENT AND PREDICTION OF ANTHROPOGENIC INFLUENCE ON THE ENVIRONMENT (ON THE EXAMPLE OF THE FOREST FUND OF THE OIL-PRODUCING REGION)
Abstract: The work is devoted to the development of a systematic method for assessing and predicting the influence of natural and anthropogenic impacts on the environment, including the procedures for the transformation of initial data store, the formation of the neural network model, its training, and testing. The method is used to analyze the consequences of anthropogenic impacts on the environment in the Khanty-Mansiysk Autonomous Okrug - Yugra. Keywords: data analysis; machine learning; neural networks; spatial analysis; geographic information systems; risk-based approach; control and supervision THE SCIENTIFIC RESULT AS THE INFORMATION OBJECT IN THE CONTEXT OF THE SCIENTIFIC SERVICES SYSTEM MANAGEMENT
Abstract: The article discusses the problem of formalization of one of the most important concepts for the system of scientific services - the scientific result, which is the quintessence of all scientific research and the main information object for the processes of formulation of scientific problems, formulation of scientific hypotheses, and monitoring of scientific and technical information (STI). The formalized concept of "scientific result" which integrates the entire information structure of scientific research is used in the organization of a significant number of scientific services: services of extraction of facts and knowledge (extraction of facts, concepts, connections, and formalization of factual data based on linguistic analysis of semistructured information); intelligent information search; thematic indexing; analysis of the front of research; communication services (services of work with STI and communication in the scientific community; intellectual analysis of specialized social networks and other means of scientific communication; and verification of borrowings). This is especially true for interdisciplinary research providing search and selection of relevant scientific provisions and tools in related fields of scientific knowledge. Keywords: scientific result; information model; scientific services; identification algorithms; intellectual search; interdisciplinary research
|
Phone of the Center: +7 (499) 135-62-60E-mail of the Center: ipiran@ipiran.ru | RUS |