Informatics and Applications
2018, Volume 12, Issue 1, pp 49-54
INFLUENCE OF PRELIMINARY ESTIMATES ON THE SPEED OF SEARCH OF SIMILARITIES BY THE COUPLING MARKOV CHAIN
Abstract
At present, Data Mining expands usage of statistical machine learning methods. The similarity-based approach uses the probabilistic combinatorial formal method (VKF (variational Kalman filter) method). The main algorithm is based on a coupling Markov chain. The authors propose a mechanism to convert lengths of preliminary trajectories (before coalescence) to an upper bound on which it is necessary to stop excessively long successive runs. The main result claims that the change of probabilities is exponentially small with respect to total variation distance, if the chain uses sufficient number of preliminary runs. This proposal may be useful when there exists a small fraction of long trajectories with respect to the rest, because it provides a balance between the size of the bound and changes of probabilities.
[+] References (11)
- Finn, V. K. 2004. Ob intellektual'nom analize dannykh [On intelligent data analysis]. Novosti iskusstvennogo in- tellekta [Artificial Intelligence News] 3:3-18.
- Ganter, B., and R. Wille. 1999. Formal concept analysis. Berlin: Springer-Verlag. 284 p.
- Kuznetsov, S. 0.1993. Bystryy algoritmpostroeniyavsekh peresecheniy ob"ektov iz nizhney polureshetki [Fast algorithm for computation of all intersections of objects from a low-semilattice]. Nauchnaya i tekhnicheskaya informat- siya. Ser. 2 [Scientific and technical information. Ser. 2] 1:17-20.
- Skvortsov, D. P. 1983. O nekotorykh sposobakh postroeniya logicheskikh yazykov s kvantorami po kortezham [On some ways to construct logical languages with tuples quantifiers]. Semiotika i informatika [Semiotics and Informatics] 20:102-126.
- Vinogradov, D. V. 2000. Formalizing plausible arguments in predicate logic. Automatic Documentation Math. Linguistics 34(6):6-10.
- Vinogradov, D.V. 2015. The probability of encountering an accidental DSM similarity in the presence of counter examples. Automatic Documentation Math. Linguistics 49(2):43-46.
- Vinogradov, D.V. 2014. VKF-method of hypotheses generation. Comm. Comp. Inf. Sc. 436:237-248.
- Davey, B. A., and H. A. Priestley. 2002. Introduction to lattices and order. 2nd ed. Cambridge: Cambridge University Press. 298 p.
- Kuznetsov, S. O. 1989. Interpretatsiya na grafakh i slozh- nostnye kharakteristiki zadach poiska zakonomernostey opredelennogo vida [Graphs interpretation and complexity characteristics of tasks of search for patterns of special form]. Nauchnaya i tekhnicheskaya informatsiya. Ser. 2 [Scientific and technical information. Ser. 2] 1:23-28.
- Vinogradov, D. V. 2012. Random generation of hypotheses in the JSM method using simple Markov chains. Automatic Documentation Math. Linguistics 46(5):221-228.
- Kemeny, J. G., J. L. Snell, and A. W Knapp. 1966. De- numerable Markov chains. Princeton, NJ: Van Nostrand. 439 p.
[+] About this article
Title
INFLUENCE OF PRELIMINARY ESTIMATES ON THE SPEED OF SEARCH OF SIMILARITIES BY THE COUPLING MARKOV CHAIN
Journal
Informatics and Applications
2018, Volume 12, Issue 1, pp 49-54
Cover Date
2018-03-30
DOI
10.14357/19922264180106
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
similarity; Markov chain; VKF candidate; total variation; coupling
Authors
D. V. Vinogradov
Author Affiliations
Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
|