Informatics and Applications

2018, Volume 12, Issue 1, pp 49-54

INFLUENCE OF PRELIMINARY ESTIMATES ON THE SPEED OF SEARCH OF SIMILARITIES BY THE COUPLING MARKOV CHAIN

  • D. V. Vinogradov

Abstract

At present, Data Mining expands usage of statistical machine learning methods. The similarity-based approach uses the probabilistic combinatorial formal method (VKF (variational Kalman filter) method). The main algorithm is based on a coupling Markov chain. The authors propose a mechanism to convert lengths of preliminary trajectories (before coalescence) to an upper bound on which it is necessary to stop excessively long successive runs. The main result claims that the change of probabilities is exponentially small with respect to total variation distance, if the chain uses sufficient number of preliminary runs. This proposal may be useful when there exists a small fraction of long trajectories with respect to the rest, because it provides a balance between the size of the bound and changes of probabilities.

[+] References (11)

[+] About this article