Informatics and Applications
2018, Volume 12, Issue 3, pp 2-13
FINDING CONTROL POLICY FOR ONE DISCRETE-TIME MARKOV CHAIN ON [0,1] WITH A GIVEN INVARIANT MEASURE
- M. G. Konovalov
- R. V. Razumchik
Abstract
A discrete-time Markov chain on the interval [0,1] with two possible transitions (left or right) at each step has been considerred. The probability of transition towards 0 (and towards 1) is a function of the current value of the chain. Having chosen the direction, the chain moves to the randomly chosen point from the appropriate interval.
The authors assume that the transition probabilities depend on the current value of the chain only through a finite number of real-valued numbers. Under this assumption, they seek the transition probabilities, which guarantee the L2 distance between the stationary density of the Markov chain and the given invariant measure on [0,1] is minimal. Since there is no reward function in this problem, it does not fit in the MDP (Markov decision process) framework. The authors follow the sensitivity-based approach and propose the gradient- and simulation-based method for estimating the parameters of the transition probabilities. Numerical results are presented which show the performance of the method for various transition probabilities and invariant measures on [0,1].
[+] References (15)
- Karlin, S. 1953. Some random walks arising in learning models. I. Pac. J. Math. 3(4):725-756.
- Kaijser, T 1994. On a theorem of Karlin. Acta Appl. Math. 34:51-69.
- Ramli, M. A., and G. Leng. 2010. The stationary proba-bility density of a class of bounded Markov processes. Adv. Appl. Probab. 42:986-993.
- McKinlay, S., and K. Borovkov. 2016. On explicit form of the stationary distributions for a class of bounded Markov chains. J. Appl. Probab. 53(1):231-243.
- Li, C. 1961. Human genetics. New York, NY: McGraw- Hill. 218 p.
- DeGroot, M.H., andM.M. Rao. 1963. Stochastic give- and-take. J. Math. Anal. Appl. 7:489-498.
- McKinlay, S. 2014. A characterization of transient random walks on stochastic matrices with Dirichlet distributed limits. J. Appl. Probab. 51:542-555.
- Peigne, M. 1993. Iterated function systems and spectral decomposition of the associated Markov operator. Publications mathematiques et informatique de Rennes. 2: 1-28.
- Diaconis, P., and D. Freedman. 1999. Iterated random functions. SIAM Rev. 41(1):45-76.
- Ladjimi, F., and M. Peigne. Iterated function systems with place dependent probabilities and application to the Diaconis–Friedman’s chain on [0,1].
Available at: https://hal.archives-ouvertes.fr/LMPT/hal01567392v1/ (accessed April4,2018).
- Stenflo, 0. 2001. A note on a theorem of Karlin. Stat. Probabil. Lett. 54(2):183-187.
- Jacquin, A. 1989. A fractal theory of iterated Markov operators with applications to digital image coding. Atlanta, GA: Georgia Institute of Technology. Ph.D. Thesis.
- Forte, B., andE. R. Vrscay 1995. Solving the inverse prob-lem for measures using iterated function systems: A new approach. Adv. Appl. Probab. 27(3):800-820.
- Vladimirov, V. S. 1976. Obobshchennye funktsii v matema- ticheskoy fizike [Generalized functions in mathematical physics]. Moscow: Nauka. 280 p.
- Konovalov, M. G. 2007. Metody adaptivnoy obrabotki in- formatsii i ikh prilozheniya [Methods of adaptive information processing and their applications]. Moscow: IPI RAN. 212 p.
[+] About this article
Title
FINDING CONTROL POLICY FOR ONE DISCRETE-TIME MARKOV CHAIN ON [0,1] WITH A GIVEN INVARIANT MEASURE
Journal
Informatics and Applications
2018, Volume 12, Issue 3, pp 2-13
Cover Date
2018-08-30
DOI
10.14357/19922264180301
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
Markov chain; control; continuous state space; sensitivity-based approach; derivative estimation
Authors
M. G. Konovalov and R. V. Razumchik ,
Author Affiliations
Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
Peoples' Friendship University of Russia (RUDN University), 6 Miklukho-Maklaya Str., Moscow 117198, Russian Federation
|