Informatics and Applications
2022, Volume 16, Issue 2, pp 109-117
CONTROLLING A BOUNDED TWO-DIMENSIONAL MARKOV CHAIN WITH A GIVEN INVARIANT MEASURE
- M. G. Konovalov
- R. V. Razumchik
Abstract
Consideration is given to the two-dimensional discrete-time Markov chain (random walk) with the bounded continuous state space (rectangle). Upon each transition, depending on its current position and if not on the boundary, the chain moves in one of four possible directions (north, south, east, or west). Having selected a direction, the length of the jump within the admissible interval is determined by the random variable. Assuming that some (reference) distribution on the state space is given, one seeks to solve the inverse control problem, i. e., to find such a control strategy (probabilities of choosing either direction) which brings the stationary distribution of the chain close (in a certain sense) to the reference distribution. The solution based on the policy gradient method is proposed. Illustrative examples are provided.
[+] References (29)
- Konovalov, M., and R. Razumchik. 2018. Upravlenie sluchaynym bluzhdaniem s etalonnym statsionarnym raspredeleniem [Finding control policy for one discretetime Markov chain on [0,1] with a given invariant measure]. Informatika i ee Primeneniya - Inform. Appl. 12(3):2-13.
- Konovalov, M. G., I. N. Konovalova, and R. V. Razumchik. 2020. Upravlenie dvumernoy markovskoy tsep'yu s nepreryvnym ogranichennym mnozhestvom sostoyaniy, privodyashchee k zadannomu statsionarnomu raspredeleniyu [Control of a two-dimensional Markov chain with a continuous bounded set of states leading to a given stationary distribution]. Informatsionno- telekommunikatsionnye tekhnologii i matematicheskoe mod- elirovanie vysokotekhnologichnykhsistem [Informationand telecommunication technologies and mathematical modeling of high-tech systems]. Moscow: RUDN University Publs. 276-279.
- Elmaliach, Y., A. Noa, and G. Kaminka. 2009. Multi-robot area patrol under frequency constraints. Ann. Math. Artif. Intel. 57(3-4):293-320.
- Elor, Y., and A. Bruckstein. 2010. Autonomous multi-agent cycle based patrolling. Swarm intelligence. Eds.
M. Dorigo, M. Birattari, G. A. Di Caro, et al. Lecture notes in computer science ser. Springer. 6234:119-130.
- LaValle, S., and J. Hinrichsen. 2001. Visibility-based pursuit-evasion: The case of curved environments. IEEE T. Robotic. Autom. 17(2):196-202.
- Gerkey, B. P., S. Thrun, and G. Gordon. 2006. Visibility- based pursuit-evasion with limited field of view. Int. J. Robot. Res. 25(4):299-315.
- Fazli, P., A. Davoodi, P. Pasquier, and A. K. Mackworth.
2010. Complete and robust cooperative robot area coverage with limited range. Conference (International) on Intelligent Robots and Systems Proceedings. Piscataway, NJ: IEEE. 5577-5582.
- Jennings, J., G. Whelan, and W Evans. 1997. Cooperative search and rescue with a team of mobile robots. 8th Conference (International) on Advanced Robotics Proceedings. Piscataway, NJ: IEEE. 193-200.
- Fazli, P., A. Davoodi, and A. K. Mackworth. 2012. Multirobot repeated area coverage: Performance optimization under various visual ranges. 9th Conference on Computer and Robot Vision Proceedings. Piscataway, NJ: IEEE. 298-305.
- Carlsson, S., B. J. Nilsson, and S.C. Ntafos. 1993 Optimum guard covers and m-watchmen routes for restricted polygons. Int. J. Comput. Geom. Ap. 3(1):85-105.
- Toth, P., and D. Vigo. 2002. The vehicle routing problem. Philadelphia, PA: Society for Industrial Mathematics. 358 p.
- Machado, A., G. Ramalho, J. D. Zucker, and A. Drogoul. 2003. Multiagent patrolling: An empirical analysis of alternative architectures. Multi-agent-based simulation II. Eds. J. Simao Sichman, F Bousquet, and P. Davidsson. Lecture notes in computer science ser. Springer. 2158:155-170.
- Gasparri, A., B. Krishnamachari, and G. Sukhatme. 2008. A framework for multi-robot node coverage in sensor networks. Ann. Math. Artif. Intel. 52(2):281-305.
- Kalyaev, I. A., A. R. Gayduk, and S. G. Kapustyan. 2009. Modeli i algoritmy kollektivnogo upravleniya v gruppakh robotov [Models and algorithms of collective control in groups of robots]. Moscow: Fizmatlit. 280 p.
- Birk, A., B. Wiggerich, H. Bulow, M. Pfingsthorn, and
S. Schwertfeger. 2011. Safety, security, and rescue missions with an unmanned aerial vehicle (UAV). J. Intell. Robot. Syst. 64(1):57-76.
- Diveev, A.I., E.Yu. Shmalko, and D.A. Ryndin. 2017. Reshenie zadachi optimal'nogo upravleniya gruppoy robotov evolyutsionnymi algoritmami [Solution of optimal control problem for group of robots by evolutionary algorithms]. Informatsionnye imatematicheskie tekhnologii v nauke i upravlenii [Information and Mathematical Technologies in Science and Management] 3(7):109-126.
- Pshikhopov, V. Kh., and M. Yu. Medvedev. 2018. Gruppovoe upravlenie dvizheniem mobil'nykh robotov v neopredelennoy srede s ispol'zovaniem neustoychivykh rezhimov [Group control of autonomous robots motion in uncertain environment via unstable modes]. Trudy SPIIRAN[SPIIRAS Proceedings] 60:39-63.
- Arbanas, B., A. Ivanovic, M. Car, M. Orsag, T. Petrovic, and S. Bogdan. 2018. Decentralized planning and control for UAV-UGV cooperative teams. Auton. Robot. 42(8):1601-1618.
- Senotov, V. D., A. P. Aliseychik, E.V. Pavlovsky, A. V. Podoprosvetov, and I. A. Orlov. 2020. Algoritmy staynogo detsentralizovannogo upravleniya dvizheniem gruppy robotov s differentsial'nym privodom [Algorithms for swarm decentralized motion control ofgroup ofrobots with a differential drive]. Keldysh Institute Preprints 123. 39 p.
- Kibzun, A.I., and I. N. Sinitsyn. 2020. Sovremennye problemy teorii optimizatsii stokhasticheskikh sistem [Modern problems of optimization theory stochastic systems]. Automat. Rem. Contr. 11:3-10.
- Portugal, D., andR. Rocha. 2010. MSP algorithm: Multirobot patrolling based on territory allocation using balanced graph partitioning. Symposium on Applied Computing Proceedings. New York, NY: ACM. 1271-1276.
- Reif, J., and H. Wang. 1999. Social potential fields: A distributed behavioral control for autonomous robots. Robot. Auton. Syst. 27(3):171-194.
- Chu, H. N., A. Glad, O. Simonin, F. Sempe, A. Drogoul, and F. Charpillet. 2007. Swarm approaches for the patrolling problem, information propagation vs. pheromone evaporation. 19th Conference (International) on Tools with Artificial Intelligence Proceedings. Alamitos, CA: IEEE. 1:442-449.
- Shvets, E.A., and D. P. Nikolaev. 2015. Complex approach to long-term multi-agent mapping in low dynamic environments. Proc. SPIE 9875:98752A. 10 p. doi: 10.1117/12.2228708.
- Miller, B. M., K. V. Stepanyan, A. K. Popov, and A. B. Miller. 2017. UAV navigation based on videose-quences captured by the onboard video camera. Automat. Rem. Contr. 17:2211-2221.
- Ramli, M. A., and G. Leng. 2010. The stationary probability density of a class of bounded Markov processes. Adv. Appl. Probab. 42:986-993.
- Dotsenko, A., A. Diveev, and J. P. C. Cevallos. 2019. Collision avoidance at swarm regrouping using modified network operator method with various number of arguments. 14th Conference on Industrial Electronics and Applications Proceedings. Piscataway, NJ: IEEE. 768-773.
- Konovalov, M., and R. Razumchik. 2021. Dispetcherizatsiya v sisteme s parallel'nym obsluzhivaniem s pomoshch'yu raspredelennogo gradientnogo upravleniya markovskoy tsep'yu [Routing jobs to heterogeneous parallel queues using distributed policy gradient algorithm]. Informatika i ee Primeneniya - Inform. Appl. 15(3):41-50.
- Konovalov, M. G. 2007. Metody adaptivnoy obrabotki informatsii i ikh prilozheniya [Methods of adaptive information processing and their applications]. Moscow: IPI RAN. 212 p.
[+] About this article
Title
CONTROLLING A BOUNDED TWO-DIMENSIONAL MARKOV CHAIN WITH A GIVEN INVARIANT MEASURE
Journal
Informatics and Applications
2022, Volume 16, Issue 2, pp 109-117
Cover Date
2022-07-25
DOI
10.14357/19922264220214
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
Markov chain control; continuous state space; policy gradient; unmanned air vehicles
Authors
M. G. Konovalov and R. V. Razumchik
Author Affiliations
Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
|