Abstract
This is the first paper studying the CTMDP model without any discounted factor: the first arrival at target set. It gives conditions not too strong which can establish the foundation of the problem, then studies the existence and the form of the solution to the optimal equation, at last shows a method to search for the optimal policy. These results are important in both the theory and application of Markovian decision programming.
Similar content being viewed by others
References
D. Blackwell, Discounted dynamic programming. Ann Math. Statist.,36 (1965), 226–235.
K.L. Chung, Markov Chains with Stationary Transition Probability. Springer-Verlag, Berlin, 1967.
S.C. Jaquette, Markov decision progress with a new optimality criterion; small interest rates. Ann. Math. Statist.,43 (1972), 1894–1901.
P. Kakumanu, Continuously discounted Markov decision model with countable state and action space. Ann. Math. Statist.,42 (1971), 919–926.
Lin Yuanlie, Continuous time model on the first arrival (I)-optimal in discounted moments. Acta. Appl. Math.,14 (1991), 115–124. (in Chinese)
N.J. Pullman, Matrix Theory and Its Applications. Marcel Dekkor Inc., New York, 1976.
K. Tanaka and C. Matsuda, On a continuously discounted vector valued Markov decision progress. J. Inform. Optim. Sci.,11 (1) (1990), 33–48.
Chengxi Zhu, The distributions of integral functionals of inhomogeneous Markov chains. Acta. Math. Sinica.,29 (3) (1986), 338–346.
Author information
Authors and Affiliations
About this article
Cite this article
Wang, P. The first arrival model of continuous time Markovian decision programming — The discounted rate is 0. Japan J. Indust. Appl. Math. 16, 423–430 (1999). https://doi.org/10.1007/BF03167366
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF03167366