Advertisement

Applied Mathematics & Optimization

, Volume 75, Issue 2, pp 317–341 | Cite as

Constrained Continuous-Time Markov Decision Processes on the Finite Horizon

Article

Abstract

This paper studies the constrained (nonhomogeneous) continuous-time Markov decision processes on the finite horizon. The performance criterion to be optimized is the expected total reward on the finite horizon, while N constraints are imposed on similar expected costs. Introducing the appropriate notion of the occupation measures for the concerned optimal control problem, we establish the following under some suitable conditions: (a) the class of Markov policies is sufficient; (b) every extreme point of the space of performance vectors is generated by a deterministic Markov policy; and (c) there exists an optimal Markov policy, which is a mixture of no more than \(N+1\) deterministic Markov policies.

Keywords

Continuous-time Markov decision process Constrained-optimality Finite horizon Mixture of N + 1 deterministic Markov policies Occupation measure 

Mathematics Subject Classification

90C40 60J27 

References

  1. 1.
    Altman, E.: Constrained Markov Decision Processes. Chapman & Hall, Boca Raton (1999)MATHGoogle Scholar
  2. 2.
    Avrachenkov, K., Habachi, O., Piunovskiy, A., Zhang, Y.: Infinite horizon impulsive optimal control with applications to Internet congestion control. Int. J. Control 88, 703–716 (2015)CrossRefMATHGoogle Scholar
  3. 3.
    Baüerle, N., Rieder, U.: Markov Decision Processes with Applications to Finance. Springer, Heidelberg (2011)CrossRefMATHGoogle Scholar
  4. 4.
    Bertsekas, D., Nedíc, A., Ozdaglar, A.: Convex Analysis and Optimization. Athena Scientific, Belmont (2003)MATHGoogle Scholar
  5. 5.
    Feinberg, E.: Continuous time discounted jump Markov decision processes: a discrete-event approach. Math. Oper. Res. 29, 492–524 (2004)MathSciNetCrossRefMATHGoogle Scholar
  6. 6.
    Feinberg, E., Mandava, M., Shiryayev, A.: On solutions of Kolmogorovs equations for nonhomogeneous jump Markov processes. J. Math. Anal. Appl. 411(1), 261–270 (2014)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Feinberg, E., Rothblum, U.: Splitting randomized stationary policies in total-reward Markov decision processes. Math. Oper. Res. 37, 129–153 (2012)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Ghosh, M.K., Saha, S.: Continuous-time controlled jump Markov processes on the finite horizon. In: Optimization, Control, and Applications of Stochastic Systems, pp. 99–109. Birkhäuser, New York (2012)Google Scholar
  9. 9.
    Guo, X.P., Hernández-Lerma, O.: Continuous-Time Markov Decision Processes. Springer, Berlin (2009)CrossRefMATHGoogle Scholar
  10. 10.
    Guo, X.P., Hernández-Lerma, O.: Constrained continuous-time Markov controlled processes with discounted criteria. Stoch. Anal. Appl. 21, 379–399 (2003)CrossRefMATHGoogle Scholar
  11. 11.
    Guo, X.P., Huang, X.X., Huang, Y.H.: Finite horizon optimality for continuous-time Markov decision processes with unbounded transition rates. Adv. Appl. Probab. 47, 1–24 (2015)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Guo, X.P., Huang, Y.H., Song, X.Y.: Linear programming and constrained average optimality for general continuous-time Markov decision processes in history-dependent policies. SIAM J. Control Optim. 50, 23–47 (2012)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Guo, X.P., Song, X.Y.: Discounted continuous-time constrained Markov decision processes in Polish spaces. Ann. Appl. Probab. 21, 2016–2049 (2011)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Guo, X.P., Piunovskiy, A.: Discounted continuous-time Markov decision processes with constraints: unbounded transition and loss rates. Math. Oper. Res. 36, 105–132 (2011)MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    Guo, X.P., Vykertas, M., Zhang, Y.: Absorbing continuous-time Markov decision processes with total cost criteria. Adv. Appl. Probab. 45, 490–519 (2013)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    Hernández-Lerma, O., Lasserre, J.B.: Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer, New York (1996)CrossRefMATHGoogle Scholar
  17. 17.
    Hernández-Lerma, O., Lasserre, J.B.: Further Topics on Discrete-Time Markov Control Processes. Springer, New York (1999)CrossRefMATHGoogle Scholar
  18. 18.
    Huang, Y.H.: Finite horizon continuous-time Markov decision processes with mean and variance criteria. Submitted (2015)Google Scholar
  19. 19.
    Jacod, J.: Multivariate point processes: predictable projection, Radon–Nicodym derivatives, representation of martingales. Z. Wahrscheinlichkeitstheorie und verwandte Gebiete 31, 235–253 (1975)CrossRefMATHGoogle Scholar
  20. 20.
    Kitaev, M.Y., Rykov, V.V.: Controlled Queueing Systems. CRC Press, New York (1995)MATHGoogle Scholar
  21. 21.
    Miller, B.L.: Finite state continuous time Markov decision processes with a finite planning horizon. SIAM J. Control 6, 266–280 (1968)MathSciNetCrossRefMATHGoogle Scholar
  22. 22.
    Miller, B., Miller, G., Siemenikhin, K.: Towards the optimal control of Markov chains with constraints. Automatica 46, 1495–1502 (2010)MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    Piunovskiy, A.B.: Optimal Control of Random Sequences in Problems with Constraints. Kluwer Academic, Dordrecht (1997)CrossRefMATHGoogle Scholar
  24. 24.
    Piunovskiy, A.: A controlled jump discounted model with constraints. Theory Probab. Appl. 42, 51–71 (1998)CrossRefGoogle Scholar
  25. 25.
    Piunovskiy, A., Zhang, Y.: Discounted continuous-time Markov decision processes with unbounded rates: the convex analytic approach. SIAM J. Control Optim. 49, 2032–2061 (2011)MathSciNetCrossRefMATHGoogle Scholar
  26. 26.
    Pliska, S.R.: Controlled jump processes. Stoch. Process. Appl. 3, 259–282 (1975)MathSciNetCrossRefMATHGoogle Scholar
  27. 27.
    Prieto-Rumeau, T., Hernández-Lerma, O.: Selected Topics in Continuous-Time Controlled Markov Chains and Markov Games. Imperial College Press, London (2012)CrossRefMATHGoogle Scholar
  28. 28.
    Yushkevich, A.A.: Controlled Markov models with countable state and continuous time. Theory Probab. Appl. 22, 215–235 (1977)MathSciNetCrossRefMATHGoogle Scholar
  29. 29.
    Zhang, L.L., Guo, X.P.: Constrained continuous-time Markov decision processes with average criteria. Math. Methods Oper. Res. 67, 323–340 (2008)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.School of Mathematics and Computational ScienceSun Yat-Sen UniversityGuangzhouChina
  2. 2.Department of Mathematical SciencesUniversity of LiverpoolLiverpoolUK

Personalised recommendations