Advertisement

Optimality of Deterministic Policies for Certain Stochastic Control Problems with Multiple Criteria and Constraints

  • Eugene A. Feinberg

Summary

For single-criterion stochastic control and sequential decision problems, optimal policies, if they exist, are typically nonrandomized. For problems with multiple criteria and constraints, optimal nonrandomized policies may not exist and, if optimal policies exist, they are typically randomized. In this paper we discuss certain conditions that lead to optimality of nonrandomized policies. In the most interesting situations, these conditions do not impose convexity assumptions on the action sets and reward functions.

Keywords

Optimal Policy Markov Decision Process Reward Function Average Reward Total Reward 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Altman E (1999) Constrained Markov Decision Processes. Chapman & Hall, Boca Raton, London, New York, Washington, D.C.zbMATHGoogle Scholar
  2. 2.
    Altman E, Shwartz A (1991) Markov decision problems and state-action frequencies. SIAM J. Control and Optimization, 29:786–809.zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Altman E, Shwartz A (1993) Time-sharing policies for controlled Markov chains. Operations Research, 41:1116–1124.zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Dvoretzky A, Wald A, Wolfowitz J (1951) Elimination of randomization in certain statistical decision procedures and zero-sum two-person games. Ann. Math. Stat., 22:1–21.zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Feinberg EA (2000) Constrained discounted Markov decision processes and Hamiltonian cycles. Mathematics of Operations Research, 25:130–140.zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    . Feinberg EA (2002) Constrained finite continuous-time Markov decision processes with average rewards. In: Proceedings of IEEE 2002 Conference on Decisions and Control (Las Vegas, December 10–13, 2002), 3805–3810.Google Scholar
  7. 7.
    Feinberg EA (2004) Continuous time discounted jump Markov decision processes: a discrete-event approach. Mathematics of Operations Research, 29:492524.CrossRefMathSciNetGoogle Scholar
  8. 8.
    Feinberg EA, Curry MT (2005) Generalized pinwheel problem. Mathematical Methods of Operations Research, 62:99–122.zbMATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Feinberg EA, Piunovskiy AB (2002) Nonatomic total reward Markov decision processes with multiple criteria. Journal of Mathematical Analysis and Applications, 273:93–111.zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Feinberg EA, Piunovskiy AB (2006) On Dvoretzky-Wald-Wolfowitz theorem on nonrandomized statistical decisions. Theory Probability and its Applications, 50:463–466.zbMATHCrossRefGoogle Scholar
  11. 11.
    Jacod J (1975) Multivariate point processes: predictable projection, Radon Nikodym derivatives, representatioin of martingales. Z. Wahrscheinlichkeittheorie verw. Gebite, 31:235–253.zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Kallenberg LCM (1983) Linear Programming and Finite Markovian Control Problems. Mathematical Center Tracts 148, Amsterdam, The Netherlands.Google Scholar
  13. 13.
    Kitaev MYu, Rykov VV (1995) Controlled Queueing Systems. CRC Press, Boca Raton, New York, London, Tokyo.zbMATHGoogle Scholar
  14. 14.
    Krass D, Vrieze OJ (2002) Achieving target state-action frequencies in multichain average-reward Markov decision processes. Mathematics of Operations Research, 27:545–566.zbMATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Liptser RS, Shiryaev AN (1989) Theory of Maringales. Kluwer Academic Publishers, Dordrecht, Boston, London.Google Scholar
  16. 16.
    Piunovskiy AB (1997) Optimal Control of Random Sequences in Problems with Constraints. Kluwer Academic Publishers, Dordrecht, Boston, London.zbMATHGoogle Scholar
  17. 17.
    Puterman ML (1994) Markov Decision Processes. John Wiley & Sons, New York, Chichester, Brisbane, Toronto, Singapure.zbMATHCrossRefGoogle Scholar
  18. 18.
    Ross KW (1989) Randomized and past-dependent policies for Markov decision processes with multiple constraints, Operations Research, 37:474–477.zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Eugene A. Feinberg
    • 1
  1. 1.State University of New York at Stony BrookStony BrookUSA

Personalised recommendations