Optimality of Deterministic Policies for Certain Stochastic Control Problems with Multiple Criteria and Constraints
For single-criterion stochastic control and sequential decision problems, optimal policies, if they exist, are typically nonrandomized. For problems with multiple criteria and constraints, optimal nonrandomized policies may not exist and, if optimal policies exist, they are typically randomized. In this paper we discuss certain conditions that lead to optimality of nonrandomized policies. In the most interesting situations, these conditions do not impose convexity assumptions on the action sets and reward functions.
KeywordsOptimal Policy Markov Decision Process Reward Function Average Reward Total Reward
Unable to display preview. Download preview PDF.
- 6.. Feinberg EA (2002) Constrained finite continuous-time Markov decision processes with average rewards. In: Proceedings of IEEE 2002 Conference on Decisions and Control (Las Vegas, December 10–13, 2002), 3805–3810.Google Scholar
- 12.Kallenberg LCM (1983) Linear Programming and Finite Markovian Control Problems. Mathematical Center Tracts 148, Amsterdam, The Netherlands.Google Scholar
- 15.Liptser RS, Shiryaev AN (1989) Theory of Maringales. Kluwer Academic Publishers, Dordrecht, Boston, London.Google Scholar