Finite Horizon Markov Decision Problems
In this chapter we solve finite horizon Markov decision problems. We are describing a policy evaluation algorithm and the Bellman equations, which are necessary and sufficient optimality conditions for Markov decision problems. Then we are constructing optimal policies out of the solution of the Bellman equations. We will see that the class of Markov deterministic policies —that are easier to handle—contain, under assumptions which are often satisfied in practise, optimal policies. Finally, we describe how optimal policies can be calculated, based on a backward induction algorithm. This chapter is based on [Put94], [Whi93], and [Der70].
KeywordsOptimal Policy Bellman Equation Induction Algorithm Card Game Total Reward
Unable to display preview. Download preview PDF.