Continuous Time Markov Decision Processes
This chapter discusses continuous time Markov decision processes, where the state space and the action sets are all countable. First, we focus on the total reward criterion for a stationary model by applying the ideas and methods presented in Chapter 2 for DTMDPs. Similar results to those in Chapter 2 are obtained. Then, we deal with a nonstationary model with the total reward criterion. By dividing the time axis into shorter intervals, we obtain the standard results, such as the optimality equation and the relationship between the optimality of a policy and the optimality equation. Finally, we study the average criterion for a stationary CTMDP model by transforming it into a DTMDP model. Thus, the results in DTMDPs can be used directly for CTMDPs for the average criterion.
KeywordsDiscount Rate Optimal Policy Markov Decision Process Optimality Equation Average Criterion
Unable to display preview. Download preview PDF.