The underlying stochastic processes in DTMDPs are discrete time Markov chains, where the decision epochs are equally periodic or the length of adjacent decision epochs are not considered. Those in CTMDPs are continuous time Markov chains, where the decision is chosen every time. In this chapter, we study a stationary semi-Markov decision processes (SMDPs) model, where the underlying stochastic processes are semi-Markov processes. Here, the decision epoch is exactly the state transition epoch with its length being random. We transform the SMDP model into a stationary DTMDP model for either the total reward criterion or the average criterion, similarly to the stationary CTMDP model with the average criterion discussed in Section 4.3. Thus, the results in DTMDP can be used directly for SMDP for the discounted criterion, the total reward criterion, and the average criterion.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
(2008). Semi-Markov Decision Processes. In: Markov Decision Processes With Their Applications. Advances in Mechanics and Mathematics, vol 14. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-36951-8_5
Download citation
DOI: https://doi.org/10.1007/978-0-387-36951-8_5
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-36950-1
Online ISBN: 978-0-387-36951-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)