Abstract
Chapter 3 deals with finite models, that is, continuous-time MDPs with a finite number of states and actions. The long-run expected average reward (AR) criterion and the n-bias (n=0,1,…) optimality criteria are introduced in Sect. 3.2. (Occasionally, we abbreviate expected average reward as EAR rather than expected AR.) For every n=0,1,…, formulas expressing the difference between the n-biases for any two policies are provided in Sect. 3.3. These formulas are used in Sect. 3.4 to characterize n-bias optimal policies. The policy iteration and the linear programming algorithms for computing optimal policies for each of the n-bias criteria are given in Sects. 3.5 and 3.6, respectively.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Guo, X., Hernández-Lerma, O. (2009). Average Optimality for Finite Models. In: Continuous-Time Markov Decision Processes. Stochastic Modelling and Applied Probability, vol 62. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02547-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-02547-1_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02546-4
Online ISBN: 978-3-642-02547-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)