Abstract
In applying the method of value iteration one frequently observes that the relative values for the n-th stage converge very rapidly with increasing n, whereas the absolute values converge slowly (discounting factor β near one) or even diverge (β ≧ 1). This fact is used by MacQueen [10], [11] and others to give good bounds for the value of the infinite horizon problem and, in addition, for the elimination of suboptimal actions in the early stages. This elimination can be improved by the use of an upper bound S for the convergence rate. In case of δ < β the improvement has two effects: it reduces computing time and it allows the application to the finite horizon case with some β ≧ 1.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Y. M. I. Dirickx: Deterministic discrete dynamic programming with discount factor greater than one: structure of optimal policies. Man. Sci. 20 (1973), 32–43.
R. C. Grinold: Elimination of suboptimal actions in Markov decision problems. Operations Res. 21 (1973), 848–851.
J. Hajnal: Weak ergodicity in non-homogeneous Markov chains. Proc. Cambr. Phil. Soc. 54 (1958), 233–246.
N. A. J. Hastings, J. C. M. Mello: Tests for suboptimal actions in discounted Markov programming. Man. Sci. 19 (1973), 1019–1022.
K. Hinderer: Estimates for finite state dynamic programs. J. Math. Anal. Appl. 55 (1976), 207–238.
K. Hinderer, G. Hübner: An improvement of J. F. Shapiro’s turnpike theorem for the horizon of finite stage discrete dynamic programs. This volume 245–255.
R. A. Howard: Dynamic programming and Markov processes. Wiley, New York 1960.
W. S. Jewell: Markov renewal programming I + II. Operations Res. 11 (1963), 938–971.
J. G. Kemeny, J. L. Snell: Finite Markov chains. Van Nostrand, Princeton, N. J. 1960.
J. MacQueen: A modified dynamic programming method for Markovian decision problems. J. Math. Anal. Appl. 14 (1966), 38–43.
J. MacQueen: A test for suboptimal actions in Markovian decision problems. Operations Res. 15 (1967), 559–561.
Th. E. Morton: On the asymptotic convergence rate of cost differences for Markovian decision processes. Operations Res. 19 (1971), 244–248.
J. L. Mott: Conditions for the ergodicity of non-homogeneous finite Markov chains. Proc. Roy. Soc. Edinburgh Sec. A 64 (1951), 369–380.
E. L. Porteus: Some bounds for discounted sequential decision processes. Man. Sci. 18 (1971), 7–11.
D. Reetz: A decision exclusion algorithm for a class of Markovian decision processes. Zeitschr. Operations Res. 20 (1976), 125–131.
T. A. Sarymsakov: On inhomogeneous Markov chains. Doklady A.N.S.S.S.R. 120 (1958), 465–467.
H. Schellhaas: Zur Extrapolation in Markoffschen Entscheidungsmodellen mit Diskontierung. Zeitschr. Operations Res. 18 (1974), 91–104.
E. Seneta: Non-negative matrices. Allen & Unwin, London 1973.
J. F. Shapiro: Turnpike planning horizons for a Markovian decision model. Man. Sci. 14 (1968), 292–300.
D. J. White: Dynamic programming, Markov chains and the method of successive approximations. J. Math. Anal. Appl. 6 (1963), 373–376.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1977 ACADEMIA, Publishing House of the Czechoslovak Academy of Sciences, Prague
About this chapter
Cite this chapter
Hübner, G. (1977). Improved Procedures for Eliminating Suboptimal Actions in Markov Programming by the Use of Contraction Properties. In: Kožešnik, J. (eds) Transactions of the Seventh Prague Conference on Information Theory, Statistical Decision Functions, Random Processes and of the 1974 European Meeting of Statisticians. Transactions of the Seventh Prague Conference on Information Theory, Statistical Decision Functions, Random Processes and of the 1974 European Meeting of Statisticians, vol 7A. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-9910-3_27
Download citation
DOI: https://doi.org/10.1007/978-94-010-9910-3_27
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-9912-7
Online ISBN: 978-94-010-9910-3
eBook Packages: Springer Book Archive