On the Behaviour of the Optimal Value Operator of Dynamic Programming

Klein Haneveld, Willem K.

doi:10.1007/978-3-642-51697-9_6

On the Behaviour of the Optimal Value Operator of Dynamic Programming

Willem K. Klein Haneveld⁴

Chapter

169 Accesses

Part of the book series: Lecture Notes in Economics and Mathematical Systems ((LNE,volume 274))

Abstract

In this chapter we deal with the dynamic programming algorithm DPA for the finite-horizon stochastic dynamic programming problem SDP introduced in Section 4.6. Roughly speaking, the following decision process is described by SDP. There is a discrete time stochastic system, and the state of the system evoluates in a Markovian way. At each stage of time the current state is observed, and using this information the decision maker has to select an action. Any action will influence the immediate cost of the corresponding stage as well as the probability distribution of the next state. The purpose is to minimize the expected costs, summed over all stages. For details on the SDP problem we refer the reader to Section 4.6. There we gave also a formal definition of a “policy”; a policy may be seen as a complete specification of all particular choices which possibly are made by the decision maker at any stage and at any state.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

K.J. Arrow, S. Karlin, H. Scare (1958). Studies in the Mathematical Theory of Inventory and Production, Stanford University Press, Stanford Ca.
Google Scholar
D.P. Bertsekas, S.E. Shreve (1978). Stochastic Optimal Control; the Discrete Time Case, Academic Press, New York.
Google Scholar
D. Blackwell, D. Freedman, M. Orkin (1974). The optimal reward operator in dynamic programming. Ann. Probab. 2, 926–941.
Article Google Scholar
I.V. Evstigneev (1976). Measurable selection and dynamic programming. Math. Oper. Res. 1, 267–272.
Article Google Scholar
D.A. Freedman (1974). The optimal reward operator in special cases of dynamic programming problems. Ann. Prob. 2, 942–949.
Article Google Scholar
K. Hinderer (1970). Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter. Lecture notes in operations research and mathematical systems 33, Springer, Berlin-Heidelberg-New York.
Book Google Scholar
W.K. Klein Haneveld (1977). Markovian Inventory Control Models, Report Institute of Econometrics OR 7702, University of Groningen.
Google Scholar
W.K. Klein Haneveld (1978). Markovian Inventory Control Models; Improvements and Extensions. Report Department of Mathematics, University of Kentucky, Lexington Ky.
Google Scholar
P. Olsen (1976). When is a multistage stochastic programming problem well-defined? SIAM J. Control Optim. 14, 518–527.
Article Google Scholar
R.T. Rockafellar (1976). Integral functionals, normal integrands and measurable selection. J.P. Gossez, E.J. Lami Dozo, J. Mawhin, L. Waelbroeck (eds.). Nonlinear Operators and the Calculus of Variation, Lecture notes in mathematics 543, Springer, Berlin-Heidelberg-New York, 157–207.
Chapter Google Scholar
R.T. Rockafellar, R.J.-B. Wets (1976). Nonanticipativity and L₁-martingales in stochastic optimization problems. Math. Programming Stud. 6, 170–187.
Article Google Scholar
M. Schäl (1972). On continuous dynamic programming with discrete time parameter. Z. Wahrsch. Verw. Gebiete 21, 279–288.
Article Google Scholar
M. Schäl (1975). On dynamic programming: compactness of the space of policies. Stochastic Process. Appl. 3, 345–364.
Article Google Scholar
M. Schäl (1976). On the optimality of (s,S)-policies In Dynamic Inventory models with finite horizon. SIAM J. Appl. Math. 30, 528–537.
Article Google Scholar
S. Shreve (1977). Dynamic Programming in Complete Separable Spaces. Ph.D. thesis, University of Illinois at Urbana-Champaign, Urbana Ill.
Google Scholar
R.E. Strauch (1966). Negative dynamic programming. Ann. Statist. 37, 871–890.
Article Google Scholar
R.J.-B. Wets (1976). Grundlagen Konvexer Optimierung. Lecture notes in economics and mathematical systems 137, Springer, Berlin-Heidelberg-New York.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Econometrics, University of Groningen, P.O. Box 800, 9700 AV, Groningen, The Netherlands
Dr. Willem K. Klein Haneveld

Authors

Dr. Willem K. Klein Haneveld
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Klein Haneveld, W.K. (1986). On the Behaviour of the Optimal Value Operator of Dynamic Programming. In: Duality in Stochastic Linear and Dynamic Programming. Lecture Notes in Economics and Mathematical Systems, vol 274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-51697-9_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-51697-9_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-16793-8
Online ISBN: 978-3-642-51697-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics