Skip to main content

On the Behaviour of the Optimal Value Operator of Dynamic Programming

  • Chapter
  • 169 Accesses

Part of the book series: Lecture Notes in Economics and Mathematical Systems ((LNE,volume 274))

Abstract

In this chapter we deal with the dynamic programming algorithm DPA for the finite-horizon stochastic dynamic programming problem SDP introduced in Section 4.6. Roughly speaking, the following decision process is described by SDP. There is a discrete time stochastic system, and the state of the system evoluates in a Markovian way. At each stage of time the current state is observed, and using this information the decision maker has to select an action. Any action will influence the immediate cost of the corresponding stage as well as the probability distribution of the next state. The purpose is to minimize the expected costs, summed over all stages. For details on the SDP problem we refer the reader to Section 4.6. There we gave also a formal definition of a “policy”; a policy may be seen as a complete specification of all particular choices which possibly are made by the decision maker at any stage and at any state.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K.J. Arrow, S. Karlin, H. Scare (1958). Studies in the Mathematical Theory of Inventory and Production, Stanford University Press, Stanford Ca.

    Google Scholar 

  2. D.P. Bertsekas, S.E. Shreve (1978). Stochastic Optimal Control; the Discrete Time Case, Academic Press, New York.

    Google Scholar 

  3. D. Blackwell, D. Freedman, M. Orkin (1974). The optimal reward operator in dynamic programming. Ann. Probab. 2, 926–941.

    Article  Google Scholar 

  4. I.V. Evstigneev (1976). Measurable selection and dynamic programming. Math. Oper. Res. 1, 267–272.

    Article  Google Scholar 

  5. D.A. Freedman (1974). The optimal reward operator in special cases of dynamic programming problems. Ann. Prob. 2, 942–949.

    Article  Google Scholar 

  6. K. Hinderer (1970). Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter. Lecture notes in operations research and mathematical systems 33, Springer, Berlin-Heidelberg-New York.

    Book  Google Scholar 

  7. W.K. Klein Haneveld (1977). Markovian Inventory Control Models, Report Institute of Econometrics OR 7702, University of Groningen.

    Google Scholar 

  8. W.K. Klein Haneveld (1978). Markovian Inventory Control Models; Improvements and Extensions. Report Department of Mathematics, University of Kentucky, Lexington Ky.

    Google Scholar 

  9. P. Olsen (1976). When is a multistage stochastic programming problem well-defined? SIAM J. Control Optim. 14, 518–527.

    Article  Google Scholar 

  10. R.T. Rockafellar (1976). Integral functionals, normal integrands and measurable selection. J.P. Gossez, E.J. Lami Dozo, J. Mawhin, L. Waelbroeck (eds.). Nonlinear Operators and the Calculus of Variation, Lecture notes in mathematics 543, Springer, Berlin-Heidelberg-New York, 157–207.

    Chapter  Google Scholar 

  11. R.T. Rockafellar, R.J.-B. Wets (1976). Nonanticipativity and L1-martingales in stochastic optimization problems. Math. Programming Stud. 6, 170–187.

    Article  Google Scholar 

  12. M. Schäl (1972). On continuous dynamic programming with discrete time parameter. Z. Wahrsch. Verw. Gebiete 21, 279–288.

    Article  Google Scholar 

  13. M. Schäl (1975). On dynamic programming: compactness of the space of policies. Stochastic Process. Appl. 3, 345–364.

    Article  Google Scholar 

  14. M. Schäl (1976). On the optimality of (s,S)-policies In Dynamic Inventory models with finite horizon. SIAM J. Appl. Math. 30, 528–537.

    Article  Google Scholar 

  15. S. Shreve (1977). Dynamic Programming in Complete Separable Spaces. Ph.D. thesis, University of Illinois at Urbana-Champaign, Urbana Ill.

    Google Scholar 

  16. R.E. Strauch (1966). Negative dynamic programming. Ann. Statist. 37, 871–890.

    Article  Google Scholar 

  17. R.J.-B. Wets (1976). Grundlagen Konvexer Optimierung. Lecture notes in economics and mathematical systems 137, Springer, Berlin-Heidelberg-New York.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 1986 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Klein Haneveld, W.K. (1986). On the Behaviour of the Optimal Value Operator of Dynamic Programming. In: Duality in Stochastic Linear and Dynamic Programming. Lecture Notes in Economics and Mathematical Systems, vol 274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-51697-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-51697-9_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-16793-8

  • Online ISBN: 978-3-642-51697-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics