Skip to main content

Using Markov decision processes to optimize a nonlinear functional of the final distribution, with manufacturing applications

  • Conference paper
Stochastic Modelling in Innovative Manufacturing

Part of the book series: Lecture Notes in Economics and Mathematical Systems ((LNE,volume 445))

Abstract

We consider manufacturing problems which can be modelled as finite horizon Markov decision processes for which the effective reward function is either a strictly concave or strictly convex functional of the distribution of the final state. Reward structures such as these often arise when penalty factors are incorporated into the usual expected reward objective function. For convex problems there is a Markov deterministic policy which is optimal, but for concave problems we usually have to consider the larger class of Markov randomised policies. In the natural formulation these problems cannot be solved directly by dynamic programming. We outline alternative iterative schemes for solution and show how they can be applied in a specific manufacturing example.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Collins, E.J. (1995) Finite horizon variance penalized Markov decision process. Department of Mathematics, University of Bristol, Report no. S-95–11. Submitted to OR Spektrum.

    Google Scholar 

  • Collins, E.J. and McNamara, J.M. (1995) Finite-horizon dynamic optimisation when the terminal reward is a concave functional of the distribution of the final state. Department of Mathematics, University of Bristol, Report no. S-95–10. Submitted to Adv. Appl. Prob.

    Google Scholar 

  • Derman, C. (1970) Finite State Markovian Decision Processes. Academic Press, New York.

    Google Scholar 

  • Derman, C. and Strauch, R. (1966) A note on memoryless rules for controlling sequential control processes. Ann. Math. Statist. 37, 276–278.

    Article  Google Scholar 

  • Filar, J.A., Kallenberg, L.C.M. and Lee, H.M. (1989) Variance-penalised Markov decision processes. Math Oper Res 14, 147–161.

    Article  Google Scholar 

  • Huang, Y. and Kallenberg, L.C.M. (1994) On finding optimal policies for Markov decision chains: a unifying framework for mean-variance tradeoffs. Math Oper Res 19, 434–448.

    Article  Google Scholar 

  • Kallenberg, L.C.M. (1983) Linear Programming and Finite Markov Control Problems. Mathematical Centre, Amsterdam.

    Google Scholar 

  • Luenberger, D.G. (1973) Introduction to Linear and Nonlinear Programming. Addison Wesley, Reading.

    Google Scholar 

  • Sobel, M.J. (1982) The variance of discounted Markov decision processes. J Appl Prob 19, 774–802.

    Article  Google Scholar 

  • White, D.J. (1988) Mean, variance and probabilistic criteria in finite Markov decision processes: a review. J Optimization Theory and Applic 56, 1–29.

    Article  Google Scholar 

  • White, D.J. (1994) A mathematical programming approach to a problem in variance penalised Markov decision processes. OR Spektrum 15, 225–230.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Collins, E.J. (1997). Using Markov decision processes to optimize a nonlinear functional of the final distribution, with manufacturing applications. In: Christer, A.H., Osaki, S., Thomas, L.C. (eds) Stochastic Modelling in Innovative Manufacturing. Lecture Notes in Economics and Mathematical Systems, vol 445. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-59105-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-59105-1_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-61768-6

  • Online ISBN: 978-3-642-59105-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics