Using Markov decision processes to optimize a nonlinear functional of the final distribution, with manufacturing applications

Collins, E. J.

doi:10.1007/978-3-642-59105-1_3

E. J. Collins⁶

Part of the book series: Lecture Notes in Economics and Mathematical Systems ((LNE,volume 445))

432 Accesses
1 Citations

Abstract

We consider manufacturing problems which can be modelled as finite horizon Markov decision processes for which the effective reward function is either a strictly concave or strictly convex functional of the distribution of the final state. Reward structures such as these often arise when penalty factors are incorporated into the usual expected reward objective function. For convex problems there is a Markov deterministic policy which is optimal, but for concave problems we usually have to consider the larger class of Markov randomised policies. In the natural formulation these problems cannot be solved directly by dynamic programming. We outline alternative iterative schemes for solution and show how they can be applied in a specific manufacturing example.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Collins, E.J. (1995) Finite horizon variance penalized Markov decision process. Department of Mathematics, University of Bristol, Report no. S-95–11. Submitted to OR Spektrum.
Google Scholar
Collins, E.J. and McNamara, J.M. (1995) Finite-horizon dynamic optimisation when the terminal reward is a concave functional of the distribution of the final state. Department of Mathematics, University of Bristol, Report no. S-95–10. Submitted to Adv. Appl. Prob.
Google Scholar
Derman, C. (1970) Finite State Markovian Decision Processes. Academic Press, New York.
Google Scholar
Derman, C. and Strauch, R. (1966) A note on memoryless rules for controlling sequential control processes. Ann. Math. Statist. 37, 276–278.
Article Google Scholar
Filar, J.A., Kallenberg, L.C.M. and Lee, H.M. (1989) Variance-penalised Markov decision processes. Math Oper Res 14, 147–161.
Article Google Scholar
Huang, Y. and Kallenberg, L.C.M. (1994) On finding optimal policies for Markov decision chains: a unifying framework for mean-variance tradeoffs. Math Oper Res 19, 434–448.
Article Google Scholar
Kallenberg, L.C.M. (1983) Linear Programming and Finite Markov Control Problems. Mathematical Centre, Amsterdam.
Google Scholar
Luenberger, D.G. (1973) Introduction to Linear and Nonlinear Programming. Addison Wesley, Reading.
Google Scholar
Sobel, M.J. (1982) The variance of discounted Markov decision processes. J Appl Prob 19, 774–802.
Article Google Scholar
White, D.J. (1988) Mean, variance and probabilistic criteria in finite Markov decision processes: a review. J Optimization Theory and Applic 56, 1–29.
Article Google Scholar
White, D.J. (1994) A mathematical programming approach to a problem in variance penalised Markov decision processes. OR Spektrum 15, 225–230.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, University of Bristol, University Walk, Bristol, BS8 1TW, UK
E. J. Collins

Authors

E. J. Collins
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematics and Computer Science, University of Salford, Salford M5 4WT, Lancs., UK
Anthony H. Christer Ph.D.
Department of Industrial and Systems Engineering, Hiroshima University, Higashi-Hiroshima, 739, Japan
Shunji Osaki Ph.D.
Department of Business Studies, The University of Edinburgh, William Robertson Building 50 George Square, Edinburgh, EH8 9JY, UK
Lyn C. Thomas Ph.D.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Collins, E.J. (1997). Using Markov decision processes to optimize a nonlinear functional of the final distribution, with manufacturing applications. In: Christer, A.H., Osaki, S., Thomas, L.C. (eds) Stochastic Modelling in Innovative Manufacturing. Lecture Notes in Economics and Mathematical Systems, vol 445. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-59105-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-59105-1_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61768-6
Online ISBN: 978-3-642-59105-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics