Abstract
Mixed criteria are linear combinations of standard criteria which cannot be represented as standard criteria. Linear combinations of total discounted and average rewards as well as linear combinations of total discounted rewards with different discount factors are examples of mixed criteria. We discuss the structure of optimal policies and algorithms for their computation for problems with and without constraints.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
E. Altman, Constrained Markov Decision Processes, Chapman & Hall/CRC, London, 1999.
E. Altman, E. Feinberg, J.A. Filar, and V.A. Gaitsgory, “Perturbed zero-sum games with applications to dynamic games,” Annals of the International Society of Dynamic Games 6 pp. 165–181, 2001.
E. Altman, E. A. Feinberg, and A. Shwartz, “Weighted discounted stochastic games with perfect information,” Annals of the International Society of Dynamic Games 5 pp. 303–323, 2000.
E. Altman and A. Shwartz, “Sensitivity of constrained Markov decision processes,” Ann. Operations Research 32 pp. 1–22, 1994.
E. Altman and A. Shwartz, “Constrained Markov games: Nash equilibria,” Annals of the International Society of Dynamic Games 5 pp. 213–221, 2000.
V.S. Borkar, Topics in Controlled Markov Chains, Longman Scientific & Technical, Harlow, 1991.
R. Ya. Chitashvili, “A finite controlled Markov chain with small break probability,” SIAM Theory Probability Appl. 21 pp. 157–163, 1976.
C. Derman and R.E. Strauch, “A note on memoryless rules for controlling sequential processes,” Ann. Math. Stat. 37 pp. 272–278, 1966.
A. Federgruen, “On N-person stochastic games with denumerable state spaces,” Ad. Appl. Prob 10 pp. 452–471, 1978.
E.A. Feinberg, “Controlled Markov processes with arbitrary numerical criteria,” SIAM Theory Probability Appl. 27 pp. 486–503, 1982.
E.A. Feinberg, “Letter to the Editor,” Oper. Res. 44 p. 526, 1996.
E.A. Feinberg, “Constrained discounted Markov decision processes and Hamiltonian cycles,” Math, of Operations Research, 25 pp. 130–140, 2000.
E.A. Feinberg, “Continuous Time Discounted Jump Markov Decision Processes: Discrete-Event Approach,” State University of New York at Stony Brook, Preprint, 1998.
E.A. Feinberg and A. Shwartz, “Markov decision models with weighted discounted criteria,” Math. of Operations Research 19 pp. 152–168, 1994.
E.A. Feinberg and A. Shwartz, “Constrained Markov decision models with weighted discounted rewards,” Math, of Operations Research 20 pp. 302–320, 1995.
E.A. Feinberg and A. Shwartz, “Constrained discounted dynamic programming,” Math. of Operations Research 21 pp. 922–945, 1996.
E.A. Feinberg and A. Shwartz, “Constrained dynamic programming with two discount factors: applications and an algorithm,” IEEE Transactions on Automatic Control TAC-44 pp. 628–630, 1999.
E. Fernandez-Gaucherand, M.K. Ghosh and S.I. Marcus, “Controlled Markov processes on the infinite planning horizon: weighted and overtaking cost criteria,” ZOR—Methods and Models of Operations Research 39 pp. 131–155, 1994.
J. Filar and O. Vrieze, “Weighted reward criteria in competitive Markov decision programming problems,” ZOR—Methods and Models of Operations Research 36 pp. 343–358, 1992.
M.K. Ghosh and S.I. Marcus, “Infinite horizon controlled diffusion problems with some nonstandard criteria,” J. Math. Systems, Estimation and Control 1 pp. 45–69, 1991.
K. Golabi, Ram B. Kulkarni and G.B. Way, “A statewide pavement management system,” Interfaces 12 pp. 5–21, 1982.
A. Haurie and P. L’Ecuyer, “Approximation and bounds in discrete event dynamic programming,” IEEE Transactions on Automatic Control AC-31 pp. 227–235, 1986.
O. Hernandez-Lerma and J. Lasserre, Discrete-Time Markov Control Processes, Springer, New York, 1996.
O. Hernandez-Lerma and J. Lasserre, Future Topics on Discrete-Time Markov Control Processes, Springer, New York, 1999.
O. Hernandez-Lerma and R. Romera, Pareto Optimality in Multiobjective Markov Control Processes, Preprint, 2000.
A. Hordijk, Dynamic Programming and Markov Potential Theory, Math. Centre Tracts 51, Math. Centrum, Amsterdam, 1974.
K. Hinderer, Foundations of Non Stationary Dynamic Programming with Discrete Time Parameter, Lecture Notes in Operations Research 33, Springer-Verlag, NY, 1970.
L.C.M. Kallenberg, Linear Programming and Finite Markovian Problem, Math. Centre Tracts 148, Math. Centrum, Amsterdam, 1983.
D. Krass, J. Filar and S.S. Sinha, “A weighted Markov decision process,” Oper. Res. 40 pp. 1180–1187, 1992.
A.P. Maitra and T. Parthasarathy, “On stochastic games,” Journal of Optimization Theory and Applications 5 pp. 289–300, 1970.
A.P. Maitra and W.D. Sudderth, “An operator solution of stochastic games,” Israel Journal of Mathematics 78 pp. 33–49, 1992.
A.P. Maitra and W.D. Sudderth, “Borel stochastic games with limsup payoff,” Annals of Probability 21 pp. 861–885, 1993
A.P. Maitra and W.D. Sudderth, Discrete Gambling and Stochastic Games, Springer, New York, 1996.
A.S. Nowak, “Universally measurable strategies in zero-sum stochastic games,” Annals of Probability 13 pp. 269–287, 1985.
T. Parthasarathy and E.S. Raghavan, Some Topics in Two-Person Games, Elsevier, New York, 1967.
A.B. Piunovskiy, Optimal Control of Random Sequences in Problems with Constraints, Kluwer, Boston, 1997.
M.L. Puterman, Markov Decision Processes, Wiley, New York, 1994.
M.I. Reiman and A. Shwartz, “Call Admission: A new Approach to Quality of Service,” to appear, QUESTA.
K.W. Ross, “Randomized and past dependent policies for Markov decision processes with finite action sets,” Oper. Res. 37 pp. 474–477, 1989.
K.W. Ross and R. Varadarajan, “Multichain Markov decision processes with a sample path constraint: a decomposition approach,” Math. Operations Research 16 pp. 195–207, 1991.
M. Schäl, “Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal,” Z. Wahr. verw. Gebiete 32 pp. 179–196, 1975.
A. Shwartz, “Death and Discounting,” IEEE Trans, on Auto. Control 46, pp. 644–647, 2001.
S. Stidham, “On the convergence of successive approximations in dynamic programming with non-zero terminal rewards,” Z. Operations Res. 25 pp. 57–77, 1981.
K.C.P. Want and J.P. Zaniewski, “20/30 hindsight: the new pavement optimization,” Interfaces 26 pp. 77–87, 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer Science+Business Media New York
About this chapter
Cite this chapter
Feinberg, E.A., Shwartz, A. (2002). Mixed Criteria. In: Feinberg, E.A., Shwartz, A. (eds) Handbook of Markov Decision Processes. International Series in Operations Research & Management Science, vol 40. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0805-2_7
Download citation
DOI: https://doi.org/10.1007/978-1-4615-0805-2_7
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5248-8
Online ISBN: 978-1-4615-0805-2
eBook Packages: Springer Book Archive