Blackwell Optimality

Hordijk, Arie; Yushkevich, Alexander A.

doi:10.1007/978-1-4615-0805-2_8

Arie Hordijk⁴ &
Alexander A. Yushkevich⁵

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 40))

1614 Accesses
10 Citations

Abstract

In this introductory section we consider Blackwell optimality in Controlled Markov Processes (CMPs) with finite state and action spaces; for brevity, we call them finite models. We introduce the basic definitions, the Laurent-expansion technique, the lexicographical policy improvement, and the Blackwell optimality equation, which were developed at the early stage of the study of sensitive criteria in CMPs. We also mention some extensions and generalizations obtained afterwards for the case of a finite state space. In Chapter 2 the algorithmic approach to Blackwell optimality for finite models is given. We refer to that chapter for computational methods. Especially for the linear programming method, which we do not introduce.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 299.00; Price excludes VAT (USA)

Softcover Book: USD 379.99; Price excludes VAT (USA)

Hardcover Book: USD 379.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

E. Altman, A. Hordijk and L.C.M. Kallenberg, “On the value function in constrained control of Markov chains”, Mathematical Methods of Operations Research 44, 387–399, 1996.
Article Google Scholar
E.I. Balder, “On compactness of the space of policies in stochastic dynamic programming”, Stochastic Processes and Applications 32, 141–150, 1989.
Article Google Scholar
D. Blackwell, “Discrete dynamic programming”, Annals of Mathematical Statistics 33, 719–726, 1962.
Article Google Scholar
R. Cavazos-Cadena and J.B. Lasserre, “Strong 1-optimal stationary policies in denumerable Markov decision processes”, Systems and Control Letters 11, 65–71, 1988.
Article Google Scholar
R. Cavazos-Cadena and J.B. Lasserre, “A direct approach to Blackwell optimality”, MORFISMOS 3, no. 1, 9–33, 1999.
Google Scholar
R.Ya. Chitashvili, “A controlled finite Markov chain with an arbitrary set of decisions”, Theory Prob. Appl. 20, 839–846, 1975.
Article Google Scholar
R.Ya. Chitashvili, “A finite controlled Markov chain with small termination probability”, Theory Prob. Appl. 21, 158–163, 1976.
Article Google Scholar
R. Dekker and A. Hordijk, “Average, sensitive and Blackwell optimal policies in denumerable Markov decision chains with unbounded rewards”, Mathematics of Operations Research 13, 395–421, 1988.
Article Google Scholar
R. Dekker and A. Hordijk. “Denumerable semi-Markov decision chains with small interest rates”, Annals Oper. Res. 28, 185–212, 1991.
Article Google Scholar
R. Dekker and A. Hordijk, “Recurrence conditions for average and Black-well optimality in denumerable state Markov decision chains”, Mathematics of Operations Research 17, 271–289, 1992.
Article Google Scholar
R. Dekker, A. Hordijk and F.M. Spieksma, “On the relation between recurrence and ergodicity properties in denumerable Markov decision chains”, Mathematics of Operations Research 19, 539–559, 1994.
Article Google Scholar
E.V. Denardo, “Markov renewal programming with small interest rates”, Annals Math. Stat. 42, 477–496, 1971.
Article Google Scholar
A. Federgruen, A. Hordijk and H.C. Tijms, “A note on simultaneous recurrence conditions on a set of denumerable stochastic matrices”, Journal of Applied Probability 15, 842–847, 1978.
Article Google Scholar
A. Federgruen, A. Hordijk and H.C. Tijms, “Recurrence conditions in denumerable state Markov decision processes”, in Dynamic Programming and Its Applications, ed. by M.L. Puterman, 3–22, Academic Press, 1978.
Google Scholar
J. Flynn, “Averaging vs. discounting in dynamic programming: a counterexample”, Annals of Statistics 2, 411–413, 1974.
Article Google Scholar
O. Hernández-Lerma, R. Montes-de-Oca and R. Cavazos-Cadena, “Recurrence conditions for Markov decision processes with Borel state space: a survey”, Annals of Operations Research 28, 29–46, 1991.
Article Google Scholar
A. Hordijk, Dynamic Programming and Markov Potential Theory, Mathematical Centre Tract 51, Mathematisch Centrum, 1974.
Google Scholar
A. Hordijk, “Regenerative Markov decision models”, in Mathematical Programming Study, 6, ed. by R.J.B. Wets, North Holland, 1976, 49–72.
Google Scholar
A. Hordijk, R. Dekker and L.C.M. Kallenberg, “Sensitivity-analysis in discounted Markovian decision problems”, Operations Research Spektrum 7, 143–151, 1985.
Article Google Scholar
A. Hordijk, O. Passchier and F.M. Spieksma, “On the existence of the Puisseux expansion of the discounted rewards: a counterexample”, Probability in the Engineering and Informational Sciences 13, 229–235, 1999.
Article Google Scholar
A. Hordijk and K. Sladký, “Sensitive optimality criteria in countable state dynamic programming”, Math. of Oper. Res. 2, 1–14, 1977.
Article Google Scholar
A. Hordijk and F.M. Spieksma, “Are limits of α-discounted optimal policies Blackwell optimal? A counterexample”, Systems and Control Letters, 13, 31–41, 1989.
Article Google Scholar
A. Hordijk and F.M. Spieksma, “On ergodicity and recurrence properties of a Markov chain with an application to an open Jackson network”, Advances in Applied Probability 24, 343–376, 1992.
Article Google Scholar
A. Hordijk, F.M. Spieksma and R.L. Tweedie, “Uniform stability conditions for general space Markov decision processes”, Technical report, Leiden University and Colorado State University, 1995.
Google Scholar
A. Hordijk and A.A. Yushkevich, “Blackwell optimality in the class of stationary policies in Markov decision chains with a Borel state space and unbounded rewards”, Math. Methods Oper. Res. 49, 1–39, 1999.
Google Scholar
A. Hordijk and A.A. Yushkevich, “Blackwell optimality in the class of all policies in Markov decision chains with a Borel state space and unbounded rewards”, Mathematical Methods of Operations Research 50, 421–428, 1999.
Article Google Scholar
R.A. Howard, Dynamic Programming and Markov Processes, Wiley, 1960.
Google Scholar
Y. Huang and A.F. Veinott Jr., “Markov branching decision chains with interest-rate-dependent rewards, Probability in the Engineering and Informational Sciences 9, 99–121, 1995.
Article Google Scholar
J.G. Kemeny and J.L. Snell, Finite Markov chains, Van Nostrand-Reinhold, 1960.
Google Scholar
J.B. Lasserre, “Conditions for existence of average and Blackwell optimal stationary policies in denumerable Markov decision processes”, Journal of Mathematical Analysis and Applications 136, 479–490, 1988.
Article Google Scholar
A. Maitra, “Dynamic programming for countable state systems”, Sankhya Ser. A 27, 241–248, 1965.
Google Scholar
S.P. Meyn and R.L. Tweedie, Markov Chains and Stochastic Stability, Springer, 1993.
Book Google Scholar
B.L. Miller and A.F. Veinott, “Discrete dynamic programming with a small interest rate”, Annals of Mathematical Statistics 40, 366–370, 1969.
Article Google Scholar
M.L. Puterman, Markov Decision Processes, Wiley, 1994.
Book Google Scholar
S.M. Ross, “Non-discounted denumerable Markovian decision models”, Annals of Mathematical Statistics 39, 412–423, 1968.
Article Google Scholar
M. Schäl, “On dynamic programming: Compactness of the space of policies”, Stochastic Processes and Applications 3, 345–354, 1975.
Article Google Scholar
M. Schäl, “On dynamic programming and statistical decision theory”, Annals of Statistics 7, 432–445, 1979.
Article Google Scholar
K. Sladký, “On the set of optimal controls for Markov chains with rewards”, Kybernetika (Prague) 10, 350–367, 1974.
Google Scholar
F.M. Spieksma, “Geometrically ergodic Markov chains and the optimal control of queues”, Ph.D. Thesis, University of Leiden, 1990.
Google Scholar
L.C. Thomas, “Connectedness conditions for denumerable state Markov decision processes”, in Recent Developments in Markov Decision Processes, ed. by R. Hartley, L. Thomas, D. White, Academic Press, 1980, 181–204.
Google Scholar
H.C. Tijms, “Average reward optimality equation in Markov decision processes with a general state space”, in Probability, Statistics and Optimization: a Tribute to Peter Whittle, ed. by F.P. Kelly, Wiley, 1994, 485–495.
Google Scholar
A.F. Veinott Jr., “On finding optimal policies in discrete dynamic programming with no discounting”, Annals of Mathematical Statistics 37, 1284–1294, 1966.
Article Google Scholar
A.F. Veinott Jr., “Discrete dynamic programming with sensitive optimality criteria”, Annals of Mathematical Statistics 40, 1635–1660, 1969.
Article Google Scholar
A.F. Veinott Jr., Dynamic Programming and Stochastic Control, Unpublished class notes.
Google Scholar
A.F. Veinott Jr., “Markov decision chains”, Studies in Optimization, G.B. Dantzig and B.C. Eaves editors, American Mathematical Association, Providence RI 1974, 124–159.
Google Scholar
H.M. Wagner, “On optimality of pure strategies”, Management Science 6, 268–269, 1960.
Article Google Scholar
J. Warga, Optimal Control of Differential and Functional Equations, Academic Press, 1972.
Google Scholar
K. Yosida, Functional Analysis, Springer, 1980.
Book Google Scholar
A.A. Yushkevich, “Blackwell optimal policies in a Markov decision process with a Borel state space”, Mathematical Methods of Operations Research 40, 253–288, 1994.
Article Google Scholar
A.A. Yushkevich, “Strong 0-discount optimal policies in a Markov decision process with a Borel state space”, Mathematical Methods of Operations Research 42, 93–108, 1995.
Article Google Scholar
A.A. Yushkevich, “A note on asymptotics of discounted value function and strong 0-discount optimality”, Mathematical Methods of Operations Research 44, 223–231, 1996.
Article Google Scholar
A.A. Yushkevich, “Blackwell optimal policies in countable dynamic programming without aperiodicity assumptions”, in Statistics, Probability and Game Theory: Papers in Honor of David Blackwell, ed. by T.S. Ferguson, L.S. Shapley and J.B. MacQueen, Inst, of Math. Stat., 1996, 401–407.
Chapter Google Scholar
A.A. Yushkevich, “Blackwell optimality in Markov decision processes with a Borel state space”, Proceedings of 36th IEEE Conference on Decision and Control 3, 2827–2830, 1997.
Article Google Scholar
A.A. Yushkevich, “The compactness of a policy space in dynamic programming via an extension theorem for Carathéodory functions”, Mathematics of Operations Research 22, 458–467, 1997.
Article Google Scholar
A.A. Yushkevich, “Blackwell optimality in Borelian continuous-in-action Markov decision processes”, SIAM Journal on Control and Optimization 35, 2157–2182, 1997.
Article Google Scholar
A.A. Yushkevich and R.Ya. Chitashvili, “Controlled random sequences and Markov chains”, Russian Mathematical Surveys 37, 239–274, 1982.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Mathematical Institute, University of Leiden, P.O. Box 9512, 2300 RA, Leiden, The Netherlands
Arie Hordijk
Department of Mathematics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
Alexander A. Yushkevich

Authors

Arie Hordijk
View author publications
You can also search for this author in PubMed Google Scholar
Alexander A. Yushkevich
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

State University of New York at Stony Brook, USA
Eugene A. Feinberg
Technion—Israel Institute of Technology, Israel
Adam Shwartz

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hordijk, A., Yushkevich, A.A. (2002). Blackwell Optimality. In: Feinberg, E.A., Shwartz, A. (eds) Handbook of Markov Decision Processes. International Series in Operations Research & Management Science, vol 40. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0805-2_8

Download citation

DOI: https://doi.org/10.1007/978-1-4615-0805-2_8
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5248-8
Online ISBN: 978-1-4615-0805-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics