Average Reward Optimization Theory for Denumerable State Spaces

Sennott, Linn I.

doi:10.1007/978-1-4615-0805-2_5

Average Reward Optimization Theory for Denumerable State Spaces

Linn I. Sennott⁴

Chapter

1595 Accesses
7 Citations

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 40))

Abstract

In this chapter we deal with certain aspects of average reward optimality. It is assumed that the state space X is denumerably infinite, and that for each x ∈ X, the set A(x) of available actions is finite. It is possible to extend the theory to compact action sets, but at the expense of increased mathematical complexity. Finite action sets are sufficient for digitally implemented controls, and so we restrict our attention to this case.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 299.00; Price excludes VAT (USA)

Softcover Book: USD 379.99; Price excludes VAT (USA)

Hardcover Book: USD 379.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

E. Altman, P. Konstantopoulos, and Z. Liu, “Stability, monotonicity and invariant quantities in general polling systems,” Queueing Sys. 11, 35–57, 1992.
Article Google Scholar
A. Arapostathis, V. Borkar, E. Fernandez-Gaucherand, M. Ghosh, and S. Marcus, “Discrete-time controlled Markov processes with average cost criterion: a survey,” SIAM J. Control Optim. 31, 282–344, 1993.
Article Google Scholar
V. Borkar, “On minimum cost per unit time control of Markov chains,” SIAM J. Control Optim. 22, 965–978, 1984.
Article Google Scholar
V. Borkar, “Control of Markov chains with long-run average cost criterion,” in Stochastic Differential Systems, Stochastic Control Theory and Applications, edited by W. Fleming and P. L. Lions, Springer-Verlag, New York, 1988.
Google Scholar
V. Borkar, “Control of Markov chains with long-run average cost criterion: the dynamic programming equations,” SIAM J. Control Optim. 27, 642–657, 1989.
Article Google Scholar
V. Borkar, Topics in Controlled Markov Chains, Pitman Research Notes in Mathematics No. 240, Longman Scientific-Wiley, New York, 1991.
Google Scholar
R. Cavazos-Cadena, “Weak conditions for the existence of optimal stationary policies in average Markov decision chains with unbounded costs, “Kybernetiha 25, 145–156, 1989.
Google Scholar
R. Cavazos-Cadena, “Solution to the optimality equation in a class of Markov decision chains with the average cost criterion,” Kybernetiha 27, 23–37, 1991.
Google Scholar
R. Cavazos-Cadena, “A counterexample on the optimality equation in Markov decision chains with the average cost criterion,” Sys. Control Letters 16, 387–392, 1991.
Article Google Scholar
R. Cavazos-Cadena, “Recent results on conditions for the existence of av- erage optimal stationary policies,” Ann. Op. Res. 28, 3–27, 1991.
Article Google Scholar
R. Cavazos-Cadena and L. Sennott, “Comparing recent assumptions for the existence of average optimal stationary policies,” Op. Res. Letters 11, 33–37, 1992.
Article Google Scholar
C. Derman, “Denumerable state Markovian decision processes-average cost criterion,” Ann. Math. Stat. 37, 1545–1553, 1966.
Article Google Scholar
C. Derman, Finite State Markovian Decision Processes, Academic, New York, 1970.
Google Scholar
E. Dynkin and A. Yushkevich, Controlled Markov Processes, Springer-Verlag, New York, 1979.
Book Google Scholar
A. Federgruen and H. Tijms, “The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms,” J. Appl. Prob. 15, 356–373, 1978.
Article Google Scholar
A. Federgruen, A. Hordijk, and H. Tijms, “Denumerable state semi-Markov decision processes with unbounded costs, average cost criterion,” Stoc. Proc. Appl. 9, 223–235, 1979.
Article Google Scholar
A. Federgruen, P. Schweitzer, and H. Tijms, “Denumerable undiscounted semi-Markov decision processes with unbounded costs,” Math. Op. Res. 8, 298–313, 1983.
Article Google Scholar
E. Feinberg, “An ε-optimal control of a finite Markov chain with an average reward criterion,” SIAM Theory Probability Appl. 25, 70–81, 1980.
Article Google Scholar
C. Fricker and M. Jaibi, “Monotonicity and stability of periodic polling models,” Queueing Sys. 15, 211–238, 1994.
Article Google Scholar
L. Georgiadis and W. Szpankowski, “Stability of token passing rings,” Queueing Sys. 11, 7–33, 1992.
Article Google Scholar
O. Hernández-Lerma and J. Lasserre, “Average cost optimal policies for Markov control processes with Borel state space and unbounded costs,” Sys. Control Letters 15, 349–356, 1990.
Article Google Scholar
O. Hernández-Lerma, “Average optimality in dynamic programming on Borel spaces—unbounded costs and controls,” Sys. Control Letters 17, 237–242, 1991.
Article Google Scholar
O. Hernández-Lerma, “Existence of average optimal policies in Markov control processes with strictly unbounded costs,” Kybernetika 29, 1–17, 1993.
Google Scholar
O. Hernández-Lerma and J. Lasserre, Discrete-Time Markov Control Processes, Springer-Verlag, New York, 1996.
Book Google Scholar
A. Hordijk, “Regenerative Markov decision models,” Math. Prog. Study 6, 49–72, 1976.
Article Google Scholar
A. Hordijk, Dynamic Programming and Markov Potential Theory Second Ed., Mathematisch Centrum Tract 51, Amsterdam, 1977.
Google Scholar
Q. Hu, “Discounted and average Markov decision processes with unbounded rewards: new conditions,” J. Math. Anal. Appl. 171, 111–124, 1992.
Article Google Scholar
M. Kitaev and V. Rykov, Controlled Queueing Systems, CRC Press, Boca Raton, 1995.
Google Scholar
S. Lippman, “On dynamic programming with unbounded rewards,” Man. Sci. 21, 1225–1233, 1975.
Article Google Scholar
R. Montes-de-Oca and O. Hernandez-Lerma, “Conditions for average optimally in Markov control processes with unbounded costs and controls,” J. Math. Sys. Estimation and Control 4, 1–19, 1994.
Google Scholar
M. Puterman, Markov Decision Processes, Wiley, New York, 1994.
Book Google Scholar
R. Ritt and L. Sennott, “Optimal stationary policies in general state space Markov decision chains with finite action sets,” Math. Op. Res. 17, 901–909, 1992.
Article Google Scholar
S. Ross, “Non-discounted denumerable Markovian decision models,” Ann. Math. Stat. 39, 412–423, 1968.
Article Google Scholar
S. Ross, Introduction to Stochastic Dynamic Programming, Academic Press, New York, 1983.
Google Scholar
M. Schäl, “Average optimality in dynamic programming with general state space,” Math. Op. Res. 18, 163–172, 1993.
Article Google Scholar
L. Sennott, “The average cost optimality equation and critical number policies,” Prob. Eng. Info. Sci. 7, 47–67, 1993.
Article Google Scholar
L. Sennott, Stochastic Dynamic Programming and the Control of Queueing Systems, Wiley, New York, 1999.
Google Scholar
F. Spieksma, Geometrically Ergodic Markov Chains and the Optimal Control of Queues, Ph.D. thesis, Leiden University, 1990.
Google Scholar
S. Stidham Jr. and R. Weber, “Monotonie and insensitive optimal policies for control of queues with undiscounted costs,” Op. Res. 87, 611–625, 1989.
Article Google Scholar
H. Takagi, Analysis of Polling Systems, MIT, Cambridge, 1986.
Google Scholar
H. Takagi, “Queueing analysis of polling models: an update,” in Stochastic Analysis of Computer and Communication Shystems, edited by H. Takagi, North Holland, New York, 1990.
Google Scholar
H. Takagi, “Queueing analysis of polling models: progress in 1990–1994,” in Frontiers in Queueing, edited by J. Dshalalow. CRC Press, Boca Raton, 1997.
Google Scholar
H. Taylor, “Markovian seqential replacement processes,” Ann. Math. Stat. 36, 1677–1694, 1965.
Article Google Scholar
E. Titchmarsh, Theory of Functions, Second Ed., Oxford University Press, Oxford, 1939.
Google Scholar
D. Widder, The Laplace Transform, Princeton University Press, Princeton, 1941.
Google Scholar
J. Wijngaard, “Existence of average optimal strategies in Markovian decision problems with strictly unbounded costs,” in Dynamic Programming and Its Applications, edited by M. Puterman, Academic, New York, 1978.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Illinois State University Normal, IL, 61790-4520, USA
Linn I. Sennott

Authors

Linn I. Sennott
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

State University of New York at Stony Brook, USA
Eugene A. Feinberg
Technion—Israel Institute of Technology, Israel
Adam Shwartz

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sennott, L.I. (2002). Average Reward Optimization Theory for Denumerable State Spaces. In: Feinberg, E.A., Shwartz, A. (eds) Handbook of Markov Decision Processes. International Series in Operations Research & Management Science, vol 40. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0805-2_5

Download citation

DOI: https://doi.org/10.1007/978-1-4615-0805-2_5
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-5248-8
Online ISBN: 978-1-4615-0805-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics