The functional equations of undiscounted denumerable state Markov renewal programming

  • Elke Mann

Abstract

In this paper we want to establish conditions for the existence of a finite solution of the functional equations
$$ g\left( i \right) = \mathop {\max }\limits_{a \in a\left( 1 \right)} \mathop {\;\Sigma }\limits_{j \in I} \;p\left( {i,a,j} \right)g\left( j \right),\;i \in I,\;$$
(1)
$$ v\left( i \right) = \mathop {\max }\limits_{a \in {A_o}\left( i \right)} \;\left[ {r\left( {i,a} \right) - t\left( {i,a} \right)g\left( i \right) + \mathop \Sigma \limits_{j \in I} p\left( {i,a,j} \right)v\left( j \right)} \right],i \in I$$
(2)
where I is a denumerable set, A(i) is a compact metric space for all i ϵ I, 0 < t(i,a) < ∞, − ∞ < r (i,a) < ∞, p (i,a,j) ⩾ 0 for all i, j ϵ I, a ϵ A(i) and \( \sum\limits_{j \in I} {p\left( {i,a,j} \right){\text{ }} = {\text{ }}1} \) for all i ϵ I, a ϵ A(i).

Keywords

Decision Rule Markov Decision Process Reward Function Optimality Equation Gain Rate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Birkhoff, G. (1948). Lattice theory. American Math. Society Colloquium, Volume 25.Google Scholar
  2. Blackwell, D. (1962). Discrete dynamic programming. Ann. Math. Stat. 33, 719–726.MathSciNetMATHCrossRefGoogle Scholar
  3. Chung, K.L. (1967). Markov chains with stationary transition probabilities. Springer, Berlin.MATHGoogle Scholar
  4. Çinlar, E. (1974). Periodicity in Markov renewal theory. Advances in Appl. Probability 6, 61–78.MathSciNetMATHCrossRefGoogle Scholar
  5. Deppe, H. (1985). Continuity of mean recurrence times in denumerable semi Markov processes. Z. Wahrscheinlichkeitstheorie verw. Gebiete 69.Google Scholar
  6. Deppe, H. (1981). Semiregenerative processes with costs. Preprint 475, SFB 72, University of Bonn.Google Scholar
  7. Federgruen, A., A. Hordijk and H.C. Tijms (1979). Denumerable state semi Markov decision processes with unbounded costs, average reward criterion. Stoc.Proc.Appl. 9, 223–225.MathSciNetMATHCrossRefGoogle Scholar
  8. Federgruen, A., P.J. Schweitzer and H.C. Tijms (1983). Denumerable undiscounted semi-Markov decision processes with unbounded rewards. Math.of Oper. Res. 8, 298–313.MathSciNetMATHCrossRefGoogle Scholar
  9. Hordijk, A. (1974). Dynamic programming and Markov potential theory. Math. Centre Tract 51, Mathematisch Centrum, Amsterdam.Google Scholar
  10. Hordijk, A. and R. Dekker (1983). Average, sensitive and Blackwell optimal policies in denumerable Markov decision chains with unbounded rewards. Report 83-36, Institute of Appl. Math, and Comp. Science, University of Leiden.Google Scholar
  11. Hordijk, A. and K. Sladky (1977). Sensitive optimality criteria in countable state dynamic programming. Math.of Oper. Res. 2, 1–14.MathSciNetMATHCrossRefGoogle Scholar
  12. Kolonko, M. (1980a). Dynamische Optimierung unter Unsicherheit in einem Semi-Markoff Modell mit abzählbarem Zustandsraum. Dissertation, Institut für Angewandte Mathematik, Universität Bonn.Google Scholar
  13. Kolonko, M. (1980b). A countable Markov chain with reward structurecontinuity of the average reward. Preprint 415, SFB 72, University of Bonn.Google Scholar
  14. Mann, E. (1983a). Optimality equations and bias-optimality in bounded Markov decision processes. Preprint 574, SFB 72, University of Bonn. To appear in Optimization 16.Google Scholar
  15. Mann, E. (1983b). Optimalitätsgleichungen und optimale Politiken für sensitive Kriterien. Operations Research Proceedings 1983, Springer Berlin Heidelberg.Google Scholar
  16. Schweitzer, P.J. (1968). Perturbation theory and finite Markov chains. J. Appl. Prob. 5, 401–413.MathSciNetMATHCrossRefGoogle Scholar
  17. Schweitzer, P.J. (1982). Solving MDP functional equations by lexicographic optimization. RAIRO 16, 91–98.MathSciNetMATHGoogle Scholar
  18. Veinott, A.F. (1966). On finding optimal policies in discrete dynamic programming. Ann. Math. Stat. 37, 1284–1294.MathSciNetMATHCrossRefGoogle Scholar
  19. Wijngaard, J. (1975). Stationary Markov decision problems, discrete time, general state space. Proefschrift, Technische Hogeschool Eindhoven.Google Scholar
  20. Zijm, H. (1984). The optimality equations in multichain denumerable state Markov decision processes with the average cost criterion: the unbounded cost case. CQM-Note 022, Centre of Quantitative Methods, Eindhoven.Google Scholar

Copyright information

© Springer Science+Business Media New York 1986

Authors and Affiliations

  • Elke Mann
    • 1
  1. 1.Institut für Angewandte MathematikUniversität BonnBonn 1Germany

Personalised recommendations