# The functional equations of undiscounted denumerable state Markov renewal programming

• Elke Mann
Chapter

## Abstract

In this paper we want to establish conditions for the existence of a finite solution of the functional equations
$$g\left( i \right) = \mathop {\max }\limits_{a \in a\left( 1 \right)} \mathop {\;\Sigma }\limits_{j \in I} \;p\left( {i,a,j} \right)g\left( j \right),\;i \in I,\;$$
(1)
$$v\left( i \right) = \mathop {\max }\limits_{a \in {A_o}\left( i \right)} \;\left[ {r\left( {i,a} \right) - t\left( {i,a} \right)g\left( i \right) + \mathop \Sigma \limits_{j \in I} p\left( {i,a,j} \right)v\left( j \right)} \right],i \in I$$
(2)
where I is a denumerable set, A(i) is a compact metric space for all i ϵ I, 0 < t(i,a) < ∞, − ∞ < r (i,a) < ∞, p (i,a,j) ⩾ 0 for all i, j ϵ I, a ϵ A(i) and $$\sum\limits_{j \in I} {p\left( {i,a,j} \right){\text{ }} = {\text{ }}1}$$ for all i ϵ I, a ϵ A(i).

## Keywords

Decision Rule Markov Decision Process Reward Function Optimality Equation Gain Rate
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

## References

1. Birkhoff, G. (1948). Lattice theory. American Math. Society Colloquium, Volume 25.Google Scholar
2. Blackwell, D. (1962). Discrete dynamic programming. Ann. Math. Stat. 33, 719–726.
3. Chung, K.L. (1967). Markov chains with stationary transition probabilities. Springer, Berlin.
4. Çinlar, E. (1974). Periodicity in Markov renewal theory. Advances in Appl. Probability 6, 61–78.
5. Deppe, H. (1985). Continuity of mean recurrence times in denumerable semi Markov processes. Z. Wahrscheinlichkeitstheorie verw. Gebiete 69.Google Scholar
6. Deppe, H. (1981). Semiregenerative processes with costs. Preprint 475, SFB 72, University of Bonn.Google Scholar
7. Federgruen, A., A. Hordijk and H.C. Tijms (1979). Denumerable state semi Markov decision processes with unbounded costs, average reward criterion. Stoc.Proc.Appl. 9, 223–225.
8. Federgruen, A., P.J. Schweitzer and H.C. Tijms (1983). Denumerable undiscounted semi-Markov decision processes with unbounded rewards. Math.of Oper. Res. 8, 298–313.
9. Hordijk, A. (1974). Dynamic programming and Markov potential theory. Math. Centre Tract 51, Mathematisch Centrum, Amsterdam.Google Scholar
10. Hordijk, A. and R. Dekker (1983). Average, sensitive and Blackwell optimal policies in denumerable Markov decision chains with unbounded rewards. Report 83-36, Institute of Appl. Math, and Comp. Science, University of Leiden.Google Scholar
11. Hordijk, A. and K. Sladky (1977). Sensitive optimality criteria in countable state dynamic programming. Math.of Oper. Res. 2, 1–14.
12. Kolonko, M. (1980a). Dynamische Optimierung unter Unsicherheit in einem Semi-Markoff Modell mit abzählbarem Zustandsraum. Dissertation, Institut für Angewandte Mathematik, Universität Bonn.Google Scholar
13. Kolonko, M. (1980b). A countable Markov chain with reward structurecontinuity of the average reward. Preprint 415, SFB 72, University of Bonn.Google Scholar
14. Mann, E. (1983a). Optimality equations and bias-optimality in bounded Markov decision processes. Preprint 574, SFB 72, University of Bonn. To appear in Optimization 16.Google Scholar
15. Mann, E. (1983b). Optimalitätsgleichungen und optimale Politiken für sensitive Kriterien. Operations Research Proceedings 1983, Springer Berlin Heidelberg.Google Scholar
16. Schweitzer, P.J. (1968). Perturbation theory and finite Markov chains. J. Appl. Prob. 5, 401–413.
17. Schweitzer, P.J. (1982). Solving MDP functional equations by lexicographic optimization. RAIRO 16, 91–98.
18. Veinott, A.F. (1966). On finding optimal policies in discrete dynamic programming. Ann. Math. Stat. 37, 1284–1294.
19. Wijngaard, J. (1975). Stationary Markov decision problems, discrete time, general state space. Proefschrift, Technische Hogeschool Eindhoven.Google Scholar
20. Zijm, H. (1984). The optimality equations in multichain denumerable state Markov decision processes with the average cost criterion: the unbounded cost case. CQM-Note 022, Centre of Quantitative Methods, Eindhoven.Google Scholar