Abstract
After a brief survey of iterative algorithms for general stochastic games, we concentrate on finite-step algorithms for two special classes of stochastic games. They are Single-Controller Stochastic Games and Perfect Information Stochastic Games. In the case of single-controller games, the transition probabilities depend on the actions of the same player in all states. In perfect information stochastic games, one of the players has exactly one action in each state. Single-controller zero-sum games are efficiently solved by linear programming. Non-zero-sum single-controller stochastic games are reducible to linear complementary problems (LCP). In the discounted case they can be modified to fit into the so-called LCPs of Eave’s class L.In the undiscounted case the LCP’s are reducible to Lemke’s copositive plus class. In either case Lemke’s algorithm can be used to find a Nash equilibrium. In the case of discounted zero-sum perfect information stochastic games, a policy improvement algorithm is presented. Many other classes of stochastic games with orderfield property still await efficient finite-step algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bardi, M., Raghavan, T.E.S. and Parthasarathy, T. (1999)Stochastic and Differential Games Theory and Numerical Methods Bikhauser, Berlin.
Bewley, T. and Kohlberg, E. (1978) On stochastic games with stationary optimal strategiesMathematics of Operations Research 2104–125.
Blackwell, D. (1962) Discrete dynamic programmingAnnals of Mathematical Statistics 33719–726.
Blackwell, D. (1969) InfiniteGagames with imperfect information,Zastosowania Matematyki 10, 99–101.
Blackwell, D. (1989) Operator solution of infinite Gs games of imperfect informationin T.W. Anderson, K. Athreya, D.L. Iglehart(eds.), Probability Statistics and Mathematics: Papers in Honor of Samuel KarlinAcademic Press, New York, pp. 83–87.
Breton, M. (1987) Equilibre pour des jeux sequential, Ph.D. thesis, University of Montreal.
Condon, A. (1992) The complexity of stochastic gamesInformation and Computing96,203–224.
Cottle, R.W., Pang, J.S. and Stone, R.E. (1992)The Linear Complementary ProblemAcademic Press, Boston.
Eaves, B. (1971) Linear complementarity problemManagement Science 17612–634.
Everett, H. (1957) Recursive games, in M. Dresher, A. W. Tucker, P. Wolfe (eds.)Contributions to the Theory of Games Vol. IIIAnnals of Mathematics Studies 39, Princeton University Press, Princeton, NJ, pp. 47–78.
Filar, J.A. and Raghavan, T.E.S. (1984) A matrix game solution of the single-controller stochastic gameMathematics of Operations Research9, 356–362.
Filar, J.A. and Schultz, T.A. (1987) Bilinear programming and structured stochastic gamesJournal of Optimization Theory and Applications 5385–104.
Filar, J.A. and Vrieze, O.J. (1996)Competitive Markov Decision ProcessesSpringer-Verlag, Berlin.
Fink, A.M. (1964) Equilibrium points of stochastic non-cooperative gamesJournal of Science of the Hiroshima University Series A-I 2889–93.
Garcia, C. B. (1973) Some classes of matrices in linear complementarity theoryMathematical Programming 5299–310.
Gillette, D. (1957) Stochastic games with zero stop probabilities, in M. Dresher, A.W. Tucker, P. Wolfe (eds.)Contributions to the Theory of Games Vol. IIIAnnals of Mathematics Studies 39, Princeton University Press, Princeton, NJ, pp. 179–188.
Gurwich, V.A., Karzanov, A.V. and Khachiyan, L.G. (1988) Cyclic games and an algorithm to find minimax cycle means in directed graphsUSSR Computational Mathematics and Mathematical Physics 2885–91.
Hoffman, A.J. and Karp, R.M. (1966) On non-terminating stochastic gamesManagement Science 12359–370.
Hordijk, A. and Kallenberg, L.C.M. (1979) Linear programming and Markovian decision chainsManagement Science 25352–362.
Hordijk, A. and Kallenberg, L.C.M. (1984) Linear programming and Markov games, in O. Moeschlin, D. Pallaschke (eds.)Game Theory and Mathematical EconomicsNorth-Holland, Amsterdam, pp. 307–319.
Howard, R.A. (1960)Dynamic Programming and Markov ProcessesWiley, New York.
Kallenberg, L.C.M. (1983) Linear programming and finite Markovian control problems, Mathematical Centre Tract 148, Centre for Mathematics and Computer Science, Amsterdam.
Krishna, V. and Sjöstrom, T. (1998) On the convergence of fictitious playMathe-maties of Operations Research23, 479–511.
Lemke, C. E. (1964) Bimatrix equilibrium points and mathematical programmingManagement Science11, 681–689.
Lemke, C.E. and Howson, Jr. J.J. (1964) Equilibrium points of bimatrix gamesJournal of the Society of Industrial and Applied Mathematics12, 413–423.
Liggett, T.M. and Lippman, S. A. (1969) Stochastic games with perfect information and time average payoffSIAM Review11, 604–607.
Ludwig, W. (1995) A subexponential randomized algorithm for the simple stochastic game problemInformation and Computation117, 151–155.
Melekopoglou, M. and Condon, A. (1994) On the complexity of the policy improvement algorithm for Markov decision processesORSA Journal on Computing6, 188–192.
Mertens, J.-F. and Neyman, A. (1981) Stochastic gamesInternational Journal of Game Theory10, 53–56.
Mertens, J.-F. and Parthasarathy, T. (1987) Equilibria for discounted stochastic games, CORE Discussion Paper 8750, Université Catholique de Louvain, Louvainla-Neuve, Belgium (Chapter 10 in this volume).
Mertens, J.-F. and Parthasarathy, T. (1991) Non-zero-sum stochastic games in T.E.S. Raghavanet al., Stochastic Games and Related Topics Kluwer Academic Publishers, Dordrecht, pp. 145–148.
Mohan, S.R., Neogy, S.K. and Parthasarathy, T. (1997) Linear complementarity and discounted polystochastic games when one player controls transitionsin M.C. Ferris, J.-S. Pang(eds.), Complementarity and Variational ProblemsSIAM, Philadelphia, PA, pp. 284–294.
Mohan, S.R. and Raghavan, T.E.S. (1987) An algorithm for discounted switching control stochastic gamesOR Spektrum9, 41–45.
Nowak, A. S. and Raghavan, T.E.S (1992) A finite-step algorithm via a bimatrix game to a single-controller non-zero-sum stochastic gameMathematical Programming17, 519–526.
Parthasarathy, T. and Raghavan, T.E.S. (1981) An orderfield property for stochastic games when one player controls transition probabilitiesJournal of Optimization Theory and Applications33, 375–392.
Parthasarathy, T., Tijs, S.J. and Vrieze, O.J. (1984) Stochastic games with state independent transitions and separable rewards, in G. Hammer, D. Pallaschke (eds.)Selected Topics in OR and Mathematical EconomicsSpringer-Verlag, Lecture Notes Series 226, pp. 262–271.
Pollatschek, M. and Avi-Itzhak, B. (1969) Algorithms for stochastic games with geometrical interpretationManagement Science15, 399–425.
Raghavan, T.E.S., Ferguson, T.S., Parthasarathy, T. and Vrieze, O.J. (eds.) (1990)Stochastic Games and Related Topics: A Volume in Honor of L.S. ShapleyKluwer Academic Publishers, Dordrecht, The Netherlands.
Raghavan, T.E.S. and Filar, J.A. (1991) Algorithms for stochastic games - A surveyZeitschrift für Operations Research35, 437–472.
Raghavan, T.E.S. and Syed, Z. (2002) A policy improvement-type algorithm for solving zero-sum two-person stochastic games of perfect informationMathematical Programmingto appear.
Raghavan, T.E.S. and Syed, Z. (2002) An algorithm to solve non-zero-sum undiscounted single-controller stochastic games, Mathematics of Operations Research, to appear.
Raghavan, T.E.S. and Syed, Z. (2002) A policy improvement-type algorithm for solving zero-sum two-person stochastic games of a special class, Zeitschrift für Op-erations Research, to appear.
Raghavan, T.E.S., Tijs, S.J. and Vrieze, O.J. (1986) Stochastic games with additive rewards and additive transitionsJournal of Optimization Theory and Applications47, 451–464.
Shapley, L.S. (1953) Stochastic gamesProceedings of the National Academy of Sciences of the U.S.A. 39, 1095–1100 (Chapter 1 in this volume).
Shultz, T.A. (1987) Mathematical programming and stochastic games, Ph.D. thesis, The Johns Hopkins University.
Solan, E. (1998), Discounted stochastic gamesMathematics of Operations Research23, 1010–1021.
Takahashi, M. (1964) Equilibrium points of stochastic non-cooperative n-person gamesJournal of Science of the Hiroshima University Series A-I 28, 95–99.
Thuijsman, F. and Raghavan, T.E.S (1997) Stochastic games with switching control or ARAT structure, Technical Report M94–06, University of Limburg, Maastricht, The Netherlands.
Van der Waal, J. (1977) Discounted Markov games: Successive approximations and stopping timesInternational Journal of Game Theory6, 11–22.
Vrieze, O.J. (1981) Linear programming and undiscounted stochastic game in which one player controls transitionsOR Spektrum3, 29–35.
Vrieze, O.J. (1983) Stochastic games with finite state and action spaces, University of Nijmegen, Nijmegen, The Netherlands.
Vrieze, O.J., Tijs, S.J., Raghavan, T.E.S. and Filar, J.A. (1983) A finite algorithm for switching control stochastic gamesOR Spektrum5, 15–24.
Zwick, U. and Paterson, M.S. (1996) The complexity of mean payoff games on graphsTheoretical Computer Science158, 343– 359.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer Science+Business Media New York
About this paper
Cite this paper
Raghavan, T.E.S. (2003). Finite-Step Algorithms for Single-Controller and Perfect Information Stochastic Games. In: Neyman, A., Sorin, S. (eds) Stochastic Games and Applications. NATO Science Series, vol 570. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-0189-2_15
Download citation
DOI: https://doi.org/10.1007/978-94-010-0189-2_15
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-1493-2
Online ISBN: 978-94-010-0189-2
eBook Packages: Springer Book Archive