Abstract
Entropy games and matrix multiplication games have been recently introduced by Asarin et al. They model the situation in which one player (Despot) wishes to minimize the growth rate of a matrix product, whereas the other player (Tribune) wishes to maximize it. We develop an operator approach to entropy games. This allows us to show that entropy games can be cast as stochastic mean payoff games in which some action spaces are simplices and payments are given by a relative entropy (Kullback-Leibler divergence). In this way, we show that entropy games with a fixed number of states belonging to Despot can be solved in polynomial time. This approach also allows us to solve these games by a policy iteration algorithm, which we compare with the spectral simplex algorithm developed by Protasov.
Similar content being viewed by others
References
Anantharam, V., Borkar, V.S.: A variational formula for risk-sensitive reward. SIAM J. Contro Optim. 55(2), 961–988 (2017). arXiv:1501.00676
Asarin, E., Cervelle, J., Degorre, A., Dima, C., Horn, F., Kozyakin, V.: Entropy games and matrix multiplication games. In: 33rd Symposium on Theoretical Aspects of Computer Science, STACS, Orlėans, France, pp. 11:1–11:14 (2016)
Akian, M., Gaubert, S., Guterman, A.: Tropical polyhedra are equivalent to mean payoff games. Int. J. Algebra Comput. 22(1), 125001 (43 pages) (2012)
Akian, M., Gaubert, S., Grand-Clément, J., Guillaud, J.: The Operator Approach to Entropy Games. In: Vollmer, H., Vallée, B. (eds.) 34th Symposium on Theoretical Aspects of Computer Science (STACS 2017), volume 66 of Leibniz International Proceedings in Informatics (LIPIcs), pp. 6:1–6:14. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl (2017)
Akian, M., Gaubert, S., Nussbaum, R.: A Collatz-Wielandt characterization of the spectral radius of order-preserving homogeneous maps on cones. arXiv:1112.5968 (2011)
Andersson, D., Miltersen, P.B.: The complexity of solving stochastic games on graphs. In: Proceedings of ISAAC’09, number 5878 in LNCS, pp 112–121. Springer (2009)
Borwein, J.M., Borwein, P.B.: On the complexity of familiar functions and numbers. SIAM Rev. 30(4), 589–601 (1988)
Baillon, J.B., Bruck, R.E.: Optimal rates of asymptotic regularity for averaged nonexpansive mappings. In: Tan, K. K. (ed.) Proceedings of the Second International Conference on Fixed Point Theory and Applications, pp. 27–66. World Scientific Press (1992)
Bolte, J., Gaubert, S., Vigeral, G.: Definable zero-sum stochastic games. Math. Oper. Res. 40(1), 171–191 (2014)
Bewley, T., Kohlberg, E.: The asymptotic theory of stochastic games. Math. Oper. Res. 1(3), 197–208 (1976)
Blondel, V.D., Nesterov, Y.: Polynomial-time computation of the joint spectral radius for some sets of nonnegative matrices. SIAM J. Matrix Anal. 31(3), 865–876 (2009)
Berman, A., Plemmons, R.J.: Nonnegative matrices in the mathematical sciences. Academic Press, New York (1994)
Chen, T., Han, T.: On the complexity of computing maximum entropy for markovian models. In: 34th International Conference on Foundation of Software Technology and Theoretical Computer Science, FSTTCS 2014, pp. 571–583, New Delhi (2014)
Crandall, M.G., Tartar, L.: Some relations between non expansive and order preserving maps. Proc. AMS 78(3), 385–390 (1980)
Donsker, M.D., Varadhan, R.: On a variational formula for the principal eigenvalue for operators with maximum principle. Proc. Nat. Acad. Sci. USA 72(3), 780–783 (1975)
Fleming, W.H., Hernández-Hernández, D.: Risk-sensitive control of finite state machines on an infinite horizon. I SIAM J. Control Optim. 35(5), 1790–1810 (1997)
Fleming, W.H., Hernández-Hernández, D.: Risk-sensitive control of finite state machines on an infinite horizon. II. SIAM J. Control Optim. 37(4), 1048–1069 (electronic) (1999)
Gaubert, S., Gunawardena, J.: A non-linear hierarchy for discrete event dynamical systems. In: Proceedings of the Fourth Workshop on Discrete Event Systems (WODES98), pp. 249–254. IEEE, Cagliari (1998)
Gaubert, S., Gunawardena, J.: The Perron-Frobenius theorem for homogeneous, monotone functions. Trans. AMS 356(12), 4931–4950 (2004)
Grötschel, M., Lovász, L., Schrijver, A.: The ellipsoid method and its consequences in combinatorial optimization. Combinatorica 1(2), 169–197 (1981)
Gaubert, S., Stott, N.: A convergent hierarchy of non-linear eigenproblems to compute the joint spectral radius of nonnegative matrices. Proceedings of the 23rd International Symposium on Mathematical Theory of Networks and Systems (MTNS2018), Hong Kong (2018)
Gaubert, S., Vigeral, G.: A maximin characterization of the escape rate of nonexpansive mappings in metrically convex spaces. Math Proc. Camb. Phil. Soc. 152, 341–363 (2012)
Hoffman, A.J., Karp, R.M.: On nonterminating stochastic games. Manag. Sci. J. Inst. Manag. Sci. Appl. Theory Ser. 12, 359–370 (1966)
Howard, R.A., Matheson, J.E.: Risk-sensitive markov decision processes. Manag. Sci. 18(7), 356–369 (1972)
Hansen, T.D., Miltersen, P.B., Zwick, U.: Strategy iteration is strongly polynomial for 2-player turn-based stochastic games with a constant discount factor. In: Innovations in Computer Science 2011, pp. 253–263. Tsinghua University Press (2011)
Ishikawa, S.: Fixed points and iteration of a nonexpansive mapping in a Banach space. Proc. Amer. Math. Soc. 59(1), 65–71 (1976)
Kingman, J.F.C.: A convexity property of positive matrices. Quart. J. Math. Oxford Ser. 2(12), 283–284 (1961)
Kozyakin, V.: Hourglass alternative and the finiteness conjecture for the spectral characteristics of sets of non-negative matrices. Linear Algebra Appl. 489, 167–185 (2016)
Krasnosel’skiĭ, M. A.: Two remarks on the method of successive approximations. Uspekhi Matematicheskikh Nauk 10, 123–127 (1955)
Kullback, S.: Information theory and statistics. Dover Publications, Inc., Mineola (1997). Reprint of the second (1968) edition
Lemmens, B., Lins, B., Nussbaum, R., Wortel, M.: Denjoy-Wolff theorems for Hilbert’s and Thompson’s metric spaces. J. d’Anal. Math. 134, 671–718 (2018)
Lothaire, M.: Applied combinatorics on words. Cambridge, New York (2005)
Mann, W.R.: Mean value methods in iteration. Proc. Amer. Math. Soc. 4, 506–510 (1953)
Mertens, J.-F., Neyman, A.: Stochastic games. Internat. J. Game Theory 10(2), 53–66 (1981)
Müller, J. M.: Elementary functions: algorithms and implementation. Birkhaüser, Cambridge (2005)
Neyman, A.: Stochastic games and nonexpansive maps. In Stochastic games and applications (Stony Brook, NY, 1999), volume 570 of NATO Sci. Ser. C Math. Phys. Sci., pp. 397–415. Kluwer Acad. Publ., Dordrecht (2003)
Nussbaum, R.D.: Convexity and log convexity for the spectral radius. Linear Algebra Appl. 73, 59–122 (1986)
Protasov, V. Yu.: Spectral simplex method. Math. Program. 156(1-2, Ser. A), 485–511 (2016)
Puterman, M.L.: Markov decision processes. Wiley, New York (2005)
Rothblum, U.G.: Multiplicative markov decision chains. Math. Oper. Res. 9 (1), 6–24 (1984)
Rump, S.M.: Polynomial minimum root separation. Math. Comput. 145(33), 327–336 (1979)
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Berlin (1998)
Sladký, K.: On Dynamic Programming Recursions for Multiplicative Markov Decision Chains, pp 216–226. Springer, Berlin (1976)
van den Dries, L.: Tame topology and o-minimal structures, volume 248 of London Mathematical Society Lecture Note Series. Cambridge University Press, Cambridge (1998)
van den Dries, L.: o-minimal structures and real analytic geometry. In: Current developments in mathematics, 1998 (Cambridge, MA), pp. 105–152. Int. Press, Somerville (1999)
Vigeral, G.: A zero-sum stochastic game with compact action sets and no asymptotic value. Dyn. Games Appl. 3(2), 172–186 (2013)
Whittle, P.: Optimization over time, I. Wiley, New York (1982)
Wilkie, A.J.: Model completeness results for expansions of the ordered field of real numbers by restricted Pfaffian functions and the exponential function. J. Amer. Math. Soc. 9(4), 1051–1094 (1996)
Ye, Y.: The simplex and policy-iteration methods are strongly polynomial for the markov decision problem with a fixed discount rate. Math. Oper. Res. 36(4), 593–603 (2011)
Zijm, W.H.M.: Asymptotic expansions for dynamic programming recursions with general nonnegative matrices. J. Optim. Theory Appl. 54(1), 157–191 (1987)
Acknowledgments
An announcement of the present results appeared in the proceedings of STACS, [4]. We are very grateful to the referees of this STACS paper and also to the referees of the present extended version, for their detailed comments which helped us to improve this manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the Topical Collection on Special Issue on Theoretical Aspects of Computer Science (STACS 2017)
The authors were partially supported by the ANR through the MALTHY INS project, and by the Gaspard Monge corporate sponsorship Program (PGMO) of EDF, Orange, Thales and Fondation Mathé matique Jacques Hadmard.
Rights and permissions
About this article
Cite this article
Akian, M., Gaubert, S., Grand-Clément, J. et al. The Operator Approach to Entropy Games. Theory Comput Syst 63, 1089–1130 (2019). https://doi.org/10.1007/s00224-019-09925-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00224-019-09925-z