Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence

  • Johanne Cohen
  • Amélie Héliou
  • Panayotis MertikopoulosEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10504)


This paper examines the problem of multi-agent learning in \(N\)-person non-cooperative games. For concreteness, we focus on the so-called “hedge” variant of the (EW) algorithm, one of the most widely studied algorithmic schemes for regret minimization in online learning. In this multi-agent context, we show that (a) dominated strategies become extinct (a.s.); and (b) in generic games, pure Nash equilibria are attracting with high probability, even in the presence of uncertainty and noise of arbitrarily high variance. Moreover, if the algorithm’s step-size does not decay too fast, we show that these properties occur at a quasi-exponential rate – that is, much faster than the algorithm’s \({{\mathrm{\mathcal O}}}(1/\sqrt{T})\) worst-case regret guarantee would suggest.


Dominated strategies Exponential weights Nash equilibrium No-regret learning 



This work was partially supported from the French National Research Agency (ANR) under grant no. ANR–16–CE33–0004–01 (ORACLESS) and the Huawei HIRP FLAGSHIP project ULTRON.


  1. 1.
    Arora, S., Hazan, E., Kale, S.: The multiplicative weights update method: a meta-algorithm and applications. Theory Comput. 8(1), 121–164 (2012)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Blum, A., Hajiaghayi, M.T., Ligett, K., Roth, A.: Regret minimization and the price of total anarchy. In: STOC 2008: Proceedings of the 40th Annual ACM Symposium on the Theory of Computing, pp. 373–382. ACM (2008)Google Scholar
  3. 3.
    Blum, A., Mansour, Y.: Learning, regret minimization, and equilibria (Chap. 4). In: Nisan, N., Roughgarden, T., Tardos, E., Vazirani, V.V. (eds.) Algorithmic Game Theory. Cambridge University Press, Cambridge (2007)Google Scholar
  4. 4.
    Bubeck, S., Cesa-Bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends Mach. Learn. 5(1), 1–122 (2012)CrossRefGoogle Scholar
  5. 5.
    Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)CrossRefGoogle Scholar
  6. 6.
    Coucheney, P., Gaujal, B., Mertikopoulos, P.: Penalty-regulated dynamics and robust learning procedures in games. Math. Oper. Res. 40(3), 611–633 (2015)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Foster, D., Vohra, R.V.: Calibrated learning and correlated equilibrium. Games Econ. Behav. 21(1), 40–55 (1997)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Foster, D.J., Lykouris, T., Sridharan, K., Tardos, E.: Learning in games: robustness of fast convergence. In: Advances in Neural Information Processing Systems, pp. 4727–4735 (2016)Google Scholar
  9. 9.
    Freund, Y., Schapire, R.E.: Adaptive game playing using multiplicative weights. Games Econ. Behav. 29, 79–103 (1999)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Goldberg, P.W., Roth, A.: Bounds for the query complexity of approximate equilibria. ACM Trans. Econ. Comput. 4(4), 24:1–24:25 (2016)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Hall, P., Heyde, C.C.: Martingale Limit Theory and Its Application. Probability and Mathematical Statistics. Academic Press, New York (1980)zbMATHGoogle Scholar
  12. 12.
    Hannan, J.: Approximation to Bayes risk in repeated play. In: Dresher, M., Tucker, A.W., Wolfe, P. (eds.) Contributions to the Theory of Games. Annals of Mathematics Studies, vol. 39, pp. 97–139. Princeton University Press, Princeton (1957)zbMATHGoogle Scholar
  13. 13.
    Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5), 1127–1150 (2000)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Kalai, A., Vempala, S.: Efficient algorithms for online decision problems. J. Comput. Syst. Sci. 71(3), 291–307 (2005)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Kleinberg, R., Piliouras, G., Tardos, É.: Load balancing without regret in the bulletin board model. Distrib. Comput. 24(1), 21–29 (2011)CrossRefGoogle Scholar
  16. 16.
    Krichene, W., Drighès, B., Bayen, A.M.: Learning Nash equilibria in congestion games. arXiv preprint arXiv:1408.0017 (2014)
  17. 17.
    Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Laraki, R., Mertikopoulos, P.: Higher order game dynamics. J. Econ. Theory 148(6), 2666–2695 (2013)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Mertikopoulos, P., Moustakas, A.L.: The emergence of rational behavior in the presence of stochastic perturbations. Ann. Appl. Probab. 20(4), 1359–1388 (2010)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Mertikopoulos, P., Sandholm, W.H.: Learning in games via reinforcement and regularization. Math. Oper. Res. 41(4), 1297–1324 (2016)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Roughgarden, T.: Intrinsic robustness of the price of anarchy. J. ACM (JACM) 62(5), 32 (2015)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Sandholm, W.H.: Population Games and Evolutionary Dynamics. Economic Learning and Social Evolution. MIT Press, Cambridge (2010)zbMATHGoogle Scholar
  24. 24.
    Syrgkanis, V., Agarwal, A., Luo, H., Schapire, R.E.: Fast convergence of regularized learning in games. In: Advances in Neural Information Processing Systems, pp. 2989–2997 (2015)Google Scholar
  25. 25.
    Viossat, Y.: Evolutionary dynamics and dominated strategies. Econ. Theory Bull. 3(1), 91–113 (2015)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Viossat, Y., Zapechelnyuk, A.: No-regret dynamics and fictitious play. J. Econ. Theory 148(2), 825–842 (2013)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Vovk, V.G.: Aggregating strategies. In: COLT 1990: Proceedings of the 3rd Workshop on Computational Learning Theory, pp. 371–383 (1990)Google Scholar
  28. 28.
    Weibull, J.W.: Evolutionary Game Theory. MIT Press, Cambridge (1995)zbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Johanne Cohen
    • 1
  • Amélie Héliou
    • 2
    • 3
  • Panayotis Mertikopoulos
    • 4
    Email author
  1. 1.LRI-CNRS, Université de Paris-Sud, Université Paris-SaclayOrsayFrance
  2. 2.AMIB ProjectInria SaclayPalaiseauFrance
  3. 3.LIX CNRS UMR 7161, Ecole PolytechniqueUniversité Paris-SaclayPalaiseauFrance
  4. 4.Univ. Grenoble Alpes, CNRS, Grenoble INP, Inria, LIGGrenobleFrance

Personalised recommendations