Economic Theory

, Volume 68, Issue 4, pp 907–934 | Cite as

Convergence results on stochastic adaptive learning

  • Naoki FunaiEmail author
Research Article


We investigate an adaptive learning model which nests several existing learning models such as payoff assessment learning, valuation learning, stochastic fictitious play learning, experience-weighted attraction learning and delta learning with foregone payoff information in normal form games. In particular, we consider adaptive players each of whom assigns payoff assessments to his own actions, chooses the action which has the highest assessment with some perturbations and updates the assessments using observed payoffs, which may include payoffs from unchosen actions. Then, we provide conditions under which the learning process converges to a quantal response equilibrium in normal form games.


Adaptive learning Normal form games Asynchronous stochastic approximation Quantal response equilibrium 

JEL Classification

C72 D83 


  1. Beggs, A.W.: On the convergence of reinforcement learning. J. Econ. Theory 122, 1–36 (2005)CrossRefGoogle Scholar
  2. Benaïm, M.: Dynamics of stochastic approximation algorithms. In: Azéma, J., Émery, M., Ledoux, M., Yor, M. (eds.) Séminaire De Probabilités, XXXIII. Lecture Notes in Mathematics, vol. 1709, pp. 1–68. Springer, Berlin (1999)CrossRefGoogle Scholar
  3. Benaïm, M., Hirsch, M.: Mixed equilibria and dynamical systems arising from fictitious play in perturbed games. Games Econ. Behav. 29, 36–72 (1999)CrossRefGoogle Scholar
  4. Borkar, V.S.: Stochastic Approximation: A Dynamical Systems Viewpoint. Cambridge University Press, Cambridge (2008)CrossRefGoogle Scholar
  5. Camerer, C., Ho, T.H.: Experience-weighted attraction learning in normal form games. Econometrica 67, 827–874 (1999)CrossRefGoogle Scholar
  6. Chen, Y., Khoroshilov, Y.: Learning under limited information. Games Econ. Behav. 44, 1–25 (2003)CrossRefGoogle Scholar
  7. Cominetti, R., Melo, E., Sorin, S.: A payoff-based learning procedure and its application to traffic games. Games Econ. Behav. 70, 71–83 (2010)CrossRefGoogle Scholar
  8. Conley, T.G., Udry, C.R.: Learning about a new technology: pineapple in Ghana. Am. Econ. Rev. 100, 35–69 (2010)CrossRefGoogle Scholar
  9. Duffy, J., Feltovich, N.: Does observation of others affect learning in strategic environments? An experimental study. Int. J. Game Theory 28, 131–52 (1999)CrossRefGoogle Scholar
  10. Erev, I., Roth, A.E.: Predicting how people play games: reinforcement learning in experimental games with unique mixed strategy equilibria. Am. Econ. Rev. 88, 848–881 (1998)Google Scholar
  11. Fudenberg, D., Kreps, D.M.: Learning mixed equilibria. Games Econ. Behav. 5, 320–367 (1993)CrossRefGoogle Scholar
  12. Fudenberg, D., Takahashi, S.: Heterogeneous beliefs and local information in stochastic fictitious play. Games Econ. Behav. 71, 100–120 (2011)CrossRefGoogle Scholar
  13. Funai, N.: An adaptive learning model with foregone payoff information. B.E. J. Theor. Econ. 14, 149–176 (2014)CrossRefGoogle Scholar
  14. Funai, N.: A unified model of adaptive learning in normal form games. Working paper (2016a)Google Scholar
  15. Funai, N.: Reinforcement learning with foregone payoff information in normal form games. Working paper (2016b)Google Scholar
  16. Grosskopf, B., Erev, I., Yechiam, E.: Foregone with the wind: indirect payoff information and its implications for choice. Int. J. Game. Theory 34, 285–302 (2006)CrossRefGoogle Scholar
  17. Hall, P., Heyde, C.C.: Martingale Limit Theory and Its Application. Academic Press, New York (1980)Google Scholar
  18. Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibrium. Econometrica 68, 1127–1150 (2000)CrossRefGoogle Scholar
  19. Heller, D., Sarin, R.: Adaptive learning with indirect payoff information. Working paper (2001)Google Scholar
  20. Hofbauer, J., Hopkins, E.: Learning in perturbed asymmetric games. Games Econ. Behav. 52, 133–152 (2005)CrossRefGoogle Scholar
  21. Hofbauer, J., Sandholm, W.H.: On the global convergence of stochastic fictitious play. Econometrica 70, 2265–2294 (2002)CrossRefGoogle Scholar
  22. Hopkins, E.: Two competing models of how people learn in games. Econometrica 70, 2141–2166 (2002)CrossRefGoogle Scholar
  23. Hopkins, E., Posch, M.: Attainability of boundary points under reinforcement learning. Games Econ. Behav. 53, 110–125 (2005)CrossRefGoogle Scholar
  24. Ianni, A.: Learning strict Nash equilibria through reinforcement. J. Math. Econ. 50, 148–155 (2014)CrossRefGoogle Scholar
  25. Jehiel, P., Samet, D.: Learning to play games in extensive form by valuation. J. Econ. Theory 124, 129–148 (2005)CrossRefGoogle Scholar
  26. Laslier, J.F., Topol, R., Walliser, B.: A behavioural learning process in games. Games Econ. Behav. 37, 340–366 (2001)CrossRefGoogle Scholar
  27. Leslie, D.S., Collins, E.J.: Individual q-learning in normal form games. SIAM J. Control Optim. 44, 495–514 (2005)CrossRefGoogle Scholar
  28. McKelvey, R.D., Palfrey, T.R.: Quantal response equilibria for normal form games. Games Econ. Behav. 10, 6–38 (1995)CrossRefGoogle Scholar
  29. Rustichini, A.: Optimal properties of stimulus-response learning models. Games Econ. Behav. 29, 244–273 (1999)CrossRefGoogle Scholar
  30. Roth, A.E., Erev, I.: Learning in extensive-form games: experimental data and simple dynamic models in the intermediate term. Games Econ. Behav. 8, 164–212 (1995)CrossRefGoogle Scholar
  31. Sarin, R., Vahid, F.: Payoff assessments without probabilities: a simple dynamic model of choice. Games Econ. Behav. 28, 294–309 (1999)CrossRefGoogle Scholar
  32. Sarin, R., Vahid, F.: Predicting how people play games: a simple dynamic model of choice. Games Econ. Behav. 34, 104–122 (2001)CrossRefGoogle Scholar
  33. Tsitsiklis, J.N.: Asynchronous stochastic approximation and q-learning. Mach. Learn. 16, 185–202 (1994)Google Scholar
  34. Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992)Google Scholar
  35. Wu, H., Bayer, R.: Learning from inferred foregone payoffs. J. Econ. Dyn. Control 51, 445–458 (2015)CrossRefGoogle Scholar
  36. Yechiam, E., Busemeyer, J.R.: Comparison of basic assumptions embedded in learning models for experience-based decision making. Psychon. Bull. Rev. 12, 387–402 (2005)CrossRefGoogle Scholar
  37. Yechiam, E., Busemeyer, J.R.: The effect of foregone payoffs on underweighting small probability events. J. Behav. Dec. Mak. 19, 1–16 (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of EconomicsRyutsu Keizai UniversityChibaJapan

Personalised recommendations