Journal of Mathematical Biology

, Volume 79, Issue 6–7, pp 2237–2253 | Cite as

Adaptive learning in large populations

  • Misha PerepelitsaEmail author


We consider the adaptive learning rule of Harley (J Theor Biol 89:611–633, 1981) for behavior selection in symmetric conflict games in large populations. This rule uses organisms’ past, accumulated rewards as the predictor for future behavior, and can be traced in many life forms from bacteria to humans. We derive a partial differential equation for the distribution of agents in the space of stimuli to select a particular strategy which describes the evolution of learning in heterogeneous populations. We analyze the solutions of the PDE model for symmetric \(2 \times 2\) games. It is found that in games with small residual stimuli, adaptive learning rules with larger memory factor converge faster to the optimal outcome.


Adaptive learning Relative payoff sum Symmetric games 

Mathematics Subject Classification

35Q91 91A05 91A20 



The author wishes to thank the anonymous referees for patient reading of the manuscript and detailed comments that helped to improve it in so many ways.


  1. Benaim M, Hirsch MW (1999) Stochastic approximation algorithms with constant step size whose average is cooperative. Ann Appl Probab 9(1):216–241MathSciNetCrossRefGoogle Scholar
  2. Borgers T, Sarin R (1997) Learning through reinforcement and replicator dynamics. J Econ Theory 77(1):1–14MathSciNetCrossRefGoogle Scholar
  3. Bush RR, Mosteller F (1955) Stochastic models for learning. Wiley, New YorkCrossRefGoogle Scholar
  4. Catania AC (1963) Concurrent performances. A baseline for the study of reinforcement magnitude. J Exp Anal Behav 6:299–300CrossRefGoogle Scholar
  5. Chung S-H, Herrstein RJ (1967) Choice and delay of reinforcement. J Exp Anal Behav 10:67–74CrossRefGoogle Scholar
  6. Cross JG (1983) A theory of adaptive economic behavior. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  7. Domjan M, Burkhard B (1984) The principles of learning and behavior. Brooks/Cole Publishing Co., MontereyGoogle Scholar
  8. Erev I, Roth AE (1998) Predicting how people play games: reinforcement learning in experimental games with unique, mixed strategy equilibrium. Am Econ Rev 88:848–881Google Scholar
  9. Ferster CB, Skinner BF (1957) Schedules of reinforcement. Appleton-Century-Grofis, New YorkCrossRefGoogle Scholar
  10. Fudenberg D, Levine D (1998) The theory of learning in games. MIT Press, CambridgezbMATHGoogle Scholar
  11. Fudenberg D, Takahashi S (2011) Heterogeneous beliefs and local information in stochastic fictitious play. Games Econ Behav 71:100–120MathSciNetCrossRefGoogle Scholar
  12. Harley CB (1981) Learning the evolutionarily stable strategy. J Theor Biol 89:611–633CrossRefGoogle Scholar
  13. Herrstein RJ (1961) Relative and absolute strength of response as a function of frequency of reinforcement. J Exp Anal Behav 4:267–272CrossRefGoogle Scholar
  14. Herrstein RJ (1970) On the law of effect. J Exp Anal Behav 13:243–266CrossRefGoogle Scholar
  15. Kahneman D, Tversky A (1984) Choices, values and frames. Am Psychol 39:341–350CrossRefGoogle Scholar
  16. Luce RD (1959) Individual choice behavior: a theoretical analysis. Wiley, New YorkzbMATHGoogle Scholar
  17. Maynard Smith J, Price GR (1973) The logic of animal conflict. Nat Lond 246:15–18CrossRefGoogle Scholar
  18. Mertikopoulos P, Sandholm WH (2016) Learning in games via reinforcement learning and regularization. Math Oper Res INFORMS 41(4):1297–1324MathSciNetCrossRefGoogle Scholar
  19. Nax HH, Perc M (2015) Directional learning and the provisioning of public goods. Sci Rep 5:8010CrossRefGoogle Scholar
  20. Pareschi L, Toscani G (2014) Interacting multiagent systems: kinetic equations and Monte Carlo methods. Oxford University Press, OxfordzbMATHGoogle Scholar
  21. Risken H (1992) The Fokker–Planck equation: methods of solution and applications. Springer, BerlinzbMATHGoogle Scholar
  22. Roth AE, Erev I (1995) Learning in extensive-form games: experimental data and simple dynamics models in the intermediate term. Games Econ Behav 8:164–212MathSciNetCrossRefGoogle Scholar
  23. Rustichini A (1999) Optimal properties of stimulus-response learning models. Games Econ Behav 29:230–244MathSciNetzbMATHGoogle Scholar
  24. Sandholm WH (2010) Population games and evolutionary dynamics. MIT Press, CambridgezbMATHGoogle Scholar
  25. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, CambridgezbMATHGoogle Scholar
  26. Taylor PD, Jonker IN (1978) Evolutionarily stable strategies. Math Biosci 40:145–56MathSciNetCrossRefGoogle Scholar
  27. Thorndike EL (1898) Animal intelligence: an experimental study of the associative processes in animals. In: Baldwin JM, Cattell JM, with the cooperation of [other] (eds) Psychological review. Series of monograph supplements, vol II, no 4 (whole no 8). The Macmillan Company, New York, LondonGoogle Scholar
  28. Traulsen A, Claussen JC, Hauert C (2005) Coevolutionary dynamics: from finite to infinite populations. Phys Rev Lett 95:238701CrossRefGoogle Scholar
  29. Traulsen A, Claussen JC, Hauert C (2006) Coevolutionary dynamics in large, but finite populations. Phys Rev E 74:011901MathSciNetCrossRefGoogle Scholar
  30. Zeeman EC (1981) Dynamics of the evolution of animal conflicts. J Theor Biol 89:249–70MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of MathematicsUniversity of HoustonHoustonUSA

Personalised recommendations