# Adaptive learning in large populations

- 90 Downloads

## Abstract

We consider the adaptive learning rule of Harley (J Theor Biol 89:611–633, 1981) for behavior selection in symmetric conflict games in large populations. This rule uses organisms’ past, accumulated rewards as the predictor for future behavior, and can be traced in many life forms from bacteria to humans. We derive a partial differential equation for the distribution of agents in the space of stimuli to select a particular strategy which describes the evolution of learning in heterogeneous populations. We analyze the solutions of the PDE model for symmetric \(2 \times 2\) games. It is found that in games with small residual stimuli, adaptive learning rules with larger memory factor converge faster to the optimal outcome.

## Keywords

Adaptive learning Relative payoff sum Symmetric games## Mathematics Subject Classification

35Q91 91A05 91A20## Notes

### Acknowledgements

The author wishes to thank the anonymous referees for patient reading of the manuscript and detailed comments that helped to improve it in so many ways.

## References

- Benaim M, Hirsch MW (1999) Stochastic approximation algorithms with constant step size whose average is cooperative. Ann Appl Probab 9(1):216–241MathSciNetCrossRefGoogle Scholar
- Borgers T, Sarin R (1997) Learning through reinforcement and replicator dynamics. J Econ Theory 77(1):1–14MathSciNetCrossRefGoogle Scholar
- Bush RR, Mosteller F (1955) Stochastic models for learning. Wiley, New YorkCrossRefGoogle Scholar
- Catania AC (1963) Concurrent performances. A baseline for the study of reinforcement magnitude. J Exp Anal Behav 6:299–300CrossRefGoogle Scholar
- Chung S-H, Herrstein RJ (1967) Choice and delay of reinforcement. J Exp Anal Behav 10:67–74CrossRefGoogle Scholar
- Cross JG (1983) A theory of adaptive economic behavior. Cambridge University Press, CambridgeCrossRefGoogle Scholar
- Domjan M, Burkhard B (1984) The principles of learning and behavior. Brooks/Cole Publishing Co., MontereyGoogle Scholar
- Erev I, Roth AE (1998) Predicting how people play games: reinforcement learning in experimental games with unique, mixed strategy equilibrium. Am Econ Rev 88:848–881Google Scholar
- Ferster CB, Skinner BF (1957) Schedules of reinforcement. Appleton-Century-Grofis, New YorkCrossRefGoogle Scholar
- Fudenberg D, Levine D (1998) The theory of learning in games. MIT Press, CambridgezbMATHGoogle Scholar
- Fudenberg D, Takahashi S (2011) Heterogeneous beliefs and local information in stochastic fictitious play. Games Econ Behav 71:100–120MathSciNetCrossRefGoogle Scholar
- Harley CB (1981) Learning the evolutionarily stable strategy. J Theor Biol 89:611–633CrossRefGoogle Scholar
- Herrstein RJ (1961) Relative and absolute strength of response as a function of frequency of reinforcement. J Exp Anal Behav 4:267–272CrossRefGoogle Scholar
- Herrstein RJ (1970) On the law of effect. J Exp Anal Behav 13:243–266CrossRefGoogle Scholar
- Kahneman D, Tversky A (1984) Choices, values and frames. Am Psychol 39:341–350CrossRefGoogle Scholar
- Luce RD (1959) Individual choice behavior: a theoretical analysis. Wiley, New YorkzbMATHGoogle Scholar
- Maynard Smith J, Price GR (1973) The logic of animal conflict. Nat Lond 246:15–18CrossRefGoogle Scholar
- Mertikopoulos P, Sandholm WH (2016) Learning in games via reinforcement learning and regularization. Math Oper Res INFORMS 41(4):1297–1324MathSciNetCrossRefGoogle Scholar
- Nax HH, Perc M (2015) Directional learning and the provisioning of public goods. Sci Rep 5:8010CrossRefGoogle Scholar
- Pareschi L, Toscani G (2014) Interacting multiagent systems: kinetic equations and Monte Carlo methods. Oxford University Press, OxfordzbMATHGoogle Scholar
- Risken H (1992) The Fokker–Planck equation: methods of solution and applications. Springer, BerlinzbMATHGoogle Scholar
- Roth AE, Erev I (1995) Learning in extensive-form games: experimental data and simple dynamics models in the intermediate term. Games Econ Behav 8:164–212MathSciNetCrossRefGoogle Scholar
- Rustichini A (1999) Optimal properties of stimulus-response learning models. Games Econ Behav 29:230–244MathSciNetzbMATHGoogle Scholar
- Sandholm WH (2010) Population games and evolutionary dynamics. MIT Press, CambridgezbMATHGoogle Scholar
- Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, CambridgezbMATHGoogle Scholar
- Taylor PD, Jonker IN (1978) Evolutionarily stable strategies. Math Biosci 40:145–56MathSciNetCrossRefGoogle Scholar
- Thorndike EL (1898) Animal intelligence: an experimental study of the associative processes in animals. In: Baldwin JM, Cattell JM, with the cooperation of [other] (eds) Psychological review. Series of monograph supplements, vol II, no 4 (whole no 8). The Macmillan Company, New York, LondonGoogle Scholar
- Traulsen A, Claussen JC, Hauert C (2005) Coevolutionary dynamics: from finite to infinite populations. Phys Rev Lett 95:238701CrossRefGoogle Scholar
- Traulsen A, Claussen JC, Hauert C (2006) Coevolutionary dynamics in large, but finite populations. Phys Rev E 74:011901MathSciNetCrossRefGoogle Scholar
- Zeeman EC (1981) Dynamics of the evolution of animal conflicts. J Theor Biol 89:249–70MathSciNetCrossRefGoogle Scholar