Multiagent Learning Paradigms

  • K. TuylsEmail author
  • P. Stone
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10767)


“Perhaps a thing is simple if you can describe it fully in several different ways, without immediately knowing that you are describing the same thing” – Richard Feynman

This articles examines multiagent learning from several paradigmatic perspectives, aiming to bring them together within one framework. We aim to provide a general definition of multiagent learning and lay out the essential characteristics of the various paradigms in a systematic manner by dissecting multiagent learning into its main components. We show how these various paradigms are related and describe similar learning processes but from varying perspectives, e.g. an individual (cognitive) learner vs. a population of (simple) learning agents.


  1. 1.
    Albrecht, S.V., Stone, P.: Autonomous agents modelling other agents: a comprehensive survey and open problems. Artif. Intell. 258, 66–95 (2018)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Altshuler, Y., Bruckstein, A.M.: Static and expanding grid coverage with ant robots: complexity results. Theor. Comput. Sci. 412(35), 4661–4674 (2011)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Banerjee, A.: A simple model of herd behavior. Q. J. Econ. 107, 797–817 (1992)CrossRefGoogle Scholar
  4. 4.
    Barrett, S., Stone, P., Kraus, S.: Empirical evaluation of ad hoc teamwork in the pursuit domain. In: 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011), Taipei, Taiwan, 2–6 May, 2011, vol. 1–3, pp. 567–574 (2011)Google Scholar
  5. 5.
    Bloembergen, D., Tuyls, K., Hennes, D., Kaisers, M.: Evolutionary dynamics of multi-agent learning: a survey. J. Artif. Intell. Res. 53, 659–697 (2015)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Broecker, B., Caliskanelli, I., Tuyls, K., Sklar, E.I., Hennes, D.: Hybrid insect-inspired multi-robot coverage in complex environments. In: Proceedings of the Towards Autonomous Robotic Systems - 16th Annual Conference, TAROS 2015, Liverpool, UK, 8–10 September 2015, pp. 56–68 (2015)CrossRefGoogle Scholar
  7. 7.
    Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence and Tenth Innovative Applications of Artificial Intelligence Conference, AAAI 98, IAAI 98, Madison, Wisconsin, USA, 26–30 July, 1998, pp. 746–752 (1998)Google Scholar
  8. 8.
    Colorni, A., Dorigo, M., Maniezzo, V.: Distributed optimization by ant colonies. In: Varela, F.J., Bourgine, P. (eds.) Towards a Practice of Autonomous Systems: Proceedings of the First European Conference on Artificial Life, pp. 134–142. MIT Press, Cambridge (1992)Google Scholar
  9. 9.
    Dorigo, M., Stützle, T.: Ant Colony Optimization. MIT Press, Cambridge (2004)Google Scholar
  10. 10.
    Fogel, D.B.: Evolving behaviors in the iterated prisoner’s dilemma. Evol. Comput. 1(1), 77–97 (1993)CrossRefGoogle Scholar
  11. 11.
    Fogel, D.B.: Evolutionary computation - toward a new philosophy of machine intelligence. IEEE (1995)Google Scholar
  12. 12.
    Galef, B.: Imitation in animals: history, definition, and interpretation of data from the psychological laboratory. In: Zentall, T., Galef, B. (eds.) Social Learning: Psychologicand Biological Perspectives. Lawrence Erlbaum Associates, Hillsdale (1988)Google Scholar
  13. 13.
    Gatti, N., Restelli, M.: Sequence-form and evolutionary dynamics: realization equivalence to agent form and logit dynamics. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, Arizona, USA, 12–17 February 2016, pp. 509–515 (2016)Google Scholar
  14. 14.
    Genter, K.L., Stone, P.: Influencing a flock via ad hoc teamwork. In: Proceedings of the Swarm Intelligence - 9th International Conference, ANTS 2014, Brussels, Belgium, 10–12 September 2014, pp. 110–121 (2014)Google Scholar
  15. 15.
    Gintis, H.: Game Theory Evolving, 2nd edn. University Press, Princeton (2009)zbMATHGoogle Scholar
  16. 16.
    Hofbauer, J., Sigmund, K.: Evolutionary Games and Population Dynamics. Cambridge University Press, Cambridge (1998)Google Scholar
  17. 17.
    Hu, J., Wellman, M.P.: Nash q-learning for general-sum stochastic games. J. Mach. Learn. Res. 4, 1039–1069 (2003)MathSciNetzbMATHGoogle Scholar
  18. 18.
    Kaisers, M., Tuyls, K.: Frequency adjusted multi-agent q-learning. In: 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010), Toronto, Canada, 10–14 May, 2010, vol. 1–3, pp. 309–316 (2010)Google Scholar
  19. 19.
    Klos, T., van Ahee, G.J., Tuyls, K.: Evolutionary dynamics of regret minimization. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS (LNAI), vol. 6322, pp. 82–96. Springer, Heidelberg (2010). Scholar
  20. 20.
    Knudson, M., Tumer, K.: Policy transfer in mobile robots using neuro-evolutionary navigation. In: Genetic and Evolutionary Computation Conference, GECCO 2012, Philadelphia, PA, USA, 7–11 July, 2012, Companion Material Proceedings, pp. 1411–1412 (2012)Google Scholar
  21. 21.
    Laland, K., Richerson, P., Boyd, R.: Animal social learning: toward a new theoretical approach. In: Klopfer, P., Bateson, P., Thomson, N. (eds.) Perspectives in Ethology. Plenum Press, New York (1993)Google Scholar
  22. 22.
    Lanctot, M.: Further developments of extensive-form replicator dynamics using the sequence-form representation. In: International Conference on Autonomous Agents and Multi-Agent Systems, AAMAS 2014, Paris, France, 5–9 May, 2014, pp. 1257–1264 (2014)Google Scholar
  23. 23.
    Littman, M.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 157–163 (1994)Google Scholar
  24. 24.
    Manderick, B., Spiessens, P.: Fine-grained parallel genetic algorithms. In: Proceedings of the 3rd International Conference on Genetic Algorithms, George Mason University, Fairfax, Virginia, USA, pp. 428–433, June 1989Google Scholar
  25. 25.
    Maynard Smith, J., Price, G.R.: The logic of animal conflict. Nature 246(2), 15–18 (1973)CrossRefGoogle Scholar
  26. 26.
    Mitchell, M.: An Introduction to Genetic Algorithms. MIT Press, Cambridge (1998)Google Scholar
  27. 27.
    Mitchell, T.M.: Machine Learning. McGraw Hill Series in Computer Science. McGraw-Hill, New York (1997)zbMATHGoogle Scholar
  28. 28.
    Palmer, G., Tuyls, K., Bloembergen, D., Savani, R.: Lenient multi-agent deep reinforcement learning. Accepted for AAMAS 2018 (2018)Google Scholar
  29. 29.
    Panait, L., Tuyls, K., Luke, S.: Theoretical advantages of lenient learners: an evolutionary game theoretic perspective. J. Mach. Learn. Res. 9, 423–457 (2008)MathSciNetzbMATHGoogle Scholar
  30. 30.
    Pardoe, D., Stone, P., Saar-Tsechansky, M., Keskin, T., Tomak, K.: Adaptive auction mechanism design and the incorporation of prior knowledge. INFORMS J. Comput. 22(3), 353–370 (2010)CrossRefGoogle Scholar
  31. 31.
    Pardoe, D., Stone, P., Saar-Tsechansky, M., Tomak, K.: Adaptive mechanism design: a metalearning approach. In: Proceedings of the 8th International Conference on Electronic Commerce: The new e-commerce - Innovations for Conquering Current Barriers, Obstacles and Limitations to Conducting Successful Business on the Internet, 2006, Fredericton, New Brunswick, Canada, 13–16 August, 2006, pp. 92–102 (2006)Google Scholar
  32. 32.
    Paredis, J.: Coevolutionary computation. Artif. Life 2(4), 355–375 (1995)CrossRefGoogle Scholar
  33. 33.
    Parkes, D.C.: On Learnable Mechanism Design, p. 107–131. Springer-Verlag (2004)Google Scholar
  34. 34.
    Sandholm, T.: Perspectives on multiagent learning. Artif. Intell. 171(7), 382–391 (2007)MathSciNetCrossRefGoogle Scholar
  35. 35.
    Saravanan, N., Fogel, D.B.: Evolving neurocontrollers using evolutionary programming. In: Proceedings of the First IEEE Conference on Evolutionary Computation, IEEE World Congress on Computational Intelligence, Orlando, Florida, USA, 27–29 June, 1994, pp. 217–222 (1994)Google Scholar
  36. 36.
    Shoham, Y., Leyton-Brown, K.: Multiagent Systems - Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, Cambridge (2009)Google Scholar
  37. 37.
    Shoham, Y., Powers, R., Grenager, T.: If multi-agent learning is the answer, what is the question? Artif. Intell. 171(7), 365–377 (2007)MathSciNetCrossRefGoogle Scholar
  38. 38.
    Stone, P.: Multiagent learning is not the answer. it is the question. Artif. Intell. 171(7), 402–405 (2007)MathSciNetCrossRefGoogle Scholar
  39. 39.
    Stone, P., Veloso, M.M.: Multiagent systems: a survey from a machine learning perspective. Auton. Robots 8(3), 345–383 (2000)CrossRefGoogle Scholar
  40. 40.
    Tuyls, K., Hoen, P.J., Vanschoenwinkel, B.: An evolutionary dynamical analysis of multi-agent learning in iterated games. Auton. Agents Multi-Agent Syst. 12(1), 115–153 (2006)CrossRefGoogle Scholar
  41. 41.
    Tuyls, K., Parsons, S.: What evolutionary game theory tells us about multiagent learning. Artif. Intell. 171(7), 406–416 (2007)MathSciNetCrossRefGoogle Scholar
  42. 42.
    Tuyls, K., Pérolat, J., Lanctot, M., Ostrovski, G., Savani, R., Leibo, J.Z., Ord, T., Graepel, T., Legg, S.: Symmetric decomposition of asymmetric games. Sci. Rep. 8(1), 1015 (2018)CrossRefGoogle Scholar
  43. 43.
    Tuyls, K., Verbeeck, K., Lenaerts, T.: A selection-mutation model for q-learning in multi-agent systems. In: Proceedings of the Second International Joint Conference on Autonomous Agents & Multiagent Systems, AAMAS 2003, Melbourne, Victoria, Australia, 14–18 July, 2003, pp. 693–700 (2003)Google Scholar
  44. 44.
    Tuyls, K., Weiss, G.: Multiagent learning: basics, challenges, and prospects. AI Mag. 33(3), 41–52 (2012)CrossRefGoogle Scholar
  45. 45.
    Urzelai, J., Floreano, D.: Evolutionary robotics: coping with environment change. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2000), Las Vegas, Nevada, USA, 8–12 July, 2000, pp. 941–948 (2000)Google Scholar
  46. 46.
    Weibull, J.W.: Evolutionary Game Theory. MIT Press, Cambridge (1997)Google Scholar
  47. 47.
    Wooldridge, M.J.: Introduction to Multiagent Systems. Wiley, Hoboken (2002)Google Scholar
  48. 48.
    Wunder, M., Littman, M.L., Babes, M.: Classes of multiagent q-learning dynamics with epsilon-greedy exploration. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel, 21–24 June, 2010, pp. 1167–1174 (2010)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.DeepMindParisFrance
  2. 2.University of LiverpoolLiverpoolUK
  3. 3.University of TexasAustinUSA

Personalised recommendations