Abstract
In this paper, we show how the dynamics of Q-learning can be visualized and analyzed from a perspective of Evolutionary Dynamics (ED). More specifically, we show how ED can be used as a model for Q-learning in stochastic games. Analysis of the evolutionary stable strategies and attractors of the derived ED from the Reinforcement Learning (RL) application then predict the desired parameters for RL in Multi-Agent Systems (MASs) to achieve Nash equilibriums with high utility.
Secondly, we show how the derived fine tuning of parameter settings from the ED can support application of the COllective INtelligence (COIN) framework. COIN is a proved engineering approach for learning of cooperative tasks in MASs. We show that the derived link between ED and RL predicts performance of the COIN framework and visualizes the incentives provided in COIN toward cooperative behavior.
Chapter PDF
References
Banerjee, B., Peng, J.: Adaptive policy gradient in multiagent learning. In: AAMAS (2003)
Barto, A., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discrete-Event Systems journal, Special issue on Reinforcement Learning 13, 41–77 (2003)
Becker, R., Zilberstein, S., Lesser, V., Goldman, C.V.: Transition-independent decentralized Markov decision problems. In: AAMAS (2003)
Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: AAAI/IAAI, pp. 746–752 (1998)
Gintis, C.: Game Theory Evolving. Princeton University Press, Princeton (2000)
Grenager, T., Powers, R., Shoham, Y.: Dispersion games: general definitions and some specific learning results. In: AAAI 2002 (2002)
Guestrin, C., Koller, D., Gearhart, C., Kanodia, N.: Generalizing plans to new environments in relational MDPs. In: International Joint Conference on Artificial Intelligence, IJCAI 2003 (2003)
Hofbauer, J., Sigmund, K.: Evolutionary Games and Population Dynamics. Cambridge University Press, Cambridge (1998)
Huang, P., Sycara, K.: Multi-agent learning in extensive games with complete information. In: AAMAS (2003)
Jung, H., Tambe, M.: Performance model for large scale multiagent systems. In: AAMAS (2003)
Lauer, M., Riedmiller, M.: An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In: Proc. 17th International Conf. on Machine Learning, pp. 535–542. Morgan Kaufmann, San Francisco (2000)
Osborne, M., Rubinstein, A.: A Course in Game Theory. The MIT Press, Cambridge (1994)
Samuelson, L.: Evolutionary Games and Equilibrium Selection. MIT Press, Cambridge (1997)
Schneider, T.: Evolution of biological information. journal of NAR 28, 2794–2799 (2000)
Stauffer, D.: Life, love and death: Models of biological reproduction and aging. Institute for Theoretical physics, Köln, Euroland (1999)
Sutton, R., Barto, A.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998)
’t Hoen, P., Bohte, S.: COllective INtelligence with sequences of actions. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 181–192. Springer, Heidelberg (2003)
’t Hoen, P., Bohte, S.: COllective INtelligence with task assignment. In: Proceedings of CDOCS 2003. LNCS (LNAI), Springer, Heidelberg (2003) (fortcoming); Also available as TR
Tumer, K., Wolpert, D.: COllective INtelligence and Braess’ paradox. In: Proceedings of the Sixteenth National Conference on Artificial Intelligence, Austin, August 2000, pp. 104–109 (2000)
Tuyls, K., Heytens, D., Nowe, A., Manderick, B.: Extended replicator dynamics as a key to reinforcement learning in multi-agent systems. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 421–431. Springer, Heidelberg (2003)
Tuyls, K., Verbeeck, K., Lenaerts, T.: A selection-mutation model for Q-learning in multi-agent systems. In: AAMAS, The ACM International Conference Proceedings Series (2003)
Watkins, Dayan: Q-learning. Machine Learning 8, 279–292 (1992)
Weibull, J.: Evolutionary Game Theory. The MIT Press, Cambridge (1995)
Wolpert, D., Tumer, K.: Optimal payoff functions for members of collectives. Advances in Complex Systems 4(2/3), 265–279 (2001)
Wolpert, D.H., Tumer, K., Frank, J.: Using collective intelligence to route internet traffic. In: Advances in Neural Information Processing Systems-11, Denver, pp. 952–958 (1998)
Wolpert, D.H., Wheeler, K.R., Tumer, K.: General principles of learning-based multi-agent systems. In: Etzioni, O., Müller, J.P., Bradshaw, J.M. (eds.) Proceedings of the Third Annual Conference on Autonomous Agents (AGENTS 1999), May 1-5, pp. 77–83. ACM Press, New York (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jan’t Hoen, P., Tuyls, K. (2004). Analyzing Multi-agent Reinforcement Learning Using Evolutionary Dynamics. In: Boulicaut, JF., Esposito, F., Giannotti, F., Pedreschi, D. (eds) Machine Learning: ECML 2004. ECML 2004. Lecture Notes in Computer Science(), vol 3201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30115-8_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-30115-8_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23105-9
Online ISBN: 978-3-540-30115-8
eBook Packages: Springer Book Archive