Reinforcement Learning of Coordination in Heterogeneous Cooperative Multi-agent Systems

  • Spiros Kapetanakis
  • Daniel Kudenko
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3394)


Most approaches to the learning of coordination in multi-agent systems (MAS) to date require all agents to use the same learning algorithm with similar (or even the same) parameter settings. In today’s open networks and high inter-connectivity such an assumption becomes increasingly unrealistic. Developers are starting to have less control over the agents that join the system and the learning algorithms they employ. This makes effective coordination and good learning performance extremely difficult to achieve, especially in the absence of learning agent standards. In this paper we investigate the problem of learning to coordinate with heterogeneous agents. We show that an agent employing the FMQ algorithm, a recently developed multi-agent learning method, has the ability to converge towards the optimal joint action when teamed-up with one or more simple Q-learners. Specifically, we show such convergence in scenarios where simple Q-learners alone are unable to converge towards an optimum. Our results show that system designers may improve learning and coordination performance by adding a “smart” agent to the MAS.


Reinforcement Learn Joint Action Multiagent System Baseline Experiment Markov Game 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Boutilier, C.: Sequential optimality and coordination in multiagent systems. In: Proceedings of the Sixteenth International Joint Conference on Articial Intelligence (IJCAI 1999), pp. 478–485 (1999)Google Scholar
  2. 2.
    Chalkiadakis, G., Boutilier, C.: Coordination in multiagent reinforcement learning: A bayesian approach. In: Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, Melbourne, Australia, pp. 709–716 (2003)Google Scholar
  3. 3.
    Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the Fifteenth National Conference on Articial Intelligence, pp. 746–752 (1998)Google Scholar
  4. 4.
    Fudenberg, D., Levine, D.K.: The Theory of Learning in Games. MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  5. 5.
    Kaelbling, L.P., Littman, M., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4 (1996)Google Scholar
  6. 6.
    Kapetanakis, S., Kudenko, D.: Reinforcement learning of coordination in cooperative multi-agent systems. In: Proceedings of the Nineteenth National Conference on Artificial Intelligence, Edmonton, Alberta, Canada, pp. 326–331 (2002)Google Scholar
  7. 7.
    Kapetanakis, S., Kudenko, D., Strens, M.: Learning to coordinate using commitment sequences in cooperative multi-agent systems. In: Proceedings of the Third Symposium on Adaptive Agents and Multi-agent Systems (AAMAS 2003), Society for the study of Artificial Intelligence and Simulation of Behaviour (2003)Google Scholar
  8. 8.
    Lauer, M., Riedmiller, M.: An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In: Proceedings of the Seventeenth International Conference in Machine Learning (2000)Google Scholar
  9. 9.
    Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Kaufman, M. (ed.) Proceedings of the Eleventh International Conference on Machine Learning, San Mateo, CA, USA, pp. 157–163 (1994)Google Scholar
  10. 10.
    Sen, S., Sekaran, M.: Individual learning of coordination knowledge. JETAI 10(3), 333–356 (1998)zbMATHCrossRefGoogle Scholar
  11. 11.
    Verbeeck, K., Nowe, A., Tuyls, K.: Coordinated exploration in stochastic common interest games. In: Proceedings of Third Symposium on Adaptive Agents and Multi-agent Systems, pp. 97–102. University of Wales, Aberystwyth (2003)Google Scholar
  12. 12.
    Wang, X., Sandholm, T.: Reinforcement learning to play an optimal nash equilibrium in team markov games. In: Proceedings of the 16th Neural Information Processing Systems: Natural and Synthetic conference, Vancouver, Canada (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Spiros Kapetanakis
    • 1
  • Daniel Kudenko
    • 1
  1. 1.Department of Computer ScienceUniversity of YorkHeslington, YorkUK

Personalised recommendations