Baselines for Joint-Action Reinforcement Learning of Coordination in Cooperative Multi-agent Systems

  • Martin Carpenter
  • Daniel Kudenko
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3394)


A common assumption for the study of reinforcement learning of coordination is that agents can observe each other’s actions (so-called joint-action learning). We present in this paper a number of simple joint-action learning algorithms and show that they perform very well when compared against more complex approaches such as OAL [1], while still maintaining convergence guarantees. Based on the empirical results, we argue that these simple algorithms should be used as baselines for any future research on joint-action learning of coordination.


Reinforcement Learn Joint Action Multiagent System Optimal Action Stochastic Game 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Wang, X., Sandholm, T.: Reinforcement learning to play an optimal nash equilibrium in team markov games. In: Proceedings of the Sixteenth Conference on Neural Information Processing Systems (2002)Google Scholar
  2. 2.
    Kapetanakis, S., Kudenko, D.: Reinforcement learning of coordination in cooperative multiagent systems. In: Proceedings of the Eighteenth National Conference on Artifical Intelligence (2002)Google Scholar
  3. 3.
    Claus, C., Boutilier, C.: Dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the Fifteenth National Conference on Artifical Intelligence 1998(1998)Google Scholar
  4. 4.
    Chalkiadakis, G., Boutilier, C.: Coordination in multiagent reinforcement learning: A bayesian approach. In: Proceedings of the Second international conference on Automonous Agents and Multiagent Systems (2003)Google Scholar
  5. 5.
    Littman, M.L.: Markov games as a framework for multi - agent reinforcement learning. In: Proceedings of the Eleventh International Conference on machine Learning (1994)Google Scholar
  6. 6.
    Watkins, C.: Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge University (1989)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Martin Carpenter
    • 1
  • Daniel Kudenko
    • 2
  1. 1.School of InformaticsUniversity of ManchesterManchester
  2. 2.Department of Computer ScienceUniversity of YorkUnited Kingdom

Personalised recommendations