Abstract
We report on an investigation of the learning of coordination in cooperative multi-agent systems. Specifically, we study solutions that are applicable to independent agents i.e. agents that do not observe one another’s actions. In previous research [5] we have presented a reinforcement learning approach that converges to the optimal joint action even in scenarios with high miscoordination costs. However, this approach failed in fully stochastic environments. In this paper, we present a novel approach based on reward estimation with a shared action-selection protocol. The new technique is applicable in fully stochastic environments where mutual observation of actions is not possible. We demonstrate empirically that our approach causes the agents to converge almost always to the optimal joint action even in difficult stochastic scenarios with high miscoordination penalties.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Boutilier, C.: Sequential optimality and coordination in multiagent systems. In: Proceedings of the Sixteenth International Joint Conference on Articial Intelligence (IJCAI 1999), pp. 478–485 (1999)
Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the Fifteenth National Conference on Articial Intelligence, pp. 746–752 (1998)
Fudenberg, D., Levine, D.K.: The Theory of Learning in Games. MIT Press, Cambridge (1998)
Hu, J., Wellman, M.P.: Multiagent q-learning. Machine Learning Research (2002)
Kapetanakis, S., Kudenko, D.: Reinforcement learning of coordination in cooperative multi-agent systems. In: Proceedings of the Eighteenth National Conference on Artificial Intelligence, AAAI 2002 (2002)
Lauer, M., Riedmiller, M.: An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In: Proceedings of the Seventeenth International Conference in Machine Learning (2000)
Nowé, A., Parent, J., Verbeeck, K.: Social agents playing a periodical policy. In: Proceedings of the 12th European Conference on Machine Learning, Freiburg, Germany (2001)
Peshkin, L., Kim, K.-E., Meuleau, N., Kaelbling, L.: Learning to cooperate via policy search. In: Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (2000)
Sen, S., Sekaran, M., Hale, J.: Learning to coordinate without sharing information. In: Proceedings of the Twelfth National Conference on Artificial Intelligence, Seattle, WA, pp. 426–431 (1994)
Wang, X., Sandholm, T.: Reinforcement learning to play an optimal nash equilibrium in team markov games. In: Proceedings of the 16th Neural Information Processing Systems: Natural and Synthetic (NIPS) conference, Vancouver, Canada (2002)
Weiss, G.: Learning to coordinate actions in multi-agent systems. In: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, vol. 1, pp. 311–316. Morgan Kaufmann Publ., San Francisco (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kapetanakis, S., Kudenko, D., Strens, M.J.A. (2005). Learning to Coordinate Using Commitment Sequences in Cooperative Multi-agent Systems. In: Kudenko, D., Kazakov, D., Alonso, E. (eds) Adaptive Agents and Multi-Agent Systems II. AAMAS AAMAS 2004 2003. Lecture Notes in Computer Science(), vol 3394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-32274-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-32274-0_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25260-3
Online ISBN: 978-3-540-32274-0
eBook Packages: Computer ScienceComputer Science (R0)