Learning to Coordinate Using Commitment Sequences in Cooperative Multi-agent Systems

Kapetanakis, Spiros; Kudenko, Daniel; Strens, Malcolm J. A.

doi:10.1007/978-3-540-32274-0_7

Spiros Kapetanakis²¹,
Daniel Kudenko²¹ &
Malcolm J. A. Strens²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3394))

Included in the following conference series:

1248 Accesses
6 Citations

Abstract

We report on an investigation of the learning of coordination in cooperative multi-agent systems. Specifically, we study solutions that are applicable to independent agents i.e. agents that do not observe one another’s actions. In previous research [5] we have presented a reinforcement learning approach that converges to the optimal joint action even in scenarios with high miscoordination costs. However, this approach failed in fully stochastic environments. In this paper, we present a novel approach based on reward estimation with a shared action-selection protocol. The new technique is applicable in fully stochastic environments where mutual observation of actions is not possible. We demonstrate empirically that our approach causes the agents to converge almost always to the optimal joint action even in difficult stochastic scenarios with high miscoordination penalties.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Boutilier, C.: Sequential optimality and coordination in multiagent systems. In: Proceedings of the Sixteenth International Joint Conference on Articial Intelligence (IJCAI 1999), pp. 478–485 (1999)
Google Scholar
Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the Fifteenth National Conference on Articial Intelligence, pp. 746–752 (1998)
Google Scholar
Fudenberg, D., Levine, D.K.: The Theory of Learning in Games. MIT Press, Cambridge (1998)
MATH Google Scholar
Hu, J., Wellman, M.P.: Multiagent q-learning. Machine Learning Research (2002)
Google Scholar
Kapetanakis, S., Kudenko, D.: Reinforcement learning of coordination in cooperative multi-agent systems. In: Proceedings of the Eighteenth National Conference on Artificial Intelligence, AAAI 2002 (2002)
Google Scholar
Lauer, M., Riedmiller, M.: An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In: Proceedings of the Seventeenth International Conference in Machine Learning (2000)
Google Scholar
Nowé, A., Parent, J., Verbeeck, K.: Social agents playing a periodical policy. In: Proceedings of the 12th European Conference on Machine Learning, Freiburg, Germany (2001)
Google Scholar
Peshkin, L., Kim, K.-E., Meuleau, N., Kaelbling, L.: Learning to cooperate via policy search. In: Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (2000)
Google Scholar
Sen, S., Sekaran, M., Hale, J.: Learning to coordinate without sharing information. In: Proceedings of the Twelfth National Conference on Artificial Intelligence, Seattle, WA, pp. 426–431 (1994)
Google Scholar
Wang, X., Sandholm, T.: Reinforcement learning to play an optimal nash equilibrium in team markov games. In: Proceedings of the 16th Neural Information Processing Systems: Natural and Synthetic (NIPS) conference, Vancouver, Canada (2002)
Google Scholar
Weiss, G.: Learning to coordinate actions in multi-agent systems. In: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, vol. 1, pp. 311–316. Morgan Kaufmann Publ., San Francisco (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of York, Heslington, York, YO10 5DD, UK
Spiros Kapetanakis & Daniel Kudenko
Guidance and Imaging Solutions,QinetiQ, Ively Road, Farnborough, Hampshire, GU14 OLX, UK
Malcolm J. A. Strens

Authors

Spiros Kapetanakis
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Kudenko
View author publications
You can also search for this author in PubMed Google Scholar
Malcolm J. A. Strens
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of York, YO10 5DD, York, UK
Daniel Kudenko
Artificial Intelligence Group, Department of Computer Science, University of York, Heslington, York, UK
Dimitar Kazakov
Department of Computing, City University, P.O. Box, EC1V 0HB, London, United Kingdom
Eduardo Alonso

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kapetanakis, S., Kudenko, D., Strens, M.J.A. (2005). Learning to Coordinate Using Commitment Sequences in Cooperative Multi-agent Systems. In: Kudenko, D., Kazakov, D., Alonso, E. (eds) Adaptive Agents and Multi-Agent Systems II. AAMAS AAMAS 2004 2003. Lecture Notes in Computer Science(), vol 3394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-32274-0_7

Download citation

DOI: https://doi.org/10.1007/978-3-540-32274-0_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25260-3
Online ISBN: 978-3-540-32274-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics