Bayesian Learning for Cooperation in Multi-Agent Systems

Allen-Williams, Mair; Jennings, Nicholas R.

doi:10.1007/978-3-642-01799-5_10

Mair Allen-Williams⁵ &
Nicholas R. Jennings⁵

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 1))

1261 Accesses

Abstract

Multi-agent systems draw together a number of significant trends in modern technology: ubiquity, decentralisation, openness, dynamism and uncertainty. As work in these fields develops, such systems face increasing challenges. Two particular challenges are decision making in uncertain and partially-observable environments, and coordination with other agents in such environments. Although uncertainty and coordination have been tackled as separate problems, formal models for an integrated approach are typically restricted to simple classes of problem and are not scalable to problems with many agents and millions of states. We improve on these approaches by extending a principled Bayesian model into more challenging domains, using heuristics and exploiting domain knowledge in order to make approximate solutions tractable.We show the effectiveness of our approach applied to an ambulance coordination problem inspired by the Robocup Rescue system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aberdeen, D., Baxter, J.: Scaling internal-state policy-gradient methods for POMDPs. In: Proceedings of the 19th International Conference on Machine Learning, vol. 2, pp. 3–10. Morgan Kaufmann, San Francisco (2002)
Google Scholar
Abul, O., Polat, F., Alhajj, R.: Multiagent reinforcement learning using function approximation. IEEE Transactions on Systems, Man, and Cybernetics, Part C 30, 485–497 (2000)
Article Google Scholar
Amato, C., Bernstein, D.S., Zilberstein, S.: Solving POMDPs using quadratically constrained linear programs. In: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pp. 341–343. ACM Press, New York (2006)
Chapter Google Scholar
Boutilier, C.: Planning, learning and coordination in multiagent decision processes. In: Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge, pp. 195–210. Morgan Kaufmann Publishers Inc., San Francisco (1996)
Google Scholar
Bowling, M., Veloso, M.: Rational and convergent learning in stochastic games. In: International Joint Conferences on Artificial Intelligence, pp. 1021–1026 (2001)
Google Scholar
Carmel, D., Markovitch, S.: Learning models of intelligent agents. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence, Portland, Oregon, vol. 2, pp. 62–67 (1996)
Google Scholar
Cassandra, A., Littman, M., Zhang, N.: Incremental pruning: A simple, fast, exact method for partially observable Markov decision processes. In: Proceedings of the 13th Annual Conference on Uncertainty in Artificial Intelligence, pp. 54–61. Morgan Kaufmann, San Francisco (1997)
Google Scholar
Chalkiadakis, G., Boutilier, C.: Coordination in multiagent reinforcement learning: a Bayesian approach. In: Proceedings of the second international joint conference on Autonomous agents and multiagent systems, pp. 709–716. ACM Press, New York (2003)
Chapter Google Scholar
Clark, A., Thollard, F.: PAC-learnability of probabilistic deterministic finite state automata. Journal of Machine Learning Research 5, 473–497 (2004)
MathSciNet Google Scholar
Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence, Menlo Park, CA, USA, pp. 746–752. American Association for Artificial Intelligence (1998)
Google Scholar
Dearden, R., Friedman, N., Andre, D.: Model-based Bayesian exploration. In: Proceedings of the 15th Annual Conference on Uncertainty in Artificial Intelligence, pp. 150–159. Morgan Kaufmann, San Francisco (1999)
Google Scholar
Durfee, E.H.: Practically coordinating. AI Magazine 20(1), 99–116 (1999)
Google Scholar
Dutta, P.S., Dasmahapatra, S., Gunn, S.R., Jennings, N., Moreau, L.: Cooperative information sharing to improve distributed learning. In: Proceedings of the AAMAS 2004 workshop on Learning and Evolution in Agent-Based Systems, pp. 18–23 (2004)
Google Scholar
Emery-Montemerlo, R., Gordon, G., Schneider, J., Thrun, S.: Approximate solutions for partially observable stochastic games with common payoffs. In: Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, Washington, DC, USA, pp. 136–143. IEEE Computer Society, Los Alamitos (2004)
Google Scholar
Fischer, F., Rovatsos, M., Weiss, G.: Hierarchical reinforcement learning in communication-mediated multiagent coordination. In: Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, Washington, DC, USA, pp. 1334–1335. IEEE Computer Society, Los Alamitos (2004)
Google Scholar
Fitoussi, D., Tennenholtz, M.: Choosing social laws for multi-agent systems: Minimality and simplicity. Artificial Intelligence 119(1-2), 61–101 (2000)
Article MATH MathSciNet Google Scholar
Fudenberg, D., Levine, D.K.: The Theory of Learning in Games. MIT Press, Cambridge (1998)
MATH Google Scholar
Hoar, J.: Reinforcement learning applied to a real robot task. DAI MSc Dissertion, University of Edinburgh (1996)
Google Scholar
Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101(1-2), 99–134 (1998)
Article MATH MathSciNet Google Scholar
Kim, Y., Nair, R., Varakantham, P., Tambe, M., Yokoo, M.: Exploiting locality of interaction in networked distributed pomdps. In: Proceedings of the AAAI Spring Symposium on Distributed Plan and Schedule Management (2006)
Google Scholar
Leslie, D.: Reinforcement learning in games. PhD thesis, University of Bristol (2004)
Google Scholar
Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the 11th International Conference on Machine Learning, New Brunswick, NJ, pp. 157–163. Morgan Kaufmann, San Francisco (1994)
Google Scholar
Marecki, J., Gupta, T., Varakantham, P., Tambe, M.: Not all agents are equal: scaling up distributed POMDPs for agent networks. In: Proceedings of the Seventh International Joint Conference on Autonomous Agents and Multiagent Systems (2008)
Google Scholar
National Research Council. Summary of a Workshop on Using Information Technology to enhance Disaster Management. National Academies Press (2005)
Google Scholar
Osborne, M.A., Rogers, A., Ramchurn, S., Roberts, S.J., Jennings, N.R.: Towards real-time information processing of sensor network data using computationally efficient multi-output gaussian processes. In: International Conference on Information Processing in Sensor Networks, April 2008, pp. 109–120 (2008)
Google Scholar
Paquet, S., Tobin, L., Chaib-draa, B.: An online POMDP algorithm for complex multiagent environments. In: Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems, pp. 970–977. ACM Press, New York (2005)
Chapter Google Scholar
Pineau, J., Gordon, G., Thrun, S.: Point-based value iteration: An anytime algorithm for POMDPs. In: International Joint Conference on Artificial Intelligence, August 2003, pp. 1025–1032 (2003)
Google Scholar
Poupart, P., Vlassis, N., Hoey, J., Regan, K.: An analytic solution to discrete bayesian reinforcement learning. In: Proceedings of the 23rd international conference on Machine learning, pp. 697–704. ACM, New York (2006)
Chapter Google Scholar
Ramamritham, K., Stankovic, J.A., Zhao, W.: Distributed scheduling of tasks with deadlines and resource requirements. IEEE Transactions on Compututers 38(8), 1110–1123 (1989)
Article Google Scholar
Ross, S., Chaib-draa, B., Pineau, J.: Bayes-adaptive POMDPs. In: Neural Information Processing Systems (2008) (in press)
Google Scholar
Roy, N., Gordon, G.: Exponential family PCA for belief compression in POMDPs. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing, Vancouver, Canada, December 2002, pp. 1043–1049 (2002)
Google Scholar
Scerri, P., Liao, E., Xu, Y., Lewis, M., Lai, G., Sycara, K.: Coordinating very large groups of wide area search munitions. In: Theory and Algorithms for Cooperative Systems (2005)
Google Scholar
Scerri, P., Sycara, K., Tambe, M.: Adjustable autonomy in the context of coordination. In: AIAA 3rd Unmanned Unlimited Technical Conference, Workshop and Exhibit (2004) (invited paper)
Google Scholar
Shani, G., Brafman, R.I., Shimony, S.E.: Model-based online learning of POMDPs. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS, vol. 3720, pp. 353–364. Springer, Heidelberg (2005)
Chapter Google Scholar
Smith, A.J.: Dynamic generalisation of continuous action spaces in reinforcement learning: A neurally inspired approach, Ph.D. thesis, Division of Informatics, Edinburgh University, UK (2002)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Tambe, M., Adibi, J., Alonaizon, Y., Erdem, A., Kaminka, G.A., Marsella, S., Muslea, I.: Building agent teams using an explicit teamwork model and learning. Artificial Intelligence 110(2), 215–239 (1999)
Article MATH Google Scholar
Vu, T., Powers, R., Shoham, Y.: Learning against multiple opponents. In: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pp. 752–759. ACM, New York (2006)
Chapter Google Scholar
Wang, F.: Self-organising communities formed by middle agents. In: Proceedings of the first international joint conference on Autonomous agents and multiagent systems, pp. 1333–1339. ACM Press, New York (2002)
Chapter Google Scholar
Welch, G., Bishop, G.: An introduction to the Kalman filter. Technical report, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA (1995)
Google Scholar
Wooldridge, M.: An Introduction to Multi-agent Systems. Wiley, Chichester (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electronics and Computer Science, University of Southampton, SO17 1BJ, UK
Mair Allen-Williams & Nicholas R. Jennings

Authors

Mair Allen-Williams
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas R. Jennings
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science, Cardiff University , 5 The Parade,Roath, CF24 3AA, Cardiff, UK
Christine L. Mumford
University of South Australia Adelaide , Mawson Lakes Campus, South Australia, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Allen-Williams, M., Jennings, N.R. (2009). Bayesian Learning for Cooperation in Multi-Agent Systems. In: Mumford, C.L., Jain, L.C. (eds) Computational Intelligence. Intelligent Systems Reference Library, vol 1. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01799-5_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-01799-5_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01798-8
Online ISBN: 978-3-642-01799-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics