Abstract
Multi-agent systems draw together a number of significant trends in modern technology: ubiquity, decentralisation, openness, dynamism and uncertainty. As work in these fields develops, such systems face increasing challenges. Two particular challenges are decision making in uncertain and partially-observable environments, and coordination with other agents in such environments. Although uncertainty and coordination have been tackled as separate problems, formal models for an integrated approach are typically restricted to simple classes of problem and are not scalable to problems with many agents and millions of states. We improve on these approaches by extending a principled Bayesian model into more challenging domains, using heuristics and exploiting domain knowledge in order to make approximate solutions tractable.We show the effectiveness of our approach applied to an ambulance coordination problem inspired by the Robocup Rescue system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aberdeen, D., Baxter, J.: Scaling internal-state policy-gradient methods for POMDPs. In: Proceedings of the 19th International Conference on Machine Learning, vol. 2, pp. 3–10. Morgan Kaufmann, San Francisco (2002)
Abul, O., Polat, F., Alhajj, R.: Multiagent reinforcement learning using function approximation. IEEE Transactions on Systems, Man, and Cybernetics, Part C 30, 485–497 (2000)
Amato, C., Bernstein, D.S., Zilberstein, S.: Solving POMDPs using quadratically constrained linear programs. In: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pp. 341–343. ACM Press, New York (2006)
Boutilier, C.: Planning, learning and coordination in multiagent decision processes. In: Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge, pp. 195–210. Morgan Kaufmann Publishers Inc., San Francisco (1996)
Bowling, M., Veloso, M.: Rational and convergent learning in stochastic games. In: International Joint Conferences on Artificial Intelligence, pp. 1021–1026 (2001)
Carmel, D., Markovitch, S.: Learning models of intelligent agents. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence, Portland, Oregon, vol. 2, pp. 62–67 (1996)
Cassandra, A., Littman, M., Zhang, N.: Incremental pruning: A simple, fast, exact method for partially observable Markov decision processes. In: Proceedings of the 13th Annual Conference on Uncertainty in Artificial Intelligence, pp. 54–61. Morgan Kaufmann, San Francisco (1997)
Chalkiadakis, G., Boutilier, C.: Coordination in multiagent reinforcement learning: a Bayesian approach. In: Proceedings of the second international joint conference on Autonomous agents and multiagent systems, pp. 709–716. ACM Press, New York (2003)
Clark, A., Thollard, F.: PAC-learnability of probabilistic deterministic finite state automata. Journal of Machine Learning Research 5, 473–497 (2004)
Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence, Menlo Park, CA, USA, pp. 746–752. American Association for Artificial Intelligence (1998)
Dearden, R., Friedman, N., Andre, D.: Model-based Bayesian exploration. In: Proceedings of the 15th Annual Conference on Uncertainty in Artificial Intelligence, pp. 150–159. Morgan Kaufmann, San Francisco (1999)
Durfee, E.H.: Practically coordinating. AI Magazine 20(1), 99–116 (1999)
Dutta, P.S., Dasmahapatra, S., Gunn, S.R., Jennings, N., Moreau, L.: Cooperative information sharing to improve distributed learning. In: Proceedings of the AAMAS 2004 workshop on Learning and Evolution in Agent-Based Systems, pp. 18–23 (2004)
Emery-Montemerlo, R., Gordon, G., Schneider, J., Thrun, S.: Approximate solutions for partially observable stochastic games with common payoffs. In: Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, Washington, DC, USA, pp. 136–143. IEEE Computer Society, Los Alamitos (2004)
Fischer, F., Rovatsos, M., Weiss, G.: Hierarchical reinforcement learning in communication-mediated multiagent coordination. In: Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, Washington, DC, USA, pp. 1334–1335. IEEE Computer Society, Los Alamitos (2004)
Fitoussi, D., Tennenholtz, M.: Choosing social laws for multi-agent systems: Minimality and simplicity. Artificial Intelligence 119(1-2), 61–101 (2000)
Fudenberg, D., Levine, D.K.: The Theory of Learning in Games. MIT Press, Cambridge (1998)
Hoar, J.: Reinforcement learning applied to a real robot task. DAI MSc Dissertion, University of Edinburgh (1996)
Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 101(1-2), 99–134 (1998)
Kim, Y., Nair, R., Varakantham, P., Tambe, M., Yokoo, M.: Exploiting locality of interaction in networked distributed pomdps. In: Proceedings of the AAAI Spring Symposium on Distributed Plan and Schedule Management (2006)
Leslie, D.: Reinforcement learning in games. PhD thesis, University of Bristol (2004)
Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the 11th International Conference on Machine Learning, New Brunswick, NJ, pp. 157–163. Morgan Kaufmann, San Francisco (1994)
Marecki, J., Gupta, T., Varakantham, P., Tambe, M.: Not all agents are equal: scaling up distributed POMDPs for agent networks. In: Proceedings of the Seventh International Joint Conference on Autonomous Agents and Multiagent Systems (2008)
National Research Council. Summary of a Workshop on Using Information Technology to enhance Disaster Management. National Academies Press (2005)
Osborne, M.A., Rogers, A., Ramchurn, S., Roberts, S.J., Jennings, N.R.: Towards real-time information processing of sensor network data using computationally efficient multi-output gaussian processes. In: International Conference on Information Processing in Sensor Networks, April 2008, pp. 109–120 (2008)
Paquet, S., Tobin, L., Chaib-draa, B.: An online POMDP algorithm for complex multiagent environments. In: Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems, pp. 970–977. ACM Press, New York (2005)
Pineau, J., Gordon, G., Thrun, S.: Point-based value iteration: An anytime algorithm for POMDPs. In: International Joint Conference on Artificial Intelligence, August 2003, pp. 1025–1032 (2003)
Poupart, P., Vlassis, N., Hoey, J., Regan, K.: An analytic solution to discrete bayesian reinforcement learning. In: Proceedings of the 23rd international conference on Machine learning, pp. 697–704. ACM, New York (2006)
Ramamritham, K., Stankovic, J.A., Zhao, W.: Distributed scheduling of tasks with deadlines and resource requirements. IEEE Transactions on Compututers 38(8), 1110–1123 (1989)
Ross, S., Chaib-draa, B., Pineau, J.: Bayes-adaptive POMDPs. In: Neural Information Processing Systems (2008) (in press)
Roy, N., Gordon, G.: Exponential family PCA for belief compression in POMDPs. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing, Vancouver, Canada, December 2002, pp. 1043–1049 (2002)
Scerri, P., Liao, E., Xu, Y., Lewis, M., Lai, G., Sycara, K.: Coordinating very large groups of wide area search munitions. In: Theory and Algorithms for Cooperative Systems (2005)
Scerri, P., Sycara, K., Tambe, M.: Adjustable autonomy in the context of coordination. In: AIAA 3rd Unmanned Unlimited Technical Conference, Workshop and Exhibit (2004) (invited paper)
Shani, G., Brafman, R.I., Shimony, S.E.: Model-based online learning of POMDPs. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS, vol. 3720, pp. 353–364. Springer, Heidelberg (2005)
Smith, A.J.: Dynamic generalisation of continuous action spaces in reinforcement learning: A neurally inspired approach, Ph.D. thesis, Division of Informatics, Edinburgh University, UK (2002)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Tambe, M., Adibi, J., Alonaizon, Y., Erdem, A., Kaminka, G.A., Marsella, S., Muslea, I.: Building agent teams using an explicit teamwork model and learning. Artificial Intelligence 110(2), 215–239 (1999)
Vu, T., Powers, R., Shoham, Y.: Learning against multiple opponents. In: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pp. 752–759. ACM, New York (2006)
Wang, F.: Self-organising communities formed by middle agents. In: Proceedings of the first international joint conference on Autonomous agents and multiagent systems, pp. 1333–1339. ACM Press, New York (2002)
Welch, G., Bishop, G.: An introduction to the Kalman filter. Technical report, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA (1995)
Wooldridge, M.: An Introduction to Multi-agent Systems. Wiley, Chichester (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Allen-Williams, M., Jennings, N.R. (2009). Bayesian Learning for Cooperation in Multi-Agent Systems. In: Mumford, C.L., Jain, L.C. (eds) Computational Intelligence. Intelligent Systems Reference Library, vol 1. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01799-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-01799-5_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01798-8
Online ISBN: 978-3-642-01799-5
eBook Packages: EngineeringEngineering (R0)