Q-Learning Policies for Multi-Agent Foraging Task

M., Yogeswaran; S.G., Ponnambalam

doi:10.1007/978-3-642-15810-0_25

Yogeswaran M.⁸ &
Ponnambalam S.G.⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 103))

Included in the following conference series:

FIRA RoboWorld Congress

1412 Accesses

Abstract

The trade-off issue between exploitation and exploration in multi-agent systems learning have been a crucial area of research for the past few decades. A proper learning policy is necessary to address the issue for the agents to react rapidly and adapt in a dynamic environment. A family of core learning policies were identified in the open literature that are suitable for non-stationary multi-agent foraging task modeled in this paper. The model is used to compare and contrast between the identified learning policies namely greedy, ε-greedy and Boltzmann distribution. A simple random search is also included to justify the convergence of q-learning. A number of simulation-based experiments was conducted and based on the numerical results that was obtained, the performances of the learning policies are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Koulouriotis, D.E., Xanthopoulos, A.: Reinforcement Learning and Evolutionary Algorithms for Non-stationary Multi-armed Bandit Problems. Applied Mathematics and Computation 196, 913–922 (2008)
Article MATH Google Scholar
Webots: Commercial Mobile Robot Simulation Software, http://www.cyberbotics.com
Ji, Z., Wu, Q., Sid-Ahmed, M.: An Improved Immune Q-learning Algorithm. In: IEEE International Conference on Systems, Man and Cybernetics, pp. 1636–1641 (2007)
Google Scholar
Gomes, E., Kowalczyk, R.: Dynamic Analysis of Multiagent Q-learning with E-greedy Exploration. In: Proceedings of the 26th International Conference on Machine Learning, vol. 382, pp. 369–376 (2009)
Google Scholar
Tuyls, K., Verbeeck, K., Lenaerts, T.: A Selection-Mutation Model for Q-learning in Multi-agent Systems. In: Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 693–700 (2003)
Google Scholar
Morihiro, K., Isokawa, T., Nishimura, H., Matsui, N.: Emergence of Flocking Behavior Based on Reinforcement Learning. In: Gabrys, B., Howlett, R.J., Jain, L.C. (eds.) KES 2006. LNCS (LNAI), vol. 4253, pp. 699–706. Springer, Heidelberg (2006)
Chapter Google Scholar
Dahmani, Y., Benyettou, A.: Seek of an Optimal Way by Q-Learning. Journal of Computer Science 1(1), 28–30 (2005)
Article Google Scholar
Whiteson, S., Taylor, M., Stone, P.: Empirical Studies in Action Selection with Reinforcement Learning. Adaptive Behavior 15(1), 33–50 (2007)
Article Google Scholar
Price, B., Boutilier, C.: Accelerating Reinforcement Learning Through Implicit Imitation. Journal of Artificial Intelligence Research 19, 569–629 (2003)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Engineering, Monash University, Sunway campus, 46150, Petaling Jaya, Selangor, Malaysia
Yogeswaran M. & Ponnambalam S.G.

Authors

Yogeswaran M.
View author publications
You can also search for this author in PubMed Google Scholar
Ponnambalam S.G.
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National University of Singapore, Singapore
Prahlad Vadakkepat & Tan Kok Kiong &
Dept. of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
Jong-Hwan Kim
Technische Universität Dortmund, Dortmund, Germany
Norbert Jesse
Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, 117576, Singapore
Abdullah Al Mamun
Department of Computer Science, University of Manitoba, R3T 2N2, Winnipeg, Manitoba, Canada
Jacky Baltes & John Anderson &
Dept. of Education in Technology & Science, Technion - Israel Institute of Technology, 32000, Haifa, Israel
Igor Verner
Trinity College, Hartford, CT, USA
David Ahlgren

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

M., Y., S.G., P. (2010). Q-Learning Policies for Multi-Agent Foraging Task. In: Vadakkepat, P., et al. Trends in Intelligent Robotics. FIRA 2010. Communications in Computer and Information Science, vol 103. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15810-0_25

Download citation

DOI: https://doi.org/10.1007/978-3-642-15810-0_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15809-4
Online ISBN: 978-3-642-15810-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics