Learning coordinated behavior in a continuous environment

Ono, Norihiko; Fukuta, Yoshihiro

doi:10.1007/3-540-62934-3_42

Norihiko Ono¹ &
Yoshihiro Fukuta¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1221))

Included in the following conference series:

251 Accesses
3 Citations

Abstract

Interesting efforts have been made to let multiple agents learn to appropriately interact, using various reinforcement-learning algorithms. In most of these cases, however, the state space for each agent is supposed discrete. It is not clear how effectively multiple reinforcementlearning agents are able to acquire appropriate coordinated behavior in continuous state spaces. The objective of this research is to explore the potential applicability of Q-learning in multi-agent continuous environments, when applied in conjunction with a generalization technique based on CMAC. We consider a modified version of the multi-agent block pushing problem, where two learning agents are interacting in a continuous environment to accomplish their common goal. To allow our agent to treat two-dimensional vector-valued inputs, we applied a CMAC-based Q-learning algorithm. This is a variant of L.-J.Lin's QCON algorithm. The objective is to incrementally elaborate a set of CMACs which can approximately provide the action value function under an optimal policy for the learning agent. The performance of our block pushing CMAC-based Q-learning agents is evaluated quantitatively and qualitatively through simulation runs. Although it is not intended to model any particular real world problem, the results are encouraging.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Albus, J.S.: Brain, Behavior, and Robotics, Byte Book, Chapter 6, pp.139–179, 1981.
Google Scholar
Drogoul, A., J.Ferber, B.Corbara, and D.Fresneau: A Behavioral Simulation Model for the Study of Emergent Social Structures, F.J.Varela, et al. (Eds.): Toward a Practice of Autonomous Systems: Proc. of the First European Conference on Artificial Life, The MIT Press, 1991.
Google Scholar
Lin, L.-J.: Self-Improving Reactive Agents Based On Reinforcement Learning, Planning and Teaching, Machine Learning, Vol.8, 1992.
Google Scholar
Ono, N., and A.T. Rahmani: Self-Organization of Communication in Distributed Learning Classifier Systems, R.F. Albrecht et al. (Eds.): Artificial Neural Nets and Genetic Algorithms: Proc. of International Conference on Artificial Neural Nets and Genetic Algorithms, Springer-Verlag Wien New York, 1993.
Google Scholar
Ono, N., T.Ohira, and A.T.Rahmani: Emergent Organization of Interspecies Communication in Q-learning Artificial Organisms, in F.Móran et al.: (Eds.) Advances in Artificial Life: Proc. of the 3rd European Conference on Artificial Life, Springer, 1995.
Google Scholar
Ono, N., and K.Fukumoto: Collective Behavior by Modular Reinforcement-Learning Animats, P.Maes et al.(Eds.): From Animals to Animats 4: Proc. of the 4th International Conference on Simulation of Adaptive Behavior, The MIT Press, 1996.
Google Scholar
Ono, N., and K.Fukumoto: Multi-agent Reinforcement Learning: A Modular Approach, Proc. of the 2nd International Conference on Multi-agent Systems, AAAI Press, 1996.
Google Scholar
Sen, S., M.Sekaran, and J.Hale: Learning to Coordinate without Sharing Information, Proc. of AAAI-94, 1994.
Google Scholar
Sen, S., and M.Sekaran: Multiagent Coordination with Learning Classifier Systems, G.Weiß and S.Sen (Eds.): Adaption and Learning in Multi-agent Systems, Springer, 1996.
Google Scholar
Tan, M.: Multi-agent Reinforcement Learning: Independent vs. Cooperative Agents, Proc. of the 10th International Conference on Machine Learning, 1993.
Google Scholar
Yanco, H., and L.A.Stein: An Adaptive Communication Protocol for Cooperating Mobile Robots, J.-A. Meyer, et al. (Eds.): From Animals to Animats 2: Proc. of the 2nd International Conference on Simulation of Adaptive Behavior, The MIT Press, 1992.
Google Scholar
Watkins, C.J.C.H.: Learning With Delayed Rewards, Ph.D.thesis, Cambridge University, 1989.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Science and Intelligent Systems Faculty of Engineering, University of Tokushima, 2-1 Minami-Josanjima, 770, Tokushima, Japan
Norihiko Ono & Yoshihiro Fukuta

Authors

Norihiko Ono
View author publications
You can also search for this author in PubMed Google Scholar
Yoshihiro Fukuta
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Gerhard Weiß

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ono, N., Fukuta, Y. (1997). Learning coordinated behavior in a continuous environment. In: Weiß, G. (eds) Distributed Artificial Intelligence Meets Machine Learning Learning in Multi-Agent Environments. LDAIS LIOME 1996 1996. Lecture Notes in Computer Science, vol 1221. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-62934-3_42

Download citation

DOI: https://doi.org/10.1007/3-540-62934-3_42
Published: 07 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62934-4
Online ISBN: 978-3-540-69050-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics