Skip to main content

Learning coordinated behavior in a continuous environment

  • Learning, Cooperation and Competition
  • Conference paper
  • First Online:
Distributed Artificial Intelligence Meets Machine Learning Learning in Multi-Agent Environments (LDAIS 1996, LIOME 1996)

Abstract

Interesting efforts have been made to let multiple agents learn to appropriately interact, using various reinforcement-learning algorithms. In most of these cases, however, the state space for each agent is supposed discrete. It is not clear how effectively multiple reinforcementlearning agents are able to acquire appropriate coordinated behavior in continuous state spaces. The objective of this research is to explore the potential applicability of Q-learning in multi-agent continuous environments, when applied in conjunction with a generalization technique based on CMAC. We consider a modified version of the multi-agent block pushing problem, where two learning agents are interacting in a continuous environment to accomplish their common goal. To allow our agent to treat two-dimensional vector-valued inputs, we applied a CMAC-based Q-learning algorithm. This is a variant of L.-J.Lin's QCON algorithm. The objective is to incrementally elaborate a set of CMACs which can approximately provide the action value function under an optimal policy for the learning agent. The performance of our block pushing CMAC-based Q-learning agents is evaluated quantitatively and qualitatively through simulation runs. Although it is not intended to model any particular real world problem, the results are encouraging.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Albus, J.S.: Brain, Behavior, and Robotics, Byte Book, Chapter 6, pp.139–179, 1981.

    Google Scholar 

  2. Drogoul, A., J.Ferber, B.Corbara, and D.Fresneau: A Behavioral Simulation Model for the Study of Emergent Social Structures, F.J.Varela, et al. (Eds.): Toward a Practice of Autonomous Systems: Proc. of the First European Conference on Artificial Life, The MIT Press, 1991.

    Google Scholar 

  3. Lin, L.-J.: Self-Improving Reactive Agents Based On Reinforcement Learning, Planning and Teaching, Machine Learning, Vol.8, 1992.

    Google Scholar 

  4. Ono, N., and A.T. Rahmani: Self-Organization of Communication in Distributed Learning Classifier Systems, R.F. Albrecht et al. (Eds.): Artificial Neural Nets and Genetic Algorithms: Proc. of International Conference on Artificial Neural Nets and Genetic Algorithms, Springer-Verlag Wien New York, 1993.

    Google Scholar 

  5. Ono, N., T.Ohira, and A.T.Rahmani: Emergent Organization of Interspecies Communication in Q-learning Artificial Organisms, in F.Móran et al.: (Eds.) Advances in Artificial Life: Proc. of the 3rd European Conference on Artificial Life, Springer, 1995.

    Google Scholar 

  6. Ono, N., and K.Fukumoto: Collective Behavior by Modular Reinforcement-Learning Animats, P.Maes et al.(Eds.): From Animals to Animats 4: Proc. of the 4th International Conference on Simulation of Adaptive Behavior, The MIT Press, 1996.

    Google Scholar 

  7. Ono, N., and K.Fukumoto: Multi-agent Reinforcement Learning: A Modular Approach, Proc. of the 2nd International Conference on Multi-agent Systems, AAAI Press, 1996.

    Google Scholar 

  8. Sen, S., M.Sekaran, and J.Hale: Learning to Coordinate without Sharing Information, Proc. of AAAI-94, 1994.

    Google Scholar 

  9. Sen, S., and M.Sekaran: Multiagent Coordination with Learning Classifier Systems, G.Weiß and S.Sen (Eds.): Adaption and Learning in Multi-agent Systems, Springer, 1996.

    Google Scholar 

  10. Tan, M.: Multi-agent Reinforcement Learning: Independent vs. Cooperative Agents, Proc. of the 10th International Conference on Machine Learning, 1993.

    Google Scholar 

  11. Yanco, H., and L.A.Stein: An Adaptive Communication Protocol for Cooperating Mobile Robots, J.-A. Meyer, et al. (Eds.): From Animals to Animats 2: Proc. of the 2nd International Conference on Simulation of Adaptive Behavior, The MIT Press, 1992.

    Google Scholar 

  12. Watkins, C.J.C.H.: Learning With Delayed Rewards, Ph.D.thesis, Cambridge University, 1989.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Gerhard Weiß

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ono, N., Fukuta, Y. (1997). Learning coordinated behavior in a continuous environment. In: Weiß, G. (eds) Distributed Artificial Intelligence Meets Machine Learning Learning in Multi-Agent Environments. LDAIS LIOME 1996 1996. Lecture Notes in Computer Science, vol 1221. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-62934-3_42

Download citation

  • DOI: https://doi.org/10.1007/3-540-62934-3_42

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-62934-4

  • Online ISBN: 978-3-540-69050-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics