Reinforcement Learning of a Six-Legged Robot to Walk and Avoid Obstacles

  • Anne Johannet
  • Isabelle Sarda
Part of the Studies in Fuzziness and Soft Computing book series (STUDFUZZ, volume 21)


Walking machine high potential for off-road or hostile environment mobility necessits an adaptive and versatile control systeme in order to avoid the difficulties of complex and unpredictible behaviour modelling.


Reinforcement Learn Soft Computing Obstacle Avoidance Reinforcement Signal Autonomous Mobile Robot 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Brooks, R.A. (1989), “A Robot that Walks: Emergent behaviors from carefully evolved network,” Neural Computation 1 (2), pp. 253–262.CrossRefGoogle Scholar
  2. [2]
    Frank, A. (1988), “Walkers”, Encyclopaedia of robotics: application and automation, DGRF R.C. ed. Wiley Interscience.Google Scholar
  3. [3]
    Min Ming, A.C. Kak (1993), “Mobile robot navigation using neural networks and nonmetrical environment models,” IEEE Control Systems, p.30, (October).Google Scholar
  4. [4]
    Brooks, R.A. (1991), “Intelligence Without Representation,” Artificial Intelligence 47, p. 139.CrossRefGoogle Scholar
  5. [5]
    Maes, P. (1990), “A Bottom-Up Mechanism for Behavior Selection in an Artificial Creature,” Proceedings of the First International Conference on Simulation of Adaptive Behavior, (Paris), ed. J.A. Meyer and S.W. Wilson.Google Scholar
  6. [6]
    Beer, R.D. (1990), “Intelligence as Adaptive Behavior, an Experiment in Computational Neuroethology,” Perspective in Artificial Intelligence, Vol. 6, Academic Press Inc.Google Scholar
  7. [7]
    Barto, A.G., Anandan, P. (1985), “Pattern Recognition Stochastic Learning Automata,” IEEE Trans. Syst. Man Cybern., 15, pp. 360–375.MathSciNetMATHCrossRefGoogle Scholar
  8. [8]
    Widrow, B., Gupta, N., Maitra, S. (1973), “Punish/Reward learning with a critic in adaptive threshold systems,” IEEE Trans. Syst. Man Cybern., 5, pp. 455–465.MathSciNetCrossRefGoogle Scholar
  9. [9]
    Bertsekas, D. (1987), Dynamic Programming: Deterministic and Stochastic Model, Englewood Cliffs, NJ: Prentice Hall.Google Scholar
  10. [10]
    Cybenko, G. (1989), “Approximation by superposition of a sigmoidal function,” Math. Control Signal Systems, Vol. 2.Google Scholar
  11. [11]
    Barto, A.G. (1996), “Reinforcement learning,” The hanbook of Brain Theory and Neural Network, pp. 804–809, Arbib edt., MIT Press.Google Scholar
  12. [12]
    Gullapali, V. (1990), “A Stochastic Reinforcement Algorithm for Learning Real-Valued Function,” Neural Networks, 3, pp. 671–692.CrossRefGoogle Scholar
  13. [13]
    Sutton, R.S. (1988), “Learning to Predict By the Method of Temporal Difference,” Machine Learning, 3, pp. 9–44.Google Scholar
  14. [14]
    Touzet, C., Sarzeaud, O. (1991), “Application d’un algorithme d’apprentissage par pénalité récompense à la génération de formes locomotrices hexapodes,” Journées de Rochebrune AFCET (1992).Google Scholar
  15. [15]
    Wilson, D.M. (1966), “Insect Walking,” Annual Rewiew of Entomology, 11, pp. 103–122.CrossRefGoogle Scholar
  16. [16]
    Long-Ji Lin (1993), Reinforcement Learning for Robots Using Neural Networks, PhD Thesis, Carnegie Mellon University, Pittsburgh.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Anne Johannet
    • 1
  • Isabelle Sarda
    • 1
  1. 1.LGI2PEMA — EERIENîmesFrance

Personalised recommendations