A Cat-Like Robot Real-Time Learning to Run

Wawrzyński, Paweł

doi:10.1007/978-3-642-04921-7_39

Paweł Wawrzyński¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5495))

Included in the following conference series:

International Conference on Adaptive and Natural Computing Algorithms

2068 Accesses

Abstract

Actor-Critics constitute an important class of reinforcement learning algorithms that can deal with continuous actions and states in an easy and natural way. In their original, sequential form, these algorithms are usually to slow to be applicable to real-life problems. However, they can be augmented by the technique of experience replay to obtain a satisfactory of learning without degrading their convergence properties. In this paper experimental results are presented that show that the combination of experience replay and Actor-Critics yields very fast learning algorithms that achieve successful policies for nontrivial control tasks in considerably short time. Namely, a policy for a model of 6-degree-of-freedom walking robot is obtained after 4 hours of the robot’s time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bartlett, P.L., Baxter, J.: Stochastic optimization of controlled partially observable markov decision processes. In: Proc. of the 39th IEEE Conf. on Decision and Control (CDC 2000), vol. 1, pp. 124–129 (2000)
Google Scholar
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can learn difficult learning control problems. IEEE Trans. on SMC 13, 834–846 (1983)
Google Scholar
Bhatnagar, S., Sutton, R.S., Ghavamzadeh, M., Lee, M.: Incremental natural actor-critic algorithms. In: Advances in NIPS, vol. 21 (2008)
Google Scholar
Cichosz, P.: An analysis of experience replay in temporal difference learning. Cybernetics and Systems 30, 341–363 (1999)
Article MATH Google Scholar
Kimura, H., Kobayashi, S.: An analysis of actor/critic algorithm using eligibility traces: Reinforcement learning with imperfect value functions. In: Proc. of the 15th ICML, pp. 278–286 (1998)
Google Scholar
Konda, V., Tsitsiklis, J.: Actor-critic algorithms. SIAM Journal on Control and Optimization 42(4), 1143–1166 (2003)
Article MathSciNet MATH Google Scholar
Lin, L.-J.: Reinforcement learning for robots using neural networks. Ph.D thesis, Carnegie Mellon University, Pittsburgh, PA, USA (1992)
Google Scholar
Mahadevan, S., Connell, J.: Automatic programming of behavior-based robots using reinforcement learning. Artificial Intelligence 55(2-3), 311–365 (1992)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Watkins, C., Dayan, P.: Q-learning. Machine Learning 8, 279–292 (1992)
MATH Google Scholar
Wawrzyński, P.: Learning to control a 6-degree-of-freedom walking robot. In: Proc. of EUROCON 2007, pp. 698–705 (2007)
Google Scholar
Wawrzyński, P., Pacut, A.: Truncated importance sampling for reinforcement learning with experience replay. In: Proc. CSIT Int. Multiconf., pp. 305–315 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Warsaw University of Technology, Poland
Paweł Wawrzyński

Authors

Paweł Wawrzyński
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Environmental Sciences, University of Kuopio, PO Box 1627, FIN-70211, Kuopio, Finland
Mikko Kolehmainen
Department of Computer Science, University of Kuopio, P.O.Box 1627, 70211, Kuopio, Finland
Pekka Toivanen
Institute of Control and Industrial Electronics, Warsaw University of Technology, ul. Koszykowa 75, 00-662, Warszawa, Poland
Bartlomiej Beliczynski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wawrzyński, P. (2009). A Cat-Like Robot Real-Time Learning to Run. In: Kolehmainen, M., Toivanen, P., Beliczynski, B. (eds) Adaptive and Natural Computing Algorithms. ICANNGA 2009. Lecture Notes in Computer Science, vol 5495. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04921-7_39

Download citation

DOI: https://doi.org/10.1007/978-3-642-04921-7_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04920-0
Online ISBN: 978-3-642-04921-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics