Evolution Strategies for Direct Policy Search

Heidrich-Meisner, Verena; Igel, Christian

doi:10.1007/978-3-540-87700-4_43

Verena Heidrich-Meisner¹⁹ &
Christian Igel¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5199))

Included in the following conference series:

International Conference on Parallel Problem Solving from Nature

3580 Accesses
20 Citations

Abstract

The covariance matrix adaptation evolution strategy (CMA-ES) is suggested for solving problems described by Markov decision processes. The algorithm is compared with a state-of-the-art policy gradient method and stochastic search on the double cart-pole balancing task using linear policies. The CMA-ES proves to be much more robust than the gradient-based approach in this scenario.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Bertsekas, D., Tsitsiklis, J.: Neuro-Dynamic Programming. Athena Scientific (1996)
Google Scholar
Heidrich-Meisner, V., Lauer, M., Igel, C., Riedmiller, M.: Reinforcement learning in a Nutshell. In: 15th European Symposium on Artificial Neural Networks (ESANN 2007), pp. 277–288. d-side publications, Evere (2007)
Google Scholar
Whitley, D., Dominic, S., Das, R., Anderson, C.W.: Genetic reinforcement learning for neurocontrol problems. Machine Learning 13(2-3), 259–284 (1993)
Article Google Scholar
Moriarty, D., Schultz, A., Grefenstette, J.: Evolutionary Algorithms for Reinforcement Learning. Journal of Artificial Intelligence Research 11, 199–229 (1999)
MathSciNet Google Scholar
Chellapilla, K., Fogel, D.: Evolution, neural networks, games, and intelligence. IEEE Proc. 87(9), 1471–1496 (1999)
Article Google Scholar
Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evolutionary Computation 10(2), 99–127 (2002)
Article Google Scholar
Whiteson, S., Stone, P.: Evolutionary function approximation for reinforcement learning. Journal of Machine Learning Research 7, 877–917 (2006)
MATH MathSciNet Google Scholar
Gomez, F., Schmidhuber, J., Miikkulainen, R.: Efficient non-linear control through neuroevolution. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 654–662. Springer, Heidelberg (2006)
Chapter Google Scholar
Igel, C.: Neuroevolution for reinforcement learning using evolution strategies. In: Congress on Evolutionary Computation (CEC 2003), vol. 4, pp. 2588–2595. IEEE Press, Los Alamitos (2003)
Google Scholar
Pellecchia, A., Igel, C., Edelbrunner, J., Schöner, G.: Making driver modeling attractive. IEEE Intelligent Systems 20(2), 8–12 (2005)
Article Google Scholar
Heidrich-Meisner, V., Igel, C.: Similarities and differences between policy gradient methods and evolution strategies. In: 16th European Symposium on Artificial Neural Networks (ESANN), pp. 149–154. d-side publications, Evere (2008)
Google Scholar
Heidrich-Meisner, V., Igel, C.: Variable metric reinforcement learning methods applied to the noisy mountain car problem. In: European Workshop on Reinforcement Learning (accepted, 2008)
Google Scholar
Beyer, H.G.: Evolution strategies. Scholarpedia 2(8), 1965 (2007)
Article Google Scholar
Hansen, N., Ostermeier, A.: Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation 9(2), 159–195 (2001)
Article Google Scholar
Hansen, N., Müller, S., Koumoutsakos, P.: Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evolutionary Computation 11(1), 1–18 (2003)
Article Google Scholar
Peters, J., Vijayakumar, S., Schaal, S.: Reinforcement learning for humanoid robotics. In: Proc. 3rd IEEE-RAS Int’l Conf. on Humanoid Robots, pp. 29–30 (2003)
Google Scholar
Riedmiller, M., Peters, J., Schaal, S.: Evaluation of policy gradient methods and variants on the cart-pole benchmark. In: Proc. 2007 IEEE Internatinal Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL 2007), pp. 254–261 (2007)
Google Scholar
Peters, J., Schaal, S.: Natural actor-critic. Neurocomputing 71(7-9), 1180–1190 (2008)
Article Google Scholar
Sutton, R., McAllester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems, vol. 12, pp. 1057–1063 (2000)
Google Scholar
Amari, S., Nagaoka, H.: Methods of Information Geometry. Translations of Mathematical Monographs, vol. 191. American Mathematical Society and Oxford University Press (2000)
Google Scholar
Siebel, N.T., Sommer, G.: Evolutionary reinforcement learning of artificial neural networks. International Journal of Hybrid Intelligent Systems 4(3), 171–183 (2007)
MATH Google Scholar
Kassahun, Y., Sommer, G.: Efficient reinforcement learning through evolutionary acquisition of neural topologies. In: 13th European Symposium on Artificial Neural Networks, d-side, pp. 259–266 (2005)
Google Scholar
Gomez, F., Miikkulainen, R.: Solving non-Markovian control tasks with neuroevolution. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence, pp. 1356–1361 (1999)
Google Scholar
Wieland, A.: Evolving neural network controllers for unstable systems. In: IJCNN 1991-Seattle International Joint Conference on Neural Networks, 1991, vol. 2 (1991)
Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Neuroinformatik, Ruhr-Universität Bochum, Germany
Verena Heidrich-Meisner & Christian Igel

Authors

Verena Heidrich-Meisner
View author publications
You can also search for this author in PubMed Google Scholar
Christian Igel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Fakultät für Informatik, Technische Universität Dortmund, 44221, Dortmund, Germany
Günter Rudolph
Fakultät für Informatik, Technische Universität Dortmund, 44221, Dortmund, Germany
Thomas Jansen & Nicola Beume &
Department of Computing and Electronic Systems, University of Essex, CO4 3SQ, Colchester, Essex, UK
Simon Lucas
Dipartimento di Ingegneria Meccanica, Università degli Studi di Trieste, 34127, Trieste, Italy
Carlo Poloni

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Heidrich-Meisner, V., Igel, C. (2008). Evolution Strategies for Direct Policy Search. In: Rudolph, G., Jansen, T., Beume, N., Lucas, S., Poloni, C. (eds) Parallel Problem Solving from Nature – PPSN X. PPSN 2008. Lecture Notes in Computer Science, vol 5199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87700-4_43

Download citation

DOI: https://doi.org/10.1007/978-3-540-87700-4_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87699-1
Online ISBN: 978-3-540-87700-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics