Scalable Neuroevolution for Reinforcement Learning

Gomez, Faustino

doi:10.1007/978-3-642-33860-1_3

Faustino Gomez¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7505))

Included in the following conference series:

International Conference on Theory and Practice of Natural Computing

835 Accesses

Abstract

The idea of using evolutionary computation to train artificial neural networks, or neuroevolution (NE), has now been around for over 20 years. The main appeal of this approach is that, because it does not rely on gradient information (e.g. backpropagation), it can potentially harness the universal function approximation capability of neural networks to solve reinforcement learning (RL) tasks, where there is no “teacher” (i.e. no targets or examples of correct behavior). Instead of incrementally adjusting the synaptic weights of a single network, the space of network parameters is searched directly according to principles inspired by natural selection: (1) encode a population of networks as strings, or genomes, (2) transform them into networks, (3) evaluate them on the task, (4) generate new, hopefully better, nets by recombining those that are most “fit”, (5) goto step 2 until a solution is found. By evolving neural networks, NE can cope naturally with tasks that have continuous inputs and outputs, and, by evolving networks with feedback connections (recurrent networks), it can tackle more general tasks that require memory.

Early work in the field focused on evolving rather small networks (hundreds of weights) for RL benchmarks, and control problems with relatively few inputs/outputs. However, as RL tasks become more challenging, the networks required become larger, as do their genomes. The result is that scaling NE to large nets (i.e. tens of thousands of weights) is infeasible using a straightforward, direct encoding where genes map one-to-one to network components. Therefore, recent efforts have focused on indirect encodings [3,7,8,11,12,14] where the genome is translated into a network by a more complex mapping that allows small genomes to represent nets of potentially arbitrary size.

At IDSIA, in addition to state of the art direct methods [1,4,6,15,16], we have developed a novel indirect encoding scheme where networks are encoded by a set of Fourier coefficients which are converted into network weight matrices via an inverse Fourier-type transform [5, 9, 10, 13]. Since there often exist network solutions whose weight matrices contain regularity (i.e. adjacent weights are correlated), the number of coefficients required to represent these networks in the frequency domain is much smaller than the number of weights (in the same way that natural images can be compressed by ignore high-frequency components). This “compressed” encoding can reduce the search-space dimensionality by as much as two orders of magnitude, both accelerating convergence and yielding more general solutions.

We have also explored an entirely different approach where, instead of trying to efficiently search the space of compactly-encoded, large networks, NE is scaled by combining it with unsupervised learning (UL) [2]. Here, the standard NE architecture is augmented by a UL module that learns to transform high-dimensional inputs (e.g. video) generated by the controllers being evaluated, into a low-dimensional code that they use instead of the raw input. Because the number of inputs becomes small after the UL dimensionality reduction, the controllers can be small as well. Research is currently underway combining these two approaches (compressed nets fed by a small number UL features), to discover controllers for real robot manipulation using stereo vision input.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 72.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cuccu, G., Gomez, F.: Block Diagonal Natural Evolution Strategies. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds.) PPSN XII, Part II. LNCS, vol. 7492, pp. 488–497. Springer, Heidelberg (2012)
Chapter Google Scholar
Cuccu, G., Luciw, M., Schmidhuber, J., Gomez, F.: Intrinsically motivated evolutionary search for vision-based reinforcement learning. In: Proceedings of the IEEE Conference on Development and Learning, and Epigenetic Robotics (2011)
Google Scholar
Dürr, P., Mattiussi, C., Floreano, D.: Neuroevolution with Analog Genetic Encoding. In: Runarsson, T.P., Beyer, H.-G., Burke, E.K., Merelo-Guervós, J.J., Whitley, L.D., Yao, X. (eds.) PPSN IX. LNCS, vol. 4193, pp. 671–680. Springer, Heidelberg (2006)
Chapter Google Scholar
Glasmachers, T., Schaul, T., Sun, Y., Wierstra, D., Schmidhuber, J.: Exponential Natural Evolution Strategies. In: Genetic and Evolutionary Computation Conference (GECCO), Portland, OR (2010)
Google Scholar
Gomez, F., Koutník, J., Schmidhuber, J.: Compressed Network Complexity Search. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds.) PPSN XII, Part I. LNCS, vol. 7491, pp. 316–326. Springer, Heidelberg (2012)
Chapter Google Scholar
Gomez, F., Schmidhuber, J., Miikkulainen, R.: Accelerated neural evolution through cooperatively coevolved synapses. Journal of Machine Learning Research 9, 937–965 (2008)
MathSciNet MATH Google Scholar
Gruau, F.: Cellular encoding of genetic neural networks. Tech. Rep. RR-92-21, Ecole Normale Superieure de Lyon, Institut IMAG, Lyon, France (1992)
Google Scholar
Kitano, H.: Designing neural networks using genetic algorithms with graph generation system. Complex Systems 4, 461–476 (1990)
MATH Google Scholar
Koutník, J., Gomez, F., Schmidhuber, J.: Evolving neural networks in compressed weight space. In: Proceedings of the Conference on Genetic and Evolutionary Computation, GECCO 2010 (2010)
Google Scholar
Koutník, J., Gomez, F., Schmidhuber, J.: Searching for minimal neural networks in fourier space. In: Proceedings of the 4th Annual Conference on Artificial General Intelligence, pp. 61–66 (2010)
Google Scholar
Risi, S., Lehman, J., Stanley, K.O.: Evolving the placement and density of neurons in the hyperneat substrate. In: Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation, GECCO 2010, pp. 563–570 (2010)
Google Scholar
Schmidhuber, J.: Discovering neural nets with low Kolmogorov complexity and high generalization capability. Neural Networks 10(5), 857–873 (1997)
Article Google Scholar
Srivastava, R.K., Schmidhuber, J., Gomez, F.: Generalized Compressed Network Search. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds.) PPSN XII, Part I. LNCS, vol. 7491, pp. 337–346. Springer, Heidelberg (2012)
Chapter Google Scholar
Stanley, K.O., Miikkulainen, R.: A taxonomy for artificial embryogeny. Artificial Life 9(2), 93–130 (2003)
Article Google Scholar
Sun, Y., Wierstra, D., Schaul, T., Schmidhuber, J.: Efficient Natural Evolution Strategies. In: Genetic and Evolutionary Computation Conference, GECCO (2009)
Google Scholar
Wierstra, D., Schaul, T., Peters, J., Schmidhuber, J.: Natural Evolution Strategies. In: Proceedings of the Congress on Evolutionary Computation (CEC 2008), Hongkong. IEEE Press (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

IDSIA, USI-SUPSI, Galleria 2, Manno-Lugano, 6928, Switzerland
Faustino Gomez

Authors

Faustino Gomez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Research Group on Mathematical Linguistics, Universitat Rovira i Virgili, Avinguda Catalunya, 35, 43002, Tarragona, Spain
Adrian-Horia Dediu & Carlos Martín-Vide &
Fakultät für Informatik, Otto-von-Guericke-Universität Magdeburg, Universitätsplatz 2, 39106, Magdeburg, Germany
Bianca Truthe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gomez, F. (2012). Scalable Neuroevolution for Reinforcement Learning. In: Dediu, AH., Martín-Vide, C., Truthe, B. (eds) Theory and Practice of Natural Computing. TPNC 2012. Lecture Notes in Computer Science, vol 7505. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33860-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-33860-1_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33859-5
Online ISBN: 978-3-642-33860-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics