Self-adapting Goals Allow Transfer of Predictive Models to New Tasks
Abstract
A long-standing challenge in Reinforcement Learning is enabling agents to learn a model of their environment which can be transferred to solve other problems in a world with the same underlying rules. One reason this is difficult is the challenge of learning accurate models of an environment. If such a model is inaccurate, the agent’s plans and actions will likely be sub-optimal, and likely lead to the wrong outcomes. Recent progress in model-based reinforcement learning has improved the ability for agents to learn and use predictive models. In this paper, we extend a recent deep learning architecture which learns a predictive model of the environment that aims to predict only the value of a few key measurements, which are indicative of an agent’s performance. Predicting only a few measurements rather than the entire future state of an environment makes it more feasible to learn a valuable predictive model. We extend this predictive model with a small, evolving neural network that suggests the best goals to pursue in the current state. We demonstrate that this allows the predictive model to transfer to new scenarios where goals are different, and that the adaptive goals can even adjust agent behavior on-line, changing its strategy to fit the current context.
Keywords
Reinforcement Learning Prediction Neural networks NeuroevolutionNotes
Acknowledgments
This work is supported by The Research Council of Norway as part of the Engineering Predictability with Embodied Cognition (EPEC) project \(\#\)240862, and the Centres of Excellence scheme, project \(\#\)262762.
References
- 1.Alvernaz, S., Togelius, J.: Autoencoder-augmented neuroevolution for visual doom playing. In: 2017 IEEE Conference on Computational Intelligence and Games (CIG) (2017)Google Scholar
- 2.Dosovitskiy, A., Koltun, V.: Learning to act by predicting the future. In: ICLR 2017, pp. 1–14 (2017)Google Scholar
- 3.Ha, D., Schmidhuber, J.: Recurrent world models facilitate policy evolution. In: Advances in Neural Information Processing Systems 31, pp. 2451–2463. Curran Associates, Inc. (2018)Google Scholar
- 4.Hafner, D., et al.: Learning latent dynamics for planning from pixels. arXiv preprint arXiv:1811.04551, November 2018
- 5.Kaiser, L., et al.: Model-based reinforcement learning for atari. arXiv preprint arXiv:1903.00374 (2019)
- 6.Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaskowski, W.: ViZDoom: a doom-based AI research platform for visual reinforcement learning. In: IEEE Conference on Computational Intelligence and Games, CIG (2017)Google Scholar
- 7.Koutnik, J., Schmidhuber, J., Gomez, F.: Evolving deep unsupervised convolutional networks for vision-based reinforcement learning. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, pp. 541–548. ACM (2014)Google Scholar
- 8.Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
- 9.Luc, P., Neverova, N., Couprie, C., Verbeek, J., Lecun, Y.: Predicting deeper into the future of semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision (2017)Google Scholar
- 10.Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRefGoogle Scholar
- 11.Poulsen, A.P., Thorhauge, M., Funch, M.H., Risi, S.: DLNE: a hybridization of deep learning and neuroevolution for visual control. In: 2017 IEEE Conference on Computational Intelligence and Games, CIG 2017 (2017)Google Scholar
- 12.Racanière, S., et al.: Imagination-augmented agents for deep reinforcement learning. In: Advances in Neural Information Processing Systems 30, pp. 5690–5701. Curran Associates, Inc. (2017)Google Scholar
- 13.Schillaci, G., Hafner, V.V., Lara, B.: Exploration behaviours, body representations and simulations processes for the development of cognition in artificial agents. Front. Robot. AI 3, 39 (2016)Google Scholar
- 14.Stanley, K.O., Clune, J., Lehman, J., Miikkulainen, R.: Designing neural networks through neuroevolution. Nat. Mach. Intell. 1(1), 24–35 (2019)CrossRefGoogle Scholar
- 15.Stanley, K.O., Miikkulainen, R.: Evolving neural network through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)CrossRefGoogle Scholar
- 16.Villegas, R., Yang, J., Zou, Y., Sohn, S., Lin, X., Lee, H.: Learning to generate long-term future via hierarchical prediction. In: ICML, April 2017Google Scholar
- 17.Wolpert, D.M., Doya, K., Kawato, M.: A unifying computational framework for motor control and social interaction. Philos. Trans. R. Soc. B Biol. Sci. 358(1431), 593–602 (2003)CrossRefGoogle Scholar