Self-adapting Goals Allow Transfer of Predictive Models to New Tasks

  • Kai Olav EllefsenEmail author
  • Jim Torresen
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1056)


A long-standing challenge in Reinforcement Learning is enabling agents to learn a model of their environment which can be transferred to solve other problems in a world with the same underlying rules. One reason this is difficult is the challenge of learning accurate models of an environment. If such a model is inaccurate, the agent’s plans and actions will likely be sub-optimal, and likely lead to the wrong outcomes. Recent progress in model-based reinforcement learning has improved the ability for agents to learn and use predictive models. In this paper, we extend a recent deep learning architecture which learns a predictive model of the environment that aims to predict only the value of a few key measurements, which are indicative of an agent’s performance. Predicting only a few measurements rather than the entire future state of an environment makes it more feasible to learn a valuable predictive model. We extend this predictive model with a small, evolving neural network that suggests the best goals to pursue in the current state. We demonstrate that this allows the predictive model to transfer to new scenarios where goals are different, and that the adaptive goals can even adjust agent behavior on-line, changing its strategy to fit the current context.


Reinforcement Learning Prediction Neural networks Neuroevolution 



This work is supported by The Research Council of Norway as part of the Engineering Predictability with Embodied Cognition (EPEC) project \(\#\)240862, and the Centres of Excellence scheme, project \(\#\)262762.


  1. 1.
    Alvernaz, S., Togelius, J.: Autoencoder-augmented neuroevolution for visual doom playing. In: 2017 IEEE Conference on Computational Intelligence and Games (CIG) (2017)Google Scholar
  2. 2.
    Dosovitskiy, A., Koltun, V.: Learning to act by predicting the future. In: ICLR 2017, pp. 1–14 (2017)Google Scholar
  3. 3.
    Ha, D., Schmidhuber, J.: Recurrent world models facilitate policy evolution. In: Advances in Neural Information Processing Systems 31, pp. 2451–2463. Curran Associates, Inc. (2018)Google Scholar
  4. 4.
    Hafner, D., et al.: Learning latent dynamics for planning from pixels. arXiv preprint arXiv:1811.04551, November 2018
  5. 5.
    Kaiser, L., et al.: Model-based reinforcement learning for atari. arXiv preprint arXiv:1903.00374 (2019)
  6. 6.
    Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaskowski, W.: ViZDoom: a doom-based AI research platform for visual reinforcement learning. In: IEEE Conference on Computational Intelligence and Games, CIG (2017)Google Scholar
  7. 7.
    Koutnik, J., Schmidhuber, J., Gomez, F.: Evolving deep unsupervised convolutional networks for vision-based reinforcement learning. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, pp. 541–548. ACM (2014)Google Scholar
  8. 8.
    Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
  9. 9.
    Luc, P., Neverova, N., Couprie, C., Verbeek, J., Lecun, Y.: Predicting deeper into the future of semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision (2017)Google Scholar
  10. 10.
    Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRefGoogle Scholar
  11. 11.
    Poulsen, A.P., Thorhauge, M., Funch, M.H., Risi, S.: DLNE: a hybridization of deep learning and neuroevolution for visual control. In: 2017 IEEE Conference on Computational Intelligence and Games, CIG 2017 (2017)Google Scholar
  12. 12.
    Racanière, S., et al.: Imagination-augmented agents for deep reinforcement learning. In: Advances in Neural Information Processing Systems 30, pp. 5690–5701. Curran Associates, Inc. (2017)Google Scholar
  13. 13.
    Schillaci, G., Hafner, V.V., Lara, B.: Exploration behaviours, body representations and simulations processes for the development of cognition in artificial agents. Front. Robot. AI 3, 39 (2016)Google Scholar
  14. 14.
    Stanley, K.O., Clune, J., Lehman, J., Miikkulainen, R.: Designing neural networks through neuroevolution. Nat. Mach. Intell. 1(1), 24–35 (2019)CrossRefGoogle Scholar
  15. 15.
    Stanley, K.O., Miikkulainen, R.: Evolving neural network through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)CrossRefGoogle Scholar
  16. 16.
    Villegas, R., Yang, J., Zou, Y., Sohn, S., Lin, X., Lee, H.: Learning to generate long-term future via hierarchical prediction. In: ICML, April 2017Google Scholar
  17. 17.
    Wolpert, D.M., Doya, K., Kawato, M.: A unifying computational framework for motor control and social interaction. Philos. Trans. R. Soc. B Biol. Sci. 358(1431), 593–602 (2003)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of InformaticsUniversity of OsloOsloNorway
  2. 2.Department of Informatics and RITMOUniversity of OsloOsloNorway

Personalised recommendations