Deep active inference

Ueltzhöffer, Kai

doi:10.1007/s00422-018-0785-7

Deep active inference

Original Article
Published: 22 October 2018

Volume 112, pages 547–573, (2018)
Cite this article

Biological Cybernetics Aims and scope Submit manuscript

Kai Ueltzhöffer ORCID: orcid.org/0000-0002-0507-7598¹

2243 Accesses
51 Citations
11 Altmetric
Explore all metrics

Abstract

This work combines the free energy principle and the ensuing active inference dynamics with recent advances in variational inference in deep generative models, and evolution strategies to introduce the “deep active inference” agent. This agent minimises a variational free energy bound on the average surprise of its sensations, which is motivated by a homeostatic argument. It does so by optimising the parameters of a generative latent variable model of its sensory inputs, together with a variational density approximating the posterior distribution over the latent variables, given its observations, and by acting on its environment to actively sample input that is likely under this generative model. The internal dynamics of the agent are implemented using deep and recurrent neural networks, as used in machine learning, making the deep active inference agent a scalable and very flexible class of active inference agent. Using the mountain car problem, we show how goal-directed behaviour can be implemented by defining appropriate priors on the latent states in the agent’s model. Furthermore, we show that the deep active inference agent can learn a generative model of the environment, which can be sampled from to understand the agent’s beliefs about the environment and its interaction therewith.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Worked Example of Fokker-Planck-Based Active Inference

Learning Generative Models for Active Inference Using Tensor Networks

Inferring Adaptive Goal-Directed Behavior Within Recurrent Neural Networks

Notes

In more general formulations of active inference, the assumption that the mapping between hidden states and outcomes is constant can be relaxed (Friston et al. 2015).

References

Adams RA, Stephan KE, Brown H, Frith CD, Friston KJ (2013) The computational anatomy of psychosis. Front Psychiatry 4:47
Article Google Scholar
Alais D, Burr D (2004) The ventriloquist effect results from near-optimal bimodal integration. Curr Biol 14(3):257–262
Article CAS Google Scholar
Baez JC, Pollard BS (2015) Relative entropy in biological systems. arXiv:1512.02742
Baltieri M, Buckley CL (2017) An active inference implementation of phototaxis. arXiv:1707.01806
Berkes P, Orbán G, Lengyel M, Fiser J (2011) Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment. Science 331:83–87
Article CAS Google Scholar
Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W (2016) Openai gym. arXiv:1606.01540
Brown H, Friston KJ (2012) Free-energy and illusions: the cornsweet effect. Front Psychol 3:43
PubMed PubMed Central Google Scholar
Campbell JO (2016) Universal Darwinism as a process of Bayesian inference. arXiv:1606.07937
Caticha A (2004) Relative entropy and inductive inference. In: AIP conference proceedings, 707
Chung J, Kastner K, Dinh L, Goel K, Courville A, Bengio Y (2015) A recurrent latent variable model for sequential data. arXiv:1506.02216
Conant R, Ashby W (1970) Every good regulator of a system must be a model of that system. Int J Syst Sci 1(2):89–97
Article Google Scholar
Crapse TB, Sommer MA (2008) Corollary discharge across the animal kingdom. Nat Rev Neurosci 9:587–600
Article CAS Google Scholar
Dosovitskiy A, Koltun V (2017) Learning to act by predicting the future. ICLR
Erhan D, Bengio Y, Courville A, Manzagol PA, Vincent P (2010) Why does unsupervised pre-training help deep learning? JMLR 11:625–660
Google Scholar
Ernst M, Banks M (2002) Humans integrate visual and haptic information in a statistically optimal fashion. Nature 415:429–433
Article CAS Google Scholar
Friston KJ (2005) A theory of cortical responses. Phil Trans R Soc B 360:815–836
Article Google Scholar
Friston KJ (2008) Hierarchical models in the brain. PLoS Comput Biol 4(11):e1000211
Article Google Scholar
Friston KJ (2010) The free-energy principle: a unified brain theory? Nat Rev Neurosci 11(2):127–138
Article CAS Google Scholar
Friston KJ (2012) A free energy principle for biological systems. Entropy 14:2100–2121
Article Google Scholar
Friston KJ (2013) Life as we know it. J R Soc Interface 10:20130475
Article Google Scholar
Friston KJ, Kiebel SJ (2009) Predictive coding under the free-energy principle. Philos Trans R Soc B 364:1211–1221
Article Google Scholar
Friston KJ, Kilner J, Harrison L (2006) A free energy principle for the brain. J Physiol Paris 100:70–87
Article Google Scholar
Friston KJ, Daunizeau J, Kilner J, Kiebel SJ (2010) Action and behavior: a free-energy formulation. Biol Cybern 192(3):227–260
Article Google Scholar
Friston KJ, Mattout J, Kilner J (2011) Action understanding and active inference. Biol Cybern 104:137–160
Article Google Scholar
Friston KJ, Rigoli F, Ognibene D, Mathys C, Fitzgerald T, Pezzulo G (2015) Active inference and epistemic value. Cogn Neurosci 6(4):187–214
Article Google Scholar
Friston KJ, Frith CD, Pezzulo G, Hobson JA, Ondobaka S (2017a) Active inference, curiosity and insight. Neural Comput 29:1–51
Article Google Scholar
Friston KJ, Rosch R, Parr T, Price C, Bowman H (2017b) Deep temporal models and active inference. Neurosci Biobehav Rev 77:388–402
Article Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. arXiv:1406.2661
Goodfellow I, Bengio Y, Courville A (2016) Deep Learning. MIT Press, Cambridge. http://www.deeplearningbook.org
Graves A, Wayne G, Reynolds M, Harley T, Danihelka I, Grabska-Barwinska A, Gómez Caolmenarejo S, Grefenstette E, Ramalho T, Agapiou J, Puigdomenèch Badia A, Hermann KM, Zwols Y, Ostrovski G, Cain A, King H, Summerfield C, Blunsum P, Kavukcuoglu K, Hassabis D (2016) Hybrid computing using a neural network with dynamic external memory. Nature 538:471–476
Article Google Scholar
Ha D, Schmidhuber J (2018) World models. arXiv:1803.10122
Haefner R, Berkes P, Fiser J (2016) Perceptual decision-making as probabilistic inference by neural sampling. Neuron 90(3):649–660
Article CAS Google Scholar
Hansen N (2016) The CMA evolution strategy: a tutorial. arXiv:1604.00772
Harper M (2009) The replicator equation as an inference dynamic. arXiv:0911.1763
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507
Article CAS Google Scholar
Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2:359–366
Article Google Scholar
Huszár F (2017) Variational inference using implicit distributions. arXiv:1702.08235
Karpathy A, Johnson J, Fei-Fei L (2015) Visualizing and understanding recurrent networks. arXiv:1506.02078
Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of gans for improved quality, stability, and variation. ICLR
Kempka M, Wydmuch M, Runc G, Toczek J, Jaśkowski W (2016) Vizdoom: a doom-based AI research platform for visual reinforcement learning. arXiv:1605.02097
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
Kingma DP, Welling M (2014) Auto-encoding variational bayes. ICLR
Kingma DP, Salimans T, Jozefowicz R, Chen X, Sutskever I, Welling M (2016) Improving variational inference with inverse autoregressive flow. arXiv:1606.04934
Knill D, Pouget A (2004) The bayesian brain: the role of uncertainty in neural coding and computation. Trends Neurosci 27(12):712–719
Article CAS Google Scholar
Le QV, Jaitly N, Hinton GE (2015) A simple way to initialize recurrent networks of rectified linear units. arXiv:1504.00941
LeCun Y, Bengio Y, Hinton GE (2015) Deep learning. Nature 521:436–444
Article CAS Google Scholar
Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W (2016) Photo-realistic single image super-resolution using a generative adversarial network. arXiv:1609.04802
Liu MY, Breuel T, Kautz J (2017) Unsupervised image-to-image translation networks. Neural Information Processing Systems (NIPS). arXiv:1703.00848
Maaløe L, Sønderby CK, Sønderby SK, Winther O (2016) Auxiliary deep generative models. arXiv:1602.05473
Maheswaranathan N, Metz L, Tucker G, Sohl-Dickenstein J (2018) Guided evolutionary strategies: escaping the curse of dimensionality in random search. arXiv:1806.10230
Mescheder L, Nowozin S, Geiger A (2017) Adversarial variational Bayes: unifying variational autoencoders and generative adversarial networks. arXiv:1701.04722
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518:529–533
Article CAS Google Scholar
Moore A (1991) Variable resolution dynamic programming: efficiently learning action maps in multivariate real-valued state-spaces. In: Proceedings of the eight international conference on machine learning. Morgan Kaufmann
Moreno-Bote R, Knill D, Pouget A (2011) Bayesian sampling in visual perception. Proc Natl Acad Sci USA 108(30):12491–12496
Article CAS Google Scholar
Pathak D, Pulkit A, Efros AA, Darrell T (2017) Curiosity-driven exploration by self-supervised prediction. arXiv:1705.05363
Platt JC, Barr AH (1988) Constrained differential optimization. In: Neural information processing systems. American Institute of Physics, New York, pp 612–621
Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by generative pre-training. Technical report, OpenAI
Ramstead MJD, Badcock PB, Friston KJ (2017) Answering schrödinger’s question: a free-energy formulation. Phys Life Rev 24:1–16
Article Google Scholar
Rezende DJ, Mohamed S (2015) Variational inference with normalizing flows. JMLR 37
Rezende DJ, Mohamed S, Wierstra D (2014) Stochastic backpropagation and approximate inference in deep generative models. ICML
Rezende DJ, Ali Eslami SM, Mohamed S, Battaglia P, Jaderberg M, Heess N (2016) Unsupervised learning of 3d structure from images. arXiv:1607.00662
Salimans T, Ho J, Chen X, Sutskever I (2017) Evolution strategies as a scalable alternative to reinforcement learning. arXiv:1703.03864
Schwartenbeck P, Fitzgerald T, Mathys C, Dolan R, Kronbichler M, Friston KJ (2015) Evidence for surprise minimization over value maximization in choice behavior. Sci Rep 5:16575
Article Google Scholar
Siegelmann HT (1995) Computation beyond the turing limit. Science 268:545–548
Article CAS Google Scholar
Theano Development Team (2016) Theano: a Python framework for fast computation of mathematical expressions. arXiv:1605.02688
Todorov E, Erez T, Tassa Y (2012) Mujoco: A physics engine for model-based control. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS)
Tomczak JM, Welling M (2016) Improving variational auto-encoders using householder flow. arXiv:1611.09630
Tran D, Ranganath R, Blei D (2017) Hierarchical implicit models and likelihood-free variational inference. arXiv:1702.08896
Watson RA, Szathmáry E (2016) How can evolution learn? Trends Ecol Evol 31(2):147–157
Article Google Scholar
Wong KF, Wang XJ (2006) A recurrent network mechanism of time integration in perceptual decisions. J Neurosci 26(4):1314–1328
Article CAS Google Scholar
Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv:1703.10593

Download references

Acknowledgements

The author would like to thank Karl Friston, Thorben Kröger, Manuel Baltieri and Annina Luck for insightful comments on earlier versions of this manuscript and the participants and organisers of the Computational Psychiatry Course 2016 for stimulating lectures and discussions. Furthermore, he would like to thank the three anonymous reviewers for very constructive and tremendously helpful comments, which significantly improved the quality of the paper.

Author information

Authors and Affiliations

Heidelberg, Germany
Kai Ueltzhöffer

Authors

Kai Ueltzhöffer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kai Ueltzhöffer.

Additional information

Communicated by Benjamin Lindner.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 542 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ueltzhöffer, K. Deep active inference. Biol Cybern 112, 547–573 (2018). https://doi.org/10.1007/s00422-018-0785-7

Download citation

Received: 17 September 2017
Accepted: 09 October 2018
Published: 22 October 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s00422-018-0785-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep active inference

Abstract

Access this article

Similar content being viewed by others

A Worked Example of Fokker-Planck-Based Active Inference

Learning Generative Models for Active Inference Using Tensor Networks

Inferring Adaptive Goal-Directed Behavior Within Recurrent Neural Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (pdf 542 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Deep active inference

Abstract

Access this article

Similar content being viewed by others

A Worked Example of Fokker-Planck-Based Active Inference

Learning Generative Models for Active Inference Using Tensor Networks

Inferring Adaptive Goal-Directed Behavior Within Recurrent Neural Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (pdf 542 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation