Intelligence as Inference or Forcing Occam on the World

Sunehag, Peter; Hutter, Marcus

doi:10.1007/978-3-319-09274-4_18

Intelligence as Inference or Forcing Occam on the World

Peter Sunehag²² &
Marcus Hutter²²

Conference paper

1203 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8598))

Abstract

We propose to perform the optimization task of Universal Artificial Intelligence (UAI) through learning a reference machine on which good programs are short. Further, we also acknowledge that the choice of reference machine that the UAI objective is based on is arbitrary and, therefore, we learn a suitable machine for the environment we are in. This is based on viewing Occam’s razor as an imperative instead of as a proposition about the world. Since this principle cannot be true for all reference machines, we need to find a machine that makes the principle true. We both want good policies and the environment to have short implementations on the machine. Such a machine is learnt iteratively through a procedure that generalizes the principle underlying the Expectation-Maximization algorithm.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: NIPS 2007. MIT Press (2007)
Google Scholar
Botvinick, M., Toussaint, M.: Planning as inference. Trends in Cognitive Sciences 16(10), 485–488 (2012)
Article Google Scholar
Dayan, P., Hinton, G.: Using expectation-maximization for reinforcement learning. Neural Computation 9(2), 271–278 (1997)
Article MATH Google Scholar
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. of the Royal Stat. Soc.: B 39, 1–38 (1977)
MATH MathSciNet Google Scholar
Fremaux, N., Sprekeler, H., Gerstner, W.: Functional requirements for reward-modulated spike timing-dependent plasticity. Journal of Neuroscience 30(40), 13326–13337 (2010)
Article Google Scholar
Hawkins, J., Blakeslee, S.: On Intelligence. Times Books (2004)
Google Scholar
Herrnstein, R.J.: On the law of effect. Journal of the Experimental Analysis of Behavior 13, 243–266 (1970)
Article Google Scholar
Hinton, G., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Article MATH MathSciNet Google Scholar
Hutter, M.: Universal Articial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer, Berlin (2005)
Google Scholar
Kahneman, D.: Thinking, fast and slow (2011)
Google Scholar
Lenat, D.: The plausible mutation of DNA. Technical report. Standford University (1980)
Google Scholar
Legg, S., Hutter, M.: Universal Intelligence: A defintion of machine intelligence. Mind and Machine 17, 391–444 (2007)
Article Google Scholar
Legenstein, R., Pecevski, D., Maass, W.: Theoretical analysis of learning with reward-modulated spike-timing-dependent plasticity. In: NIPS (2007)
Google Scholar
Loewenstein, Y., Seung, S.: Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity. PNAS 103(41), 15224–15229 (2006)
Article Google Scholar
Orseau, L., Ring, M.: Space-time embedded intelligence. In: Bach, J., Goertzel, B., Iklé, M. (eds.) AGI 2012. LNCS (LNAI), vol. 7716, pp. 209–218. Springer, Heidelberg (2012)
Chapter Google Scholar
Pelikan, M.: Probabilistic model-building genetic algorithms. In: GECCO, pp. 777–804. ACM (2012)
Google Scholar
Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice-Hall, Englewood Cliffs (2010)
Google Scholar
Russell, S.: Rationality and intelligence. Artificial Intelligence (1997)
Google Scholar
Schmidhuber, J.: Gödel machines: Fully self-referential optimal universal self-improvers. In: Artificial General Intelligence, pp. 199–226 (2007)
Google Scholar
Sunehag, P., Hutter, M.: Optimistic agents are asymptotically optimal. In: Proceedings of the 25th Australasian AI Conference, pp. 15–26 (2012)
Google Scholar
Sunehag, P., Hutter, M.: Optimistic AIXI. In: Bach, J., Goertzel, B., Iklé, M. (eds.) AGI 2012. LNCS (LNAI), vol. 7716, pp. 312–321. Springer, Heidelberg (2012)
Chapter Google Scholar
Shteingart, H., Loewenstein, Y.: Reinforcement learning and human behavior. Current Opinion in Neurobiology 25(0), 93–98 (2014)
Article Google Scholar
Schmidhuber, J., Zhao, J., Wiering, M.: Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning 28, 105–130 (1997)
Article Google Scholar
West-Eberhard, M.J.: Developmental Plasticity and Evolution. Oxford University Press, USA (2003)
Google Scholar
Webb, G.: Occam’s razor. In: Encl. of Machine Learning, Springer (2010)
Google Scholar
Wingate, D., Goodman, N., Kaelbling, L., Roy, D., Tenenbaum, J.: Bayesian policy search with policy priors. IJCAI, 1565–1570 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Research School of Computer Science, Australian National University, Canberra, Australia
Peter Sunehag & Marcus Hutter

Authors

Peter Sunehag
View author publications
You can also search for this author in PubMed Google Scholar
Marcus Hutter
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

OpenCog Foundation, G/F, 51C Lung Mei Village, Tai Po, N.T., Hong Kong
Ben Goertzel
AgroParisTech, 16 rue Claude Bernard, 75005, Paris, France
Laurent Orseau
Google Inc., 1600 Amphitheatre Parkway, 94043, Mountain View, CA, USA
Javier Snaider

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sunehag, P., Hutter, M. (2014). Intelligence as Inference or Forcing Occam on the World. In: Goertzel, B., Orseau, L., Snaider, J. (eds) Artificial General Intelligence. AGI 2014. Lecture Notes in Computer Science(), vol 8598. Springer, Cham. https://doi.org/10.1007/978-3-319-09274-4_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-09274-4_18
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09273-7
Online ISBN: 978-3-319-09274-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics