Skip to main content

Intelligence as Inference or Forcing Occam on the World

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8598))

Abstract

We propose to perform the optimization task of Universal Artificial Intelligence (UAI) through learning a reference machine on which good programs are short. Further, we also acknowledge that the choice of reference machine that the UAI objective is based on is arbitrary and, therefore, we learn a suitable machine for the environment we are in. This is based on viewing Occam’s razor as an imperative instead of as a proposition about the world. Since this principle cannot be true for all reference machines, we need to find a machine that makes the principle true. We both want good policies and the environment to have short implementations on the machine. Such a machine is learnt iteratively through a procedure that generalizes the principle underlying the Expectation-Maximization algorithm.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: NIPS 2007. MIT Press (2007)

    Google Scholar 

  2. Botvinick, M., Toussaint, M.: Planning as inference. Trends in Cognitive Sciences 16(10), 485–488 (2012)

    Article  Google Scholar 

  3. Dayan, P., Hinton, G.: Using expectation-maximization for reinforcement learning. Neural Computation 9(2), 271–278 (1997)

    Article  MATH  Google Scholar 

  4. Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. of the Royal Stat. Soc.: B 39, 1–38 (1977)

    MATH  MathSciNet  Google Scholar 

  5. Fremaux, N., Sprekeler, H., Gerstner, W.: Functional requirements for reward-modulated spike timing-dependent plasticity. Journal of Neuroscience 30(40), 13326–13337 (2010)

    Article  Google Scholar 

  6. Hawkins, J., Blakeslee, S.: On Intelligence. Times Books (2004)

    Google Scholar 

  7. Herrnstein, R.J.: On the law of effect. Journal of the Experimental Analysis of Behavior 13, 243–266 (1970)

    Article  Google Scholar 

  8. Hinton, G., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  9. Hutter, M.: Universal Articial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer, Berlin (2005)

    Google Scholar 

  10. Kahneman, D.: Thinking, fast and slow (2011)

    Google Scholar 

  11. Lenat, D.: The plausible mutation of DNA. Technical report. Standford University (1980)

    Google Scholar 

  12. Legg, S., Hutter, M.: Universal Intelligence: A defintion of machine intelligence. Mind and Machine 17, 391–444 (2007)

    Article  Google Scholar 

  13. Legenstein, R., Pecevski, D., Maass, W.: Theoretical analysis of learning with reward-modulated spike-timing-dependent plasticity. In: NIPS (2007)

    Google Scholar 

  14. Loewenstein, Y., Seung, S.: Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity. PNAS 103(41), 15224–15229 (2006)

    Article  Google Scholar 

  15. Orseau, L., Ring, M.: Space-time embedded intelligence. In: Bach, J., Goertzel, B., Iklé, M. (eds.) AGI 2012. LNCS (LNAI), vol. 7716, pp. 209–218. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  16. Pelikan, M.: Probabilistic model-building genetic algorithms. In: GECCO, pp. 777–804. ACM (2012)

    Google Scholar 

  17. Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Prentice-Hall, Englewood Cliffs (2010)

    Google Scholar 

  18. Russell, S.: Rationality and intelligence. Artificial Intelligence (1997)

    Google Scholar 

  19. Schmidhuber, J.: Gödel machines: Fully self-referential optimal universal self-improvers. In: Artificial General Intelligence, pp. 199–226 (2007)

    Google Scholar 

  20. Sunehag, P., Hutter, M.: Optimistic agents are asymptotically optimal. In: Proceedings of the 25th Australasian AI Conference, pp. 15–26 (2012)

    Google Scholar 

  21. Sunehag, P., Hutter, M.: Optimistic AIXI. In: Bach, J., Goertzel, B., Iklé, M. (eds.) AGI 2012. LNCS (LNAI), vol. 7716, pp. 312–321. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  22. Shteingart, H., Loewenstein, Y.: Reinforcement learning and human behavior. Current Opinion in Neurobiology 25(0), 93–98 (2014)

    Article  Google Scholar 

  23. Schmidhuber, J., Zhao, J., Wiering, M.: Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning 28, 105–130 (1997)

    Article  Google Scholar 

  24. West-Eberhard, M.J.: Developmental Plasticity and Evolution. Oxford University Press, USA (2003)

    Google Scholar 

  25. Webb, G.: Occam’s razor. In: Encl. of Machine Learning, Springer (2010)

    Google Scholar 

  26. Wingate, D., Goodman, N., Kaelbling, L., Roy, D., Tenenbaum, J.: Bayesian policy search with policy priors. IJCAI, 1565–1570 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Sunehag, P., Hutter, M. (2014). Intelligence as Inference or Forcing Occam on the World. In: Goertzel, B., Orseau, L., Snaider, J. (eds) Artificial General Intelligence. AGI 2014. Lecture Notes in Computer Science(), vol 8598. Springer, Cham. https://doi.org/10.1007/978-3-319-09274-4_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09274-4_18

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09273-7

  • Online ISBN: 978-3-319-09274-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics