On the Foundations of Universal Sequence Prediction

  • Marcus Hutter
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3959)


Solomonoff completed the Bayesian framework by providing a rigorous, unique, formal, and universal choice for the model class and the prior. We discuss in breadth how and in which sense universal (non-i.i.d.) sequence prediction solves various (philosophical) problems of traditional Bayesian sequence prediction. We show that Solomonoff’s model possesses many desirable properties: Fast convergence and strong bounds, and in contrast to most classical continuous prior densities has no zero p(oste)rior problem, i.e. can confirm universal hypotheses, is reparametrization and regrouping invariant, and avoids the old-evidence and updating problem. It even performs well (actually better) in non-computable environments.


Turing Machine Kolmogorov Complexity Prior Density Input Tape Sequence Prediction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [CB90]
    Clarke, B.S., Barron, A.R.: Information-theoretic asymptotics of Bayes methods. IEEE Transactions on Information Theory 36, 453–471 (1990)zbMATHCrossRefMathSciNetGoogle Scholar
  2. [CV05]
    Cilibrasi, R., Vitányi, P.M.B.: Clustering by compression. IEEE Trans. Information Theory 51(4), 1523–1545 (2005)CrossRefGoogle Scholar
  3. [Ear93]
    Earman, J.: Bayes or Bust? A Critical Examination of Bayesian Confirmation Theory. MIT Press, Cambridge (1993)Google Scholar
  4. [Hut04]
    Hutter, M.: Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability, p. 300. Springer, Heidelberg (2004), Google Scholar
  5. [KW96]
    Kass, R.E., Wasserman, L.: The selection of prior distributions by formal rules. Journal of the American Statistical Association 91(435), 1343–1370 (1996)zbMATHCrossRefGoogle Scholar
  6. [LV97]
    Li, M., Vitányi, P.M.B.: An Introduction to Kolmogorov Complexity and its Applications, 2nd edn. Springer, Berlin (1997)zbMATHGoogle Scholar
  7. [PH04]
    Poland, J., Hutter, M.: On the convergence speed of MDL predictions for Bernoulli sequences. In: Ben-David, S., Case, J., Maruoka, A. (eds.) ALT 2004. LNCS (LNAI), vol. 3244, pp. 294–308. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  8. [Sch02]
    Schmidhuber, J.: Hierarchies of generalized Kolmogorov complexities and nonenumerable universal measures computable in the limit. International Journal of Foundations of Computer Science 13(4), 587–612 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  9. [Sch04]
    Schmidhuber, J.: Optimal ordered problem solver. Machine Learning 54(3), 211–254 (2004)zbMATHCrossRefGoogle Scholar
  10. [Sol64]
    Solomonoff, R.J.: A formal theory of inductive inference: Parts 1 and 2. Information and Control 7(1–22 and 224–254) (1964)zbMATHCrossRefMathSciNetGoogle Scholar
  11. [Sol78]
    Solomonoff, R.J.: Complexity-based induction systems: Comparisons and convergence theorems. IEEE Transactions on Information Theory IT-24, 422–432 (1978)CrossRefMathSciNetGoogle Scholar
  12. [Wal96]
    Walley, P.: Inferences from multinomial data: learning about a bag of marbles. Journal of the Royal Statistical Society B 58(1), 3–57 (1996)zbMATHMathSciNetGoogle Scholar
  13. [Wal05]
    Wallace, C.S.: Statistical and Inductive Inference by Minimum Message Length. Springer, Berlin (2005)zbMATHGoogle Scholar
  14. [ZL70]
    Zvonkin, A.K., Levin, L.A.: The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms. Russian Mathematical Surveys 25(6), 83–124 (1970)zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Marcus Hutter
    • 1
  1. 1.IDSIAManno-LuganoSwitzerland

Personalised recommendations