Skip to main content

A Syntactic Approach to Prediction

  • Chapter
  • 1592 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7070))

Abstract

A central question in the empirical sciences is; given a body of data how do we best attempt to make predictions? There are subtle differences between current approaches which include Minimum Message Length (MML) and Solomonoff’s theory of induction [24].

The nature of hypothesis spaces is explored and we observe a correlation between the complexity of a function and the frequency with which it is represented. There is not a single best hypothesis, as suggested by Occam’s razor (which says prefer the simplest), but a set of functionally equivalent hypotheses. One set of hypotheses is preferred over another set because it is larger, thus giving the impression simpler functions generalize better. The probabilistic weighting of one set of hypotheses is given by the relative size of its equivalence class. We justify Occam’s razor by a counting argument over the hypothesis space.

Occam’s razor contrasts with the No Free Lunch theorems which state that it impossible for one machine learning algorithm to generalize better than any other. No Free Lunch theorems assume a distribution over functions, whereas Occam’s razor assumes a distribution over programs.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Auger, A., Teytaud, O.: Continuous lunches are free plus the design of optimal optimization algorithms. Algorithmica 57(1), 121–146 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  2. Cover, T.M., Thomas, J.A.: Elements of information theory. Wiley-Interscience, New York (1991)

    Book  MATH  Google Scholar 

  3. Domingos, P.: The role of occam’s razor in knowledge discovery. Data Min. Knowl. Discov. 3(4), 409–425 (1999)

    Article  Google Scholar 

  4. Dowe, D.L.: MML, hybrid Bayesian network graphical models, statistical consistency, invariance and uniqueness. In: Handbook of the Philosophy of Science (HPS). Philosophy of Statistics, vol. 7, pp. 901–982 (2011)

    Google Scholar 

  5. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley-Interscience (November 2000)

    Google Scholar 

  6. Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. In: Machine Learning, pp. 63–91 (1993)

    Google Scholar 

  7. Hutter, M.: A complete theory of everything (will be subjective). Algorithms 3(7), 360–374 (2010)

    Google Scholar 

  8. Hutter, M.: Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability, 300 pages. Springer, Berlin (2004), http://www.idsia.ch/~marcus/ai/uaibook.htm

  9. Kearns, M.J., Vazirani, U.V.: An introduction to computational learning theory. MIT Press, Cambridge (1994)

    Google Scholar 

  10. Koza, J.R.: Genetic Programming II: Automatic Discovery of Reusable Programs. The MIT Press, Cambridge (1994)

    MATH  Google Scholar 

  11. Langdon, W.B.: Scaling of program functionality. Genetic Programming and Evolvable Machines 10(1), 5–36 (2009)

    Article  Google Scholar 

  12. William, B.: Langdon. Scaling of program fitness spaces. Evolutionary Computation 7(4), 399–428 (1999)

    Article  Google Scholar 

  13. Li, M., Vitányi, P.: An introduction to Kolmogorov complexity and its applications, 2nd edn. Springer-Verlag New York, Inc., Secaucus (1997)

    Google Scholar 

  14. Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)

    MATH  Google Scholar 

  15. Murphy, P.M., Pazzani, M.J.: Exploring the decision forest: An empirical investigation of occams razor in decision tree induction. Journal of Artificial Intelligence Research, 257–275 (1994)

    Google Scholar 

  16. Needham, S.L., Dowe, D.L.: Message length as an effective ockham’s razor in decision tree induction. In: Proc. 8th International Workshop on Artificial Intelligence and Statistics (AI+STATS 2001), Key West, Florida, U.S.A., pp. 253–260 (January 2001)

    Google Scholar 

  17. Poli, R., Graff, M., McPhee, N.F.: Free lunches for function and program induction. In: Proceedings of the Tenth ACM SIGEVO Workshop on Foundations of Genetic Algorithms (FOGA 2009), Orlando, Florida, USA, January 9-11, pp. 183–194. ACM (2009)

    Google Scholar 

  18. Rogers, H.: Theory of recursive functions and effective computability. McGraw-Hill series in higher mathematics. MIT Press (1987)

    Google Scholar 

  19. Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Pearson Education (2003)

    Google Scholar 

  20. Schaffer, C.: A conservation law for generalization performance. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 259–265. Morgan Kaufmann (1994)

    Google Scholar 

  21. Solomonoff, R.: Machine learning - past and future. In: The Dartmouth Artificial Intelligence Conference, AI@50, pp. 257–275. Dartmouth, N.H. (2006)

    Google Scholar 

  22. Solomonoff, R.J.: A formal theory of inductive inference. part i. Information and Control 7(1), 1–22 (1964)

    Article  MathSciNet  MATH  Google Scholar 

  23. Solomonoff, R.J.: A formal theory of inductive inference. part ii. Information and Control 7(2), 224–254 (1964)

    Article  MathSciNet  MATH  Google Scholar 

  24. Wallace, C.S., Dowe, D.L.: Minimum message length and kolmogorov complexity. Computer Journal 42, 270–283 (1999)

    Article  MATH  Google Scholar 

  25. Webb: Generality is more significant than complexity: Toward an alternative to occam’s razor. In: Australian Joint Conference on Artificial Intelligence (AJCAI) (1994)

    Google Scholar 

  26. Wolpert, D.H., Macready, W.G.: No free lunch theorems for search. Technical Report SFI-TR-95-02-010, Santa Fe, NM (1995)

    Google Scholar 

  27. Wolpert, D.H., Macready, W.G.: No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1(1), 67–82 (1997)

    Article  Google Scholar 

  28. Woodward, J.R.: Complexity and cartesian genetic programming. In: Collet, P., Tomassini, M., Ebner, M., Gustafson, S., Ekárt, A. (eds.) EuroGP 2006. LNCS, vol. 3905, pp. 260–269. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  29. Woodward, J.R.: Invariance of function complexity under primitive recursive functions. In: Collet, P., Tomassini, M., Ebner, M., Gustafson, S., Ekárt, A. (eds.) EuroGP 2006. LNCS, vol. 3905, pp. 310–319. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Woodward, J., Swan, J. (2013). A Syntactic Approach to Prediction. In: Dowe, D.L. (eds) Algorithmic Probability and Friends. Bayesian Prediction and Artificial Intelligence. Lecture Notes in Computer Science, vol 7070. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-44958-1_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-44958-1_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-44957-4

  • Online ISBN: 978-3-642-44958-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics