Skip to main content

A Quantitative Study of Learning and Generalization in Genetic Programming

  • Conference paper
Genetic Programming (EuroGP 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6621))

Included in the following conference series:

Abstract

The relationship between generalization and solutions functional complexity in genetic programming (GP) has been recently investigated. Three main contributions are contained in this paper: (1) a new measure of functional complexity for GP solutions, called Graph Based Complexity (GBC) is defined and we show that it has a higher correlation with GP performance on out-of-sample data than another complexity measure introduced in a recent publication. (2) A new measure is presented, called Graph Based Learning Ability (GBLA). It is inspired by the GBC and its goal is to quantify the ability of GP to learn “difficult” training points; we show that GBLA is negatively correlated with the performance of GP on out-of-sample data. (3) Finally, we use the ideas that have inspired the definition of GBC and GBLA to define a new fitness function, whose suitability is empirically demonstrated. The experimental results reported in this paper have been obtained using three real-life multidimensional regression problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amil, N.M., Bredeche, N., Gagné, C., Gelly, S., Schoenauer, M., Teytaud, O.: A statistical learning perspective of genetic programming. In: Vanneschi, L., Gustafson, S., Moraglio, A., De Falco, I., Ebner, M. (eds.) EuroGP 2009. LNCS, vol. 5481, pp. 327–338. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  2. Archetti, F., Lanzeni, S., Messina, E., Vanneschi, L.: Genetic programming for human oral bioavailability of drugs. In: Keijzer, M., et al. (eds.) GECCO 2006, vol. 1, pp. 255–262. ACM Press, New York (2006)

    Google Scholar 

  3. Banzhaf, W., Francone, F.D., Nordin, P.: The effect of extensive use of the mutation operator on generalization in genetic programming using sparse data sets. In: Ebeling, W., Rechenberg, I., Voigt, H.-M., Schwefel, H.-P. (eds.) PPSN 1996. LNCS, vol. 1141, pp. 300–309. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  4. Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, Chichester (1973)

    MATH  Google Scholar 

  5. Francone, F.D., Nordin, P., Banzhaf, W.: Benchmarking the generalization capabilities of a compiling genetic programming system using sparse data sets. In: Koza, J.R., et al. (eds.) GP 1996, pp. 72–80. MIT Press, Cambridge (1996)

    Google Scholar 

  6. Gagné, C., Schoenauer, M., Parizeau, M., Tomassini, M.: Genetic programming, validation sets, and parsimony pressure. In: Collet, P., Tomassini, M., Ebner, M., Gustafson, S., Ekárt, A. (eds.) EuroGP 2006. LNCS, vol. 3905, pp. 109–120. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  7. Gomez, F.J., Togelius, J., Schmidhuber, J.: Measuring and optimizing behavioral complexity for evolutionary reinforcement learning. In: Alippi, C., Polycarpou, M., Panayiotou, C., Ellinas, G. (eds.) ICANN 2009. LNCS, vol. 5769, pp. 765–774. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  8. Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)

    MATH  Google Scholar 

  9. Kushchu, I.: An evaluation of evolutionary generalization in genetic programming. Artificial Intelligence Review 18(1), 3–14 (2002)

    Article  MATH  Google Scholar 

  10. Langdon, W.B., Poli, R.: Foundations of Genetic Programming. Springer, Heidelberg (2002)

    Book  MATH  Google Scholar 

  11. O’Neill, M., Vanneschi, L., Gustafson, S., Banzhaf, W.: Open issues in genetic programming. Genetic Programming and Evolvable Machines 11(3-4), 339–363 (2010)

    Article  Google Scholar 

  12. Poli, R., Langdon, W.B., McPhee, N.F.: A field guide to genetic programming (2008), Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk

  13. Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)

    Article  MATH  Google Scholar 

  14. Rosca, J.: Generality versus size in genetic programming. In: Koza, J.R., et al. (eds.) GP 1996, pp. 381–387. MIT Press, Cambridge (1996)

    Google Scholar 

  15. Silva, S., Costa, E.: Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genetic Programming and Evolvable Machines 10(2), 141–179 (2009)

    Article  Google Scholar 

  16. Silva, S., Vanneschi, L.: Operator equalisation, bloat and overfitting: a study on human oral bioavailability prediction. In: Raidl, G., et al. (eds.) GECCO 2009, pp. 1115–1122. ACM, New York (2009)

    Google Scholar 

  17. Vanneschi, L., Castelli, M., Silva, S.: Measuring bloat, overfitting and functional complexity in genetic programming. In: GECCO 2010, pp. 877–884. ACM, New York (2010)

    Google Scholar 

  18. Vanneschi, L., Silva, S.: Using operator equalisation for prediction of drug toxicity with genetic programming. In: Lopes, L.S., Lau, N., Mariano, P., Rocha, L.M. (eds.) EPIA 2009. LNCS, vol. 5816, pp. 65–76. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  19. Vladislavleva, E.J., Smits, G.F., den Hertog, D.: Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. IEEE Transactions on Evolutionary Computation 13(2), 333–349 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Castelli, M., Manzoni, L., Silva, S., Vanneschi, L. (2011). A Quantitative Study of Learning and Generalization in Genetic Programming. In: Silva, S., Foster, J.A., Nicolau, M., Machado, P., Giacobini, M. (eds) Genetic Programming. EuroGP 2011. Lecture Notes in Computer Science, vol 6621. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20407-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20407-4_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20406-7

  • Online ISBN: 978-3-642-20407-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics