A Quantitative Study of Learning and Generalization in Genetic Programming

Castelli, Mauro; Manzoni, Luca; Silva, Sara; Vanneschi, Leonardo

doi:10.1007/978-3-642-20407-4_3

Mauro Castelli²¹,
Luca Manzoni²¹,
Sara Silva²² &
…
Leonardo Vanneschi^21,22

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6621))

Included in the following conference series:

European Conference on Genetic Programming

795 Accesses
12 Citations

Abstract

The relationship between generalization and solutions functional complexity in genetic programming (GP) has been recently investigated. Three main contributions are contained in this paper: (1) a new measure of functional complexity for GP solutions, called Graph Based Complexity (GBC) is defined and we show that it has a higher correlation with GP performance on out-of-sample data than another complexity measure introduced in a recent publication. (2) A new measure is presented, called Graph Based Learning Ability (GBLA). It is inspired by the GBC and its goal is to quantify the ability of GP to learn “difficult” training points; we show that GBLA is negatively correlated with the performance of GP on out-of-sample data. (3) Finally, we use the ideas that have inspired the definition of GBC and GBLA to define a new fitness function, whose suitability is empirically demonstrated. The experimental results reported in this paper have been obtained using three real-life multidimensional regression problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Amil, N.M., Bredeche, N., Gagné, C., Gelly, S., Schoenauer, M., Teytaud, O.: A statistical learning perspective of genetic programming. In: Vanneschi, L., Gustafson, S., Moraglio, A., De Falco, I., Ebner, M. (eds.) EuroGP 2009. LNCS, vol. 5481, pp. 327–338. Springer, Heidelberg (2009)
Chapter Google Scholar
Archetti, F., Lanzeni, S., Messina, E., Vanneschi, L.: Genetic programming for human oral bioavailability of drugs. In: Keijzer, M., et al. (eds.) GECCO 2006, vol. 1, pp. 255–262. ACM Press, New York (2006)
Google Scholar
Banzhaf, W., Francone, F.D., Nordin, P.: The effect of extensive use of the mutation operator on generalization in genetic programming using sparse data sets. In: Ebeling, W., Rechenberg, I., Voigt, H.-M., Schwefel, H.-P. (eds.) PPSN 1996. LNCS, vol. 1141, pp. 300–309. Springer, Heidelberg (1996)
Chapter Google Scholar
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, Chichester (1973)
MATH Google Scholar
Francone, F.D., Nordin, P., Banzhaf, W.: Benchmarking the generalization capabilities of a compiling genetic programming system using sparse data sets. In: Koza, J.R., et al. (eds.) GP 1996, pp. 72–80. MIT Press, Cambridge (1996)
Google Scholar
Gagné, C., Schoenauer, M., Parizeau, M., Tomassini, M.: Genetic programming, validation sets, and parsimony pressure. In: Collet, P., Tomassini, M., Ebner, M., Gustafson, S., Ekárt, A. (eds.) EuroGP 2006. LNCS, vol. 3905, pp. 109–120. Springer, Heidelberg (2006)
Chapter Google Scholar
Gomez, F.J., Togelius, J., Schmidhuber, J.: Measuring and optimizing behavioral complexity for evolutionary reinforcement learning. In: Alippi, C., Polycarpou, M., Panayiotou, C., Ellinas, G. (eds.) ICANN 2009. LNCS, vol. 5769, pp. 765–774. Springer, Heidelberg (2009)
Chapter Google Scholar
Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)
MATH Google Scholar
Kushchu, I.: An evaluation of evolutionary generalization in genetic programming. Artificial Intelligence Review 18(1), 3–14 (2002)
Article MATH Google Scholar
Langdon, W.B., Poli, R.: Foundations of Genetic Programming. Springer, Heidelberg (2002)
Book MATH Google Scholar
O’Neill, M., Vanneschi, L., Gustafson, S., Banzhaf, W.: Open issues in genetic programming. Genetic Programming and Evolvable Machines 11(3-4), 339–363 (2010)
Article Google Scholar
Poli, R., Langdon, W.B., McPhee, N.F.: A field guide to genetic programming (2008), Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk
Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)
Article MATH Google Scholar
Rosca, J.: Generality versus size in genetic programming. In: Koza, J.R., et al. (eds.) GP 1996, pp. 381–387. MIT Press, Cambridge (1996)
Google Scholar
Silva, S., Costa, E.: Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genetic Programming and Evolvable Machines 10(2), 141–179 (2009)
Article Google Scholar
Silva, S., Vanneschi, L.: Operator equalisation, bloat and overfitting: a study on human oral bioavailability prediction. In: Raidl, G., et al. (eds.) GECCO 2009, pp. 1115–1122. ACM, New York (2009)
Google Scholar
Vanneschi, L., Castelli, M., Silva, S.: Measuring bloat, overfitting and functional complexity in genetic programming. In: GECCO 2010, pp. 877–884. ACM, New York (2010)
Google Scholar
Vanneschi, L., Silva, S.: Using operator equalisation for prediction of drug toxicity with genetic programming. In: Lopes, L.S., Lau, N., Mariano, P., Rocha, L.M. (eds.) EPIA 2009. LNCS, vol. 5816, pp. 65–76. Springer, Heidelberg (2009)
Chapter Google Scholar
Vladislavleva, E.J., Smits, G.F., den Hertog, D.: Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. IEEE Transactions on Evolutionary Computation 13(2), 333–349 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Informatica, Sistemistica e Comunicazione (D.I.S.Co.), University of Milano-Bicocca, Milan, Italy
Mauro Castelli, Luca Manzoni & Leonardo Vanneschi
KDBIO group, INESC-ID Lisboa, Lisbon, Portugal
Sara Silva & Leonardo Vanneschi

Authors

Mauro Castelli
View author publications
You can also search for this author in PubMed Google Scholar
Luca Manzoni
View author publications
You can also search for this author in PubMed Google Scholar
Sara Silva
View author publications
You can also search for this author in PubMed Google Scholar
Leonardo Vanneschi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INESC-ID Lisboa, Rua Alves Redol 9, 1000-029, Lisboa, Portugal
Sara Silva
Department of Biological Sciences, University of Idaho, ID 83844-3051, Moscow, USA
James A. Foster
University College Dublin, UCD CASL, Belfield, Dublin 4, Ireland
Miguel Nicolau
Faculty of Sciences and Technology, Department of Informatics Engineering, University of Coimbra, Pólo II - Pinhal de Marrocos, 3030-290, Coimbra, Portugal
Penousal Machado
Department of Animal Production Epidemiology and Ecology, University of Torino, Via Leonardo da Vinci 44, 10095, Grugliasco (TO), Italy
Mario Giacobini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Castelli, M., Manzoni, L., Silva, S., Vanneschi, L. (2011). A Quantitative Study of Learning and Generalization in Genetic Programming. In: Silva, S., Foster, J.A., Nicolau, M., Machado, P., Giacobini, M. (eds) Genetic Programming. EuroGP 2011. Lecture Notes in Computer Science, vol 6621. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20407-4_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-20407-4_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20406-7
Online ISBN: 978-3-642-20407-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics