Defining locality as a problem difficulty measure in genetic programming

Galván-López, Edgar; McDermott, James; O’Neill, Michael; Brabazon, Anthony

doi:10.1007/s10710-011-9136-3

Defining locality as a problem difficulty measure in genetic programming

Published: 02 April 2011

Volume 12, pages 365–401, (2011)
Cite this article

Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Edgar Galván-López¹,
James McDermott¹,
Michael O’Neill¹ &
…
Anthony Brabazon¹

398 Accesses
33 Citations
Explore all metrics

Abstract

A mapping is local if it preserves neighbourhood. In Evolutionary Computation, locality is generally described as the property that neighbouring genotypes correspond to neighbouring phenotypes. A representation has high locality if most genotypic neighbours are mapped to phenotypic neighbours. Locality is seen as a key element in performing effective evolutionary search. It is believed that a representation that has high locality will perform better in evolutionary search and the contrary is true for a representation that has low locality. When locality was introduced, it was the genotype-phenotype mapping in bitstring-based Genetic Algorithms which was of interest; more recently, it has also been used to study the same mapping in Grammatical Evolution. To our knowledge, there are few explicit studies of locality in Genetic Programming (GP). The goal of this paper is to shed some light on locality in GP and use it as an indicator of problem difficulty. Strictly speaking, in GP the genotype and the phenotype are not distinct. We attempt to extend the standard quantitative definition of genotype-phenotype locality to the genotype-fitness mapping by considering three possible definitions. We consider the effects of these definitions in both continuous- and discrete-valued fitness functions. We compare three different GP representations (two of them induced by using different function sets and the other using a slightly different GP encoding) and six different mutation operators. Results indicate that one definition of locality is better in predicting performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Locality in Continuous Fitness-Valued Cases and Genetic Programming Difficulty

On the Locality of Standard Search Operators in Grammatical Evolution

Adjudicated GP: A Behavioural Approach to Selective Breeding

Notes

The term locality has also been used in an unrelated context, to refer to the quasi-geographical distribution of an EC population [8].
Notice that this type of encoding is distinct only when the arities defined in the function set are of different values. So, if arities are all the same, Uniform GP reduces to standard GP.
Notice that Size-fair subtree mutation has two variants.
Notice that when using the permutation mutation operator and using F _E3 and F _E4 on the Even-n-Parity problem, the fd is always 0 because all of the operators in these function sets are symmetric.
100 independent runs, 45 different settings (i.e., three different combinations of population sizes and number of generations, five different problems and three different function sets for each of the five problems—3 × 5 × 3), and 6 different mutation operators.

References

L. Altenberg, Fitness Distance Correlation Analysis: An Instructive Counterexample. in Proceedings of the Seventh International Conference on Genetic Algorithms, ed. by T. Back (Morgan Kaufmann, 1997), pp 57–64, San Francisco, CA, USA
H. Beyer, H. Schwefel, Evolution strategies—a comprehensive introduction. Nat. Comput. 1(1), 3–52 (2002)
Article MathSciNet MATH Google Scholar
M. Brameier, W. Banzhaf, Linear Genetic Programming. (Springer, New York, 2006)
Google Scholar
R. Cilibrasi, P.M.B. Vitanyi, Clustering by compression. IEEE Trans. Inf. Theory 51(4), 1523–1545 (2005)
Article MathSciNet Google Scholar
M. Clergue, P. Collard, GA-Hard Functions Built by Combination of Trap Functions. In: D.B. Fogel, M.A. El-Sharkawi, X. Yao, G. Greenwood, H. Iba, P. Marrow, M. Schackleton (eds) CEC 2002: Proceedings of the 2002 Congress on Evolutionary Computation, (IEEE Press, New York, 2002) pp. 249–254.
Chapter Google Scholar
M. Clergue, P. Collard, M. Tomassini , L. Vanneschi, in Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2002, eds. by W.B. Langdon, E. Cantú-Paz, K.E. Mathias, R. Roy, D. Davis, R. Poli, K. Balakrishnan, V. Honavar, G. Rudolph, J. Wegener, L. Bull, M.A. Potter, A.C. Schultz, J.F. Miller, E.K. Burke, N. Jonoska. Fitness Distance Correlation and Problem Difficulty for Genetic Programming (Morgan Kaufmann Publishers, New York, 2002), pp. 724–732
I. De Falco, A. Iazzetta, E. Tarantino, A. Della Cioppa, G. Trautteur, in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2000). A Kolmogorov Complexity Based Genetic Programming Tool for String Compression (2000)
P. D’haeseleer, J. Bluming, Effects of Locality in Individual and Population Evolution. In: K.E. Kinnear (eds) Advances in Genetic Programming, (MIT Press, Cambridge, 1994) pp. 177–198.
Google Scholar
A. Ekárt, S.Z. Németh, in EuroGP, number 1802 in Lecture Notes in Computer Science. A metric for genetic programs and fitness sharing (Springer, 2000), pp. 259–270
D.B. Fogel, A. Ghozeil, Using Fitness Distributions to Design More Efficient Evolutionary Computations (1996)
E. Galván-López, An Analysis of the Effects of Neutrality on Problem Hardness for Evolutionary Algorithms. PhD thesis, School of Computer Science and Electronic Engineering, University of Essex, United Kingdom (2009)
E. Galván-López, S. Dignum, R. Poli, The Effects of Constant Neutrality on Performance and Problem Hardness in GP. in EuroGP 2008 - 11th European Conference on Genetic Programming, vol. 4971 of LNCS, ed. by M. O’Neill, L. Vanneschi, S. Gustafson, A.I.E. Alcazar, I.D. Falco, A.D. Cioppa, E. Tarantino (Springer, 26–28 Mar. 2008), pp. 312–324, Napoli, Italy
E. Galván-Lopéz, J. McDermott, M. O’Neill, A. Brabazon, in CEC 2010: Proceedings of the 12th Annual Congress on Evolutionary Computation. Defining locality in genetic programming to predict performance, Barcelona, Spain (IEEC Press, July 2010)
E. Galván-Lopéz, J. McDermott, M. O’Neill, A. Brabazon, in GECCO 2010: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation. Towards an understanding of locality in genetic programming. (ACM Press, Portland, July 2010)
E. Galván-López, M. O’Neill, A. Brabazon, in Artificial Intelligence, 2009. MICAI 2009. Eighth Mexican International Conference on. Towards Understanding the Effects of Locality in gp (2009), pp. 9–14
E. Galván-López, R. Poli, in Parallel Problem Solving from Nature (PPSN IX). 9th International Conference, vol. 4193 of LNCS, ed. by T.P. Runarsson, H.-G. Beyer, E. Burke, J.J. Merelo-Guervós, L.D. Whitley, X. Yao. Some Steps Towards Understanding How Neutrality Affects Evolutionary Search (Springer, 9–13 Sept. 2006), pp. 778–787 (Reykjavik, Iceland)
E. Galván-López, R. Poli, in MICAI, vol. 5845 of Lecture Notes in Computer Science, ed. by A.H. Aguirre, R.M. Borja, C.A.R. Garcia. An Empirical Investigation of How Degree Neutrality Affects gp Search (Springer, 2009), pp. 728–739
E. Galván López, R. Poli, C.A. Coello Coello, in Genetic Programming 7th European Conference, EuroGP 2004, Proceedings, vol. 3003 of LNCS, ed. by M. Keijzer, U.-M. O’Reilly, S. Lucas, E. Costa, T. Soule. Reusing Code in Genetic Programming (Springer, 5–7 Apr. 2004), pp. 359–368 (Coimbra, Portugal)
E. Galván-López, R. Poli, A. Kattan, M. O’Neill, A. Brabazon, Neutrality in evolutionary algorithms … what do we know? Evol. Syst. (2011)
D.E. Goldberg, Construction of high-order deceptive functions using low-order walsh coefficients. Ann. Math. Artif. Intell. 5(1), 35–47 (1992)
Article MATH Google Scholar
D.E. Goldberg, K. Deb, J. Horn, in PPSN II: Proceedings of the 2nd International Conference on Parallel Problem Solving from Nature, ed. by R. Männer, B. Manderick. Massive Multimodality, Deception, and Genetic Algorithms (Elsevier, Amsterdam, 1992), pp. 37–48
F.J. Gomez, in Proceedings of the 11th Annual conference on Genetic and evolutionary computation. Sustaining Diversity Using Behavioral Information Distance (ACM, Montréal, 2009), pp. 113–120
J. Gottlieb, B.A. Julstrom, G.R. Raidl, F. Rothlauf, in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2001), ed. by S. Spector, E. Wu, B. Voigt, Gen, Sen, Dorigo, Pezeshk, Garzon, Burke. Prufer numbers: A poor representation of spanning trees for evolutionary search (Morgan Kaufmann, 2001), pp. 343–350
J. Gottlieb, G.R. Raidl, in Proceedings of the Genetic and Evolutionary Computation Conference 2000. The Effects of Locality on the Dynamics of Decoder-Based Evolutionary Search
J. Gottlieb, G.R. Raidl, in AE ’99: Selected Papers from the 4th European Conference on Artificial Evolution. Characterizing Locality in Decoder-Based EAs for the Multidimensional Knapsack Problem (Springer, London, 2000), pp. 38–52
S. Gustafson, L. Vanneschi, Crossover-based tree distance in genetic programming. IEEE Trans. Evol. Comput. 12(4), 506–524 (2008)
Article Google Scholar
J.H. Holland, Adaptation in Natural and Artificial Systems. (University of Michigan Press, Ann Arbor, 1975)
Google Scholar
C. Igel, K. Chellapilla, Investigating the Influence of Depth and Degree of Genotypic Change on Fitness in Genetic Programming (1999)
T. Jiang, L. Wang, K. Zhang, Alignment of trees—an alternative to tree edit. Theor. Comput. Sci. 143(1), 137–148 (1995)
MathSciNet MATH Google Scholar
T. Jones. Evolutionary Algorithms, Fitness Landscapes and Search. PhD thesis, University of New Mexico, Albuquerque (1995)
T. Jones, S. Forrest, in Proceedings of the 6th International Conference on Genetic Algorithms, ed. by L.J. Eshelman. Fitness Distance Correlation as a Measure of Problem Difficulty for Genetic Algorithms (Morgan Kaufmann Publishers, San Francisco, 1995), pp. 184–192
K.E. Kinnear, Jr., in Proceedings of the 1994 IEEE World Conference on Computational Intelligence, vol. 1. Fitness Landscapes and Difficulty in Genetic Programming (IEEE Press, Orlando, 27–29 June 1994) pp. 142–147
J.R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection. (The MIT Press, Cambridge, 1992)
MATH Google Scholar
W. Langdon, R. Poli, Why Ants are Hard. In: J.R. Koza (eds) Proceedings of the Third Annual Conference on Genetic Programming, (Morgan Kaufmann, Madison, 1998) pp. 193–201.
Google Scholar
W.B. Langdon, in 1998 IEEE International Conference on Evolutionary Computation. The Evolution of Size in Variable Length Representations (IEEE Press, 1998), pp. 633–638
W.B. Langdon, R. Poli, Foundations of Genetic Programming. (Springer, Berlin, 2002)
MATH Google Scholar
P.K. Lehre, P.C. Haddow, Phenotypic complexity and local variations in neutral degree. BioSystems 87(2-3), 233–242 (2006)
Article Google Scholar
M. Li, P. Vitanyi, An Introduction to Kolmogorov Complexity and its Applications. (Springer, Berlin, 1997)
MATH Google Scholar
B. Manderick, de M.K. Weger, P. Spiessens, The Genetic Algorithm and the Structure of the Fitness Landscape. In: R.K. Belew, L.B. Booker (eds) ICGA, (Morgan Kaufmann, Los Altos, 1991) pp. 143–150.
Google Scholar
J.F. Miller, P. Thomson, in EuroGP. Cartesian Genetic Programming (Springer, 2000), pp. 121–132
B. Naudts, L. Kallel, A comparison of predictive measures of problem difficulty in evolutionary algorithms. IEEE Trans. Evol. Comput. 4(1), 1–15 (2000)
Article Google Scholar
P. Nordin, Evolutionary Program Induction of Binary Machine Code and its Applications. PhD thesis, der Universitat Dortmund am Fachereich Informatik (1997)
U.-M. O’Reilly, in IEEE International Conference on Systems, Man, and Cybernetics: Computational Cybernetics and Simulation, vol. 5. Using a Distance Metric on Genetic Programs to Understand Genetic Operators (1997)
R. Poli, E. Galván-López, in Foundations of Genetic Algorithms IX, Lecture Notes in Computer Science, ed. by C.R. Stephens, M. Toussaint, D. Whitley, P. Stadler. On The Effects of Bit-Wise Neutrality on Fitness Distance Correlation, Phenotypic Mutation Rates and Problem Hardness (Springer, Mexico city, 8–11 Jan. 2007), pp. 138–164
R. Poli, E. Galván-López, The Effects of Constant and Bit-Wise Neutrality on Hardness, Fitness Distance Correlation and Phenotypic Mutation Rataes. IEEE Trans. Evol. Comput. (2011)
R. Poli, W.B. Langdon, N.F. McPhee, A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk, 2008. (With contributions by J. R. Koza)
R. Poli, L. Vanneschi, in Proceedings of the 9th annual conference on Genetic and evolutionary computation, GECCO ’07. Fitness-Proportional Negative Slope Coefficient as a Hardness Measure for Genetic Algorithms (ACM, New York, 2007), pp. 1335–1342
B. Punch, D. Zongker, E. Godman, in Advances in Genetic Programming 2, ed. by P. Angeline, K. Kinnear. The Royal Tree Problem, A Benchmark for Single and Multi-population Genetic Programming (The MIT Press, Cambridge, 1996), pp. 299–316
R.J. Quick, V.J. Rayward-Smith, G.D. Smith, in Proceedings of the 5th International Conference on Parallel Problem Solving from Nature. Fitness Distance Correlation and Ridge Functions (Springer, London, 1998), pp. 77–86
I. Rechenberg, Evolutionsstrategie 94, volume 1 of Werkstatt Bionik und Evolutionstechnik. (Frommann-Holzboog, Stuttgart, 1994)
Google Scholar
S. Ronald, Robust Encodings in Genetic Algorithms. In: Z. Michalewicz, K. Deb, M. Schmidt, T. Stidsen (eds) Evolutionary Algorithms in Engineering Applications, (Springer, Berlin, 1997) pp. 29–44.
Google Scholar
F. Rothlauf, Representations for Genetic and Evolutionary Algorithms, 2nd edn. (Physica, Berlin, 2006)
Google Scholar
F. Rothlauf, D. Goldberg, Redundant representations in evolutionary algorithms. Evol. Comput. 11(4), 381–415 (2003)
Article Google Scholar
F. Rothlauf, D.E. Goldberg, Pruefer numbers and genetic algorithms: A lesson how the low locality of an encoding can harm the performance of GAs. Technical Report 3/2000, Bayreuth (2000)
F. Rothlauf, E. Goldberg, David, Tree network design with genetic algorithms—an investigation in the locality of the pruefernumber encoding. Technical Report 6/1999, Bayreuth (1999)
F. Rothlauf, M. Oetzel, in Proceedings of the 9th European Conference on Genetic Programming, vol. 3905 of Lecture Notes in Computer Science, ed. by P. Collet, M. Tomassini, M. Ebner, S. Gustafson, A. Ekárt. On the Locality of Grammatical Evolution (Springer, Budapest, 10–12 Apr. 2006), pp. 320–330
D. Shasha, K. Zhang, in SPAA ’89: Proceedings of the First Annual ACM Symposium on Parallel Algorithms and Architectures. Fast Parallel Algorithms for the Unit Cost Editing Distance Between Trees (ACM, New York, 1989), pp. 117–126
P.F. Stadler, C.R. Stephens, Landscapes and effective fitness. Comments Theori. Biol. 8, 389–431 (2002)
Article Google Scholar
M. Tacker, P.F. Stadler, E.G. Bornberg-Bauer, I.L. Hofacker, P. Schuster, Algorithm indepedent properties of RNA secondary structure predictions. Eur. Biophys. J. 25(2), 115–130 (1996)
Article Google Scholar
M. Tomassini, L. Vanneschi, P. Collard, M. Clergue, A study of fitness distance correlation as a difficulty measure in genetic programming. Evol. Comput. 13(2), 213–239 (2005)
Article Google Scholar
M. Toussaint, C. Igel, in Proceedings of the IEEE Congress on Evolutionary Computation (CEC 2002). Neutrality: A Necessity for Self-Adaptation (2002), pp. 1354–1359
L. Vanneschi, Theory and Practice for Efficient Genetic Programming. PhD thesis, Faculty of Science, University of Lausanne, Switzerland (2004)
L. Vanneschi, in Genetic Programming Theory and Practive V, chap. 7, ed. by R. et al. Investigating Problem Hardness of Real Life Applications (Springer, US, 2007), pp. 107–124
L. Vanneschi, M. Clergue, P. Collard, M. Tomassini, S. Verel, in EuroGP, LNCS. Fitness Clouds and Problem Hardness in Genetic Programming (Springer, Berlin, 2004), pp. 690–701
L. Vanneschi, M. Tomassini, P. Collard, M. Clergue, in EuroGP, Lecture notes in computer science. Fitness Distance Correlation in Structural Mutation Genetic Programming (Springer, Berlin, 2003), pp. 455–464
L. Vanneschi, M. Tomassini, P. Collard, S. Verel, Y. Pirola, G. Mauri, in Proceedings of EuroGP 2007, vol. 4445 of LNCS. A comprehensive View of Fitness Landscapes with Neutrality and Fitness Clouds (Springer, Berlin, 2007), pp. 241–250
L. Vanneschi, A. Valsecchi, R. Poli, in GECCO ’09: Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation. Limitations of the Fitness-Proportional Negative Slope Coefficient as a Difficulty Measure (ACM, New York, 2009), pp. 1877–1878
E. Weinberger, Correlated and uncorrelated fitness landscapes and how to tell the difference. Biol. Cybern. 63(5), 325–336 (1990)
Article MATH Google Scholar
S. Wright, in Proceedings of the Sixth International Congress on Genetics, vol. 1, ed. by D.F. Jones. The Roles of Mutation, Inbreeding, Crossbreeding and Selection in Evolution (1932), pp. 356–366

Download references

Acknowledgments

This research is based upon works supported by Science Foundation Ireland under Grant No. 08/IN.1/I1868 and by the Irish Research Council for Science, Engineering and Technology under the Empower scheme. The authors would like to thank the anonymous reviewers for their valuable comments. Leonardo Vanneschi is particularly thanked for his helpful suggestions and for his encouragement to further continue developing this research.

Author information

Authors and Affiliations

Natural Computing Research and Applications Group, University College Dublin, Dublin, Ireland
Edgar Galván-López, James McDermott, Michael O’Neill & Anthony Brabazon

Authors

Edgar Galván-López
View author publications
You can also search for this author in PubMed Google Scholar
James McDermott
View author publications
You can also search for this author in PubMed Google Scholar
Michael O’Neill
View author publications
You can also search for this author in PubMed Google Scholar
Anthony Brabazon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Edgar Galván-López.

Appendix

Tables 9, 11, 13, 15 and 17 show the results on locality and Tables 10, 12, 14, 16 and 18 show the performance (measured in terms of average of the best fitness values over all runs) for the Even-3, Even-4, Artificial Ant and two Symbolic Regression problems (F ₁ and F ₂), respectively.

Table 9 Locality on the Even-3-Parity Problem using three function sets (F _E3 = {AND, OR, NOT}, F _E4 = {AND, OR, NAND, NOR} and \(F_{E3^{\ast}}=\{AND, OR, NOT2\}\)), six mutations, and three locality definitions. \(\underline{\hbox{Lower is better}}\)

Full size table

Table 10 Performance (measured in terms of average of the best fitness values over all runs) of a Mutation-Based GP on the Even-3-Parity Problem

Full size table

Table 11 Locality on the Even-4-Parity Problem using three function sets (F _E3 = {AND, OR, NOT}, F _E4 = {AND, OR, NAND, NOR} and \(F_{E3^{\ast}}=\{AND, OR, NOT2\}\)), six mutations, and three locality definitions. \(\underline{\hbox{Lower is better}}\)

Full size table

Table 12 Performance (measured in terms of average of the best fitness values over all runs) of a mutation-based GP on the Even-4-Parity problem

Full size table

Table 13 Locality on the Artificial Ant Problem using three function sets (F _A3 = {IF, PROG2, PROG3}, F _A4 = {IF, PROG2, PROG3, PROG4} and \(F_{A3^{\ast}}=\{IF3,PROG23,PROG3\}\)), six mutations, and three locality definitions. \(\underline{\hbox{Lower is better}}\)

Full size table

Table 14 Performance (measured in terms of average of the best fitness values over all runs) of a mutation-based GP on the Artificial Ant problem

Full size table

Table 15 Locality on the symbolic regression problem F ₁ using three function sets \(F_{S6}=\{+,-,{\ast},\%,Sin,Cos\}, F_{S4}=\{+,-,{\ast},\%\}, F_{S6^{\ast}}=\{+,-,{\ast},\%,Sin2,Cos2\}\), six mutations, and three locality definitions. \(\underline{\hbox{Lower is better}}\)

Full size table

Table 16 Performance (measured in terms of average of the best fitness values over all runs) of a mutation-based GP on the symbolic regression problem F ₁

Full size table

Table 17 Locality on the Symbolic Regression Problem F ₂ using three function sets \(F_{S6}=\{+,-,{\ast},\%,Sin,Cos\}, F_{S4}=\{+,-,{\ast},\%\}, F_{S6^{\ast}}=\{+,-,{\ast},\%,Sin2,Cos2\}\), six mutations, and three locality definitions. \(\underline{\hbox{Lower is better}}\)

Full size table

Table 18 Performance (measured in terms of average of the best fitness values over all runs) of a mutation-based GP on the symbolic regression problem F ₂

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Galván-López, E., McDermott, J., O’Neill, M. et al. Defining locality as a problem difficulty measure in genetic programming. Genet Program Evolvable Mach 12, 365–401 (2011). https://doi.org/10.1007/s10710-011-9136-3

Download citation

Received: 25 August 2010
Revised: 28 February 2011
Published: 02 April 2011
Issue Date: December 2011
DOI: https://doi.org/10.1007/s10710-011-9136-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Defining locality as a problem difficulty measure in genetic programming

Abstract

Access this article

Similar content being viewed by others

Locality in Continuous Fitness-Valued Cases and Genetic Programming Difficulty

On the Locality of Standard Search Operators in Grammatical Evolution

Adjudicated GP: A Behavioural Approach to Selective Breeding

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Defining locality as a problem difficulty measure in genetic programming

Abstract

Access this article

Similar content being viewed by others

Locality in Continuous Fitness-Valued Cases and Genetic Programming Difficulty

On the Locality of Standard Search Operators in Grammatical Evolution

Adjudicated GP: A Behavioural Approach to Selective Breeding

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation