Abstract
A mapping is local if it preserves neighbourhood. In Evolutionary Computation, locality is generally described as the property that neighbouring genotypes correspond to neighbouring phenotypes. A representation has high locality if most genotypic neighbours are mapped to phenotypic neighbours. Locality is seen as a key element in performing effective evolutionary search. It is believed that a representation that has high locality will perform better in evolutionary search and the contrary is true for a representation that has low locality. When locality was introduced, it was the genotype-phenotype mapping in bitstring-based Genetic Algorithms which was of interest; more recently, it has also been used to study the same mapping in Grammatical Evolution. To our knowledge, there are few explicit studies of locality in Genetic Programming (GP). The goal of this paper is to shed some light on locality in GP and use it as an indicator of problem difficulty. Strictly speaking, in GP the genotype and the phenotype are not distinct. We attempt to extend the standard quantitative definition of genotype-phenotype locality to the genotype-fitness mapping by considering three possible definitions. We consider the effects of these definitions in both continuous- and discrete-valued fitness functions. We compare three different GP representations (two of them induced by using different function sets and the other using a slightly different GP encoding) and six different mutation operators. Results indicate that one definition of locality is better in predicting performance.
Similar content being viewed by others
Notes
The term locality has also been used in an unrelated context, to refer to the quasi-geographical distribution of an EC population [8].
Notice that this type of encoding is distinct only when the arities defined in the function set are of different values. So, if arities are all the same, Uniform GP reduces to standard GP.
Notice that Size-fair subtree mutation has two variants.
Notice that when using the permutation mutation operator and using F E3 and F E4 on the Even-n-Parity problem, the fd is always 0 because all of the operators in these function sets are symmetric.
100 independent runs, 45 different settings (i.e., three different combinations of population sizes and number of generations, five different problems and three different function sets for each of the five problems—3 × 5 × 3), and 6 different mutation operators.
References
L. Altenberg, Fitness Distance Correlation Analysis: An Instructive Counterexample. in Proceedings of the Seventh International Conference on Genetic Algorithms, ed. by T. Back (Morgan Kaufmann, 1997), pp 57–64, San Francisco, CA, USA
H. Beyer, H. Schwefel, Evolution strategies—a comprehensive introduction. Nat. Comput. 1(1), 3–52 (2002)
M. Brameier, W. Banzhaf, Linear Genetic Programming. (Springer, New York, 2006)
R. Cilibrasi, P.M.B. Vitanyi, Clustering by compression. IEEE Trans. Inf. Theory 51(4), 1523–1545 (2005)
M. Clergue, P. Collard, GA-Hard Functions Built by Combination of Trap Functions. In: D.B. Fogel, M.A. El-Sharkawi, X. Yao, G. Greenwood, H. Iba, P. Marrow, M. Schackleton (eds) CEC 2002: Proceedings of the 2002 Congress on Evolutionary Computation, (IEEE Press, New York, 2002) pp. 249–254.
M. Clergue, P. Collard, M. Tomassini , L. Vanneschi, in Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2002, eds. by W.B. Langdon, E. Cantú-Paz, K.E. Mathias, R. Roy, D. Davis, R. Poli, K. Balakrishnan, V. Honavar, G. Rudolph, J. Wegener, L. Bull, M.A. Potter, A.C. Schultz, J.F. Miller, E.K. Burke, N. Jonoska. Fitness Distance Correlation and Problem Difficulty for Genetic Programming (Morgan Kaufmann Publishers, New York, 2002), pp. 724–732
I. De Falco, A. Iazzetta, E. Tarantino, A. Della Cioppa, G. Trautteur, in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2000). A Kolmogorov Complexity Based Genetic Programming Tool for String Compression (2000)
P. D’haeseleer, J. Bluming, Effects of Locality in Individual and Population Evolution. In: K.E. Kinnear (eds) Advances in Genetic Programming, (MIT Press, Cambridge, 1994) pp. 177–198.
A. Ekárt, S.Z. Németh, in EuroGP, number 1802 in Lecture Notes in Computer Science. A metric for genetic programs and fitness sharing (Springer, 2000), pp. 259–270
D.B. Fogel, A. Ghozeil, Using Fitness Distributions to Design More Efficient Evolutionary Computations (1996)
E. Galván-López, An Analysis of the Effects of Neutrality on Problem Hardness for Evolutionary Algorithms. PhD thesis, School of Computer Science and Electronic Engineering, University of Essex, United Kingdom (2009)
E. Galván-López, S. Dignum, R. Poli, The Effects of Constant Neutrality on Performance and Problem Hardness in GP. in EuroGP 2008 - 11th European Conference on Genetic Programming, vol. 4971 of LNCS, ed. by M. O’Neill, L. Vanneschi, S. Gustafson, A.I.E. Alcazar, I.D. Falco, A.D. Cioppa, E. Tarantino (Springer, 26–28 Mar. 2008), pp. 312–324, Napoli, Italy
E. Galván-Lopéz, J. McDermott, M. O’Neill, A. Brabazon, in CEC 2010: Proceedings of the 12th Annual Congress on Evolutionary Computation. Defining locality in genetic programming to predict performance, Barcelona, Spain (IEEC Press, July 2010)
E. Galván-Lopéz, J. McDermott, M. O’Neill, A. Brabazon, in GECCO 2010: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation. Towards an understanding of locality in genetic programming. (ACM Press, Portland, July 2010)
E. Galván-López, M. O’Neill, A. Brabazon, in Artificial Intelligence, 2009. MICAI 2009. Eighth Mexican International Conference on. Towards Understanding the Effects of Locality in gp (2009), pp. 9–14
E. Galván-López, R. Poli, in Parallel Problem Solving from Nature (PPSN IX). 9th International Conference, vol. 4193 of LNCS, ed. by T.P. Runarsson, H.-G. Beyer, E. Burke, J.J. Merelo-Guervós, L.D. Whitley, X. Yao. Some Steps Towards Understanding How Neutrality Affects Evolutionary Search (Springer, 9–13 Sept. 2006), pp. 778–787 (Reykjavik, Iceland)
E. Galván-López, R. Poli, in MICAI, vol. 5845 of Lecture Notes in Computer Science, ed. by A.H. Aguirre, R.M. Borja, C.A.R. Garcia. An Empirical Investigation of How Degree Neutrality Affects gp Search (Springer, 2009), pp. 728–739
E. Galván López, R. Poli, C.A. Coello Coello, in Genetic Programming 7th European Conference, EuroGP 2004, Proceedings, vol. 3003 of LNCS, ed. by M. Keijzer, U.-M. O’Reilly, S. Lucas, E. Costa, T. Soule. Reusing Code in Genetic Programming (Springer, 5–7 Apr. 2004), pp. 359–368 (Coimbra, Portugal)
E. Galván-López, R. Poli, A. Kattan, M. O’Neill, A. Brabazon, Neutrality in evolutionary algorithms … what do we know? Evol. Syst. (2011)
D.E. Goldberg, Construction of high-order deceptive functions using low-order walsh coefficients. Ann. Math. Artif. Intell. 5(1), 35–47 (1992)
D.E. Goldberg, K. Deb, J. Horn, in PPSN II: Proceedings of the 2nd International Conference on Parallel Problem Solving from Nature, ed. by R. Männer, B. Manderick. Massive Multimodality, Deception, and Genetic Algorithms (Elsevier, Amsterdam, 1992), pp. 37–48
F.J. Gomez, in Proceedings of the 11th Annual conference on Genetic and evolutionary computation. Sustaining Diversity Using Behavioral Information Distance (ACM, Montréal, 2009), pp. 113–120
J. Gottlieb, B.A. Julstrom, G.R. Raidl, F. Rothlauf, in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2001), ed. by S. Spector, E. Wu, B. Voigt, Gen, Sen, Dorigo, Pezeshk, Garzon, Burke. Prufer numbers: A poor representation of spanning trees for evolutionary search (Morgan Kaufmann, 2001), pp. 343–350
J. Gottlieb, G.R. Raidl, in Proceedings of the Genetic and Evolutionary Computation Conference 2000. The Effects of Locality on the Dynamics of Decoder-Based Evolutionary Search
J. Gottlieb, G.R. Raidl, in AE ’99: Selected Papers from the 4th European Conference on Artificial Evolution. Characterizing Locality in Decoder-Based EAs for the Multidimensional Knapsack Problem (Springer, London, 2000), pp. 38–52
S. Gustafson, L. Vanneschi, Crossover-based tree distance in genetic programming. IEEE Trans. Evol. Comput. 12(4), 506–524 (2008)
J.H. Holland, Adaptation in Natural and Artificial Systems. (University of Michigan Press, Ann Arbor, 1975)
C. Igel, K. Chellapilla, Investigating the Influence of Depth and Degree of Genotypic Change on Fitness in Genetic Programming (1999)
T. Jiang, L. Wang, K. Zhang, Alignment of trees—an alternative to tree edit. Theor. Comput. Sci. 143(1), 137–148 (1995)
T. Jones. Evolutionary Algorithms, Fitness Landscapes and Search. PhD thesis, University of New Mexico, Albuquerque (1995)
T. Jones, S. Forrest, in Proceedings of the 6th International Conference on Genetic Algorithms, ed. by L.J. Eshelman. Fitness Distance Correlation as a Measure of Problem Difficulty for Genetic Algorithms (Morgan Kaufmann Publishers, San Francisco, 1995), pp. 184–192
K.E. Kinnear, Jr., in Proceedings of the 1994 IEEE World Conference on Computational Intelligence, vol. 1. Fitness Landscapes and Difficulty in Genetic Programming (IEEE Press, Orlando, 27–29 June 1994) pp. 142–147
J.R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection. (The MIT Press, Cambridge, 1992)
W. Langdon, R. Poli, Why Ants are Hard. In: J.R. Koza (eds) Proceedings of the Third Annual Conference on Genetic Programming, (Morgan Kaufmann, Madison, 1998) pp. 193–201.
W.B. Langdon, in 1998 IEEE International Conference on Evolutionary Computation. The Evolution of Size in Variable Length Representations (IEEE Press, 1998), pp. 633–638
W.B. Langdon, R. Poli, Foundations of Genetic Programming. (Springer, Berlin, 2002)
P.K. Lehre, P.C. Haddow, Phenotypic complexity and local variations in neutral degree. BioSystems 87(2-3), 233–242 (2006)
M. Li, P. Vitanyi, An Introduction to Kolmogorov Complexity and its Applications. (Springer, Berlin, 1997)
B. Manderick, de M.K. Weger, P. Spiessens, The Genetic Algorithm and the Structure of the Fitness Landscape. In: R.K. Belew, L.B. Booker (eds) ICGA, (Morgan Kaufmann, Los Altos, 1991) pp. 143–150.
J.F. Miller, P. Thomson, in EuroGP. Cartesian Genetic Programming (Springer, 2000), pp. 121–132
B. Naudts, L. Kallel, A comparison of predictive measures of problem difficulty in evolutionary algorithms. IEEE Trans. Evol. Comput. 4(1), 1–15 (2000)
P. Nordin, Evolutionary Program Induction of Binary Machine Code and its Applications. PhD thesis, der Universitat Dortmund am Fachereich Informatik (1997)
U.-M. O’Reilly, in IEEE International Conference on Systems, Man, and Cybernetics: Computational Cybernetics and Simulation, vol. 5. Using a Distance Metric on Genetic Programs to Understand Genetic Operators (1997)
R. Poli, E. Galván-López, in Foundations of Genetic Algorithms IX, Lecture Notes in Computer Science, ed. by C.R. Stephens, M. Toussaint, D. Whitley, P. Stadler. On The Effects of Bit-Wise Neutrality on Fitness Distance Correlation, Phenotypic Mutation Rates and Problem Hardness (Springer, Mexico city, 8–11 Jan. 2007), pp. 138–164
R. Poli, E. Galván-López, The Effects of Constant and Bit-Wise Neutrality on Hardness, Fitness Distance Correlation and Phenotypic Mutation Rataes. IEEE Trans. Evol. Comput. (2011)
R. Poli, W.B. Langdon, N.F. McPhee, A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk, 2008. (With contributions by J. R. Koza)
R. Poli, L. Vanneschi, in Proceedings of the 9th annual conference on Genetic and evolutionary computation, GECCO ’07. Fitness-Proportional Negative Slope Coefficient as a Hardness Measure for Genetic Algorithms (ACM, New York, 2007), pp. 1335–1342
B. Punch, D. Zongker, E. Godman, in Advances in Genetic Programming 2, ed. by P. Angeline, K. Kinnear. The Royal Tree Problem, A Benchmark for Single and Multi-population Genetic Programming (The MIT Press, Cambridge, 1996), pp. 299–316
R.J. Quick, V.J. Rayward-Smith, G.D. Smith, in Proceedings of the 5th International Conference on Parallel Problem Solving from Nature. Fitness Distance Correlation and Ridge Functions (Springer, London, 1998), pp. 77–86
I. Rechenberg, Evolutionsstrategie 94, volume 1 of Werkstatt Bionik und Evolutionstechnik. (Frommann-Holzboog, Stuttgart, 1994)
S. Ronald, Robust Encodings in Genetic Algorithms. In: Z. Michalewicz, K. Deb, M. Schmidt, T. Stidsen (eds) Evolutionary Algorithms in Engineering Applications, (Springer, Berlin, 1997) pp. 29–44.
F. Rothlauf, Representations for Genetic and Evolutionary Algorithms, 2nd edn. (Physica, Berlin, 2006)
F. Rothlauf, D. Goldberg, Redundant representations in evolutionary algorithms. Evol. Comput. 11(4), 381–415 (2003)
F. Rothlauf, D.E. Goldberg, Pruefer numbers and genetic algorithms: A lesson how the low locality of an encoding can harm the performance of GAs. Technical Report 3/2000, Bayreuth (2000)
F. Rothlauf, E. Goldberg, David, Tree network design with genetic algorithms—an investigation in the locality of the pruefernumber encoding. Technical Report 6/1999, Bayreuth (1999)
F. Rothlauf, M. Oetzel, in Proceedings of the 9th European Conference on Genetic Programming, vol. 3905 of Lecture Notes in Computer Science, ed. by P. Collet, M. Tomassini, M. Ebner, S. Gustafson, A. Ekárt. On the Locality of Grammatical Evolution (Springer, Budapest, 10–12 Apr. 2006), pp. 320–330
D. Shasha, K. Zhang, in SPAA ’89: Proceedings of the First Annual ACM Symposium on Parallel Algorithms and Architectures. Fast Parallel Algorithms for the Unit Cost Editing Distance Between Trees (ACM, New York, 1989), pp. 117–126
P.F. Stadler, C.R. Stephens, Landscapes and effective fitness. Comments Theori. Biol. 8, 389–431 (2002)
M. Tacker, P.F. Stadler, E.G. Bornberg-Bauer, I.L. Hofacker, P. Schuster, Algorithm indepedent properties of RNA secondary structure predictions. Eur. Biophys. J. 25(2), 115–130 (1996)
M. Tomassini, L. Vanneschi, P. Collard, M. Clergue, A study of fitness distance correlation as a difficulty measure in genetic programming. Evol. Comput. 13(2), 213–239 (2005)
M. Toussaint, C. Igel, in Proceedings of the IEEE Congress on Evolutionary Computation (CEC 2002). Neutrality: A Necessity for Self-Adaptation (2002), pp. 1354–1359
L. Vanneschi, Theory and Practice for Efficient Genetic Programming. PhD thesis, Faculty of Science, University of Lausanne, Switzerland (2004)
L. Vanneschi, in Genetic Programming Theory and Practive V, chap. 7, ed. by R. et al. Investigating Problem Hardness of Real Life Applications (Springer, US, 2007), pp. 107–124
L. Vanneschi, M. Clergue, P. Collard, M. Tomassini, S. Verel, in EuroGP, LNCS. Fitness Clouds and Problem Hardness in Genetic Programming (Springer, Berlin, 2004), pp. 690–701
L. Vanneschi, M. Tomassini, P. Collard, M. Clergue, in EuroGP, Lecture notes in computer science. Fitness Distance Correlation in Structural Mutation Genetic Programming (Springer, Berlin, 2003), pp. 455–464
L. Vanneschi, M. Tomassini, P. Collard, S. Verel, Y. Pirola, G. Mauri, in Proceedings of EuroGP 2007, vol. 4445 of LNCS. A comprehensive View of Fitness Landscapes with Neutrality and Fitness Clouds (Springer, Berlin, 2007), pp. 241–250
L. Vanneschi, A. Valsecchi, R. Poli, in GECCO ’09: Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation. Limitations of the Fitness-Proportional Negative Slope Coefficient as a Difficulty Measure (ACM, New York, 2009), pp. 1877–1878
E. Weinberger, Correlated and uncorrelated fitness landscapes and how to tell the difference. Biol. Cybern. 63(5), 325–336 (1990)
S. Wright, in Proceedings of the Sixth International Congress on Genetics, vol. 1, ed. by D.F. Jones. The Roles of Mutation, Inbreeding, Crossbreeding and Selection in Evolution (1932), pp. 356–366
Acknowledgments
This research is based upon works supported by Science Foundation Ireland under Grant No. 08/IN.1/I1868 and by the Irish Research Council for Science, Engineering and Technology under the Empower scheme. The authors would like to thank the anonymous reviewers for their valuable comments. Leonardo Vanneschi is particularly thanked for his helpful suggestions and for his encouragement to further continue developing this research.
Author information
Authors and Affiliations
Corresponding author
Appendix
Rights and permissions
About this article
Cite this article
Galván-López, E., McDermott, J., O’Neill, M. et al. Defining locality as a problem difficulty measure in genetic programming. Genet Program Evolvable Mach 12, 365–401 (2011). https://doi.org/10.1007/s10710-011-9136-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10710-011-9136-3