Genetic Programming and Evolvable Machines

, Volume 12, Issue 4, pp 365–401 | Cite as

Defining locality as a problem difficulty measure in genetic programming

  • Edgar Galván-López
  • James McDermott
  • Michael O’Neill
  • Anthony Brabazon


A mapping is local if it preserves neighbourhood. In Evolutionary Computation, locality is generally described as the property that neighbouring genotypes correspond to neighbouring phenotypes. A representation has high locality if most genotypic neighbours are mapped to phenotypic neighbours. Locality is seen as a key element in performing effective evolutionary search. It is believed that a representation that has high locality will perform better in evolutionary search and the contrary is true for a representation that has low locality. When locality was introduced, it was the genotype-phenotype mapping in bitstring-based Genetic Algorithms which was of interest; more recently, it has also been used to study the same mapping in Grammatical Evolution. To our knowledge, there are few explicit studies of locality in Genetic Programming (GP). The goal of this paper is to shed some light on locality in GP and use it as an indicator of problem difficulty. Strictly speaking, in GP the genotype and the phenotype are not distinct. We attempt to extend the standard quantitative definition of genotype-phenotype locality to the genotype-fitness mapping by considering three possible definitions. We consider the effects of these definitions in both continuous- and discrete-valued fitness functions. We compare three different GP representations (two of them induced by using different function sets and the other using a slightly different GP encoding) and six different mutation operators. Results indicate that one definition of locality is better in predicting performance.


Locality Genotype-phenotype mapping Genotype-fitness mapping Problem hardness Genetic programming 



This research is based upon works supported by Science Foundation Ireland under Grant No. 08/IN.1/I1868 and by the Irish Research Council for Science, Engineering and Technology under the Empower scheme. The authors would like to thank the anonymous reviewers for their valuable comments. Leonardo Vanneschi is particularly thanked for his helpful suggestions and for his encouragement to further continue developing this research.


  1. 1.
    L. Altenberg, Fitness Distance Correlation Analysis: An Instructive Counterexample. in Proceedings of the Seventh International Conference on Genetic Algorithms, ed. by T. Back (Morgan Kaufmann, 1997), pp 57–64, San Francisco, CA, USAGoogle Scholar
  2. 2.
    H. Beyer, H. Schwefel, Evolution strategies—a comprehensive introduction. Nat. Comput. 1(1), 3–52 (2002)MathSciNetMATHCrossRefGoogle Scholar
  3. 3.
    M. Brameier, W. Banzhaf, Linear Genetic Programming. (Springer, New York, 2006)Google Scholar
  4. 4.
    R. Cilibrasi, P.M.B. Vitanyi, Clustering by compression. IEEE Trans. Inf. Theory 51(4), 1523–1545 (2005)MathSciNetCrossRefGoogle Scholar
  5. 5.
    M. Clergue, P. Collard, GA-Hard Functions Built by Combination of Trap Functions. In: D.B. Fogel, M.A. El-Sharkawi, X. Yao, G. Greenwood, H. Iba, P. Marrow, M. Schackleton (eds) CEC 2002: Proceedings of the 2002 Congress on Evolutionary Computation, (IEEE Press, New York, 2002) pp. 249–254.CrossRefGoogle Scholar
  6. 6.
    M. Clergue, P. Collard, M. Tomassini , L. Vanneschi, in Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2002, eds. by W.B. Langdon, E. Cantú-Paz, K.E. Mathias, R. Roy, D. Davis, R. Poli, K. Balakrishnan, V. Honavar, G. Rudolph, J. Wegener, L. Bull, M.A. Potter, A.C. Schultz, J.F. Miller, E.K. Burke, N. Jonoska. Fitness Distance Correlation and Problem Difficulty for Genetic Programming (Morgan Kaufmann Publishers, New York, 2002), pp. 724–732Google Scholar
  7. 7.
    I. De Falco, A. Iazzetta, E. Tarantino, A. Della Cioppa, G. Trautteur, in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2000). A Kolmogorov Complexity Based Genetic Programming Tool for String Compression (2000)Google Scholar
  8. 8.
    P. D’haeseleer, J. Bluming, Effects of Locality in Individual and Population Evolution. In: K.E. Kinnear (eds) Advances in Genetic Programming, (MIT Press, Cambridge, 1994) pp. 177–198.Google Scholar
  9. 9.
    A. Ekárt, S.Z. Németh, in EuroGP, number 1802 in Lecture Notes in Computer Science. A metric for genetic programs and fitness sharing (Springer, 2000), pp. 259–270Google Scholar
  10. 10.
    D.B. Fogel, A. Ghozeil, Using Fitness Distributions to Design More Efficient Evolutionary Computations (1996)Google Scholar
  11. 11.
    E. Galván-López, An Analysis of the Effects of Neutrality on Problem Hardness for Evolutionary Algorithms. PhD thesis, School of Computer Science and Electronic Engineering, University of Essex, United Kingdom (2009)Google Scholar
  12. 12.
    E. Galván-López, S. Dignum, R. Poli, The Effects of Constant Neutrality on Performance and Problem Hardness in GP. in EuroGP 2008 - 11th European Conference on Genetic Programming, vol. 4971 of LNCS, ed. by M. O’Neill, L. Vanneschi, S. Gustafson, A.I.E. Alcazar, I.D. Falco, A.D. Cioppa, E. Tarantino (Springer, 26–28 Mar. 2008), pp. 312–324, Napoli, ItalyGoogle Scholar
  13. 13.
    E. Galván-Lopéz, J. McDermott, M. O’Neill, A. Brabazon, in CEC 2010: Proceedings of the 12th Annual Congress on Evolutionary Computation. Defining locality in genetic programming to predict performance, Barcelona, Spain (IEEC Press, July 2010)Google Scholar
  14. 14.
    E. Galván-Lopéz, J. McDermott, M. O’Neill, A. Brabazon, in GECCO 2010: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation. Towards an understanding of locality in genetic programming. (ACM Press, Portland, July 2010)Google Scholar
  15. 15.
    E. Galván-López, M. O’Neill, A. Brabazon, in Artificial Intelligence, 2009. MICAI 2009. Eighth Mexican International Conference on. Towards Understanding the Effects of Locality in gp (2009), pp. 9–14Google Scholar
  16. 16.
    E. Galván-López, R. Poli, in Parallel Problem Solving from Nature (PPSN IX). 9th International Conference, vol. 4193 of LNCS, ed. by T.P. Runarsson, H.-G. Beyer, E. Burke, J.J. Merelo-Guervós, L.D. Whitley, X. Yao. Some Steps Towards Understanding How Neutrality Affects Evolutionary Search (Springer, 9–13 Sept. 2006), pp. 778–787 (Reykjavik, Iceland)Google Scholar
  17. 17.
    E. Galván-López, R. Poli, in MICAI, vol. 5845 of Lecture Notes in Computer Science, ed. by A.H. Aguirre, R.M. Borja, C.A.R. Garcia. An Empirical Investigation of How Degree Neutrality Affects gp Search (Springer, 2009), pp. 728–739Google Scholar
  18. 18.
    E. Galván López, R. Poli, C.A. Coello Coello, in Genetic Programming 7th European Conference, EuroGP 2004, Proceedings, vol. 3003 of LNCS, ed. by M. Keijzer, U.-M. O’Reilly, S. Lucas, E. Costa, T. Soule. Reusing Code in Genetic Programming (Springer, 5–7 Apr. 2004), pp. 359–368 (Coimbra, Portugal)Google Scholar
  19. 19.
    E. Galván-López, R. Poli, A. Kattan, M. O’Neill, A. Brabazon, Neutrality in evolutionary algorithms … what do we know? Evol. Syst. (2011)Google Scholar
  20. 20.
    D.E. Goldberg, Construction of high-order deceptive functions using low-order walsh coefficients. Ann. Math. Artif. Intell. 5(1), 35–47 (1992)MATHCrossRefGoogle Scholar
  21. 21.
    D.E. Goldberg, K. Deb, J. Horn, in PPSN II: Proceedings of the 2nd International Conference on Parallel Problem Solving from Nature, ed. by R. Männer, B. Manderick. Massive Multimodality, Deception, and Genetic Algorithms (Elsevier, Amsterdam, 1992), pp. 37–48Google Scholar
  22. 22.
    F.J. Gomez, in Proceedings of the 11th Annual conference on Genetic and evolutionary computation. Sustaining Diversity Using Behavioral Information Distance (ACM, Montréal, 2009), pp. 113–120Google Scholar
  23. 23.
    J. Gottlieb, B.A. Julstrom, G.R. Raidl, F. Rothlauf, in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2001), ed. by S. Spector, E. Wu, B. Voigt, Gen, Sen, Dorigo, Pezeshk, Garzon, Burke. Prufer numbers: A poor representation of spanning trees for evolutionary search (Morgan Kaufmann, 2001), pp. 343–350Google Scholar
  24. 24.
    J. Gottlieb, G.R. Raidl, in Proceedings of the Genetic and Evolutionary Computation Conference 2000. The Effects of Locality on the Dynamics of Decoder-Based Evolutionary SearchGoogle Scholar
  25. 25.
    J. Gottlieb, G.R. Raidl, in AE ’99: Selected Papers from the 4th European Conference on Artificial Evolution. Characterizing Locality in Decoder-Based EAs for the Multidimensional Knapsack Problem (Springer, London, 2000), pp. 38–52Google Scholar
  26. 26.
    S. Gustafson, L. Vanneschi, Crossover-based tree distance in genetic programming. IEEE Trans. Evol. Comput. 12(4), 506–524 (2008)CrossRefGoogle Scholar
  27. 27.
    J.H. Holland, Adaptation in Natural and Artificial Systems. (University of Michigan Press, Ann Arbor, 1975)Google Scholar
  28. 28.
    C. Igel, K. Chellapilla, Investigating the Influence of Depth and Degree of Genotypic Change on Fitness in Genetic Programming (1999)Google Scholar
  29. 29.
    T. Jiang, L. Wang, K. Zhang, Alignment of trees—an alternative to tree edit. Theor. Comput. Sci. 143(1), 137–148 (1995)MathSciNetMATHGoogle Scholar
  30. 30.
    T. Jones. Evolutionary Algorithms, Fitness Landscapes and Search. PhD thesis, University of New Mexico, Albuquerque (1995)Google Scholar
  31. 31.
    T. Jones, S. Forrest, in Proceedings of the 6th International Conference on Genetic Algorithms, ed. by L.J. Eshelman. Fitness Distance Correlation as a Measure of Problem Difficulty for Genetic Algorithms (Morgan Kaufmann Publishers, San Francisco, 1995), pp. 184–192Google Scholar
  32. 32.
    K.E. Kinnear, Jr., in Proceedings of the 1994 IEEE World Conference on Computational Intelligence, vol. 1. Fitness Landscapes and Difficulty in Genetic Programming (IEEE Press, Orlando, 27–29 June 1994) pp. 142–147Google Scholar
  33. 33.
    J.R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection. (The MIT Press, Cambridge, 1992)MATHGoogle Scholar
  34. 34.
    W. Langdon, R. Poli, Why Ants are Hard. In: J.R. Koza (eds) Proceedings of the Third Annual Conference on Genetic Programming, (Morgan Kaufmann, Madison, 1998) pp. 193–201.Google Scholar
  35. 35.
    W.B. Langdon, in 1998 IEEE International Conference on Evolutionary Computation. The Evolution of Size in Variable Length Representations (IEEE Press, 1998), pp. 633–638Google Scholar
  36. 36.
    W.B. Langdon, R. Poli, Foundations of Genetic Programming. (Springer, Berlin, 2002)MATHGoogle Scholar
  37. 37.
    P.K. Lehre, P.C. Haddow, Phenotypic complexity and local variations in neutral degree. BioSystems 87(2-3), 233–242 (2006)CrossRefGoogle Scholar
  38. 38.
    M. Li, P. Vitanyi, An Introduction to Kolmogorov Complexity and its Applications. (Springer, Berlin, 1997)MATHGoogle Scholar
  39. 39.
    B. Manderick, de M.K. Weger, P. Spiessens, The Genetic Algorithm and the Structure of the Fitness Landscape. In: R.K. Belew, L.B. Booker (eds) ICGA, (Morgan Kaufmann, Los Altos, 1991) pp. 143–150.Google Scholar
  40. 40.
    J.F. Miller, P. Thomson, in EuroGP. Cartesian Genetic Programming (Springer, 2000), pp. 121–132Google Scholar
  41. 41.
    B. Naudts, L. Kallel, A comparison of predictive measures of problem difficulty in evolutionary algorithms. IEEE Trans. Evol. Comput. 4(1), 1–15 (2000)CrossRefGoogle Scholar
  42. 42.
    P. Nordin, Evolutionary Program Induction of Binary Machine Code and its Applications. PhD thesis, der Universitat Dortmund am Fachereich Informatik (1997)Google Scholar
  43. 43.
    U.-M. O’Reilly, in IEEE International Conference on Systems, Man, and Cybernetics: Computational Cybernetics and Simulation, vol. 5. Using a Distance Metric on Genetic Programs to Understand Genetic Operators (1997)Google Scholar
  44. 44.
    R. Poli, E. Galván-López, in Foundations of Genetic Algorithms IX, Lecture Notes in Computer Science, ed. by C.R. Stephens, M. Toussaint, D. Whitley, P. Stadler. On The Effects of Bit-Wise Neutrality on Fitness Distance Correlation, Phenotypic Mutation Rates and Problem Hardness (Springer, Mexico city, 8–11 Jan. 2007), pp. 138–164Google Scholar
  45. 45.
    R. Poli, E. Galván-López, The Effects of Constant and Bit-Wise Neutrality on Hardness, Fitness Distance Correlation and Phenotypic Mutation Rataes. IEEE Trans. Evol. Comput. (2011)Google Scholar
  46. 46.
    R. Poli, W.B. Langdon, N.F. McPhee, A field guide to genetic programming. Published via and freely available at, 2008. (With contributions by J. R. Koza)
  47. 47.
    R. Poli, L. Vanneschi, in Proceedings of the 9th annual conference on Genetic and evolutionary computation, GECCO ’07. Fitness-Proportional Negative Slope Coefficient as a Hardness Measure for Genetic Algorithms (ACM, New York, 2007), pp. 1335–1342Google Scholar
  48. 48.
    B. Punch, D. Zongker, E. Godman, in Advances in Genetic Programming 2, ed. by P. Angeline, K. Kinnear. The Royal Tree Problem, A Benchmark for Single and Multi-population Genetic Programming (The MIT Press, Cambridge, 1996), pp. 299–316Google Scholar
  49. 49.
    R.J. Quick, V.J. Rayward-Smith, G.D. Smith, in Proceedings of the 5th International Conference on Parallel Problem Solving from Nature. Fitness Distance Correlation and Ridge Functions (Springer, London, 1998), pp. 77–86Google Scholar
  50. 50.
    I. Rechenberg, Evolutionsstrategie 94, volume 1 of Werkstatt Bionik und Evolutionstechnik. (Frommann-Holzboog, Stuttgart, 1994)Google Scholar
  51. 51.
    S. Ronald, Robust Encodings in Genetic Algorithms. In: Z. Michalewicz, K. Deb, M. Schmidt, T. Stidsen (eds) Evolutionary Algorithms in Engineering Applications, (Springer, Berlin, 1997) pp. 29–44.Google Scholar
  52. 52.
    F. Rothlauf, Representations for Genetic and Evolutionary Algorithms, 2nd edn. (Physica, Berlin, 2006)Google Scholar
  53. 53.
    F. Rothlauf, D. Goldberg, Redundant representations in evolutionary algorithms. Evol. Comput. 11(4), 381–415 (2003)CrossRefGoogle Scholar
  54. 54.
    F. Rothlauf, D.E. Goldberg, Pruefer numbers and genetic algorithms: A lesson how the low locality of an encoding can harm the performance of GAs. Technical Report 3/2000, Bayreuth (2000)Google Scholar
  55. 55.
    F. Rothlauf, E. Goldberg, David, Tree network design with genetic algorithms—an investigation in the locality of the pruefernumber encoding. Technical Report 6/1999, Bayreuth (1999)Google Scholar
  56. 56.
    F. Rothlauf, M. Oetzel, in Proceedings of the 9th European Conference on Genetic Programming, vol. 3905 of Lecture Notes in Computer Science, ed. by P. Collet, M. Tomassini, M. Ebner, S. Gustafson, A. Ekárt. On the Locality of Grammatical Evolution (Springer, Budapest, 10–12 Apr. 2006), pp. 320–330Google Scholar
  57. 57.
    D. Shasha, K. Zhang, in SPAA ’89: Proceedings of the First Annual ACM Symposium on Parallel Algorithms and Architectures. Fast Parallel Algorithms for the Unit Cost Editing Distance Between Trees (ACM, New York, 1989), pp. 117–126Google Scholar
  58. 58.
    P.F. Stadler, C.R. Stephens, Landscapes and effective fitness. Comments Theori. Biol. 8, 389–431 (2002)CrossRefGoogle Scholar
  59. 59.
    M. Tacker, P.F. Stadler, E.G. Bornberg-Bauer, I.L. Hofacker, P. Schuster, Algorithm indepedent properties of RNA secondary structure predictions. Eur. Biophys. J. 25(2), 115–130 (1996)CrossRefGoogle Scholar
  60. 60.
    M. Tomassini, L. Vanneschi, P. Collard, M. Clergue, A study of fitness distance correlation as a difficulty measure in genetic programming. Evol. Comput. 13(2), 213–239 (2005)CrossRefGoogle Scholar
  61. 61.
    M. Toussaint, C. Igel, in Proceedings of the IEEE Congress on Evolutionary Computation (CEC 2002). Neutrality: A Necessity for Self-Adaptation (2002), pp. 1354–1359Google Scholar
  62. 62.
    L. Vanneschi, Theory and Practice for Efficient Genetic Programming. PhD thesis, Faculty of Science, University of Lausanne, Switzerland (2004)Google Scholar
  63. 63.
    L. Vanneschi, in Genetic Programming Theory and Practive V, chap. 7, ed. by R. et al. Investigating Problem Hardness of Real Life Applications (Springer, US, 2007), pp. 107–124Google Scholar
  64. 64.
    L. Vanneschi, M. Clergue, P. Collard, M. Tomassini, S. Verel, in EuroGP, LNCS. Fitness Clouds and Problem Hardness in Genetic Programming (Springer, Berlin, 2004), pp. 690–701Google Scholar
  65. 65.
    L. Vanneschi, M. Tomassini, P. Collard, M. Clergue, in EuroGP, Lecture notes in computer science. Fitness Distance Correlation in Structural Mutation Genetic Programming (Springer, Berlin, 2003), pp. 455–464Google Scholar
  66. 66.
    L. Vanneschi, M. Tomassini, P. Collard, S. Verel, Y. Pirola, G. Mauri, in Proceedings of EuroGP 2007, vol. 4445 of LNCS. A comprehensive View of Fitness Landscapes with Neutrality and Fitness Clouds (Springer, Berlin, 2007), pp. 241–250Google Scholar
  67. 67.
    L. Vanneschi, A. Valsecchi, R. Poli, in GECCO ’09: Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation. Limitations of the Fitness-Proportional Negative Slope Coefficient as a Difficulty Measure (ACM, New York, 2009), pp. 1877–1878Google Scholar
  68. 68.
    E. Weinberger, Correlated and uncorrelated fitness landscapes and how to tell the difference. Biol. Cybern. 63(5), 325–336 (1990)MATHCrossRefGoogle Scholar
  69. 69.
    S. Wright, in Proceedings of the Sixth International Congress on Genetics, vol. 1, ed. by D.F. Jones. The Roles of Mutation, Inbreeding, Crossbreeding and Selection in Evolution (1932), pp. 356–366Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Edgar Galván-López
    • 1
  • James McDermott
    • 1
  • Michael O’Neill
    • 1
  • Anthony Brabazon
    • 1
  1. 1.Natural Computing Research and Applications GroupUniversity College DublinDublinIreland

Personalised recommendations