Skip to main content

Advertisement

Log in

Defining locality as a problem difficulty measure in genetic programming

  • Published:
Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Abstract

A mapping is local if it preserves neighbourhood. In Evolutionary Computation, locality is generally described as the property that neighbouring genotypes correspond to neighbouring phenotypes. A representation has high locality if most genotypic neighbours are mapped to phenotypic neighbours. Locality is seen as a key element in performing effective evolutionary search. It is believed that a representation that has high locality will perform better in evolutionary search and the contrary is true for a representation that has low locality. When locality was introduced, it was the genotype-phenotype mapping in bitstring-based Genetic Algorithms which was of interest; more recently, it has also been used to study the same mapping in Grammatical Evolution. To our knowledge, there are few explicit studies of locality in Genetic Programming (GP). The goal of this paper is to shed some light on locality in GP and use it as an indicator of problem difficulty. Strictly speaking, in GP the genotype and the phenotype are not distinct. We attempt to extend the standard quantitative definition of genotype-phenotype locality to the genotype-fitness mapping by considering three possible definitions. We consider the effects of these definitions in both continuous- and discrete-valued fitness functions. We compare three different GP representations (two of them induced by using different function sets and the other using a slightly different GP encoding) and six different mutation operators. Results indicate that one definition of locality is better in predicting performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. The term locality has also been used in an unrelated context, to refer to the quasi-geographical distribution of an EC population [8].

  2. Notice that this type of encoding is distinct only when the arities defined in the function set are of different values. So, if arities are all the same, Uniform GP reduces to standard GP.

  3. Notice that Size-fair subtree mutation has two variants.

  4. Notice that when using the permutation mutation operator and using F E3 and F E4 on the Even-n-Parity problem, the fd is always 0 because all of the operators in these function sets are symmetric.

  5. 100 independent runs, 45 different settings (i.e., three different combinations of population sizes and number of generations, five different problems and three different function sets for each of the five problems—3 × 5 × 3), and 6 different mutation operators.

References

  1. L. Altenberg, Fitness Distance Correlation Analysis: An Instructive Counterexample. in Proceedings of the Seventh International Conference on Genetic Algorithms, ed. by T. Back (Morgan Kaufmann, 1997), pp 57–64, San Francisco, CA, USA

  2. H. Beyer, H. Schwefel, Evolution strategies—a comprehensive introduction. Nat. Comput. 1(1), 3–52 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  3. M. Brameier, W. Banzhaf, Linear Genetic Programming. (Springer, New York, 2006)

    Google Scholar 

  4. R. Cilibrasi, P.M.B. Vitanyi, Clustering by compression. IEEE Trans. Inf. Theory 51(4), 1523–1545 (2005)

    Article  MathSciNet  Google Scholar 

  5. M. Clergue, P. Collard, GA-Hard Functions Built by Combination of Trap Functions. In: D.B. Fogel, M.A. El-Sharkawi, X. Yao, G. Greenwood, H. Iba, P. Marrow, M. Schackleton (eds) CEC 2002: Proceedings of the 2002 Congress on Evolutionary Computation, (IEEE Press, New York, 2002) pp. 249–254.

    Chapter  Google Scholar 

  6. M. Clergue, P. Collard, M. Tomassini , L. Vanneschi, in Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2002, eds. by W.B. Langdon, E. Cantú-Paz, K.E. Mathias, R. Roy, D. Davis, R. Poli, K. Balakrishnan, V. Honavar, G. Rudolph, J. Wegener, L. Bull, M.A. Potter, A.C. Schultz, J.F. Miller, E.K. Burke, N. Jonoska. Fitness Distance Correlation and Problem Difficulty for Genetic Programming (Morgan Kaufmann Publishers, New York, 2002), pp. 724–732

  7. I. De Falco, A. Iazzetta, E. Tarantino, A. Della Cioppa, G. Trautteur, in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2000). A Kolmogorov Complexity Based Genetic Programming Tool for String Compression (2000)

  8. P. D’haeseleer, J. Bluming, Effects of Locality in Individual and Population Evolution. In: K.E. Kinnear (eds) Advances in Genetic Programming, (MIT Press, Cambridge, 1994) pp. 177–198.

    Google Scholar 

  9. A. Ekárt, S.Z. Németh, in EuroGP, number 1802 in Lecture Notes in Computer Science. A metric for genetic programs and fitness sharing (Springer, 2000), pp. 259–270

  10. D.B. Fogel, A. Ghozeil, Using Fitness Distributions to Design More Efficient Evolutionary Computations (1996)

  11. E. Galván-López, An Analysis of the Effects of Neutrality on Problem Hardness for Evolutionary Algorithms. PhD thesis, School of Computer Science and Electronic Engineering, University of Essex, United Kingdom (2009)

  12. E. Galván-López, S. Dignum, R. Poli, The Effects of Constant Neutrality on Performance and Problem Hardness in GP. in EuroGP 2008 - 11th European Conference on Genetic Programming, vol. 4971 of LNCS, ed. by M. O’Neill, L. Vanneschi, S. Gustafson, A.I.E. Alcazar, I.D. Falco, A.D. Cioppa, E. Tarantino (Springer, 26–28 Mar. 2008), pp. 312–324, Napoli, Italy

  13. E. Galván-Lopéz, J. McDermott, M. O’Neill, A. Brabazon, in CEC 2010: Proceedings of the 12th Annual Congress on Evolutionary Computation. Defining locality in genetic programming to predict performance, Barcelona, Spain (IEEC Press, July 2010)

  14. E. Galván-Lopéz, J. McDermott, M. O’Neill, A. Brabazon, in GECCO 2010: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation. Towards an understanding of locality in genetic programming. (ACM Press, Portland, July 2010)

  15. E. Galván-López, M. O’Neill, A. Brabazon, in Artificial Intelligence, 2009. MICAI 2009. Eighth Mexican International Conference on. Towards Understanding the Effects of Locality in gp (2009), pp. 9–14

  16. E. Galván-López, R. Poli, in Parallel Problem Solving from Nature (PPSN IX). 9th International Conference, vol. 4193 of LNCS, ed. by T.P. Runarsson, H.-G. Beyer, E. Burke, J.J. Merelo-Guervós, L.D. Whitley, X. Yao. Some Steps Towards Understanding How Neutrality Affects Evolutionary Search (Springer, 9–13 Sept. 2006), pp. 778–787 (Reykjavik, Iceland)

  17. E. Galván-López, R. Poli, in MICAI, vol. 5845 of Lecture Notes in Computer Science, ed. by A.H. Aguirre, R.M. Borja, C.A.R. Garcia. An Empirical Investigation of How Degree Neutrality Affects gp Search (Springer, 2009), pp. 728–739

  18. E. Galván López, R. Poli, C.A. Coello Coello, in Genetic Programming 7th European Conference, EuroGP 2004, Proceedings, vol. 3003 of LNCS, ed. by M. Keijzer, U.-M. O’Reilly, S. Lucas, E. Costa, T. Soule. Reusing Code in Genetic Programming (Springer, 5–7 Apr. 2004), pp. 359–368 (Coimbra, Portugal)

  19. E. Galván-López, R. Poli, A. Kattan, M. O’Neill, A. Brabazon, Neutrality in evolutionary algorithms … what do we know? Evol. Syst. (2011)

  20. D.E. Goldberg, Construction of high-order deceptive functions using low-order walsh coefficients. Ann. Math. Artif. Intell. 5(1), 35–47 (1992)

    Article  MATH  Google Scholar 

  21. D.E. Goldberg, K. Deb, J. Horn, in PPSN II: Proceedings of the 2nd International Conference on Parallel Problem Solving from Nature, ed. by R. Männer, B. Manderick. Massive Multimodality, Deception, and Genetic Algorithms (Elsevier, Amsterdam, 1992), pp. 37–48

  22. F.J. Gomez, in Proceedings of the 11th Annual conference on Genetic and evolutionary computation. Sustaining Diversity Using Behavioral Information Distance (ACM, Montréal, 2009), pp. 113–120

  23. J. Gottlieb, B.A. Julstrom, G.R. Raidl, F. Rothlauf, in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2001), ed. by S. Spector, E. Wu, B. Voigt, Gen, Sen, Dorigo, Pezeshk, Garzon, Burke. Prufer numbers: A poor representation of spanning trees for evolutionary search (Morgan Kaufmann, 2001), pp. 343–350

  24. J. Gottlieb, G.R. Raidl, in Proceedings of the Genetic and Evolutionary Computation Conference 2000. The Effects of Locality on the Dynamics of Decoder-Based Evolutionary Search

  25. J. Gottlieb, G.R. Raidl, in AE ’99: Selected Papers from the 4th European Conference on Artificial Evolution. Characterizing Locality in Decoder-Based EAs for the Multidimensional Knapsack Problem (Springer, London, 2000), pp. 38–52

  26. S. Gustafson, L. Vanneschi, Crossover-based tree distance in genetic programming. IEEE Trans. Evol. Comput. 12(4), 506–524 (2008)

    Article  Google Scholar 

  27. J.H. Holland, Adaptation in Natural and Artificial Systems. (University of Michigan Press, Ann Arbor, 1975)

    Google Scholar 

  28. C. Igel, K. Chellapilla, Investigating the Influence of Depth and Degree of Genotypic Change on Fitness in Genetic Programming (1999)

  29. T. Jiang, L. Wang, K. Zhang, Alignment of trees—an alternative to tree edit. Theor. Comput. Sci. 143(1), 137–148 (1995)

    MathSciNet  MATH  Google Scholar 

  30. T. Jones. Evolutionary Algorithms, Fitness Landscapes and Search. PhD thesis, University of New Mexico, Albuquerque (1995)

  31. T. Jones, S. Forrest, in Proceedings of the 6th International Conference on Genetic Algorithms, ed. by L.J. Eshelman. Fitness Distance Correlation as a Measure of Problem Difficulty for Genetic Algorithms (Morgan Kaufmann Publishers, San Francisco, 1995), pp. 184–192

  32. K.E. Kinnear, Jr., in Proceedings of the 1994 IEEE World Conference on Computational Intelligence, vol. 1. Fitness Landscapes and Difficulty in Genetic Programming (IEEE Press, Orlando, 27–29 June 1994) pp. 142–147

  33. J.R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection. (The MIT Press, Cambridge, 1992)

    MATH  Google Scholar 

  34. W. Langdon, R. Poli, Why Ants are Hard. In: J.R. Koza (eds) Proceedings of the Third Annual Conference on Genetic Programming, (Morgan Kaufmann, Madison, 1998) pp. 193–201.

    Google Scholar 

  35. W.B. Langdon, in 1998 IEEE International Conference on Evolutionary Computation. The Evolution of Size in Variable Length Representations (IEEE Press, 1998), pp. 633–638

  36. W.B. Langdon, R. Poli, Foundations of Genetic Programming. (Springer, Berlin, 2002)

    MATH  Google Scholar 

  37. P.K. Lehre, P.C. Haddow, Phenotypic complexity and local variations in neutral degree. BioSystems 87(2-3), 233–242 (2006)

    Article  Google Scholar 

  38. M. Li, P. Vitanyi, An Introduction to Kolmogorov Complexity and its Applications. (Springer, Berlin, 1997)

    MATH  Google Scholar 

  39. B. Manderick, de M.K. Weger, P. Spiessens, The Genetic Algorithm and the Structure of the Fitness Landscape. In: R.K. Belew, L.B. Booker (eds) ICGA, (Morgan Kaufmann, Los Altos, 1991) pp. 143–150.

    Google Scholar 

  40. J.F. Miller, P. Thomson, in EuroGP. Cartesian Genetic Programming (Springer, 2000), pp. 121–132

  41. B. Naudts, L. Kallel, A comparison of predictive measures of problem difficulty in evolutionary algorithms. IEEE Trans. Evol. Comput. 4(1), 1–15 (2000)

    Article  Google Scholar 

  42. P. Nordin, Evolutionary Program Induction of Binary Machine Code and its Applications. PhD thesis, der Universitat Dortmund am Fachereich Informatik (1997)

  43. U.-M. O’Reilly, in IEEE International Conference on Systems, Man, and Cybernetics: Computational Cybernetics and Simulation, vol. 5. Using a Distance Metric on Genetic Programs to Understand Genetic Operators (1997)

  44. R. Poli, E. Galván-López, in Foundations of Genetic Algorithms IX, Lecture Notes in Computer Science, ed. by C.R. Stephens, M. Toussaint, D. Whitley, P. Stadler. On The Effects of Bit-Wise Neutrality on Fitness Distance Correlation, Phenotypic Mutation Rates and Problem Hardness (Springer, Mexico city, 8–11 Jan. 2007), pp. 138–164

  45. R. Poli, E. Galván-López, The Effects of Constant and Bit-Wise Neutrality on Hardness, Fitness Distance Correlation and Phenotypic Mutation Rataes. IEEE Trans. Evol. Comput. (2011)

  46. R. Poli, W.B. Langdon, N.F. McPhee, A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk, 2008. (With contributions by J. R. Koza)

  47. R. Poli, L. Vanneschi, in Proceedings of the 9th annual conference on Genetic and evolutionary computation, GECCO ’07. Fitness-Proportional Negative Slope Coefficient as a Hardness Measure for Genetic Algorithms (ACM, New York, 2007), pp. 1335–1342

  48. B. Punch, D. Zongker, E. Godman, in Advances in Genetic Programming 2, ed. by P. Angeline, K. Kinnear. The Royal Tree Problem, A Benchmark for Single and Multi-population Genetic Programming (The MIT Press, Cambridge, 1996), pp. 299–316

  49. R.J. Quick, V.J. Rayward-Smith, G.D. Smith, in Proceedings of the 5th International Conference on Parallel Problem Solving from Nature. Fitness Distance Correlation and Ridge Functions (Springer, London, 1998), pp. 77–86

  50. I. Rechenberg, Evolutionsstrategie 94, volume 1 of Werkstatt Bionik und Evolutionstechnik. (Frommann-Holzboog, Stuttgart, 1994)

    Google Scholar 

  51. S. Ronald, Robust Encodings in Genetic Algorithms. In: Z. Michalewicz, K. Deb, M. Schmidt, T. Stidsen (eds) Evolutionary Algorithms in Engineering Applications, (Springer, Berlin, 1997) pp. 29–44.

    Google Scholar 

  52. F. Rothlauf, Representations for Genetic and Evolutionary Algorithms, 2nd edn. (Physica, Berlin, 2006)

    Google Scholar 

  53. F. Rothlauf, D. Goldberg, Redundant representations in evolutionary algorithms. Evol. Comput. 11(4), 381–415 (2003)

    Article  Google Scholar 

  54. F. Rothlauf, D.E. Goldberg, Pruefer numbers and genetic algorithms: A lesson how the low locality of an encoding can harm the performance of GAs. Technical Report 3/2000, Bayreuth (2000)

  55. F. Rothlauf, E. Goldberg, David, Tree network design with genetic algorithms—an investigation in the locality of the pruefernumber encoding. Technical Report 6/1999, Bayreuth (1999)

  56. F. Rothlauf, M. Oetzel, in Proceedings of the 9th European Conference on Genetic Programming, vol. 3905 of Lecture Notes in Computer Science, ed. by P. Collet, M. Tomassini, M. Ebner, S. Gustafson, A. Ekárt. On the Locality of Grammatical Evolution (Springer, Budapest, 10–12 Apr. 2006), pp. 320–330

  57. D. Shasha, K. Zhang, in SPAA ’89: Proceedings of the First Annual ACM Symposium on Parallel Algorithms and Architectures. Fast Parallel Algorithms for the Unit Cost Editing Distance Between Trees (ACM, New York, 1989), pp. 117–126

  58. P.F. Stadler, C.R. Stephens, Landscapes and effective fitness. Comments Theori. Biol. 8, 389–431 (2002)

    Article  Google Scholar 

  59. M. Tacker, P.F. Stadler, E.G. Bornberg-Bauer, I.L. Hofacker, P. Schuster, Algorithm indepedent properties of RNA secondary structure predictions. Eur. Biophys. J. 25(2), 115–130 (1996)

    Article  Google Scholar 

  60. M. Tomassini, L. Vanneschi, P. Collard, M. Clergue, A study of fitness distance correlation as a difficulty measure in genetic programming. Evol. Comput. 13(2), 213–239 (2005)

    Article  Google Scholar 

  61. M. Toussaint, C. Igel, in Proceedings of the IEEE Congress on Evolutionary Computation (CEC 2002). Neutrality: A Necessity for Self-Adaptation (2002), pp. 1354–1359

  62. L. Vanneschi, Theory and Practice for Efficient Genetic Programming. PhD thesis, Faculty of Science, University of Lausanne, Switzerland (2004)

  63. L. Vanneschi, in Genetic Programming Theory and Practive V, chap. 7, ed. by R. et al. Investigating Problem Hardness of Real Life Applications (Springer, US, 2007), pp. 107–124

  64. L. Vanneschi, M. Clergue, P. Collard, M. Tomassini, S. Verel, in EuroGP, LNCS. Fitness Clouds and Problem Hardness in Genetic Programming (Springer, Berlin, 2004), pp. 690–701

  65. L. Vanneschi, M. Tomassini, P. Collard, M. Clergue, in EuroGP, Lecture notes in computer science. Fitness Distance Correlation in Structural Mutation Genetic Programming (Springer, Berlin, 2003), pp. 455–464

  66. L. Vanneschi, M. Tomassini, P. Collard, S. Verel, Y. Pirola, G. Mauri, in Proceedings of EuroGP 2007, vol. 4445 of LNCS. A comprehensive View of Fitness Landscapes with Neutrality and Fitness Clouds (Springer, Berlin, 2007), pp. 241–250

  67. L. Vanneschi, A. Valsecchi, R. Poli, in GECCO ’09: Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation. Limitations of the Fitness-Proportional Negative Slope Coefficient as a Difficulty Measure (ACM, New York, 2009), pp. 1877–1878

  68. E. Weinberger, Correlated and uncorrelated fitness landscapes and how to tell the difference. Biol. Cybern. 63(5), 325–336 (1990)

    Article  MATH  Google Scholar 

  69. S. Wright, in Proceedings of the Sixth International Congress on Genetics, vol. 1, ed. by D.F. Jones. The Roles of Mutation, Inbreeding, Crossbreeding and Selection in Evolution (1932), pp. 356–366

Download references

Acknowledgments

This research is based upon works supported by Science Foundation Ireland under Grant No. 08/IN.1/I1868 and by the Irish Research Council for Science, Engineering and Technology under the Empower scheme. The authors would like to thank the anonymous reviewers for their valuable comments. Leonardo Vanneschi is particularly thanked for his helpful suggestions and for his encouragement to further continue developing this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edgar Galván-López.

Appendix

Appendix

Tables 9, 11, 13, 15 and 17 show the results on locality and Tables 10, 12, 14, 16 and 18 show the performance (measured in terms of average of the best fitness values over all runs) for the Even-3, Even-4, Artificial Ant and two Symbolic Regression problems (F 1 and F 2), respectively.

Table 9 Locality on the Even-3-Parity Problem using three function sets (F E3 = {ANDORNOT}, F E4 = {ANDORNANDNOR} and \(F_{E3^{\ast}}=\{AND, OR, NOT2\}\)), six mutations, and three locality definitions. \(\underline{\hbox{Lower is better}}\)
Table 10 Performance (measured in terms of average of the best fitness values over all runs) of a Mutation-Based GP on the Even-3-Parity Problem
Table 11 Locality on the Even-4-Parity Problem using three function sets (F E3 = {ANDORNOT}, F E4 = {ANDORNANDNOR} and \(F_{E3^{\ast}}=\{AND, OR, NOT2\}\)), six mutations, and three locality definitions. \(\underline{\hbox{Lower is better}}\)
Table 12 Performance (measured in terms of average of the best fitness values over all runs) of a mutation-based GP on the Even-4-Parity problem
Table 13 Locality on the Artificial Ant Problem using three function sets (F A3 = {IFPROG2, PROG3}, F A4 = {IFPROG2, PROG3, PROG4} and \(F_{A3^{\ast}}=\{IF3,PROG23,PROG3\}\)), six mutations, and three locality definitions. \(\underline{\hbox{Lower is better}}\)
Table 14 Performance (measured in terms of average of the best fitness values over all runs) of a mutation-based GP on the Artificial Ant problem
Table 15 Locality on the symbolic regression problem F 1 using three function sets \(F_{S6}=\{+,-,{\ast},\%,Sin,Cos\}, F_{S4}=\{+,-,{\ast},\%\}, F_{S6^{\ast}}=\{+,-,{\ast},\%,Sin2,Cos2\}\), six mutations, and three locality definitions. \(\underline{\hbox{Lower is better}}\)
Table 16 Performance (measured in terms of average of the best fitness values over all runs) of a mutation-based GP on the symbolic regression problem F 1
Table 17 Locality on the Symbolic Regression Problem F 2 using three function sets \(F_{S6}=\{+,-,{\ast},\%,Sin,Cos\}, F_{S4}=\{+,-,{\ast},\%\}, F_{S6^{\ast}}=\{+,-,{\ast},\%,Sin2,Cos2\}\), six mutations, and three locality definitions. \(\underline{\hbox{Lower is better}}\)
Table 18 Performance (measured in terms of average of the best fitness values over all runs) of a mutation-based GP on the symbolic regression problem F 2

Rights and permissions

Reprints and permissions

About this article

Cite this article

Galván-López, E., McDermott, J., O’Neill, M. et al. Defining locality as a problem difficulty measure in genetic programming. Genet Program Evolvable Mach 12, 365–401 (2011). https://doi.org/10.1007/s10710-011-9136-3

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10710-011-9136-3

Keywords

Navigation