How Far Is It from Here to There? A Distance That Is Coherent with GP Operators

  • James McDermott
  • Una-May O’Reilly
  • Leonardo Vanneschi
  • Kalyan Veeramachaneni
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6621)


The distance between pairs of individuals is a useful concept in the study of evolutionary algorithms. It is particularly useful to define a distance which is coherent with, i.e. related to, the action of a particular operator. We present the first formal, general definition of this operator-distance coherence. We also propose a new distance function, based on the multi-step transition probability (MSTP), that is coherent with any GP operator for which the one-step transition probability (1STP) between individuals can be defined. We give an algorithm for 1STP in the case of subtree mutation. Because MSTP is useful in GP investigations, but impractical to compute, we evaluate a variety of means to approximate it. We show that some syntactic distance measures give good approximations, and attempt to combine them to improve the approximation using a GP symbolic regression method. We conclude that 1STP itself is a sufficient indicator of MSTP for subtree mutation.


Distance Function Genetic Program Pareto Front Operator Application Linear Genetic Programming 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bernard, M., Habrard, A., Sebban, M.: Learning stochastic tree edit distance. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 42–53. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  2. 2.
    Brameier, M., Banzhaf, W.: Explicit control of diversity and effective variation distance in linear genetic programming. In: Foster, J., Lutton, E., Miller, J., Ryan, C., Tettamanzi, A. (eds.) EuroGP 2002. LNCS, vol. 2278, pp. 37–49. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
    Cilibrasi, R., Vitanyi, P.M.B.: Clustering by compression. IEEE Transactions on Information Theory 51(4), 1523–1545 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Durrett, G., Neumann, F., O’Reilly, U.M.: Computational complexity analysis of simple genetic programming on two problems modeling isolated program semantics. In: Foundations of Genetic Algorithms (2010)Google Scholar
  5. 5.
    Ekárt, A., Németh, S.Z.: A metric for genetic programs and fitness sharing. In: Poli, R., Banzhaf, W., Langdon, W.B., Miller, J., Nordin, P., Fogarty, T.C. (eds.) EuroGP 2000. LNCS, vol. 1802, pp. 259–270. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  6. 6.
    Evolved Analytics LLC: DataModeler Release 1.0. Evolved Analytics LLC (2010)Google Scholar
  7. 7.
    Gustafson, S., Vanneschi, L.: Crossover-based tree distance in genetic programming. IEEE Transactions on Evolutionary Computation 12(4), 506–524 (2008)CrossRefGoogle Scholar
  8. 8.
    Igel, C., Chellapilla, K.: Investigating the influence of depth and degree of genotypic change on fitness in genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference, vol. 2, pp. 1061–1068 (1999)Google Scholar
  9. 9.
    Jones, T.: Evolutionary Algorithms, Fitness Landscapes and Search. Ph.D. thesis, University of New Mexico, Albuquerque (1995)Google Scholar
  10. 10.
    Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)zbMATHGoogle Scholar
  11. 11.
    Krawiec, K., Lichocki, P.: Approximating geometric crossover in semantic space. In: GECCO 2009: Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation, pp. 987–994. ACM, New York (2009)Google Scholar
  12. 12.
    McDermott, J., Galván-Lopéz, E., O’Neill, M.: A fine-grained view of GP locality with binary decision diagrams as ant phenotypes. In: Schaefer, R., Cotta, C., Kołodziej, J., Rudolph, G. (eds.) PPSN XI. LNCS, vol. 6238, pp. 164–173. Springer, Heidelberg (2010)Google Scholar
  13. 13.
    Moraglio, A.: Towards a geometric unification of evolutionary algorithms. Ph.D. thesis, University of Essex (November 2007),
  14. 14.
    Moraglio, A., Poli, R.: Geometric landscape of homologous crossover for syntactic trees. In: CEC, vol. 1, pp. 427–434. IEEE, Los Alamitos (2005)Google Scholar
  15. 15.
    Nguyen, Q.U., Nguyen, X.H., O’Neill, M.: Semantic aware crossover for genetic programming: The case for real-valued function regression. In: Vanneschi, L., Gustafson, S., Moraglio, A., De Falco, I., Ebner, M. (eds.) EuroGP 2009. LNCS, vol. 5481, pp. 292–302. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  16. 16.
    O’Reilly, U.M.: Using a distance metric on genetic programs to understand genetic operators. In: IEEE International Conference on Systems, Man, and Cybernetics: Computational Cybernetics and Simulation, vol. 5 (1997)Google Scholar
  17. 17.
    Tomassini, M., Vanneschi, L., Collard, P., Clergue, M.: A study of fitness distance correlation as a difficulty measure in genetic programming. Evolutionary Computation 13(2), 213–239 (2005)CrossRefzbMATHGoogle Scholar
  18. 18.
    Vanneschi, L.: Theory and Practice for Efficient Genetic Programming. Ph.D. thesis, Université de Lausanne (2004)Google Scholar
  19. 19.
    Vladislavleva, E., Smits, G., Kotanchek, M.: Better solutions faster: Soft evolution of robust regression models in Pareto genetic programming. In: Riolo, R.L., Soule, T., Worzel, B. (eds.) Genetic Programming Theory and Practice V, ch. 2, pp. 13–32. Springer, Ann Arbor (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • James McDermott
    • 1
  • Una-May O’Reilly
    • 1
  • Leonardo Vanneschi
    • 2
  • Kalyan Veeramachaneni
    • 1
  1. 1.EvoDesignOpt, CSAILMITUSA
  2. 2.D.I.S.Co.University of Milano-BicoccaMilanItaly

Personalised recommendations