Genetic Programming Symbolic Classification: A Study

  • Michael F. Korns
Conference paper
Part of the Genetic and Evolutionary Computation book series (GEVO)


While Symbolic Regression (SR) is a well-known offshoot of Genetic Programming, Symbolic Classification (SC), by comparison, has received only meager attention. Clearly, regression is only half of the solution. Classification also plays an important role in any well rounded predictive analysis tool kit. In several recent papers, SR algorithms are developed which move SR into the ranks of extreme accuracy. In an additional set of papers algorithms are developed designed to push SC to the level of basic classification accuracy competitive with existing commercially available classification tools. This paper is a simple study of four proposed SC algorithms and five well-known commercially available classification algorithms to determine just where SC now ranks in competitive comparison. The four SC algorithms are: simple genetic programming using argmax referred to herein as (AMAXSC); the M2GP algorithm; the MDC algorithm, and Linear Discriminant Analysis (LDA). The five commercially available classification algorithms are available in the KNIME system, and are as follows: Decision Tree Learner (DTL); Gradient Boosted Trees Learner (GBTL); Multiple Layer Perceptron Learner (MLP); Random Forest Learner (RFL); and Tree Ensemble Learner (TEL). A set of ten artificial classification problems are constructed with no noise. The simple formulas for these ten artificial problems are listed herein. The problems vary from linear to nonlinear multimodal and from 25 to 1000 columns. All problems have 5000 training points and a separate 5000 testing points. The scores, on the out of sample testing data, for each of the nine classification algorithms are published herein.



Our thanks to: Thomas May from Lantern Credit for assisting with the KNIME Learner training/scoring on all ten artificial classification problems.


  1. 1.
    Ingalalli, Vijay, Silva, Sara, Castelli, Mauro, Vanneschi, Leonardo 2014. A Multi-dimensional Genetic Programming Approach for Multi-class Classification Problems. Euro GP 2014 Springer, pp. 48–60.Google Scholar
  2. 2.
    Korns, Michael F. 2013. Extreme Accuracy in Symbolic Regression. Genetic Programming Theory and Practice XI. Springer, New York, NY, pp. 1–30.Google Scholar
  3. 3.
    Koza, John R. 1992. Genetic Programming: On the Programming of Computers by means of Natural Selection. The MIT Press. Cambridge, Massachusetts.Google Scholar
  4. 4.
    Korns, Michael F. 2012. A Baseline Symbolic Regression Algorithm. Genetic Programming Theory and Practice X. Springer, New York, NY.Google Scholar
  5. 5.
    Keijzer, Maarten. 2003. Improving Symbolic Regression with Interval Arithmetic and Linear Scaling. European Conference on Genetic Programming. Springer, Berlin, pp. 275–299.Google Scholar
  6. 6.
    Billard, Billard., Diday, Edwin. 2003. Symbolic Regression Analysis. Springer. New York, NY.zbMATHGoogle Scholar
  7. 7.
    Korns, Michael F. 2015. Extremely Accurate Symbolic Regression for Large Feature Problems. Genetic Programming Theory and Practice XII. Springer, New York, NY, pp. 109–131.Google Scholar
  8. 8.
    Korns, Michael F. 2016. Highly Accurate Symbolic Regression for Noisy Training Data. Genetic Programming Theory and Practice XIII. Springer, New York, NY, pp. 91–115.CrossRefGoogle Scholar
  9. 9.
    Korns, Michael F. 2018. An Evolutionary Algorithm for Big Data Multi-class Classification Problems. In William Tozier and Brian W. Goldman and Bill Worzel and Rick Riolo editors, Genetic Programming Theory and Practice XIV, Ann Arbor, USA, 2016. www.cs.bham.∼wbl/biblio/gp-html/MichaelKorns.html.
  10. 10.
    Munoz, Louis, Silva, Sara, M. Castelli, Trujillo 2014. M 3 GP Multiclass Classification with GP. Proceedings Euro GP 2015 Springer, pp. 78–91.Google Scholar
  11. 11.
    Fisher, R. A. 1936. The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics 7 (2) 179–188.CrossRefGoogle Scholar
  12. 12.
    Friedman, J. H. 1989. Regularized Discriminant Analysis. Journal of American Statistical Association 84 (405) 165–175.MathSciNetCrossRefGoogle Scholar
  13. 13.
    McLachan, Geoffrey, J. 2004. Discriminant Analysis and Statistical Pattern Recognition. Wiley. New York, NY.Google Scholar
  14. 14.
    Korns, Michael F., 2017. Evolutionary Linear Discriminant Analysis for Multiclass Classification Problems. GECCO Conference Proceedings ’17, July 15–19, Berlin, Germany. ACM Press, New York (2017), pp. 233–234.Google Scholar
  15. 15.
    Michael R. Berthold, Nicolas Cebron, Fabian Dill, Thomas R. Gabriel, Tobias Kötter, Thorsten Meinl, Peter Ohl, Christoph Sieb, Kilian Thiel, and Bernd Wiswedel, 2007. KNIME: The Konstanz Information Miner. ACM SIGKDD Explorations Newsletter. ACM Press, New York (2009), pp. 26–31.Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Michael F. Korns
    • 1
  1. 1.Lantern Credit LLCHendersonUSA

Personalised recommendations