Abstract
While Symbolic Regression (SR) is a well-known offshoot of Genetic Programming, Symbolic Classification (SC), by comparison, has received only meager attention. Clearly, regression is only half of the solution. Classification also plays an important role in any well rounded predictive analysis tool kit. In several recent papers, SR algorithms are developed which move SR into the ranks of extreme accuracy. In an additional set of papers algorithms are developed designed to push SC to the level of basic classification accuracy competitive with existing commercially available classification tools. This paper is a simple study of four proposed SC algorithms and five well-known commercially available classification algorithms to determine just where SC now ranks in competitive comparison. The four SC algorithms are: simple genetic programming using argmax referred to herein as (AMAXSC); the M2GP algorithm; the MDC algorithm, and Linear Discriminant Analysis (LDA). The five commercially available classification algorithms are available in the KNIME system, and are as follows: Decision Tree Learner (DTL); Gradient Boosted Trees Learner (GBTL); Multiple Layer Perceptron Learner (MLP); Random Forest Learner (RFL); and Tree Ensemble Learner (TEL). A set of ten artificial classification problems are constructed with no noise. The simple formulas for these ten artificial problems are listed herein. The problems vary from linear to nonlinear multimodal and from 25 to 1000 columns. All problems have 5000 training points and a separate 5000 testing points. The scores, on the out of sample testing data, for each of the nine classification algorithms are published herein.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ingalalli, Vijay, Silva, Sara, Castelli, Mauro, Vanneschi, Leonardo 2014. A Multi-dimensional Genetic Programming Approach for Multi-class Classification Problems. Euro GP 2014 Springer, pp. 48–60.
Korns, Michael F. 2013. Extreme Accuracy in Symbolic Regression. Genetic Programming Theory and Practice XI. Springer, New York, NY, pp. 1–30.
Koza, John R. 1992. Genetic Programming: On the Programming of Computers by means of Natural Selection. The MIT Press. Cambridge, Massachusetts.
Korns, Michael F. 2012. A Baseline Symbolic Regression Algorithm. Genetic Programming Theory and Practice X. Springer, New York, NY.
Keijzer, Maarten. 2003. Improving Symbolic Regression with Interval Arithmetic and Linear Scaling. European Conference on Genetic Programming. Springer, Berlin, pp. 275–299.
Billard, Billard., Diday, Edwin. 2003. Symbolic Regression Analysis. Springer. New York, NY.
Korns, Michael F. 2015. Extremely Accurate Symbolic Regression for Large Feature Problems. Genetic Programming Theory and Practice XII. Springer, New York, NY, pp. 109–131.
Korns, Michael F. 2016. Highly Accurate Symbolic Regression for Noisy Training Data. Genetic Programming Theory and Practice XIII. Springer, New York, NY, pp. 91–115.
Korns, Michael F. 2018. An Evolutionary Algorithm for Big Data Multi-class Classification Problems. In William Tozier and Brian W. Goldman and Bill Worzel and Rick Riolo editors, Genetic Programming Theory and Practice XIV, Ann Arbor, USA, 2016. www.cs.bham. ac.uk/∼wbl/biblio/gp-html/MichaelKorns.html.
Munoz, Louis, Silva, Sara, M. Castelli, Trujillo 2014. M 3 GP Multiclass Classification with GP. Proceedings Euro GP 2015 Springer, pp. 78–91.
Fisher, R. A. 1936. The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics 7 (2) 179–188.
Friedman, J. H. 1989. Regularized Discriminant Analysis. Journal of American Statistical Association 84 (405) 165–175.
McLachan, Geoffrey, J. 2004. Discriminant Analysis and Statistical Pattern Recognition. Wiley. New York, NY.
Korns, Michael F., 2017. Evolutionary Linear Discriminant Analysis for Multiclass Classification Problems. GECCO Conference Proceedings ’17, July 15–19, Berlin, Germany. ACM Press, New York (2017), pp. 233–234.
Michael R. Berthold, Nicolas Cebron, Fabian Dill, Thomas R. Gabriel, Tobias Kötter, Thorsten Meinl, Peter Ohl, Christoph Sieb, Kilian Thiel, and Bernd Wiswedel, 2007. KNIME: The Konstanz Information Miner. ACM SIGKDD Explorations Newsletter. ACM Press, New York (2009), pp. 26–31.
Acknowledgements
Our thanks to: Thomas May from Lantern Credit for assisting with the KNIME Learner training/scoring on all ten artificial classification problems.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Korns, M.F. (2018). Genetic Programming Symbolic Classification: A Study. In: Banzhaf, W., Olson, R., Tozier, W., Riolo, R. (eds) Genetic Programming Theory and Practice XV. Genetic and Evolutionary Computation. Springer, Cham. https://doi.org/10.1007/978-3-319-90512-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-90512-9_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-90511-2
Online ISBN: 978-3-319-90512-9
eBook Packages: Computer ScienceComputer Science (R0)