Estimating Classifier Performance with Genetic Programming

Trujillo, Leonardo; Martínez, Yuliana; Melin, Patricia

doi:10.1007/978-3-642-20407-4_24

Leonardo Trujillo²¹,
Yuliana Martínez²¹ &
Patricia Melin²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6621))

Included in the following conference series:

European Conference on Genetic Programming

731 Accesses
2 Citations

Abstract

A fundamental task that must be addressed before classifying a set of data, is that of choosing the proper classification method. In other words, a researcher must infer which classifier will achieve the best performance on the classification problem in order to make a reasoned choice. This task is not trivial, and it is mostly resolved based on personal experience and individual preferences. This paper presents a methodological approach to produce estimators of classifier performance, based on descriptive measures of the problem data. The proposal is to use Genetic Programming (GP) to evolve mathematical operators that take as input descriptors of the problem data, and output the expected error that a particular classifier might achieve if it is used to classify the data. Experimental tests show that GP can produce accurate estimators of classifier performance, by evaluating our approach on a large set of 500 two-class problems of multimodal data, using a neural network for classification. The results suggest that the GP approach could provide a tool that helps researchers make a reasoned decision regarding the applicability of a classifier to a particular problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cantú-Paz, E., Kamath, C.: An empirical comparison of combinations of evolutionary algorithms and neural networks for classification problems. IEEE Trans. on Syst., Man, and Cyber., Part B 35(5), 915–927 (2005)
Article Google Scholar
Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. 24, 289–300 (2002)
Article Google Scholar
Hordijk, W.: A measure of landscapes. Evol. Comput. 4, 335–360 (1996)
Article Google Scholar
Luke, S., Panait, L.: Lexicographic parsimony pressure. In: Proceedings of GECCO 2002, pp. 829–836. Morgan Kaufmann, San Francisco (2002)
Google Scholar
Mansilla, E.B., Ho, T.K.: On classifier domains of competence. In: Proceedings of ICPR 2004, vol. 1, pp. 136–139. IEEE Computer Society, Washington, DC, USA (2004)
Google Scholar
McDermott, J., Galvan-Lopez, E., O’Neill, M.: A fine-grained view of GP locality with binary decision diagrams as ant phenotypes. In: Schaefer, R., Cotta, C., Kołodziej, J., Rudolph, G. (eds.) PPSN XI. LNCS, vol. 6238, pp. 164–173. Springer, Heidelberg (2010)
Google Scholar
Michie, D., Spiegelhalter, D.J., Taylor, C.C., Campbell, J. (eds.): Machine learning, neural and statistical classification, NJ, USA (1994)
Google Scholar
Ou, G., Murphey, Y.L.: Multi-class pattern classification using neural networks. Pattern Recogn. 40, 4–18 (2007)
Article MATH Google Scholar
Poli, R., Graff, M.: There is a free lunch for hyper-heuristics, genetic programming and computer scientists. In: Vanneschi, L., Gustafson, S., Moraglio, A., De Falco, I., Ebner, M. (eds.) EuroGP 2009. LNCS, vol. 5481, pp. 195–207. Springer, Heidelberg (2009)
Chapter Google Scholar
Poli, R., Graff, M., McPhee, N.F.: Free lunches for function and program induction. In: Proceedings of FOGA 2009, pp. 183–194. ACM, New York (2009)
Google Scholar
Poli, R., Vanneschi, L.: Fitness-proportional negative slope coefficient as a hardness measure for genetic algorithms. In: Proceedings of GECCO 2007, pp. 1335–1342. ACM, New York (2007)
Google Scholar
Silva, S., Almeida, J.: Gplab–a genetic programming toolbox for matlab. In: Proceedings of the Nordic MATLAB Conference, pp. 273–278 (2003)
Google Scholar
Silva, S., Costa, E.: Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genetic Programming and Evolvable Machines 10(2), 141–179 (2009)
Article Google Scholar
Sohn, S.Y.: Meta analysis of classification algorithms for pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 21, 1137–1144 (1999)
Article Google Scholar
Vanneschi, L., Castelli, M., Silva, S.: Measuring bloat, overfitting and functional complexity in genetic programming. In: Proceedings of GECCO 2010, pp. 877–884. ACM, New York (2010)
Google Scholar
Vanneschi, L., Tomassini, M., Collard, P., Vérel, S., Pirola, Y., Mauri, G.: A comprehensive view of fitness landscapes with neutrality and fitness clouds. In: Ebner, M., O’Neill, M., Ekárt, A., Vanneschi, L., Esparcia-Alcázar, A.I. (eds.) EuroGP 2007. LNCS, vol. 4445, pp. 241–250. Springer, Heidelberg (2007)
Chapter Google Scholar
Whitley, D., Watson, J.: Complexity theory and the no free lunch theorem, ch. 11, pp. 317–339 (2005)
Google Scholar
Wolpert, D., Macready, W.: No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1(1), 67–82 (1997)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Instituto Tecnológico de Tijuana, Av. Tecnológico S/N, Tijuana, BC, México
Leonardo Trujillo, Yuliana Martínez & Patricia Melin

Authors

Leonardo Trujillo
View author publications
You can also search for this author in PubMed Google Scholar
Yuliana Martínez
View author publications
You can also search for this author in PubMed Google Scholar
Patricia Melin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INESC-ID Lisboa, Rua Alves Redol 9, 1000-029, Lisboa, Portugal
Sara Silva
Department of Biological Sciences, University of Idaho, ID 83844-3051, Moscow, USA
James A. Foster
University College Dublin, UCD CASL, Belfield, Dublin 4, Ireland
Miguel Nicolau
Faculty of Sciences and Technology, Department of Informatics Engineering, University of Coimbra, Pólo II - Pinhal de Marrocos, 3030-290, Coimbra, Portugal
Penousal Machado
Department of Animal Production Epidemiology and Ecology, University of Torino, Via Leonardo da Vinci 44, 10095, Grugliasco (TO), Italy
Mario Giacobini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Trujillo, L., Martínez, Y., Melin, P. (2011). Estimating Classifier Performance with Genetic Programming. In: Silva, S., Foster, J.A., Nicolau, M., Machado, P., Giacobini, M. (eds) Genetic Programming. EuroGP 2011. Lecture Notes in Computer Science, vol 6621. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20407-4_24

Download citation

DOI: https://doi.org/10.1007/978-3-642-20407-4_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20406-7
Online ISBN: 978-3-642-20407-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics