Genetic Programming for Feature Subset Ranking in Binary Classification Problems
We propose a genetic programming (GP) system for measuring the relevance of subsets of features in binary classification tasks. A virtual program structure and an evaluation function are defined in a way that constructed GP programs can measure the goodness of subsets of features. The proposed system can detect relevant subsets of features in different situations including multimodal class distributions and mutually correlated features where other ranking methods have difficulties. Our empirical results indicate that the proposed system is good at ranking subsets and giving insight into the actual classification performance. The proposed ranking system is also efficient in terms of feature selection.
KeywordsFeature Selection Genetic Programming Information Gain Feature Subset Relevance Measure
Unable to display preview. Download preview PDF.
- 2.Ruiz, R., Riquelme, J.C., Aguilar-Ruiz, J.S.: Fast feature ranking algorithm. In: Knowledge-Based Intelligent Information and Engineering Systems, pp. 325–331 (2003)Google Scholar
- 3.Biesiada, J., Duch, W., Kachel, A., Maczka, K., Palucha, S.: Feature ranking methods based on information entropy with parzen windows. In: International Conference on Research in Electrotechnology and Applied Informatics (REI 2005), pp. 109–119 (2005)Google Scholar
- 6.Neshatian, K., Zhang, M.: Genetic programming for feature ranking in classification problems. In: Li, X., et al. (eds.) SEAL 2008. LNCS, vol. 5361, pp. 544–554. Springer, Heidelberg (2008)Google Scholar
- 10.Jolliffe, I.T.: Principal Component Analysis (2002)Google Scholar
- 11.Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the Seventeenth International Conference on Machine Learning table of contents, pp. 359–366. Morgan Kaufmann Publishers Inc., San Francisco (2000)Google Scholar
- 13.Asuncion, A., Newman, D.: UCI machine learning repository (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
- 16.Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
- 19.Lowry, R.: Concepts and Applications of Inferential Statistics. VassarStat (2008)Google Scholar