Ensemble Feature Selection Based on the Contextual Merit
Recent research has proved the benefits of using ensembles of classifiers for classification problems. Ensembles constructed by machine learning methods manipulating the training set are used to create diverse sets of accurate classifiers. Different feature selection techniques based on applying different heuristics for generating base classifiers can be adjusted to specific domain characteristics. In this paper we consider and experiment with the contextual feature merit measure as a feature selection heuristic. We use the diversity of an ensemble as evaluation function in our new algorithm with a refinement cycle. We have evaluated our algorithm on seven data sets from UCI. The experimental results show that for all these data sets ensemble feature selection based on the contextual merit and suitable starting amount of features produces an ensemble which with weighted voting never produces smaller accuracy than C4.5 alone with all the features.
KeywordsFeature Selection Feature Subset Base Classifier Heuristic Rule Weighted Vote
Unable to display preview. Download preview PDF.
- 3.Brieman, L., Friedman, J., Olshen, R., Stone, C.: Classification and regression trees. Wadsworth International Group, Belmont, California (1984).Google Scholar
- 4.Cost, S., Salzberg, S.: A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features. Machine Learning, Vol. 10,No. 1 (1993) 57–78.Google Scholar
- 5.Dietterich, T. Machine Learning research: Four Current Directions. Artificial Intelligence, Vol. 18,No. 4 (1997) 97–136.Google Scholar
- 8.John, G.H.: Enhancements to the Data Mining Process, PhD Thesis, Computer Science Department, School of Engineering, Stanford University (1997).Google Scholar
- 9.Kohavi R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence Journal, Special Issue on Relevance edited by R. Greiner, J. Pearl and D. Subramanian.Google Scholar
- 10.Kohavi, R., John, G.H.: The Wrapper Approach. In: (eds.) H. Liu and H. Motoda, Feature Selection for Knowledge Discovery in Databases, Springer-Verlag (1998).Google Scholar
- 11.Kohavi, R., Sommerfield, D., Dougherty, J.: Data Mining Using MLC++: A Machine Learning Library in C++. Tools with Artificial Intelligence, IEEE CS Press (1996) 234–245.Google Scholar
- 12.Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Datasets http://www.ics.uci.edu/~mlearn/MLRepository.html]. Dept of Information and CS, Un-ty of California, Irvine, CA (1998).
- 13.Opitz, D. Feature Selection for Ensembles. In: 16th National Conf. on Artificial Intelligence (AAAI), Orlando, Florida (1999) 379–384.Google Scholar
- 15.Opitz, D., Shavlik, J.: Generating accurate and diverse members of neural network ensemble. Advances in Neural Information Processing Systems, Vol. 8 (1996) 881–887.Google Scholar
- 16.Oza, N., Tumer, K.: Dimensionality Reduction Through Classifier Ensembles. Tech. Rep. NASA-ARC-IC-1999-126.Google Scholar
- 17.Prodromidis, A.L., Stolfo, S.J., Chan P.K.: Puning Classifiers in a Distributed Meta-Learning System. In: Proc. of 1st National Conference on New Information Technologies, (1998) 151–160.Google Scholar
- 18.Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann, San Mateo, California (1993).Google Scholar
- 19.Shapire, R.E., Freud, Y., Bartlett, P., Lee, W.S.: Boosting the Margin: A New Explanation of the Effectiveness of the Voting Methods. The Annals of Statistics, Vol. 25,No. 5 (1998), 1651–1686.Google Scholar
- 20.Shapire, R.E.: A Brief Introduction to Boosting. In: Proceedings of 16th International Joint Conference on Artificial Intelligence (1999).Google Scholar
- 21.Skrypnyk, I., Puuronen, S.: Ensembles of Classifiers based on Contextual Features. In Proceedings of 4th International Conference “New Information Technologies” (NITe’2000), Minsk, Belarus, Dec. (2000) (to appear).Google Scholar