Detecting Subset of Classifiers for Multi-attribute Response Prediction
An algorithm detecting a classification model in the presence of a multi-class response is introduced. It is called Sequential Automatic Search of a Subset of Classifiers (SASSC) because it adaptively and sequentially aggregates subsets of instances related to a proper aggregation of a subset of the response classes, that is, to a super-class. In each step of the algorithm, aggregations are based on the search of the subset of instances whose response classes generate a classifier presenting the lowest generalization error compared to other alternative aggregations. Cross-validation is used to estimate such generalization errors. The user can choose a final number of subsets of the response classes (super-classes) obtaining a final tree-based classification model presenting an high level of accuracy without neglecting parsimony. Results obtained analyzing a real dataset highlights the effectiveness of the proposed method.
- Asuncion, A., & Newman, D. J. (2007). UCI machine learning repository. School of Information and Computer Sciences, University of California, Irvine. Retrieved December 21, 2007 from http://mlearn.ics.uci.edu/MLRepository.html.
- Dietterich, T. G. (2000). Ensemble methods in machine learning. In J. Kittler & F. Roli (Eds.) Multiple classifier system, Proceedings of the First International Workshop MCS 2000 (pp. 1–15). New York: Springer.Google Scholar
- Fogarty, T. (1992). First nearest neighbor classification on Frey and Slate’s letter recognition problem (technical note). Machine Learning, 9, 387–388.Google Scholar
- Frey, P. W., & Slate, D. J. (1991). Letter recognition using Holland-style adaptive classifiers. Machine Learning, 6, 161–182.Google Scholar
- R Development Core Team. (2009). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved October 21, 2009 from http://www.R-project.org.