Abstract
Many of the studies related to supervised learning have focused on the resolution of multiclass problems. A standard technique used to resolve these problems is to decompose the original multiclass problem into multiple binary problems. In this paper, we propose a new learning model applicable to multi-class domains in which the examples are described by a large number of features. The proposed model is an Artificial Neural Network ensemble in which the base learners are composed by the union of a binary classifier and a multiclass classifier. To analyze the viability and quality of this system, it will be validated in two real domains: traffic sign recognition and hand-written digit recognition. Experimental results show that our model is at least as accurate as other methods reported in the bibliography but has a considerable advantage respecting size, computational complexity, and running time.
Similar content being viewed by others
References
Rangachari A, Mehrotra K, Chilukuri KM, Rank S (1995) Efficient classification for multiclass problems using modular neural networks. Trans Neural Netw 6(1):117–124
Rifkin R, Klautau A (2004) In defense of one-vs-all classification. J Mach Learn Res 5:101–141
Ou G, Murphey L (2007) Multi-class pattern classification using neural networks. Pattern Recognit 40(1):4–18
Hansen L, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12(19):993–1001
Tax DMJ, Duin RPW (2002) Using two-class classifiers for multiclass classification. 16th International Conference on Pattern Recognition, vol 2
Hard-Peled S, Roth D, Zimak D (2003) Constraint classification for multiclass classification and ranking. In: Proceedings of the 16th Annual Conference in Neural Information Processing Systems (NIPS), pp 785–192
García-Pedrajas N, Ortiz-Boyer D (2003) Improving multiclass pattern recognition by the combination of two strategies. Trans Pattern Anal Mach Intell 28(6):1001–1006
Liu H, Yu L (2002) Feature selection for data mining, research technical report. Available in: http://www.public.asu.edu/~huanliu/sur-fs02.ps
Oliveira LS, Sabourin R, Bortolozzi F, Suen CY (2003) A methodology for feature selection using multi-objective genetic algorithms for handwritten digit string recognition. Int J Pattern Recognit Artif Intell 17(6):903–929
Kim Y, Nick-Street W, Menczer F (2006) Optimal ensemble construction via meta-evolutionary ensembles. Expert Syst Appl 30(4):705–714
Muthuramalingam A, Himavathi S, Srinivasan E (2007) Neural network implementation using FPGA: Issues and application. Int J Inf Technol 4(2):86–92
Dietterich TG (1997) Machine-learning research: four current directions. AI Mag 18(4):97–136
Optiz DW (1999) Feature selection for ensembles. Proceedings of the 16th International Conference on Artificial Intelligence, pp 379–384
Bryll F, Gutierrez-Osuna R, Quek F (2003) Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recognit 36(6):1291–1302
Kolen JF, Pollack JB (1991) Back propagation is sensitive to initial conditions. Proceedings of the 1990 Conference on Advances in Neural Information Processing Systems, pp 860–867
Friedman J (1996) Another approach to polychotomous classification. Technical report, Stanford University
Hastie T, Tibshirani R (1998) Classification by pairwise coupling. Ann Stat 26(2):451–471
Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 2:263–286
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, Washington
Sesmero MP, Alonso-Weber JM, Gutiérrez G, Ledezma A, Sanchis A (2007) Testing feature selection in traffic signs. In Proceedings of the 11th International Conference on Computer Aided Systems, pp 396–398
Hall MA (1998) Correlation-based feature selection for machine learning. Ph.D diss. Department of Computer Science, Waikato University, Hamilton, NZ
Xu L, Yan P, Chang T (1998) Best first strategy for feature selection. 9th Int. Conf. On Pattern Recognition, pp 706–708
Russell SJ, Norvig P (2003) Artificial intelligence: a modern approach. Prentice Hall, Englewood
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In: Rumelhart DE, McClelland JL (eds) Parallel distributed processing: explorations in the microstructure of cognition, vol 1. MIT, Cambridge
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Quinlan JR (1996) Bagging, boosting, and C4.5. Proceedings of the Thirteenth National Conference on Artificial Intelligence, Portland, OR
Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11:169–198
Duin R, Tax D (2000) Experiments with classifier combining rules. Proceedings of the first international workshop on multiple classifier systems. Lect Notes Comput Sci 1857:16–29
Tsymbal A, Pechenizkiy M, Cunningham P (2003) Diversity in ensemble feature selection. Technical report TCD-CS-2003-44, Computer Science Department Trinity College, Dublin
Kuncheva LI, Witaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51(2):181–207
Alpaydin E (1999) Combined 5×2cv F-test for comparing supervised classification learning algorithms. Neural Comput 11:1885–1892
Sesmero MP, Alonso-Weber JM, Gutiérrez G, Ledezma A, Sanchis A (2007) Specialized ensemble of classifiers for traffic sign recognition. Comput Ambient Intell Lect Notes Comput Sci 4507:733–740
Ship CA, Kuncheva LI (2002) Relationships between combination methods and measures of diversity in combining classifiers. Inf Fusion 3(2):135–148
LeCun Y, Jackel LD, Bottou L, Cortes C, Denker JS, Drucker H, Guyon I, Muller UA, Sackinger E, Simard P, Vapnik V (1995) Learning algorithms for classification: a comparison on handwritten digit recognition. In: JH Oh, Kwon C, Cho S (eds) Neural networks: the statistical mechanics perspective. World Scientific, Singapore, pp 261–276
Everitt BS (1977) The analysis of contingency tables. Chapman and Hall, London
Sharkey AJC, Sharkey NE, Gerecke U, Chandroth GO (2000) The “test and select” approach to ensemble combination. Lect Notes Comput Sci 1857:30–44
Roli F, Giacinto G, Vernazza G (2001) Methods for designing multiple classifier systems. Lect Notes Comput Sci 2096:78–87
Goebel K, Yan W (2004) Choosing classifiers for decision fusion. Proceedings of the Seventh International Conference on Information Fusion, vol 1, pp 563–568
Ledezma A, Aler A, Sanchis A, Borrajo D (2010) GA-stacking: evolutionary stacked generalization. Intell Data Anal 14(1):89–119
Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2):197–227
Acknowledgments
The research reported here has been supported by the Spanish MCyT under project TRA 2007-67374-C02-02.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sesmero, M.P., Alonso-Weber, J.M., Gutiérrez, G. et al. A new artificial neural network ensemble based on feature selection and class recoding. Neural Comput & Applic 21, 771–783 (2012). https://doi.org/10.1007/s00521-010-0458-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-010-0458-5