Skip to main content
Log in

A new artificial neural network ensemble based on feature selection and class recoding

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Many of the studies related to supervised learning have focused on the resolution of multiclass problems. A standard technique used to resolve these problems is to decompose the original multiclass problem into multiple binary problems. In this paper, we propose a new learning model applicable to multi-class domains in which the examples are described by a large number of features. The proposed model is an Artificial Neural Network ensemble in which the base learners are composed by the union of a binary classifier and a multiclass classifier. To analyze the viability and quality of this system, it will be validated in two real domains: traffic sign recognition and hand-written digit recognition. Experimental results show that our model is at least as accurate as other methods reported in the bibliography but has a considerable advantage respecting size, computational complexity, and running time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Rangachari A, Mehrotra K, Chilukuri KM, Rank S (1995) Efficient classification for multiclass problems using modular neural networks. Trans Neural Netw 6(1):117–124

    Article  Google Scholar 

  2. Rifkin R, Klautau A (2004) In defense of one-vs-all classification. J Mach Learn Res 5:101–141

    MathSciNet  MATH  Google Scholar 

  3. Ou G, Murphey L (2007) Multi-class pattern classification using neural networks. Pattern Recognit 40(1):4–18

    Article  MATH  Google Scholar 

  4. Hansen L, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12(19):993–1001

    Article  Google Scholar 

  5. Tax DMJ, Duin RPW (2002) Using two-class classifiers for multiclass classification. 16th International Conference on Pattern Recognition, vol 2

  6. Hard-Peled S, Roth D, Zimak D (2003) Constraint classification for multiclass classification and ranking. In: Proceedings of the 16th Annual Conference in Neural Information Processing Systems (NIPS), pp 785–192

  7. García-Pedrajas N, Ortiz-Boyer D (2003) Improving multiclass pattern recognition by the combination of two strategies. Trans Pattern Anal Mach Intell 28(6):1001–1006

    Article  Google Scholar 

  8. Liu H, Yu L (2002) Feature selection for data mining, research technical report. Available in: http://www.public.asu.edu/~huanliu/sur-fs02.ps

  9. Oliveira LS, Sabourin R, Bortolozzi F, Suen CY (2003) A methodology for feature selection using multi-objective genetic algorithms for handwritten digit string recognition. Int J Pattern Recognit Artif Intell 17(6):903–929

    Article  Google Scholar 

  10. Kim Y, Nick-Street W, Menczer F (2006) Optimal ensemble construction via meta-evolutionary ensembles. Expert Syst Appl 30(4):705–714

    Article  Google Scholar 

  11. Muthuramalingam A, Himavathi S, Srinivasan E (2007) Neural network implementation using FPGA: Issues and application. Int J Inf Technol 4(2):86–92

    Google Scholar 

  12. Dietterich TG (1997) Machine-learning research: four current directions. AI Mag 18(4):97–136

    Google Scholar 

  13. Optiz DW (1999) Feature selection for ensembles. Proceedings of the 16th International Conference on Artificial Intelligence, pp 379–384

  14. Bryll F, Gutierrez-Osuna R, Quek F (2003) Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recognit 36(6):1291–1302

    Article  MATH  Google Scholar 

  15. Kolen JF, Pollack JB (1991) Back propagation is sensitive to initial conditions. Proceedings of the 1990 Conference on Advances in Neural Information Processing Systems, pp 860–867

  16. Friedman J (1996) Another approach to polychotomous classification. Technical report, Stanford University

  17. Hastie T, Tibshirani R (1998) Classification by pairwise coupling. Ann Stat 26(2):451–471

    Article  MathSciNet  MATH  Google Scholar 

  18. Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 2:263–286

    MATH  Google Scholar 

  19. Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, Washington

    MATH  Google Scholar 

  20. Sesmero MP, Alonso-Weber JM, Gutiérrez G, Ledezma A, Sanchis A (2007) Testing feature selection in traffic signs. In Proceedings of the 11th International Conference on Computer Aided Systems, pp 396–398

  21. Hall MA (1998) Correlation-based feature selection for machine learning. Ph.D diss. Department of Computer Science, Waikato University, Hamilton, NZ

  22. Xu L, Yan P, Chang T (1998) Best first strategy for feature selection. 9th Int. Conf. On Pattern Recognition, pp 706–708

  23. Russell SJ, Norvig P (2003) Artificial intelligence: a modern approach. Prentice Hall, Englewood

    Google Scholar 

  24. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In: Rumelhart DE, McClelland JL (eds) Parallel distributed processing: explorations in the microstructure of cognition, vol 1. MIT, Cambridge

  25. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MathSciNet  MATH  Google Scholar 

  26. Quinlan JR (1996) Bagging, boosting, and C4.5. Proceedings of the Thirteenth National Conference on Artificial Intelligence, Portland, OR

  27. Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11:169–198

    MATH  Google Scholar 

  28. Duin R, Tax D (2000) Experiments with classifier combining rules. Proceedings of the first international workshop on multiple classifier systems. Lect Notes Comput Sci 1857:16–29

    Article  Google Scholar 

  29. Tsymbal A, Pechenizkiy M, Cunningham P (2003) Diversity in ensemble feature selection. Technical report TCD-CS-2003-44, Computer Science Department Trinity College, Dublin

  30. Kuncheva LI, Witaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51(2):181–207

    Article  MATH  Google Scholar 

  31. Alpaydin E (1999) Combined 5×2cv F-test for comparing supervised classification learning algorithms. Neural Comput 11:1885–1892

    Article  Google Scholar 

  32. Sesmero MP, Alonso-Weber JM, Gutiérrez G, Ledezma A, Sanchis A (2007) Specialized ensemble of classifiers for traffic sign recognition. Comput Ambient Intell Lect Notes Comput Sci 4507:733–740

    Article  Google Scholar 

  33. Ship CA, Kuncheva LI (2002) Relationships between combination methods and measures of diversity in combining classifiers. Inf Fusion 3(2):135–148

    Article  Google Scholar 

  34. LeCun Y, Jackel LD, Bottou L, Cortes C, Denker JS, Drucker H, Guyon I, Muller UA, Sackinger E, Simard P, Vapnik V (1995) Learning algorithms for classification: a comparison on handwritten digit recognition. In: JH Oh, Kwon C, Cho S (eds) Neural networks: the statistical mechanics perspective. World Scientific, Singapore, pp 261–276

    Google Scholar 

  35. Everitt BS (1977) The analysis of contingency tables. Chapman and Hall, London

    Google Scholar 

  36. Sharkey AJC, Sharkey NE, Gerecke U, Chandroth GO (2000) The “test and select” approach to ensemble combination. Lect Notes Comput Sci 1857:30–44

    Article  Google Scholar 

  37. Roli F, Giacinto G, Vernazza G (2001) Methods for designing multiple classifier systems. Lect Notes Comput Sci 2096:78–87

    Article  MathSciNet  Google Scholar 

  38. Goebel K, Yan W (2004) Choosing classifiers for decision fusion. Proceedings of the Seventh International Conference on Information Fusion, vol 1, pp 563–568

  39. Ledezma A, Aler A, Sanchis A, Borrajo D (2010) GA-stacking: evolutionary stacked generalization. Intell Data Anal 14(1):89–119

    Google Scholar 

  40. Schapire RE (1990) The strength of weak learnability. Mach Learn 5(2):197–227

    Google Scholar 

Download references

Acknowledgments

The research reported here has been supported by the Spanish MCyT under project TRA 2007-67374-C02-02.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. P. Sesmero.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sesmero, M.P., Alonso-Weber, J.M., Gutiérrez, G. et al. A new artificial neural network ensemble based on feature selection and class recoding. Neural Comput & Applic 21, 771–783 (2012). https://doi.org/10.1007/s00521-010-0458-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-010-0458-5

Keywords

Navigation