Skip to main content

Feature Selection for Ensembles Using the Multi-Objective Optimization Approach

  • Chapter
Multi-Objective Machine Learning

Part of the book series: Studies in Computational Intelligence ((SCI,volume 16))

Abstract

Feature selection for ensembles has shown to be an effective strategy for ensemble creation due to its ability of producing good subsets of features, which make the classifiers of the ensemble disagree on difficult cases. In this paper we present an ensemble feature selection approach based on a hierarchical multi-objective genetic algorithm. The underpinning paradigm is the “overproduce and choose”. The algorithm operates in two levels. Firstly, it performs feature selection in order to generate a set of classifiers and then it chooses the best team of classifiers. In order to show its robustness, the method is evaluated in two different contexts: supervised and unsupervised feature selection. In the former, we have considered the problem of handwritten digit recognition and used three different feature sets and multi-layer perceptron neural networks as classifiers. In the latter, we took into account the problem of handwritten month word recognition and used three different feature sets and hidden Markov models as classifiers. Experiments and comparisons with classical methods, such as Bagging and Boosting, demonstrated that the proposed methodology brings compelling improvements when classifiers have to work with very low error rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. L. Breiman. Stacked regressions. Machine Learning, 24(1):49–64, 1996.

    MATH  MathSciNet  Google Scholar 

  2. E. Cantu-Paz. Efficient and Accurate Parallel Genetic Algorithms. Kluwer Academic Publishers, 2000.

    Google Scholar 

  3. D. L. Davies and D. W. Bouldin. A cluster separation measure. IEEE Trans. on Pattern Analysis and Machine Intelligence, 1(224–227):550–554, 1979.

    Google Scholar 

  4. K. Deb. Multi-Objective Optimization using Evolutionary Algorithms. John Wiley and Sons Ltd, 2nd edition, April 2002.

    Google Scholar 

  5. J. G. Dy and C. E. Brodley. Feature subset selection and order identification for unsupervised learning. In Proc. 17 th International Conference on Machine Learning, 2000.

    Google Scholar 

  6. B. Efron and Tibshirani R. An introduction to the Bootstrap. Chapman and Hall, 1993.

    Google Scholar 

  7. Y. Freund and R. Schapire. Experiments with a new boosting algorithm. In Proc. of 13 th International Conference on Machine Learning, pages 148–156, Bary-Italy, 1996.

    Google Scholar 

  8. G. Fumera, F. Roli, and G. Giacinto. Reject option with multiple thresholds. Pattern Recognition, 33(12):2099–2101, 2000.

    Article  Google Scholar 

  9. G. Giacinto and F. Roli. Design of effective neural network ensemble for image classification purposes. Image Vision and Computing Journal, 9–10:697–705, 2001.

    Google Scholar 

  10. S. Gunter and H. Bunke. Creation of classifier ensembles for handwritten word recogntion using feature selection algorithms. In Proc. of 8 th IWFHR, pages 183–188, Niagara-on-the-Lake, Canada, 2002.

    Google Scholar 

  11. S. Hashem. Optimal linear combinations of neural networks. Neural Networks, 10(4):599–614, 1997.

    Article  Google Scholar 

  12. T. K. Ho. The random subspace method for constructing decision forests. IEEE Trans. on Pattern Analysis and Machine Intelligence, 20(8):832–844, 1998.

    Article  Google Scholar 

  13. G. John, R. Kohavi, and K. Pfleger. Irrelevant features and the subset selection problems. In Proc. of 11 th International Conference on Machine Learning, pages 121–129, 1994.

    Google Scholar 

  14. J. J. Oliveira Jr., J. M. Carvalho, C. O. A. Freitas, and R. Sabourin. Evaluating NN and HMM classifiers for handwritten word recognition. In Proceedings of the 15 th Brazilian Symposium on Computer Graphics and Image Processing, pages 210–217. IEEE Computer Society, 2002.

    Google Scholar 

  15. Y. S. Kim, W. N. Street, and F. Menczer. Feature selection in unsupervised learning via evolutionary search. In Proc. 6 th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 365–369, 2000.

    Google Scholar 

  16. J. Kittler, M. Hatef, R. Duin, and J. Matas. On combining classifiers. IEEE Trans. on Pattern Analysis and Machine Intelligence, 20(3):226–239, 1998.

    Article  Google Scholar 

  17. A. Krogh and J. Vedelsby. Neural networks ensembles, cross validation, and active learning. In G. Tesauro et al, editor, Advances in Neural Information Processing Systems 7, pages 231–238. MIT Press, 1995.

    Google Scholar 

  18. M. Kudo and J. Sklansky. Comparision of algorithms that select features for pattern classifiers. Pattern Recognition, 33(1):25–41, 2000.

    Article  Google Scholar 

  19. L. Kuncheva. That elusive diversity in classifier ensembles. In Proc. of ibPRIA, LNCS 2652, pages 1126–1138, Mallorca, Spain, 2003.

    Google Scholar 

  20. L. Kuncheva, J. C. Bezdek, and R. P. W. Duin. Decision templates for multiple classifier fusion: An experimental comparison. Pattern Recognition, 34(2):299–314, 2001.

    Article  MATH  Google Scholar 

  21. L. Kuncheva and L. C. Jain. Designing classifier fusion systems by genetic algorithms. IEEE Trans. on Evolutionary Computation, 4(4):327–336, 2000.

    Article  Google Scholar 

  22. L. I. Kuncheva and C. J. Whitaker. Ten measures of diversity in classifier ensembles: limits for two classifiers. In Proc. of IEE Workshop on Intelligent Sensor Processing, pages 1–10, 2001.

    Google Scholar 

  23. L. I. Kuncheva and C. J. Whitaker. Measures of diversity in classifier ensembles. Machine Learning, 51:181–207, 2003.

    Article  MATH  Google Scholar 

  24. M. Last, H. Bunke, and A. Kandel. A feature-based serial approach to classifier combination. Pattern Analysis and Applications, 5:385–398, 2002.

    Article  MathSciNet  Google Scholar 

  25. M. Miki, T. Hiroyasu, K. Kaneko, and K. Hatanaka. A parallel genetic algorithm with distributed environment scheme. In Proc. of International Conference on System, Man, and Cybernetics, volume 1, pages 695–700, 1999.

    Google Scholar 

  26. J. Moody and J. Utans. Principled architecture selection for neural networks: Application to corporate bond rating prediction. In J. Moody, S. J. Hanson, and R. P. Lippmann, editors, Advances in Neural Information Processing Systems 4. Morgan Kaufmann, 1991.

    Google Scholar 

  27. M. Morita, R. Sabourin, F. Bortolozzi, and C. Y. Suen. Unsupervised feature selection using multi-objective genetic algorithms for handwritten word recognition. In Proceedings of the 7 th International Conference on Document Analysis and Recognition, pages 666–670. IEEE Computer Society, 2003.

    Google Scholar 

  28. M. Morita, R. Sabourin, F. Bortolozzi, and Suen C. Y. Segmentation and recognition of handwritten dates: An hmm-mlp hybrid approach. International Journal on Document Analysis and Recognition, 6:248–262, 2003.

    Article  Google Scholar 

  29. L. S. Oliveira, R. Sabourin, F. Bortolozzi, and C. Y. Suen. Automatic recognition of handwritten numerical strings: A recognition and verification strategy. IEEE Trans. on Pattern Analysis and Machine Intelligence, 24(11):1438–1454, 2002.

    Article  Google Scholar 

  30. L. S. Oliveira, R. Sabourin, F. Bortolozzi, and C. Y. Suen. A methodology for feature selection using multi-objective genetic algorithms for handwritten digit string recognition. International Journal of Pattern Recognition and Artificial Intelligence, 17(6):903–930, 2003.

    Article  Google Scholar 

  31. D. W. Optiz. Feature selection for ensembles. In Proc. of 16 th International Conference on Artificial Intelligence, pages 379–384, 1999.

    Google Scholar 

  32. D. Partridge and W. B. Yates. Engineering multiversion neural-net systems. Neural Computation, 8(4):869–893, 1996.

    Google Scholar 

  33. D. Ruta. Multilayer selection-fusion model for pattern classification. In Proceedings of the IASTED Artificial Intelligence and Application Conference, Insbruck, Austria, 2004.

    Google Scholar 

  34. N. Srinivas and K. Deb. Multiobjective optimization using nondominated sorting in genetic algorithms. Evolutionary Computation, 2(3):221–248, 1995.

    Google Scholar 

  35. A. Tsymbal, S. Puuronen, and D. W. Patterson. Ensemble feature selection with the simple Bayesian classification. Information Fusion, 4:87–100, 2003.

    Article  Google Scholar 

  36. K. Tumer and J. Ghosh. Error correlation and error reduction in ensemble classifiers. Connection Science, 8(3–4):385–404, 1996.

    Article  Google Scholar 

  37. K. Tumer and N. C. Oza. Input decimated ensembles. Pattern Analysis and Applications, 6:65–77, 2003.

    Article  MATH  MathSciNet  Google Scholar 

  38. H. Yuan, S. S. Tseng, W. Gangshan, and Z. Fuyan. A two-phase feature selection method using both filter and wrapper. In Proc. of IEEE International Conference on Systems, Man, and Cybernetics, volume 2, pages 132–136, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer

About this chapter

Cite this chapter

Oliveira, L.S., Morita, M., Sabourin, R. (2006). Feature Selection for Ensembles Using the Multi-Objective Optimization Approach. In: Jin, Y. (eds) Multi-Objective Machine Learning. Studies in Computational Intelligence, vol 16. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-33019-4_3

Download citation

  • DOI: https://doi.org/10.1007/3-540-33019-4_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30676-4

  • Online ISBN: 978-3-540-33019-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics