A Holistic Classification Optimization Framework with Feature Selection, Preprocessing, Manifold Learning and Classifiers

  • Fabian BürgerEmail author
  • Josef Pauli
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9493)


All real-world classification problems require a carefully designed system to achieve the desired generalization performance. Developers need to select a useful feature subset and a classifier with suitable hyperparameters. Furthermore, a feature preprocessing method (e.g. scaling or pre-whitening) and a dimension reduction method (e.g. Principal Component Analysis (PCA), Autoencoders or other manifold learning algorithms) may improve the performance. The interplay of all these components is complex and a manual selection is time-consuming. This paper presents an automatic optimization framework that incorporates feature selection, several feature preprocessing methods, multiple feature transforms learned by manifold learning and multiple classifiers including all hyperparameters. The highly combinatorial optimization problem is solved with an evolutionary algorithm. Additionally, a multi-classifier based on the optimization trajectory is presented which improves the generalization. The evaluation on several datasets shows the effectiveness of the proposed framework.


Feature selection Model selection Evolutionary optimization Representation learning 



This work was funded by the European Commission within the Ziel2.NRW programme “NanoMikro+Werkstoffe.NRW”.


  1. 1.
    Wolpert, D.H.: The lack of a priori distinctions between learning algorithms. Neural Comput. 8, 1341–1390 (1996)CrossRefGoogle Scholar
  2. 2.
    Jain, A.K., Duin, R.P.W., Mao, J.: Statistical pattern recognition: a review. IEEE Trans. Pattern Anal. Mach. Intell. 22, 4–37 (2000)CrossRefGoogle Scholar
  3. 3.
    Juszczak, P., Tax, D., Duin, R.: Feature scaling in support vector data description. In: Proceedings ASCI, pp. 95–102. Citeseer (2002)Google Scholar
  4. 4.
    Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intelli. 35, 1798–1828 (2013)CrossRefGoogle Scholar
  5. 5.
    Bürger, F., Pauli, J.: Representation optimization with feature selection and manifold learning in a holistic classification framework. In: De Marsico, M., Fred, A., eds.: Proceedings of the International Conference on Pattern Recognition Applications and Methods ICPRAM 2015, vol. 1, pp. 35–44. INSTICC, SCITEPRESS, Lisbon (2015)Google Scholar
  6. 6.
    Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of KDD-2013, pp. 847–855 (2013)Google Scholar
  7. 7.
    Bengio, Y.: Gradient-based optimization of hyperparameters. Neural Comput. 12, 1889–1900 (2000)CrossRefGoogle Scholar
  8. 8.
    Bergstra, J., Bardenet, R., Bengio, Y., Kégl, B., et al.: Algorithms for hyper-parameter optimization. In: 25th Annual Conference on Neural Information Processing Systems (NIPS 2011) (2011)Google Scholar
  9. 9.
    Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Huang, C.L., Wang, C.J.: A GA-based feature selection and parameters optimization for support vector machines. Expert Syst. Appl. 31, 231–240 (2006)CrossRefGoogle Scholar
  11. 11.
    Huang, H.L., Chang, F.L.: ESVM: evolutionary support vector machine for automatic feature selection and classification of microarray data. Biosyst. 90, 516–528 (2007)CrossRefGoogle Scholar
  12. 12.
    Åberg, M., Wessberg, J.: Evolutionary optimization of classifiers and features for single trial EEG discrimination. Biomed. Eng. Online 6, 32 (2007)CrossRefGoogle Scholar
  13. 13.
    Hutter, F., Hoos, H.H., Leyton-Brown, K., Stützle, T.: ParamILS: an automatic algorithm configuration framework. J. Artif. Intell. Res. 36, 267–306 (2009)zbMATHGoogle Scholar
  14. 14.
    Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  15. 15.
    Ansótegui, C., Sellmann, M., Tierney, K.: A gender-based genetic algorithm for the automatic configuration of algorithms. In: Gent, I.P. (ed.) CP 2009. LNCS, vol. 5732, pp. 142–157. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  16. 16.
    Buschmann, F., Meunier, R., Rohnert, H., Sommerlad, P., Stal, M., Stal, M.: Pattern-Oriented Software Architecture: A System of Patterns, vol. 1, Wiley, New York (1996)Google Scholar
  17. 17.
    Bishop, C.M., Nasrabadi, N.M.: Pattern recognition and machine learning. vol. 1, Springer, New York (2006)Google Scholar
  18. 18.
    Van der Maaten, L., Postma, E., Van Den Herik, H.: Dimensionality reduction: a comparative review. J. Mach. Learn. Res. 10, 1–41 (2009)Google Scholar
  19. 19.
    Van der Maaten, L.: Matlab Toolbox for Dimensionality Reduction (2014).
  20. 20.
    Ma, Y., Fu, Y.: Manifold Learning Theory and Applications. CRC Press, Boca Raton (2011)Google Scholar
  21. 21.
    Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)CrossRefGoogle Scholar
  22. 22.
    Bengio, Y., Paiement, J.f., Vincent, P., Delalleau, O., Roux, N.L., Ouimet, M.: Out-of-sample extensions for LLE, Isomap, MDS, eigenmaps, and spectral clustering. In: Advances in Neural Information Processing Systems (2003)Google Scholar
  23. 23.
    Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: theory and applications. Neurocomputing 70, 489–501 (2006)CrossRefGoogle Scholar
  24. 24.
    Darwin, C.: On the Origins of Species by Means of Natural Selection. Murray, London (1859) Google Scholar
  25. 25.
    Bäck, T.: Evolutionary algorithms in theory and practice. Oxford UniversityPress (1996)Google Scholar
  26. 26.
    Beyer, H.G., Schwefel, H.P.: Evolution strategies - a comprehensive introduction. Nat. Comput. 1, 3–52 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Müller, M.: Ein Entwurfsmuster für die multikriterielle Parameteradaption mit Evolutionsstrategien in der Bildverarbeitung. VDI-Verlag (2012)Google Scholar
  28. 28.
    Ranawana, R., Palade, V.: Multi-classifier systems: review and a roadmap for developers. Int. J. Hybrid Intell. Syst. 3, 35–61 (2006)zbMATHGoogle Scholar
  29. 29.
    Bache, K., Lichman, M.: UCI machine learning repository (2013).

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Lehrstuhl für Intelligente SystemeUniversität Duisburg-EssenDuisburgGermany

Personalised recommendations