Advertisement

Metaheuristics and Classifier Ensembles

  • Ringolf Thomschke
  • Stefan VoßEmail author
  • Stefan Lessmann
Chapter

Abstract

A classifier ensemble combines several base models into a composite model to increase predictive accuracy. Given a set of candidate base models, the question which of these to incorporate into an ensemble and whether to weight base models differently has received much interest in the machine learning literature. Using heuristic search for ensemble member selection has proven a viable approach. However, research has till now considered only a small set of (meta-)heuristics for this type of problem. More generally, whether the choice of a metaheuristic is important has not been addressed at all. This paper aims at filling this gap. To that end, a comprehensive set of metaheuristics is employed to create alternative ensemble classifiers and these are compared in the scope of an empirical study. The results observed in several experiments provide original insights concerning the relative effectiveness of different metaheuristics and fitness functions for ensemble modelling. Having identified a particularly promising modelling approach, the paper proceeds with comparisons to other ensemble regimes and more generally prediction models to assess the degree to which a metaheuristic-based ensemble improves upon the state-of-the-art. As part of this analysis, the paper also proposes an approach to approximate an optimality gap for predictive classification models.

Keywords

Classifier ensembles Machine learning Metaheuristics Predictive analytics 

References

  1. 1.
    Ali MZ, Awad NH, Suganthan PN, Duwairi RM, Reynolds RG (2016) A novel hybrid cultural algorithms framework with trajectory-based search for global numerical optimization. Information Sciences 334:219–249, https://doi.org/10.1016/j.ins.2015.11.032 CrossRefGoogle Scholar
  2. 2.
    Asuncion A, Newman D (2007) UCI machine learning repository. http://www.ics.uci.edu/~mlearn/MLRepository.html
  3. 3.
    Atashpaz-Gargari E, Lucas C (2007) Imperialist competitive algorithm: An algorithm for optimization inspired by imperialistic competition. In: IEEE Congress on Evolutionary Computation, IEEE, pp 4661–4667, http://dblp.uni-trier.de/db/conf/cec/cec2007.html#Atashpaz-GargariL07
  4. 4.
    Beyer HG, Schwefel HP (2002) Evolution strategies – a comprehensive introduction. Natural Computing 1(1):3–52, https://doi.org/10.1023/A:1015059928466 MathSciNetCrossRefGoogle Scholar
  5. 5.
    Blum C, Li X (2008) Swarm intelligence in optimization. In: Blum C, Merkle D (eds) Swarm Intelligence: Introduction and Applications, Springer, Berlin, Heidelberg, pp 43–85, https://doi.org/10.1007/978-3-540-74089-6_2 CrossRefGoogle Scholar
  6. 6.
    Blum C, Roli A (2003) Metaheuristics in combinatorial optimization: Overview and conceptual comparison. ACM Computing Surveys 35(3):268–308, http://doi.acm.org/10.1145/937503.937505 CrossRefGoogle Scholar
  7. 7.
    Caruana R, Niculescu-Mizil A, Crew G, Ksikes A (2004) Ensemble selection from libraries of models. In: Proceedings of the Twenty-first International Conference on Machine Learning (ICML), ACM, New York, pp 18, http://www.cs.cornell.edu/~caruana/ctp/ct.papers/caruana.icml04.icdm06long.pdf
  8. 8.
    Caserta M, Voß S (2010) Metaheuristics: Intelligent problem solving. In: Maniezzo V, Stützle T, Voß S (eds) Matheuristics: Hybridizing Metaheuristics and Mathematical Programming, Springer US, pp 1–38Google Scholar
  9. 9.
    Chen Y, Wong ML (2010) An ant colony optimization approach for stacking ensemble. In: Nature and Biologically Inspired Computing (NaBIC), 2010 Second World Congress on, pp 146–151Google Scholar
  10. 10.
    Chipperfield A, Fleming P, Pohlheim H, Fonseca C (1994) Genetic algorithm toolbox for use with matlab. Tech. rep., Department of Automatic Control and Systems Engineering, University of SheffieldGoogle Scholar
  11. 11.
    Civicioglu P (2013) Backtracking search optimization algorithm. https://de.mathworks.com/matlabcentral/fileexchange/44842-backtracking-search-optimization-algorithm, accessed: 2016-04-01
  12. 12.
    Civicioglu P (2013) Backtracking search optimization algorithm for numerical optimization problems. Applied Mathematics and Computation 219(15):8121–8144, https://doi.org/10.1016/j.amc.2013.02.017 MathSciNetCrossRefGoogle Scholar
  13. 13.
    Coletta LFS, Hruschka ER, Acharya A, Ghosh J (2013) Towards the use of metaheuristics for optimizing the combination of classifier and cluster ensembles. In: 2013 BRICS Congress on Computational Intelligence and 11th Brazilian Congress on Computational Intelligence, pp 483–488Google Scholar
  14. 14.
    Dietterich TG (2000) Ensemble methods in machine learning. In: Multiple Classifier Systems, Springer, Lecture Notes in Computer Science, vol 1857, pp 1–15, http://web.engr.oregonstate.edu/~tgd/publications/mcs-ensembles.pdf CrossRefGoogle Scholar
  15. 15.
    Duan QY, Gupta VK, Sorooshian S (1993) Shuffled complex evolution approach for effective and efficient global minimization. Journal of Optimization Theory and Applications 76(3):501–521, https://doi.org/10.1007/BF00939380 MathSciNetCrossRefGoogle Scholar
  16. 16.
    Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Sixth International Symposium on Micro Machine and Human Science, IEEE, pp 39–43Google Scholar
  17. 17.
    Ekbal A, Saha S (2011) A multiobjective simulated annealing approach for classifier ensemble: Named entity recognition in Indian languages as case studies. Expert Systems with Applications 38(12):14760–14772, https://doi.org/10.1016/j.eswa.2011.05.004 CrossRefGoogle Scholar
  18. 18.
    Fawcett T (2006) An introduction to roc analysis. Pattern Recognition Letters 27(8):861–874, http://www.sciencedirect.com/science/article/B6V15-4HV747X-1/2/c1653cca4db4e94215437a482fcbecbb MathSciNetCrossRefGoogle Scholar
  19. 19.
    Gabrys B, Ruta D (2006) Genetic algorithms in classifier fusion. Applied Soft Computing 6(4):337–347, https://doi.org/10.1016/j.asoc.2005.11.001 CrossRefGoogle Scholar
  20. 20.
    García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information Science 180(10):2044–2064, https://doi.org/10.1016/j.ins.2009.12.010 CrossRefGoogle Scholar
  21. 21.
    Geem Z, Kim J, Loganathan G (2001) A new heuristic optimization algorithm: Harmony search. Simulation 76(2):60–68CrossRefGoogle Scholar
  22. 22.
    Gendreau M, Potvin JY (2010) Handbook of Metaheuristics, 2nd edn. SpringerCrossRefGoogle Scholar
  23. 23.
    Glover F (1994) Genetic algorithms and scatter search: Unsuspected potentials. Statistics and Computing 4:131–140CrossRefGoogle Scholar
  24. 24.
    Glover F (2000) Fundamentals of scatter search and path relinking. Control and Cybernetics 29(3):653–684MathSciNetzbMATHGoogle Scholar
  25. 25.
    Glover F, Kochenberger GA (2003) Handbook of Metaheuristics. Kluwer, Boston, http://opac.inria.fr/record=b1099522 CrossRefGoogle Scholar
  26. 26.
    Greistorfer P, Voß S (2005) Controlled pool maintenance for meta-heuristics. In: Rego C, Alidaee B (eds) Metaheuristic Optimization via Memory and Evolution, Kluwer, Boston, pp 387–424CrossRefGoogle Scholar
  27. 27.
    Hand DJ (2009) Measuring classifier performance: A coherent alternative to the area under the roc curve. Machine Learning 77(1):103–123, https://doi.org/10.1007/s10994-009-5119-5 CrossRefGoogle Scholar
  28. 28.
    Hansen N (2006) The CMA evolution strategy: a comparing review. In: Lozano J, Larranaga P, Inza I, Bengoetxea E (eds) Towards a new evolutionary computation. Advances on estimation of distribution algorithms, Springer, pp 75–102CrossRefGoogle Scholar
  29. 29.
    Hastie T, Tibshirani R, Friedman JH (2009) The Elements of Statistical Learning, 2nd edn. Springer, New YorkCrossRefGoogle Scholar
  30. 30.
    Hernández-Orallo J, Flach PA, Ramirez CF (2011) Brier curves: a new cost-based visualisation of classifier performance. In: Getoor L, Scheffer T (eds) Proceedings of the 28th International Conference on Machine Learning, Omnipress, pp 585–592, http://dblp.uni-trier.de/db/conf/icml/icml2011.html#Hernandez-OralloFR11
  31. 31.
    Hosseini S, Khaled AA (2014) A survey on the imperialist competitive algorithm metaheuristic: Implementation in engineering domain and directions for future research. Applied Soft Computing 24:1078–1094, https://doi.org/10.1016/j.asoc.2014.08.024 CrossRefGoogle Scholar
  32. 32.
    Ingber L (1996) Adaptive simulated annealing (ASA): Lessons learned. Control and Cybernetics 25:33–54zbMATHGoogle Scholar
  33. 33.
    Janikow CZ, Michalewicz Z (1991) An experimental comparison of binary and floating point representations in genetic algorithms. In: Belew RK, Booker LB (eds) Proceedings of the 4th International Conference on Genetic Algorithms, Morgan Kaufmann, pp 151–157–36, http://dblp.uni-trier.de/db/conf/icga/icga1991.html#JanikowM91
  34. 34.
    Jing Y, Xiaoqin Z, Shuiming Z, Shengli W (2013) Effective neural network ensemble approach for improving generalization performance. IEEE Transactions on Neural Networks and Learning Systems 24(6):878–887CrossRefGoogle Scholar
  35. 35.
    Karaboga D, Basturk B (2007) Artificial bee colony (ABC) optimization algorithm for solving constrained optimization problems. In: Melin P, Castillo O, Aguilar LT, Kacprzyk J, Pedrycz W (eds) Foundations of Fuzzy Logic and Soft Computing: Proceedings of the 12th International Fuzzy Systems Association World Congress, IFSA 2007, Cancun, Mexico, June 18-21, 2007, Springer, Berlin, Heidelberg, pp 789–798, https://doi.org/10.1007/978-3-540-72950-1_77 CrossRefGoogle Scholar
  36. 36.
    Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680MathSciNetCrossRefGoogle Scholar
  37. 37.
    Lessmann S, Baesens B, Seow HV, Thomas LC (2015) Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European Journal of Operational Research 247(1):124–136, http://www.sciencedirect.com/science/article/pii/S0377221715004208, https://doi.org/10.1016/j.ejor.2015.05.030 CrossRefGoogle Scholar
  38. 38.
    MATLAB (2010) version 7.10.0 (R2010a). The MathWorks Inc., Natick, MassachusettsGoogle Scholar
  39. 39.
    Mitchell M (1995) Genetic algorithms: An overview. Complexity 1(1):31–39,  https://doi.org/10.1002/cplx.6130010108 CrossRefGoogle Scholar
  40. 40.
    Nabavi-Kerizi SH, Abadi M, Kabir E (2010) A PSO-based weighting method for linear combination of neural networks. Comput Electr Eng 36(5):886–894, https://doi.org/10.1016/j.compeleceng.2008.04.006 CrossRefGoogle Scholar
  41. 41.
    Ortiz GA (2012) (1+1)-evolutionary strategy. https://de.mathworks.com/matlabcentral/fileexchange/35800-1+1-evolution-strategy--es-, accessed: 2016-04-01
  42. 42.
    Palanisamy S, Kanmani S (2012) Classifier ensemble design using artificial bee colony based feature selection. International Journal of Computer Science Issues 9(2):522–529Google Scholar
  43. 43.
    Partalas I, Tsoumakas G, Vlahavas I (2010) An ensemble uncertainty aware measure for directed hill climbing ensemble pruning. Machine Learning 81(3):257–282, https://doi.org/10.1007/s10994-010-5172-0 MathSciNetCrossRefGoogle Scholar
  44. 44.
    Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization. Swarm Intelligence 1(1):33–57, https://doi.org/10.1007/s11721-007-0002-0 CrossRefGoogle Scholar
  45. 45.
    Quoos M, Pozniak-Koszalka I, Koszalka L, Kasprzak A (2015) Multiple classifier system with metaheuristic algorithms. In: Gervasi O, Murgante B, Misra S, Gavrilova LM, Rocha CAMA, Torre C, Taniar D, Apduhan OB (eds) Computational Science and Its Applications – ICCSA 2015: Proceedings of the 15th International Conference, Banff, AB, Canada, June 22-25, 2015, Part II, Springer, Cham, pp 43–54, https://doi.org/10.1007/978-3-319-21407-8_4 CrossRefGoogle Scholar
  46. 46.
    Rao RV, Savsani VJ, Vakharia DP (2011) Teaching-learning-based optimization: A novel method for constrained mechanical design optimization problems. Computer Aided Design 43(3):303–315, https://doi.org/10.1016/j.cad.2010.12.015 CrossRefGoogle Scholar
  47. 47.
    Rechenberg I (1970) Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Dissertation, Technische Universität BerlinGoogle Scholar
  48. 48.
    Resende M, Ribeiro C, Glover F, Marti R (2010) Scatter search and path relinking: Fundamentals, advances and applications. In: Handbook of Metaheuristics, Springer, New York, pp 87–107CrossRefGoogle Scholar
  49. 49.
    Reynolds RG (1994) An introduction to cultural algorithms. In: Sebald AV, Fogel LJ (eds) Evolutionary Programming — Proceedings of the Third Annual Conference, World Scientific Press, San Diego, CA, USA, pp 131–139, http://ai.cs.wayne.edu/ai/availablePapersOnLine/IntroToCA.pdf Google Scholar
  50. 50.
    Schwefel HP (1975) Evolutionsstrategie und numerische Optimierung. Dissertation, Technische Universität BerlinGoogle Scholar
  51. 51.
    Segredo E, Lalla-Ruiz E, Hart E, BPaechter, Voß S (2016) Analysing the performance of migrating birds optimisation approaches for large scale continuous problems. Lecture Notes in Computer Science 9921:134–144Google Scholar
  52. 52.
    Shmueli G, Koppius OR (2011) Predictive analytics in information systems research. MIS Quarterly 35(3):553–572CrossRefGoogle Scholar
  53. 53.
    Simon D (2008) Biogeography-based optimization. IEEE Transactions on Evolutionary Computation 12(6):702–713CrossRefGoogle Scholar
  54. 54.
    Socha K, Dorigo M (2008) Ant colony optimization for continuous domains. European Journal of Operational Research 185(3):1155–1173, http://EconPapers.repec.org/RePEc:eee:ejores:v:185:y:2008:i:3:p:1155-1173 MathSciNetCrossRefGoogle Scholar
  55. 55.
    Sorensen K, Sevaux M, Glover F (2018) A history of metaheuristics. In: Martí R, Pardalos P, Resende M (eds) Handbook of Heuristics, Springer, Cham. https://doi.org/10.1007/978-3-319-07153-4_4-1 Google Scholar
  56. 56.
    Storn R, Price K (1997) Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization 11(4):341–359, https://doi.org/10.1023/A:1008202821328 MathSciNetCrossRefGoogle Scholar
  57. 57.
    Taghavi S, Sajedi H (2014) Ensemble selection using simulated annealing walking. International Journal of Advances in Computer Science & Its Applications 4(4):174–178Google Scholar
  58. 58.
    Tahir MA, Smith J (2010) Creating diverse nearest-neighbour ensembles using simultaneous metaheuristic feature selection. Pattern Recognition Letters 31(11):1470–1480. https://doi.org/10.1016/j.patrec.2010.01.030 CrossRefGoogle Scholar
  59. 59.
    Tang EK, Suganthan PN, Yao X (2006) An analysis of diversity measures. Machine Learning 65(1):247–271. https://doi.org/10.1007/s10994-006-9449-2 CrossRefGoogle Scholar
  60. 60.
    Tsoumakas G, Partalas I, Vlahavas I (2009) An ensemble pruning primer. In: Okun O, Valentini G (eds) Applications of Supervised and Unsupervised Ensemble Methods, Studies in Computational Intelligence, Springer, Berlin, pp 1–13, https://doi.org/10.1007/978-3-642-03999-7_1 Google Scholar
  61. 61.
    Visentini I, Snidaro L, Foresti GL (2016) Diversity-aware classifier ensemble selection via f-score. Information Fusion 28:24–43, http://www.sciencedirect.com/science/article/pii/S1566253515000688 CrossRefGoogle Scholar
  62. 62.
    Weyland D (2010) A rigorous analysis of the harmony search algorithm: How the research community can be misled by a “novel” methodology. International Journal of Applied Metaheuristic Computing (IJAMC) 1(2):50–60CrossRefGoogle Scholar
  63. 63.
    Yang XS (2009) Firefly algorithms for multimodal optimization. In: Watanabe O, Zeugmann T (eds) Stochastic Algorithms: Foundations and Applications: Proceedings of the 5th International Symposium, SAGA 2009, Sapporo, Japan, October 26-28, 2009., Springer, Berlin, Heidelberg, pp 169–178, https://doi.org/10.1007/978-3-642-04944-6_14 Google Scholar
  64. 64.
    Yarpiz (2016) Yarpiz. http://yarpiz.com/category/metaheuristics, accessed: 2016-04-01
  65. 65.
    Yin PY, Glover F, Laguna M, Zhu JX (2010) Cyber swarm algorithms — improving particle swarm optimization using adaptive memory strategies. European Journal of Operational Research 201:377–389MathSciNetCrossRefGoogle Scholar
  66. 66.
    Yu X, Gen M (2012) Introduction to Evolutionary Algorithms. SpringerzbMATHGoogle Scholar
  67. 67.
    Zhou ZH (2012) Ensemble Methods: Foundations and Algorithms. Chapman & Hall/CRCGoogle Scholar
  68. 68.
    Zhou ZH, Wu JX, Jiang Y, Chen SF (2001) Genetic algorithm based selective neural network ensemble. In: Proceedings of the 17th International Joint Conference on Artificial Intelligence - Volume 2, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, IJCAI’01, pp 797–802, http://dl.acm.org/citation.cfm?id=1642194.1642200

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Ringolf Thomschke
    • 1
  • Stefan Voß
    • 2
    Email author
  • Stefan Lessmann
    • 1
  1. 1.Humboldt-Universität zu BerlinSchool of Business and EconomicsBerlinGermany
  2. 2.University of HamburgInstitute of Information SystemsHamburgGermany

Personalised recommendations