Advertisement

Statistical Instance-Based Ensemble Pruning for Multi-class Problems

  • Gonzalo Martínez-Muñoz
  • Daniel Hernández-Lobato
  • Alberto Suárez
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5768)

Abstract

Recent research has shown that the provisional count of votes of an ensemble of classifiers can be used to estimate the probability that the final ensemble prediction coincides with the current majority class. For a given instance, querying can be stopped when this probability is above a specified threshold. This instance-based ensemble pruning procedure can be efficiently implemented if these probabilities are pre-computed and stored in a lookup table. However, the size of the table and the cost of computing the probabilities grow very rapidly with the number of classes of the problem. In this article we introduce a number of computational optimizations that can be used to make the construction of the lookup table feasible. As a result, the application of instance-based ensemble pruning is extended to multi-class problems. Experiments in several UCI multi-class problems show that instance-based pruning speeds-up classification by a factor between 2 and 10 without any significant variation in the prediction accuracy of the ensemble.

Keywords

Instance based pruning ensemble learning neural networks 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)zbMATHGoogle Scholar
  2. 2.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  3. 3.
    Martínez-Muñoz, G., Suárez, A.: Switching class labels to generate classification ensembles. Pattern Recognition 38(10), 1483–1494 (2005)CrossRefGoogle Scholar
  4. 4.
    Rodriguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(10), 1619–1630 (2006)CrossRefGoogle Scholar
  5. 5.
    Martínez-Muñoz, G., Sánchez-Martínez, A., Hernández-Lobato, D., Suárez, A.: Class-switching neural network ensembles. Neurocomputing 71(13-15), 2521–2528 (2008); Artificial Neural Networks (ICANN 2006)CrossRefGoogle Scholar
  6. 6.
    Hernández-Lobato, D., Martínez-Muñoz, G., Suárez, A.: Statistical instance-based pruning in ensembles of independent classifiers. IEEE Transanctions on Pattern Analysis and Machine Intelligence 31(2), 364–369 (2009)CrossRefGoogle Scholar
  7. 7.
    Fan, W.: Systematic data selection to mine concept-drifting data streams. In: KDD 2004, pp. 128–137. ACM, New York (2004)Google Scholar
  8. 8.
    Asuncion, A., Newman, D.: UCI machine learning repository (2007)Google Scholar
  9. 9.
    Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S. Springer, New York (2002)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Gonzalo Martínez-Muñoz
    • 1
    • 2
  • Daniel Hernández-Lobato
    • 1
  • Alberto Suárez
    • 1
  1. 1.Universidad Autónoma de Madrid, EPSMadridSpain
  2. 2.Oregon State UniversityCorvallisUSA

Personalised recommendations