Effect of Subsampling Rate on Subbagging and Related Ensembles of Stable Classifiers

  • Faisal Zaman
  • Hideo Hirose
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5909)

Abstract

In ensemble methods to create multiple classifiers mostly bootstrap sampling method is preferred. The use of subsampling in ensemble creation, produce diverse members for the ensemble and induce instability for stable classifiers. In subsampling the only parameter is the subsample rate that is how much observations we will take from the training sample in each subsample. In this paper we have presented our work on the effect of different subsampling rate (SSR) in bagging type ensemble of stable classifiers, Subbagging and Double Subbagging. We have used three stable classifiers, Linear Support Vector Machine (LSVM), Stable Linear Discriminant Analysis (SLDA) and Logistic Linear Classifier (LOGLC). We also experimented on decision tree to check whether the performance of tree classifier is influenced by different SSR. From the experiment we see that for most of the datasets, the subbagging with stable classifiers in low SSR produces better performance than bagging and single stable classifiers, also in some cases better than double subbagging. We also found an opposite relation between the performance of double subbagging and subbagging.

Keywords

Subsample rate Stable Classifiers Subbagging Double Subbagging 

References

  1. 1.
    Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases, http://www.ics.uci.edu/mlearn/MLRepository.html
  2. 2.
    Bousquet, O., Elisseeff, A.: Stability and generalization. J. Mach. Lear. Res. 2, 499–526 (2002)MATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Elisseeff, A., Evegniou, T., Pontil, M.: Stability of Randomized Algorithm. J. Mach. Lear. Res. 6, 55–79 (2005)Google Scholar
  4. 4.
    Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996a)MATHMathSciNetGoogle Scholar
  5. 5.
    Breiman, L.: Heuristics of instability and stabilization in model selection. Annals of Statistics 24(6), 2350–2383 (1996c)MATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Bühlman, P.: Bagging, subbagging and bragging for improving some prediction algorithms. In: Arkitas, M.G., Politis, D.N. (eds.) Recent Advances and Trends in Nonparametric Statistics, pp. 9–34. Elsevier, Amsterdam (2003)Google Scholar
  7. 7.
    Evgeniou, T., Pontil, M., Elisseeff, A.: Leave one out error, stability, and generalization of voting combinations of classifiers (Preprint) (2001)Google Scholar
  8. 8.
    Hothorn, T., Lausen, B.: Double-bagging: combining classifiers by bootstrap aggregation. Pattern Recognition 36(6), 1303–1309 (2003)MATHCrossRefGoogle Scholar
  9. 9.
    Shapire, R., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics (1998)Google Scholar
  10. 10.
    Zaman, M.F., Hirose, H.: A New Double Bagging via the Support Vector Machine with Application to the Condition Diagnosis for the Electric Power Apparatus. In: International Conference on Data Mining and Applications (ICDMA 2009), pp. 654–660 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Faisal Zaman
    • 1
  • Hideo Hirose
    • 1
  1. 1.Kyushu Institute of TechnologyFukuokaJapan

Personalised recommendations