Abstract
Various modifications of bagging for class imbalanced data are discussed. An experimental comparison of known bagging modifications shows that integrating with undersampling is more powerful than oversampling. We introduce Local-and-Over-All Balanced bagging where probability of sampling an example is tuned according to the class distribution inside its neighbourhood. Experiments indicate that this proposal is competitive to best undersampling bagging extensions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Batista, G., Prati, R., Monard, M.: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter 6(1), 20–29 (2004)
Błaszczyński, J., Słowiński, R., Stefanowski, J.: Feature Set-based Consistency Sampling in Bagging Ensembles. In: Proc. From Local Patterns To Global Models (LEGO), ECML/PKDD Workshop, pp. 19–35 (2009)
Błaszczyński, J., Słowiński, R., Stefanowski, J.: Variable Consistency Bagging Ensembles. Transactions on Rough Sets 11, 40–52 (2010)
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
Chang, E.: Statistical learning for effective visual information retrieval. In: Proc. of ICIP 2003, pp. 609–612 (2003)
Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, W.: SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artifical Intelligence Research 16, 341–378 (2002)
Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A Review on Ensembles for Class Imbalance Problem: Bagging, Boosting and Hybrid Based Approaches. IEEE Transactions on Systems, Man, and Cybernetics–Part C 42(4), 463–484 (2011)
He, H., Garcia, E.: Learning from imbalanced data. IEEE Transactions on Data and Knowledge Engineering 21(9), 1263–1284 (2009)
Hido, S., Kashima, H.: Roughly balanced bagging for imbalance data. Statistical Analysis and Data Mining 2(5-6), 412–426 (2009)
Japkowicz, N., Shah, M.: Evaluating Learning Algorithms. A Classification Perpsective. Cambridge University Press (2011)
Khoshgoftaar, T., Van Hulse, J., Napolitano, A.: Comparing boosting and bagging techniques with noisy and imbalanced data. IEEE Transactions on Systems, Man, and Cybernetics–Part A 41(3), 552–568 (2011)
Napierala, K., Stefanowski, J.: Identification of different types of minority class examples in imbalanced data. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, S.-B. (eds.) HAIS 2012, Part II. LNCS, vol. 7209, pp. 139–150. Springer, Heidelberg (2012)
Wang, S., Yao, T.: Diversity analysis on imbalanced data sets by using ensemble models. In: Proc. IEEE Symp. Comput. Intell. Data Mining, pp. 324–331 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Błaszczyński, J., Stefanowski, J., Idkowiak, Ł. (2013). Extending Bagging for Imbalanced Data. In: Burduk, R., Jackowski, K., Kurzynski, M., Wozniak, M., Zolnierek, A. (eds) Proceedings of the 8th International Conference on Computer Recognition Systems CORES 2013. Advances in Intelligent Systems and Computing, vol 226. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-00969-8_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-00969-8_26
Publisher Name: Springer, Heidelberg
Print ISBN: 978-3-319-00968-1
Online ISBN: 978-3-319-00969-8
eBook Packages: EngineeringEngineering (R0)