Local Characteristics of Minority Examples in Pre-processing of Imbalanced Data
Informed pre-processing methods for improving classifiers learned from class-imbalanced data are considered. We discuss different ways of analyzing the characteristics of local distributions of examples in such data. Then, we experimentally compare main informed pre-processing methods and show that identifying types of minority examples depending on their k nearest neighbourhood may help in explaining differences in performance of these methods. Finally, we exploit the information about the local neighbourhood to modify the oversampling ratio in a SMOTE–related method.
KeywordsLocal Characteristic Majority Class Local Neighbourhood Minority Class Class Imbalance
Unable to display preview. Download preview PDF.
- 2.Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, W.: SMOTE: Synthetic Minority Over-sampling Technique. J. of Artificial Intelligence Research 16, 341–378 (2002)Google Scholar
- 3.He, H.: Yungian Ma: Imbalanced Learning. Foundations, Algorithms and Applications. IEEE - Wiley (2013)Google Scholar
- 5.Kubat, M., Matwin, S.: Addresing the curse of imbalanced training sets: one-side selection. In: Proc. of the 14th Int. Conf. on Machine Learning, pp. 179–186 (1997)Google Scholar
- 6.Laurikkala, J.: Improving identification of difficult small classes by balancing class distribution. Tech. Report A-2001-2. University of Tampere (2001)Google Scholar
- 7.Maciejewski, T., Stefanowski, J.: Local neighbourhood extension of SMOTE for mining imbalanced data. In: Proc. IEEE Symp. on Computational Intelligence and Data Mining, pp. 104–111 (2011)Google Scholar
- 8.Napierala, K.: Improving rule classifiers for imbalanced data. Ph.D. Thesis. Poznan University of Technology (2013)Google Scholar