Abstract
We consider the problem of multi-class classification with imbalanced data-sets. To this end, we introduce a cost-sensitive multi-class Boosting algorithm (BAdaCost) based on a generalization of the Boosting margin, termed multi-class cost-sensitive margin. To address the class imbalance we introduce a cost matrix that weighs more hevily the costs of confused classes and a procedure to estimate these costs from the confusion matrix of a standard \(0|1\)-loss classifier. Finally, we evaluate the performance of the approach with synthetic and real data-sets and compare our results with the AdaC2.M1 algorithm.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Sun, Y., Kamel, M.S., Wang, Y.: Boosting for learning multiple classes with imbalanced class distribution. In: Proceedings of the International Conference on Data Mining. ICDM ’06, pp. 592–602 (2006)
Masnadi-Shirazi, Hamed, Vasconcelos, N.: Cost-sensitive boosting. Trans. Pattern Anal. Mach. Intell. 33, 294–309 (2011)
Fan, W., Stolfo, S.J., Zhang, J., Chan, P.K.: Adacost: Misclassification cost-sensitive boosting. In: Proceedings of the 16th International Conference on Machine Learning, pp. 97–105 (1999)
Landesa-Vazquez, I., Alba-Castro, J.L.: Double-base asymmetric adaboost. Neurocomputing 118, 101–114 (2013)
Freund, Y., Schapire, R.E.: A decision theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 119–139 (1997)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Ann. Stat. 28(2), 337–407 (2000)
Zou, H., Zhu, J., Hastie, T.: New multicategory boosting algorithms based on multicategory fisher-consistent losses. Ann. Appl. Stat. 2, 1290–1306 (2008)
Zhu, J., Zou, H., Rosset, S., Hastie, T.: Multi-class AdaBoost. Stat. Interface 2, 349–360 (2009)
O’Brien, D.B., Gupta, M.R., Gray, R.M.: Cost-sensitive multi-class classification from probability estimates. In: Proceedings of the 25th International Conference on Machine Learning, pp. 712–719 (2008)
Fernandez-Baldera, A., Baumela, L.: Multi-class boosting with asymmetric weak-learners. Patt. Recogn. 47(5), 2080–2090 (2014)
Cetina, K., Márquez-Neila, P., Baumela, L.: A comparative study of feature descriptors for mitochondria and synapse segmentation. In: Proceedings of the International Conference on Pattern Recognition (2014)
Acknowledgments
This research was funded by the spanish Ministerio de Economía y Competitividad, project number TIN2013-47630-C2-2-R.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Fernández-Baldera, A., Buenaposada, J.M., Baumela, L. (2015). Multi-class Boosting for Imbalanced Data. In: Paredes, R., Cardoso, J., Pardo, X. (eds) Pattern Recognition and Image Analysis. IbPRIA 2015. Lecture Notes in Computer Science(), vol 9117. Springer, Cham. https://doi.org/10.1007/978-3-319-19390-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-19390-8_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19389-2
Online ISBN: 978-3-319-19390-8
eBook Packages: Computer ScienceComputer Science (R0)