Skip to main content

Racing for Unbalanced Methods Selection

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8206))

Abstract

State-of-the-art classification algorithms suffer when the data is skewed towards one class. This led to the development of a number of techniques to cope with unbalanced data. However, as confirmed by our experimental comparison, no technique appears to work consistently better in all conditions. We propose to use a racing method to select adaptively the most appropriate strategy for a given unbalanced task. The results show that racing is able to adapt the choice of the strategy to the specific nature of the unbalanced problem and to select rapidly the most appropriate strategy without compromising the accuracy.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. D.N.A. Asuncion. UCI machine learning repository (2007)

    Google Scholar 

  2. Batista, G., Carvalho, A., Monard, M.: Applying one-sided selection to unbalanced datasets. In: Cairó, O., Cantú, F.J. (eds.) MICAI 2000. LNCS, vol. 1793, pp. 315–325. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  3. Birattari, M.: Race: Racing methods for the selection of the best, R package version 0.1.59 (2012)

    Google Scholar 

  4. Birattari, M., Stützle, T., Paquete, L., Varrentrapp, K.: A racing algorithm for configuring metaheuristics. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 11–18 (2002)

    Google Scholar 

  5. Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, W.: Smote: synthetic minority over-sampling technique. Arxiv preprint arXiv:1106.1813 (2011)

    Google Scholar 

  6. Clark, P., Niblett, T.: The cn2 induction algorithm. Machine Learning 3(4), 261–283 (1989)

    Google Scholar 

  7. Drummond, C., Holte, R., et al.: C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In: Workshop on Learning from Imbalanced Datasets II. Citeseer (2003)

    Google Scholar 

  8. Hart, P.E.: The condensed nearest neighbor rule. IEEE Transactions on Information Theory (1968)

    Google Scholar 

  9. Holte, R.C., Acker, L.E., Porter, B.W., et al.: Concept learning and the problem of small disjuncts. In: Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, vol. 1. Citeseer (1989)

    Google Scholar 

  10. Japkowicz, N., Stephen, S.: The class imbalance problem: A systematic study. Intelligent Data Analysis 6(5), 429–449 (2002)

    MATH  Google Scholar 

  11. Kubat, M., Matwin, S., et al.: Addressing the curse of imbalanced training sets: one-sided selection. In: Machine Learning-International Workshop Then Conference, pp. 179–186. Morgan Kaufmann Publishers, Inc. (1997)

    Google Scholar 

  12. Laurikkala, J.: Improving identification of difficult small classes by balancing class distribution. In: Quaglini, S., Barahona, P., Andreassen, S. (eds.) AIME 2001. LNCS (LNAI), vol. 2101, pp. 63–66. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  13. Liu, X., Wu, J., Zhou, Z.: Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 39(2), 539–550 (2009)

    Article  Google Scholar 

  14. Lin, K.T.M., Yao, X.: A dynamic sampling approach to training neural networks for multi-class imbalance classification. IEEE Transactions on Neural Networks and Learning Systems 24, 647–660 (2013)

    Article  Google Scholar 

  15. Maron, O., Moore, A.: Hoeffding races: Accelerating model selection search for classification and function approximation, p. 263. Robotics Institute (1993)

    Google Scholar 

  16. Olshen, L., Stone, C.: Classification and regression trees. Wadsworth International Group (1984)

    Google Scholar 

  17. Quinlan, J.R.: C4. 5: programs for machine learning, vol. 1. Morgan Kaufmann (1993)

    Google Scholar 

  18. R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2011) ISBN 3-900051-07-0

    Google Scholar 

  19. Tomek, I.: Two modifications of cnn. IEEE Trans. Syst. Man Cybern. 6, 769–772 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  20. Wilson, D.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man and Cybernetics (3), 408–421 (1972)

    Google Scholar 

  21. Wilson, D., Martinez, T.: Reduction techniques for instance-based learning algorithms. Machine Learning 38(3), 257–286 (2000)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dal Pozzolo, A., Caelen, O., Waterschoot, S., Bontempi, G. (2013). Racing for Unbalanced Methods Selection. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2013. IDEAL 2013. Lecture Notes in Computer Science, vol 8206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41278-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41278-3_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41277-6

  • Online ISBN: 978-3-642-41278-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics