Skip to main content

Boosting Inspired Process for Improving AUC

  • Conference paper
  • 2031 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6871))

Abstract

Boosting is a general method of combining a set of classifiers in making final prediction. It is shown to be an effective approach to improve the predictive accuracy of a learning algorithm, but its impact on the ranking performance is unknown. This paper introduces the boosting algorithm AUCBoost, which is a generic algorithm to improve the ranking performance of learning algorithms. Unlike AdaBoost, AUCBoost uses the AUC, not the accuracy, of a classifier to calculate the weight of each training example for building next classifier. To simplify the computation of AUC of weighted instances in AUCBoost, we extend the standard formula for calculating AUC to be a weighted AUC formula (WAUC in short). This extension frees boosting from the resampling process and saves much computation time in the training process. Our experiment results show that the new boosting algorithm AUCBoost does improve ranking performance of AdaBoost when the base learning algorithm is the improved ranking favored decision tree C4.4 or naïve Bayes.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Elkan, C.: Boosting and Naïve Bayesian Learning, Technical Report No. CS97-557, University of California, SanDiego (1997)

    Google Scholar 

  2. Fayyad, U., Irani, K.: Multi-interval Discretization of Continuous-valued attributes for Classification Learning. In: Proceeding of Thirteenth International Joint Conference on Artificial Intelligence, pp. 1022–1027. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  3. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  4. Hand, D.J., Till, R.J.: A Simple Generalization of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning 45, 171–186 (2001)

    Article  MATH  Google Scholar 

  5. Hastie, T., Tibshirani, R., Friedman, J.: The Element of Statistic Learning; Data Mining, Inference, and Prediction. Springer, Heidelberg (2001)

    MATH  Google Scholar 

  6. Kohavi, R.: A Study of Cross Validation and Bootstrap for Accuracy Estimation and Model Selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 338–345. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  7. Ling, C., Huang, J., Zhang, H.: AUC: a Statistically Consistent and more Discriminating Measure than Accuracy. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 329–341 (2003)

    Google Scholar 

  8. Margineantu, D.D., Dietterich, T.G.: Improved Class Probability Estimates from Decision Tree Models. In: Denison, D.D., Hansen, M.H., Holmes, C.C., Mallick, B., Yu, B. (eds.) Nonlinear Estimation and Classification. Lecture Notes in Statistics, vol. 171, pp. 169–184. Springer, New York (2002)

    Google Scholar 

  9. Mitchell, T.: Machine Learning. The McGraw-Hill Companies, New York (1997)

    MATH  Google Scholar 

  10. Merz, C., Murphy, P., Aha, D.: UCI Repository of Machine Learning DataBases. In: Department of ICS. University of California, Irvine (1997), http://www.ics.uci.edu/mlearn/MLRepository.html

    Google Scholar 

  11. Provost, F.J., Domingos, P.: Tree induction for probability-based ranking. Machine Learning 52, 199–215 (2003)

    Article  MATH  Google Scholar 

  12. Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1(1), 86–106 (1986)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sheng, V.S., Tada, R. (2011). Boosting Inspired Process for Improving AUC. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2011. Lecture Notes in Computer Science(), vol 6871. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23199-5_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23199-5_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23198-8

  • Online ISBN: 978-3-642-23199-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics