Skip to main content

Training Classifiers for Unbalanced Distribution and Cost-Sensitive Domains with ROC Analysis

  • Conference paper
Advances in Knowledge Acquisition and Management (PKAW 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4303))

Included in the following conference series:

Abstract

ROC (Receiver Operating Characteristic) has been used as a tool for the analysis and evaluation of two-class classifiers, even the training data embraces unbalanced class distribution and cost-sensitiveness. However, ROC has not been effectively extended to evaluate multi-class classifiers. In this paper, we proposed an effective way to deal with multi-class learning with ROC analysis. An EMAUC algorithm is implemented to transform a multi-class training set into several two-class training sets. Classification is carried out with these two-class training sets. Empirical results demonstrate that the classifiers trained with the proposed algorithm have competitive performance for unbalanced distribution and cost-sensitive domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fawcett, T., Provost, F.: Adaptive Fraud Detection. Data Mining and Knowledge Discovery, 291–316 (1997)

    Google Scholar 

  2. Lusted, L.B.: Logical Analysis in Roentgen Diagnosis. Radiology 74, 178–193 (1960)

    Google Scholar 

  3. Dietterich, T.G., Bakiri, G.: Solving Multiclass Learning Problems Via Error Correcting Output Codes. Journal of Artificial Intelligence Research 2, 263–286 (1995)

    MATH  Google Scholar 

  4. WEKA, http://www.cs.waikato.ac.nz/ml/weka

  5. ROCon, http://www.cs.bris.ac.uk/Research/MachineLearning/rocon

  6. Merz, C.J., Murphy, P.M., Aha, D.W.: UCI repository of machine learning databases, University of California, Irvine (1998), Available: http://www.ics.uci.edu/~mlearn/MLRepository.html

  7. Swets, J.A., Dawes, R.M., Monahan, J.: Better Decisions through Science. Scientific American (2000)

    Google Scholar 

  8. Fawcett, T.: ROC Graphs: Notes and Practical Considerations for Researchers. Machine Learning (2004)

    Google Scholar 

  9. Drummond, C., Holte, R.C.: What ROC Curves Can’t Do (and Cost Curves Can). In: Proceedings of the ROC Analysis in Artificial Intelligence, 1st International Workshop, pp. 19–26 (2004)

    Google Scholar 

  10. Ling, C.X., Huang, J., Zhang, H.: AUC: a Better Measure than Accuracy in Comparing Learning Algorithms. In: Canadian Conference on AI (2003)

    Google Scholar 

  11. Mossman, D.: Three-way ROCs. Medical Decision Making 19(1), 78–89 (1999)

    Article  Google Scholar 

  12. Ferri, C., Flach, P.A., Hernandez-Orallo, J.: Learning Decision Trees Using the Area Under the ROC Curve. In: Proceedings of the Nineteenth International Conference on Machine Learning ICML, pp. 139–146 (2002)

    Google Scholar 

  13. Ferri, C., Hernndez-Orallo, J., Salido, M.A.: Volume Under the ROC Surface for Multi-class Problems. In: Proceedings of 14th European Conference on Machine Learning, ECML (2003)

    Google Scholar 

  14. Ferri, C., Hernndez-Orallo, J., Salido, M.A.: Volume Under the ROC Surface for Multi-class Problems. Exact Computation and Evaluation of Approximations. 2003, Univ. Politecnica de Valencia: Valencia. 1-40. DSIC. Univ. Politc. Valncia (2003)

    Google Scholar 

  15. Hand, D.J., Till, R.J.: A Simple Generalization of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning 45(2), 171–186 (2001)

    Article  MATH  Google Scholar 

  16. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  17. Bradley, A.P.: The Use of the Area under the ROC Curve in the Evaluation of Machine Learning Algorithms. Pattern Recognition 30, 1145–1159 (1997)

    Article  Google Scholar 

  18. Flach, P.A.: The Geometry of ROC Space: Using ROC Isometrics to Understand Machine Learning Metrics. In: Proceedings of the International Conference on Machine Learning (2003)

    Google Scholar 

  19. Provost, F.J., Fawcett, T.: Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions. In: Knowledge Discovery and Data Mining, pp. 43–48 (1997)

    Google Scholar 

  20. Ling, C.X., Huang, J., Zhang, H.: AUC: a Statistically Consistent and More Discriminating Measure Than Accuracy. In: Proceedings of 18th International Conference on Artificial Intelligence (IJCAI 2003), pp. 329–341 (2003)

    Google Scholar 

  21. Huang, J., Lu, J., Ling, C.X.: Comparing Natives Bayes, Decision Trees, and SVM using Accuracy and AUC. In: Proceedings of European Conference on Data Mining (ICDML 2003) (2003)

    Google Scholar 

  22. Lachicle, N., Flach, P.: Improving Accuracy and Cost of Two-Class and Multi-Class Probabilistic Classifiers Using ROC Curves. In: Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003) (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, X., Jiang, C., Luo, Mj. (2006). Training Classifiers for Unbalanced Distribution and Cost-Sensitive Domains with ROC Analysis. In: Hoffmann, A., Kang, Bh., Richards, D., Tsumoto, S. (eds) Advances in Knowledge Acquisition and Management. PKAW 2006. Lecture Notes in Computer Science(), vol 4303. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11961239_8

Download citation

  • DOI: https://doi.org/10.1007/11961239_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68955-3

  • Online ISBN: 978-3-540-68957-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics