Skip to main content

Cost-Sensitive Classifier Evaluation Using Cost Curves

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Included in the following conference series:

Abstract

The evaluation of classifier performance in a cost-sensitive setting is straightforward if the operating conditions (misclassification costs and class distributions) are fixed and known. When this is not the case, evaluation requires a method of visualizing classifier performance across the full range of possible operating conditions. This talk outlines the most important requirements for cost-sensitive classifier evaluation for machine learning and KDD researchers and practitioners, and introduces a recently developed technique for classifier performance visualization – the cost curve – that meets all these requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adams, N.M., Hand, D.J.: Comparing classifiers when misclassification costs are uncertain. Pattern Recognition 32, 1139–1147 (1999)

    Article  Google Scholar 

  2. Antonie, M.-L., Zaiane, O.R., Holtex, R.C.: Learning to use a learned model: A two-stage approach to classification. In: Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), pp. 33–42 (2006)

    Google Scholar 

  3. Bosin, A., Dessi, N., Pes, B.: Capturing heuristics and intelligent methods for improving micro-array data classification. In: IDEAL 2007. LNCS, vol. 4881, pp. 790–799. Springer, Heidelberg (2007)

    Google Scholar 

  4. Briggs, W.M., Zaretzki, R.: The skill plot: a graphical technique for the evaluating the predictive usefulness of continuous diagnostic tests. Biometrics, OnlineEarly Articles (2007)

    Google Scholar 

  5. Chawla, N.V., Hall, L.O., Joshi, A.: Wrapper-based computation and evaluation of sampling methods for imbalanced datasets. In: Workshop on Utility-Based Data Mining held in conjunction with the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 179–188 (2005)

    Google Scholar 

  6. Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning (ICML 2006), pp. 233–240 (2006)

    Google Scholar 

  7. Drummond, C., Holte, R.C.: Explicitly representing expected cost: An alternative to ROC representation. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 198–207 (2000)

    Google Scholar 

  8. Drummond, C., Holte, R.C.: C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling. In: Workshop on Learning from Imbalanced Datasets II, held in conjunction with ICML 2003 (2003)

    Google Scholar 

  9. Drummond, C., Holte, R.C.: Learning to live with false alarms. In: Workshop on Data Mining Methods for Anomaly Detection held in conjunction with the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 21–24 (2005)

    Google Scholar 

  10. Drummond, C., Holte, R.C.: Severe class imbalance: Why better algorithms aren’t the answer. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 539–546. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. Drummond, C., Holte, R.C.: Cost curves: An improved method for visualizing classifier performance. Machine Learning 65(1), 95–130 (2006)

    Article  Google Scholar 

  12. Fawcett, T.: ROC graphs with instance-varying costs. Pattern Recognition Letters 27(8), 882–891 (2006)

    Article  MathSciNet  Google Scholar 

  13. Hilden, J., Glasziou, P.: Regret graphs, diagnostic uncertainty, and Youden’s index. Statistics in Medicine 15, 969–986 (1996)

    Article  Google Scholar 

  14. Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Machine Learning 11(1), 63–91 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  15. Jumi, M., Suzuki, E., Ohshima, M., Zhong, N., Yokoi, H., Takabayashi, K.: Spiral discovery of a separate prediction model from chronic hepatitis data. In: Sakurai, A., Hasida, K., Nitta, K. (eds.) JSAI 2003. LNCS (LNAI), vol. 3609, pp. 464–473. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  16. Liu, T., Ting, K.M.: Variable randomness in decision tree ensembles. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 81–90. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  17. Liu, Y., Shriberg, E.: Comparing evaluation metrics for sentence boundary detection. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007), vol. 4, pp. IV–185—IV–188 (2007)

    Google Scholar 

  18. Provost, F., Fawcett, T.: Robust classification for imprecise environments. Machine Learning 42, 203–231 (2001)

    Article  MATH  Google Scholar 

  19. Provost, F., Fawcett, T.: Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA, pp. 43–48 (1997)

    Google Scholar 

  20. Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)

    Google Scholar 

  21. Remaleya, A.T., Sampson, M.L., DeLeo, J.M., Remaley, N.A., Farsi, B.D., Zweig, M.H.: Prevalence-value-accuracy plots: A new method for comparing diagnostic tests based on misclassification costs. Clinical Chemistry 45, 934–941 (1999)

    Google Scholar 

  22. Ting, K.M.: Issues in classifier evaluation using optimal cost curves. In: Proceedings of The Nineteenth International Conference on Machine Learning, pp. 642–649 (2002)

    Google Scholar 

  23. Zhou, Z.-H., Liu, X.-L.: Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Transactions on Knowledge and Data Engineering 18(1), 63–77 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Holte, R.C., Drummond, C. (2008). Cost-Sensitive Classifier Evaluation Using Cost Curves. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68125-0_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68124-3

  • Online ISBN: 978-3-540-68125-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics