Cost-Sensitive Classifier Evaluation Using Cost Curves

Holte, Robert C.; Drummond, Chris

doi:10.1007/978-3-540-68125-0_4

Robert C. Holte¹ &
Chris Drummond²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5012))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2521 Accesses
7 Citations

Abstract

The evaluation of classifier performance in a cost-sensitive setting is straightforward if the operating conditions (misclassification costs and class distributions) are fixed and known. When this is not the case, evaluation requires a method of visualizing classifier performance across the full range of possible operating conditions. This talk outlines the most important requirements for cost-sensitive classifier evaluation for machine learning and KDD researchers and practitioners, and introduces a recently developed technique for classifier performance visualization – the cost curve – that meets all these requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Adams, N.M., Hand, D.J.: Comparing classifiers when misclassification costs are uncertain. Pattern Recognition 32, 1139–1147 (1999)
Article Google Scholar
Antonie, M.-L., Zaiane, O.R., Holtex, R.C.: Learning to use a learned model: A two-stage approach to classification. In: Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), pp. 33–42 (2006)
Google Scholar
Bosin, A., Dessi, N., Pes, B.: Capturing heuristics and intelligent methods for improving micro-array data classification. In: IDEAL 2007. LNCS, vol. 4881, pp. 790–799. Springer, Heidelberg (2007)
Google Scholar
Briggs, W.M., Zaretzki, R.: The skill plot: a graphical technique for the evaluating the predictive usefulness of continuous diagnostic tests. Biometrics, OnlineEarly Articles (2007)
Google Scholar
Chawla, N.V., Hall, L.O., Joshi, A.: Wrapper-based computation and evaluation of sampling methods for imbalanced datasets. In: Workshop on Utility-Based Data Mining held in conjunction with the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 179–188 (2005)
Google Scholar
Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning (ICML 2006), pp. 233–240 (2006)
Google Scholar
Drummond, C., Holte, R.C.: Explicitly representing expected cost: An alternative to ROC representation. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 198–207 (2000)
Google Scholar
Drummond, C., Holte, R.C.: C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling. In: Workshop on Learning from Imbalanced Datasets II, held in conjunction with ICML 2003 (2003)
Google Scholar
Drummond, C., Holte, R.C.: Learning to live with false alarms. In: Workshop on Data Mining Methods for Anomaly Detection held in conjunction with the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 21–24 (2005)
Google Scholar
Drummond, C., Holte, R.C.: Severe class imbalance: Why better algorithms aren’t the answer. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 539–546. Springer, Heidelberg (2005)
Chapter Google Scholar
Drummond, C., Holte, R.C.: Cost curves: An improved method for visualizing classifier performance. Machine Learning 65(1), 95–130 (2006)
Article Google Scholar
Fawcett, T.: ROC graphs with instance-varying costs. Pattern Recognition Letters 27(8), 882–891 (2006)
Article MathSciNet Google Scholar
Hilden, J., Glasziou, P.: Regret graphs, diagnostic uncertainty, and Youden’s index. Statistics in Medicine 15, 969–986 (1996)
Article Google Scholar
Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Machine Learning 11(1), 63–91 (1993)
Article MATH MathSciNet Google Scholar
Jumi, M., Suzuki, E., Ohshima, M., Zhong, N., Yokoi, H., Takabayashi, K.: Spiral discovery of a separate prediction model from chronic hepatitis data. In: Sakurai, A., Hasida, K., Nitta, K. (eds.) JSAI 2003. LNCS (LNAI), vol. 3609, pp. 464–473. Springer, Heidelberg (2007)
Chapter Google Scholar
Liu, T., Ting, K.M.: Variable randomness in decision tree ensembles. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 81–90. Springer, Heidelberg (2006)
Chapter Google Scholar
Liu, Y., Shriberg, E.: Comparing evaluation metrics for sentence boundary detection. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007), vol. 4, pp. IV–185—IV–188 (2007)
Google Scholar
Provost, F., Fawcett, T.: Robust classification for imprecise environments. Machine Learning 42, 203–231 (2001)
Article MATH Google Scholar
Provost, F., Fawcett, T.: Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, Menlo Park, CA, pp. 43–48 (1997)
Google Scholar
Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Google Scholar
Remaleya, A.T., Sampson, M.L., DeLeo, J.M., Remaley, N.A., Farsi, B.D., Zweig, M.H.: Prevalence-value-accuracy plots: A new method for comparing diagnostic tests based on misclassification costs. Clinical Chemistry 45, 934–941 (1999)
Google Scholar
Ting, K.M.: Issues in classifier evaluation using optimal cost curves. In: Proceedings of The Nineteenth International Conference on Machine Learning, pp. 642–649 (2002)
Google Scholar
Zhou, Z.-H., Liu, X.-L.: Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Transactions on Knowledge and Data Engineering 18(1), 63–77 (2006)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computing Science Department, University of Alberta, Edmonton, Alberta, Canada, T6G 2E8
Robert C. Holte
Institute for Information Technology, National Research Council, Ontario, Canada, K1A 0R6
Chris Drummond

Authors

Robert C. Holte
View author publications
You can also search for this author in PubMed Google Scholar
Chris Drummond
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Takashi Washio Einoshin Suzuki Kai Ming Ting Akihiro Inokuchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Holte, R.C., Drummond, C. (2008). Cost-Sensitive Classifier Evaluation Using Cost Curves. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_4

Download citation

DOI: https://doi.org/10.1007/978-3-540-68125-0_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68124-3
Online ISBN: 978-3-540-68125-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics