Skip to main content

Similarity-Binning Averaging: A Generalisation of Binning Calibration

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5788))

Abstract

In this paper we revisit the problem of classifier calibration, motivated by the issue that existing calibration methods ignore the problem attributes (i.e., they are univariate). We propose a new calibration method inspired in binning-based methods in which the calibrated probabilities are obtained from k instances from a dataset. Bins are constructed by including the k-most similar instances, considering not only estimated probabilities but also the original attributes. This method has been tested wrt. two calibration measures, including a comparison with other traditional calibration methods. The results show that the new method outperforms the most commonly used calibration methods.

This work has been partially supported by the EU (FEDER) and the Spanish MEC/MICINN, under grant TIN 2007-68093-C02 and the Spanish project ”Agreement Technologies” (Consolider Ingenio CSD2007-00022).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bella, A., Ferri, C., Hernandez-Orallo, J., Ramirez-Quintana, M.J.: Calibration of machine learning models. In: Handbook of Research on Machine Learning Applications. IGI Global (2009)

    Google Scholar 

  2. Caruana, R., Niculescu-Mizil, A.: Data mining in metric space: an empirical analysis of supervised learning performance criteria. In: Proc. of the 10th Intl. Conference on Knowledge Discovery and Data Mining, pp. 69–78 (2004)

    Google Scholar 

  3. Ayer, M., et al.: An empirical distribution function for sampling with incomplete information. Annals of Mathematical Statistics 5, 641–647 (1955)

    Article  MathSciNet  MATH  Google Scholar 

  4. Ferri, C., Hernández-Orallo, J., Modroiu, R.: An experimental comparison of performance measures for classification. Pattern Recogn. Lett. 30(1), 27–38 (2009)

    Article  Google Scholar 

  5. Flach, P., Matsubara, E.: A simple lexicographic ranker and probability estimator. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 575–582. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  6. Gama, J., Brazdil, P.: Cascade generalization. Machine Learning 41, 315–343 (2000)

    Article  MATH  Google Scholar 

  7. Murphy, A.H.: Scalar and vector partitions of the probability score: Part ii. n-state situation. Journal of Applied Meteorology 11, 1182–1192 (1972)

    Google Scholar 

  8. Platt, J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers, pp. 61–74. MIT Press, Boston (1999)

    Google Scholar 

  9. Zadrozny, B., Elkan, C.: Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. In: Proc. of the 18th Intl. Conference on Machine Learning, pp. 609–616 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bella, A., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.J. (2009). Similarity-Binning Averaging: A Generalisation of Binning Calibration. In: Corchado, E., Yin, H. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2009. IDEAL 2009. Lecture Notes in Computer Science, vol 5788. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04394-9_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04394-9_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04393-2

  • Online ISBN: 978-3-642-04394-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics