Abstract
We propose a general method to assess the reliability of two-class probabilities in an instance-wise manner. This is relevant, for instance, for obtaining calibrated multi-class probabilities from two-class probability scores. The LS-ECOC method approaches this by performing least-squares fitting over a suitable error-correcting output code matrix, where the optimisation resolves potential conflicts in the input probabilities. While this gives all input probabilities equal weight, we would like to spend less effort fitting unreliable probability estimates. We introduce the concept of a reliability map to accompany the more conventional notion of calibration map; and LS-ECOC-R which modifies LS-ECOC to take reliability into account. We demonstrate on synthetic data that this gets us closer to the Bayes-optimal classifier, even if the base classifiers are linear and hence have high bias. Results on UCI data sets demonstrate that multi-class accuracy also improves.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Allwein, E., Schapire, R., Singer, Y.: Reducing multiclass to binary: A unifying approach for margin classifiers. Journal of Machine Learning Research 1, 113–141 (2001)
Bella, A., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.J.: On the effect of calibration in classifier combination. Applied Intelligence 38(4), 566–585 (2012)
Bennett, P.N.: Neighborhood-based local sensitivity. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 30–41. Springer, Heidelberg (2007)
Dietterich, T., Bakiri, G.: Solving Multiclass Learning Problems via Error-Correcting Output Codes. Journal of Artificial Intelligence Research 2, 263–286 (1995)
Fan, J., Yao, Q.: Efficient estimation of conditional variance functions in stochastic regression. Biometrika 85(3), 645–660 (1998)
Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: Dynamic classifier selection for one-vs-one strategy. Pattern Recognition 46, 3412–3424 (2013)
Kaufman, L., Rousseeuw, P.J.: Finding groups in data: an introduction to cluster analysis, vol. 344. John Wiley & Sons (2009)
Kong, E.B., Dietterich, T.: Probability estimation via error-correcting output coding. In: International Conference of Artificial Intelligence and Soft Computing, Banff, Canada (1997)
Lorena, A.C., Carvalho, A., Gama, J.: A review on the combination of binary classifiers in multiclass problems. Artificial Intelligence Review 30, 19–37 (2009)
Murphy, A.H.: A new vector partition of the probability score. Journal of Applied Meteorology 12(4), 595–600 (1973)
O’Brien, D., Gupta, M., Gray, R.: Cost-sensitive multi-class classification from probability estimates. In: Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, pp. 712–719 (2008)
Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Machine Learning 37(3), 297–336 (1999)
Shafer, G., Vovk, V.: A tutorial on conformal prediction. Journal of Machine Learning Research 9, 371–421 (2008)
Székely, G.J., Rizzo, M.L.: A new test for multivariate normality. Journal of Multivariate Analysis 93(1), 58–80 (2005)
Wang, X., Zhou, J.: Research on the characteristic of the probabilistic outputs via LS-ECOC. In: Eighth International Conference on Fuzzy Systems and Knowledge Discovery, vol. 2, pp. 1330–1334. IEEE (2011)
Windeatt, T., Ghaderi, R.: Coding and decoding strategies for multi-class learning problems. Information Fusion 4(1), 11–21 (2003)
Zadrozny, B., Elkan, C.: Transforming classifier scores into accurate multiclass probability estimates. In: Eighth ACM SIGKDD International Conference, pp. 694–699. ACM Press, New York (2002)
Zhou, J.D., Wang, X.D., Song, H.: Research on the unbiased probability estimation of error-correcting output coding. Pattern Recognition 44(7), 1552–1565 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kull, M., Flach, P.A. (2014). Reliability Maps: A Tool to Enhance Probability Estimates and Improve Classification Accuracy. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2014. Lecture Notes in Computer Science(), vol 8725. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44851-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-662-44851-9_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44850-2
Online ISBN: 978-3-662-44851-9
eBook Packages: Computer ScienceComputer Science (R0)