Transductive Reliability Estimation for Individual Classifications in Machine Learning and Data Mining

Kukar, Matjaž

doi:10.1007/978-1-4614-1903-7_1

Matjaž Kukar⁴

550 Accesses

Abstract

Machine learning and data mining approaches are nowadays being used in many fields as valuable data analysis tools. However, their serious practical use is affected by the fact, that more often than not, they cannot produce reliable and unbiased assessments of their predictions’ quality. In last years, several approaches for estimating reliability or confidence of individual classifiers have emerged, many of them building upon the algorithmic theory of randomness, such as (historically ordered) transduction-based confidence estimation, typicalness-based confidence estimation, and transductive reliability estimation. In the chapter we describe typicalness and transductive reliability estimation frameworks and propose a joint approach that compensates their weaknesses by integrating typicalness-based confidence estimation and transductive reliability estimation into a joint confidence machine. The resulting confidence machine produces confidence values in the statistical sense (e.g., a confidence level of 95% means that in 95% the predicted class is also a true class), as well as provides us with a general principle that is independent of to the particular underlying classifier.

Download to read the full chapter text

Chapter PDF

Credal Decision Trees to Classify Noisy Data Sets

Empirical Confidence Models for Supervised Machine Learning

Estimating Prediction Certainty in Decision Trees

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Bay, S.D., Pazzani, M.J.: Characterizing model errors and differences. In: Proc. 17th International Conf. on Machine Learning, pp. 49–56. Morgan Kaufmann, San Francisco, CA (2000)
Google Scholar
Birattari, M., Bontempi, H., Bersini, H.: Local learning for data analysis. In: Proceedings of the 8th Belgian-Dutch Conference on Machine Learning, pp. 55–61. Wageningen, The Netherlands (1998)
Google Scholar
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: P. Bartlett, Y. Mansour (eds.) Proceedings of the 11th Annual Conference on Computational Learning Theory, pp. 92–100. ACM Press, New York, USA, Madison, Wisconsin (1998)
Google Scholar
Bosni’c, Z., Kononenko, I.: Estimation of individual prediction reliability using the local sensitivity analysis. Appl. intell. 29(3), 187–203 (2008)
Google Scholar
Bousquet, O., Elisseeff, A.: Stability and generalization. Journal of Machine Learning Research 2, 499–526 (2002)
MathSciNet MATH Google Scholar
Breierova, L., Choudhari, M.: An introduction to sensitivity analysis. MIT System Dynamics in Education Project (1996)
Google Scholar
Carney, J., Cunningham, P.: Confidence and prediction intervals for neural network ensembles. In: Proceedings of the International Joint Conference on Neural Networks, pp. 1215–1218. Washington, USA (1999)
Google Scholar
Diamond, G.A., Forester, J.S.: Analysis of probability as an aid in the clinical diagnosis of coronary artery disease. New England Journal of Medicine 300, 13–50 (1979)
Article Google Scholar
Elidan, G., Ninio, M., Friedman, N., Schuurmans, D.: Data perturbation for escaping local maxima in learning. In: Proceedings of the Eighteenth National Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications of Artificial Intelligence, pp. 132–139. AAAI Press, Edmonton, Alberta, Canada (2002)
Google Scholar
Gammerman, A., Vovk, V., Vapnik, V.: Learning by transduction. In: G.F. Cooper, S. Moral (eds.) Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence, pp. 148–155. Morgan Kaufmann, San Francisco, USA, Madison, Wisconsin (1998)
Google Scholar
Giacinto, G., Roli, F.: Dynamic classifier selection based on multiple classifier behaviour. Pattern Recognition 34, 1879–1881 (2001)
Article MATH Google Scholar
Gibbs, A.L., Su, F.E.: On choosing and bounding probability metrics. International Statistical Review 70(3), 419–435 (2002)
Article MATH Google Scholar
Halck, O.M.: Using hard classifiers to estimate conditional class probabilities. In: T. Elomaa, H. Mannila, H. Toivonen (eds.) Proceedings of the Thirteenth European Conference on Machine Learning, pp. 124–134. Springer-Verlag, Berlin (2002)
Google Scholar
Hastie, T., Tibisharani, R., Friedman, J.: The Elements of Statistical Learning. Springer-Verlag (2001)
Google Scholar
Heskes, T.: Practical confidence and prediction intervals. Advances in Neural Information Processing Systems 9, 176–182 (1997)
Google Scholar
Ho, S.S., Wechsler, H.: Transductive confidence machine for active learning. In: Proc. Int. Joint Conf. on Neural Networks’03. Portland, OR. (2003)
Google Scholar
John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: P. Besnard, S. Hanks (eds.) Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann, San Francisco, USA (1995)
Google Scholar
Kearns, M.J., Ron, D.: Algorithmic stability and sanity-check bounds for leave-one-out crossvalidation. In: Y. Freund, R. Shapire (eds.) Computational Learning Theory, pp. 152–162. Morgan Kaufmann (1997)
Google Scholar
Kleijnen, J.: Experimental designs for sensitivity analysis of simulation models, tutorial at the Eurosim 2001 Conference (2001)
Google Scholar
Kononenko, I., ˇSimec, E., Robnik-ˇSikonja, M.: Overcoming the myopia of inductive learning algorithms with ReliefF. Applied Intelligence 7, 39–55 (1997)
Google Scholar
Kononenko, I.: Semi-naive Bayesian classifier. In: Y. Kodratoff (ed.) Proc. EuropeanWorking Session on Learning-91, pp. 206–219. Springer-Verlag, Berlin-Heidelberg-New York, Porto, Potrugal (1991)
Google Scholar
Kukar, M.: Transductive reliability estimation for medical diagnosis. Artif. intell. med. pp. 81–106 (2003)
Google Scholar
Kukar, M.: Quality assessment of individual classifications in machine learning and data mining. Knowledge and information systems 9(3), 364–384 (2006)
Article Google Scholar
Kukar, M., Kononenko, I.: Reliable classifications with Machine Learning. In: T. Elomaa, H. Mannila, H. Toivonen (eds.) Proceedings of 13th European Conference on Machine Learning, ECML 2002, pp. 219–231. Springer-Verlag, Berlin (2002)
Chapter Google Scholar
Li, M., Vit’anyi, P.: An introduction to Kolmogorov complexity and its applications, 2nd edn. Springer-Verlag, New York (1997)
Google Scholar
Melluish, T., Saunders, C., Nouretdinov, I., Vovk, V.: Comparing the Bayes and typicalness frameworks. In: Proc. ECML 2001, vol. 2167, pp. 350–357 (2001)
Google Scholar
Nouretdinov, I., Melluish, T., Vovk, V.: Predictions with confidence intervals (local error bars). In: Proceedings of the International Conference on Neural Information Processing, pp. 847–852. Seoul, Korea (1994)
Google Scholar
Nouretdinov, I., Melluish, T., Vovk, V.: Ridge regression confidence machine. In: Proc. 18th International Conf. on Machine Learning, pp. 385–392. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Olona-Cabases, M.: The probability of a correct diagnosis. In: J. Candell-Riera, D. Ortega-Alcalde (eds.) Nuclear Cardiology in Everyday Practice, pp. 348–357. Kluwer, Dordrecht, NL (1994)
Google Scholar
Pfahringer, B., Bensuasan, H., Giraud-Carrier, C.: Meta-learning by landmarking various learning algorithms. In: Proc. 17th International Conf. on Machine Learning. Morgan Kaufmann, San Francisco, CA (2000)
Google Scholar
Proedrou, K., Nouretdinov, I., Vovk, V., Gammerman, A.: Transductive confidence machines for pattern recognition. In: Proc. ECML 2002, pp. 381–390. Springer, Berlin (2002)
Google Scholar
Rumelhart, D., McClelland, J.L.: Parallel Distributed Processing, vol. 1: Foundations. MIT Press, Cambridge (1986)
Google Scholar
Saunders, C., Gammerman, A., Vovk., V.: Transduction with confidence and credibility. In: T. Dean (ed.) Proceedings of the International Joint Conference on Artificial Intelligence. Morgan Kaufmann, San Francisco, USA, Stockholm, Sweden (1999)
Google Scholar
Saunders, C., Gammerman, A., Vovk, V.: Computationally efficient transductive machines. In: Algorithmic Learning Theory, 11th International Conference, ALT 2000, Sydney, Australia, December 2000, Proceedings, vol. 1968, pp. 325–333. Springer, Berlin (2000)
Google Scholar
Seewald, A., Furnkranz, J.: An evaluation of grading classifiers. In: Proc. 4th International Symposium on Advances in Intelligent Data Analysis, pp. 115–124 (2001)
Google Scholar
Smyth, P., Gray, A., Fayyad, U.: Retrofitting decision tree classifiers using kernel density estimation. In: A. Prieditis, S.J. Russell (eds.) Proceedings of the Twelvth International Conference on Machine Learning, pp. 506–514. Morgan Kaufmann, San Francisco, USA, Tahoe City, California, USA (1995)
Google Scholar
Specht, D.F., Romsdahl, H.: Experience with adaptive pobabilistic neural networks and adaptive general regression neural networks. In: S.K. Rogers (ed.) Proceedings of IEEE International Conference on Neural Networks. IEEE Press, Piscataway, USA, Orlando, USA (1994)
Google Scholar
Taneja, I.J.: On generalized information measures and their applications. Adv. Electron. and Elect. Physics 76, 327–416 (1995)
Google Scholar
Tsuda, K., Raetsch, M., Mika, S., Mueller, K.: Learning to predict the leave-one-out error of kernel based classifiers. In: Lecture Notes in Computer Science, pp. 227–331. Springer, Berlin/Heidelberg (2001)
Google Scholar
Vapnik, V.: Statistical Learning Theory. John Wiley, New York, USA (1998)
MATH Google Scholar
Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S-PLUS. Fourth edition. Springer-Verlag (2002)
Google Scholar
Vovk, V., Gammerman, A., Saunders, C.: Machine learning application of algorithmic randomness. In: I. Bratko, S. Dzeroski (eds.) Proceedings of the 16th International Conference on Machine Learning (ICML’99). Morgan Kaufmann, San Francisco, USA, Bled, Slovenija (1999)
Google Scholar
Wand, M.P., Jones, M.C.: Kernel Smoothing. Chapman and Hall, London (1995)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer and Information Science, University of Ljubljana, Tržaška 25, SI-1001, Ljubljana, Slovenia
Matjaž Kukar

Authors

Matjaž Kukar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matjaž Kukar .

Editor information

Editors and Affiliations

, School of Information Technology, Deakin University, 221 Burwood Highway, Burwood, 3125, Victoria, Australia
Honghua Dai
, Computing, Hong Kong Polytechnic University, Man Wai Building, Hunghom, PQ806, Hong Kong SAR
James N. K. Liu
, Department of Knowledge Engineering, Maastricht University, Maastricht, 6200MD, Netherlands
Evgueni Smirnov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kukar, M. (2012). Transductive Reliability Estimation for Individual Classifications in Machine Learning and Data Mining. In: Dai, H., Liu, J., Smirnov, E. (eds) Reliable Knowledge Discovery. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-1903-7_1

Download citation

DOI: https://doi.org/10.1007/978-1-4614-1903-7_1
Published: 08 February 2012
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4614-1902-0
Online ISBN: 978-1-4614-1903-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Transductive Reliability Estimation for Individual Classifications in Machine Learning and Data Mining

Abstract

Chapter PDF

Similar content being viewed by others

Credal Decision Trees to Classify Noisy Data Sets

Empirical Confidence Models for Supervised Machine Learning

Estimating Prediction Certainty in Decision Trees

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Transductive Reliability Estimation for Individual Classifications in Machine Learning and Data Mining

Abstract

Chapter PDF

Similar content being viewed by others

Credal Decision Trees to Classify Noisy Data Sets

Empirical Confidence Models for Supervised Machine Learning

Estimating Prediction Certainty in Decision Trees

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation