Estimating Prediction Certainty in Decision Trees

Costa, Eduardo P.; Verwer, Sicco; Blockeel, Hendrik

doi:10.1007/978-3-642-41398-8_13

Eduardo P. Costa¹⁹,
Sicco Verwer²⁰ &
Hendrik Blockeel^19,21

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8207))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

2441 Accesses
1 Citations

Abstract

Decision trees estimate prediction certainty using the class distribution in the leaf responsible for the prediction. We introduce an alternative method that yields better estimates. For each instance to be predicted, our method inserts the instance to be classified in the training set with one of the possible labels for the target attribute; this procedure is repeated for each one of the labels. Then, by comparing the outcome of the different trees, the method can identify instances that might present some difficulties to be correctly classified, and attribute some uncertainty to their prediction. We perform an extensive evaluation of the proposed method, and show that it is particularly suitable for ranking and reliability estimations. The ideas investigated in this paper may also be applied to other machine learning techniques, as well as combined with other methods for prediction certainty estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ferri, C., Flach, P.A., Hernández-Orallo, J.: Improving the AUC of probabilistic estimation trees. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837, pp. 121–132. Springer, Heidelberg (2003)
Chapter Google Scholar
Provost, F., Domingos, P.: Tree induction for probability-based ranking. Machine Learning 52(3), 199–215 (2003)
Article MATH Google Scholar
Zadrozny, B., Elkan, C.: Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. In: Proceedings of the 18th International Conference on Machine Learning (ICML), pp. 609–616 (2001)
Google Scholar
Brier, G.W.: Verification of forecasts expressed in terms of probability. Monthly Weather Review 78(1), 1–3 (1950)
Article Google Scholar
Kukar, M., Kononenko, I.: Reliable classifications with machine learning. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 219–231. Springer, Heidelberg (2002)
Chapter Google Scholar
Quinlan, J.R.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)
Google Scholar
Hüllermeier, E., Vanderlooy, S.: Why fuzzy decision trees are good rankers. IEEE Transactions on Fuzzy Systems 17(6), 1233–1244 (2009)
Article Google Scholar
Margineantu, D.D., Dietterich, T.G.: Improved class probability estimates from decision tree models. In: Denison, D.D., Hansen, M.H., Holmes, C.C., Mallick, B., Yu, B. (eds.) Nonlinear Estimation and Classification. Lecture Notes in Statistics, vol. 171, pp. 169–184. Springer (2001)
Google Scholar
Liang, H., Yan, Y.: Improve decision trees for probability-based ranking by lazy learners. In: Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence, pp. 427–435 (2006)
Google Scholar
Ling, C.X., Yan, R.J.: Decision tree with better ranking. In: Proceedings of the 20th International Conference on Machine Learning, pp. 480–487 (2003)
Google Scholar
Wang, B., Zhang, H.: Improving the ranking performance of decision trees. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 461–472. Springer, Heidelberg (2006)
Chapter Google Scholar
Li, M., Vitanyi, P.: An Introduction to Kolmogorov Complexity and its Applications. Springer (1997)
Google Scholar
Vovk, V., Gammerman, A., Saunders, C.: Machine-learning applications of algorithmic randomness. In: Proceedings of the 16th International Conference on Machine Learning, pp. 444–453 (1999)
Google Scholar
Shafer, G., Vovk, V.: A tutorial on conformal prediction. J. Mach. Learn. Res. 9, 371–421 (2008)
MathSciNet MATH Google Scholar
Bache, K., Lichman, M.: UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences (2013), http://archive.ics.uci.edu/ml
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, KU Leuven, Leuven, Belgium
Eduardo P. Costa & Hendrik Blockeel
Institute for Computing and Information Sciences, Radboud University Nijmegen, The Netherlands
Sicco Verwer
Leiden Institute of Advanced Computer Science, Universiteit Leiden, Leiden, The Netherlands
Hendrik Blockeel

Authors

Eduardo P. Costa
View author publications
You can also search for this author in PubMed Google Scholar
Sicco Verwer
View author publications
You can also search for this author in PubMed Google Scholar
Hendrik Blockeel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Systems, Computing and Mathematics, Brunel University, UB8 3PH, Uxbridge, Middlesex, UK
Allan Tucker & Stephen Swift &
Faculty of Computer Science/IT, Ostfalia University of Applied Sciences, Am Exer 2, 38302, Wolfenbüttel, Germany
Frank Höppner
Faculty of Science, Department of Information and Computing Science, Buys Ballot Laboratory, Universiteit Utrecht, Princetonplein 5, 3584 CC, Utrecht, The Netherlands
Arno Siebes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Costa, E.P., Verwer, S., Blockeel, H. (2013). Estimating Prediction Certainty in Decision Trees. In: Tucker, A., Höppner, F., Siebes, A., Swift, S. (eds) Advances in Intelligent Data Analysis XII. IDA 2013. Lecture Notes in Computer Science, vol 8207. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41398-8_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-41398-8_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41397-1
Online ISBN: 978-3-642-41398-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics