End-to-End Learning of Deterministic Decision Trees

Hehn, Thomas M.; Hamprecht, Fred A.

doi:10.1007/978-3-030-12939-2_42

Thomas M. Hehn¹⁵ &
Fred A. Hamprecht¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11269))

Included in the following conference series:

German Conference on Pattern Recognition

2691 Accesses
3 Citations

Abstract

Conventional decision trees have a number of favorable properties, including interpretability, a small computational footprint and the ability to learn from little training data. However, they lack a key quality that has helped fuel the deep learning revolution: that of being end-to-end trainable. Kontschieder 2015 has addressed this deficit, but at the cost of losing a main attractive trait of decision trees: the fact that each sample is routed along a small subset of tree nodes only. We here propose a model and Expectation-Maximization training scheme for decision trees that are fully probabilistic at train time, but after an annealing process become deterministic at test time. We analyze the learned oblique split parameters on image datasets and show that Neural Networks can be trained at each split. In summary, we present an end-to-end learning scheme for deterministic decision trees and present results on par or superior to published standard oblique decision tree algorithms.

T. M. Hehn—Corresponding author is now at TU Delft.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Breiman, L., Friedman, J., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman & Hall/CRC, London (1984)
MATH Google Scholar
Cardona, A., et al.: An integrated micro- and macroarchitectural analysis of the drosophila brain by computer-assisted serial section electron microscopy. PLOS Biol. 8(10), 1–17 (2010). https://doi.org/10.1371/journal.pbio.1000502
Article Google Scholar
Criminisi, A., Shotton, J.: Decision Forests for Computer Vision and Medical Image Analysis. Springer, Berlin (2013). https://doi.org/10.1007/978-1-4471-4929-3
Book Google Scholar
Eilers, P.H.C., Marx, B.D.: Flexible smoothing with B-splines and penalties. Stat. Sci. 11, 89–121 (1996)
Article MathSciNet Google Scholar
Fan, R.E., Lin, C.J.: LIBSVM data: classification, regression and multi-label (2011). http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
Fernández-Delgado, M., Cernadas, E., Barro, S., Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems? J. Mach. Learn. Res. 15, 3133–3181 (2014)
MathSciNet MATH Google Scholar
Gall, J., Lempitsky, V.: Class-specific hough forests for object detection. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1022–1029, June 2009. https://doi.org/10.1109/CVPR.2009.5206740
Ioannou, Y., et al.: Decision forests, convolutional networks and the models in-between. arXiv:1603.01250 (March 2016)
Jordan, M.I.: A statistical approach to decision tree modeling. In: Proceedings of the Seventh Annual Conference on Computational Learning Theory, COLT 1994, New York, NY, USA, pp. 13–20 (1994)
Google Scholar
Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the em algorithm. Neural Comput. 6(2), 181–214 (1994). https://doi.org/10.1162/neco.1994.6.2.181
Article Google Scholar
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
Google Scholar
Kontschieder, P., Fiterau, M., Criminisi, A., Rota Bulò, S.: Deep neural decision forests. In: ICCV (2015)
Google Scholar
Kontschieder, P., Kohli, P., Shotton, J., Criminisi, A.: GeoF: geodesic forests for learning coupled predictors. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2013
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lepetit, V., Lagger, P., Fua, P.: Randomized trees for real-time keypoint recognition. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 2, pp. 775–781, June 2005. https://doi.org/10.1109/CVPR.2005.288
McGill, M., Perona, P.: Deciding how to decide: dynamic routing in artificial neural networks. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, PMLR, International Convention Centre, Sydney, Australia, 06–11 August 2017, vol. 70, pp. 2363–2372. http://proceedings.mlr.press/v70/mcgill17a.html
Menze, B.H., Kelm, B.M., Splitthoff, D.N., Koethe, U., Hamprecht, F.A.: On oblique random forests. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS (LNAI), vol. 6912, pp. 453–469. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23783-6_29
Chapter Google Scholar
Montillo, A., et al.: Entanglement and differentiable information gain maximization. In: Criminisi, A., Shotton, J. (eds.) Decision Forests for Computer Vision and Medical Image Analysis. ACVPR, pp. 273–293. Springer, London (2013). https://doi.org/10.1007/978-1-4471-4929-3_19
Chapter Google Scholar
Murthy, K.V.S.: On growing better decision trees from data. Ph.D. thesis, The Johns Hopkins University (1996)
Google Scholar
Norouzi, M., Collins, M.D., Fleet, D.J., Kohli, P.: Co2 forest: improved random forest by continuous optimization of oblique splits. arXiv:1506.06155 (2015)
Norouzi, M., Collins, M.D., Johnson, M., Fleet, D.J., Kohli, P.: Efficient non-greedy optimization of decision trees. In: NIPS, December 2015
Google Scholar
PyTorch: http://www.pytorch.org/
Quinlan, J.R.: Induction of decision trees. In: Shavlik, J.W., Dietterich, T.G. (eds.) Readings in Machine Learning. Morgan Kaufmann, Los Altos (1990). Originally published in Mach. Learn. 1, 81–106 (1986)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Google Scholar
Richmond, D., Kainmueller, D., Yang, M., Myers, E., Rother, C.: Mapping auto-context decision forests to deep convnets for semantic segmentation. In: Wilson, R.C., Hancock, E.R., Smith, W.A.P. (eds.) Proceedings of the British Machine Vision Conference (BMVC), pp. 144.1–144.12. BMVA Press, September 2016. https://doi.org/10.5244/C.30.144
Rose, K., Gurewitz, E., Fox, G.C.: Statistical mechanics and phase transitions in clustering. Phys. Rev. Lett. 65, 945–948 (1990). https://doi.org/10.1103/PhysRevLett.65.945
Article Google Scholar
Rota Bulo, S., Kontschieder, P.: Neural decision forests for semantic image labelling. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014
Google Scholar
Sethi, I.K.: Entropy nets: from decision trees to neural networks. Proc. IEEE 78(10), 1605–1613 (1990)
Article Google Scholar
Suárez, A., Lutsko, J.F.: Globally optimal fuzzy decision trees for classification and regression. IEEE Trans. Pattern Anal. Mach. Intell. 21(12), 1297–1311 (1999)
Article Google Scholar
Welbl, J.: Casting random forests as artificial neural networks (and profiting from it). In: Jiang, X., Hornegger, J., Koch, R. (eds.) GCPR 2014. LNCS, vol. 8753, pp. 765–771. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11752-2_66
Chapter Google Scholar
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv:1708.07747 (2017)

Download references

Acknowledgments

The authors gratefully acknowledge financial support by DFG grant HA 4364/10-1.

Author information

Authors and Affiliations

Heidelberg Collaboratory for Image Processing Interdisciplinary Center for Scientific Computing, Heidelberg University, 69115, Heidelberg, Germany
Thomas M. Hehn & Fred A. Hamprecht

Authors

Thomas M. Hehn
View author publications
You can also search for this author in PubMed Google Scholar
Fred A. Hamprecht
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thomas M. Hehn .

Editor information

Editors and Affiliations

University of Freiburg, Freiburg im Breisgau, Baden-Württemberg, Germany
Thomas Brox
University of Stuttgart, Stuttgart, Baden-Württemberg, Germany
Andrés Bruhn
CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
Mario Fritz

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 375 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hehn, T.M., Hamprecht, F.A. (2019). End-to-End Learning of Deterministic Decision Trees. In: Brox, T., Bruhn, A., Fritz, M. (eds) Pattern Recognition. GCPR 2018. Lecture Notes in Computer Science(), vol 11269. Springer, Cham. https://doi.org/10.1007/978-3-030-12939-2_42

Download citation

DOI: https://doi.org/10.1007/978-3-030-12939-2_42
Published: 14 February 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12938-5
Online ISBN: 978-3-030-12939-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics