Abstract
Tree ensembles, such as Random Forest (RF), are popular methods in machine learning because of their efficiency and superior performance. However, they always grow big trees and large forests, which limits their use in many memory constrained applications. In this paper, we propose Random decision Directed Acyclic Graph (RDAG), which employs an entropy-based pre-pruning and node merging strategy to reduce the number of nodes in random forest. Empirical results show that the resulting model, which is a DAG, dramatically reduces the model size while achieving competitive classification performance when compared to RF.
Supported by the Natural Science Foundation of China (61672441, 61673324), the Natural Science Foundation of Fujian (2018J01097), the Shenzhen Basic Research Program (JCYJ20170818141325209).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bache, K., Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
Begon, J.M., Joly, A., Geurts, P.: Globally induced forest: a prepruning compression scheme. In: International Conference on Machine Learning, pp. 420–428 (2017)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Elisha, O., Dekel, S.: Wavelet decompositions of random forests: smoothness analysis, sparse approximation and applications. J. Mach. Learn. Res. 17(1), 6952–6989 (2016)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)
Shotton, J., Sharp, T., Kohli, P., Nowozin, S., Winn, J., Criminisi, A.: Decision jungles: compact and rich models for classification. In: Advances in Neural Information Processing Systems, pp. 234–242 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Liu, X., Liu, X., Lai, Y., Yang, F., Zeng, Y. (2019). Random Decision DAG: An Entropy Based Compression Approach for Random Forest. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds) Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science(), vol 11448. Springer, Cham. https://doi.org/10.1007/978-3-030-18590-9_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-18590-9_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18589-3
Online ISBN: 978-3-030-18590-9
eBook Packages: Computer ScienceComputer Science (R0)