Random Decision DAG: An Entropy Based Compression Approach for Random Forest

Liu, Xin; Liu, Xiao; Lai, Yongxuan; Yang, Fan; Zeng, Yifeng

doi:10.1007/978-3-030-18590-9_37

Xin Liu¹⁹,
Xiao Liu¹⁹,
Yongxuan Lai²⁰,
Fan Yang^19,20 &
…
Yifeng Zeng²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11448))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

3621 Accesses
2 Citations

Abstract

Tree ensembles, such as Random Forest (RF), are popular methods in machine learning because of their efficiency and superior performance. However, they always grow big trees and large forests, which limits their use in many memory constrained applications. In this paper, we propose Random decision Directed Acyclic Graph (RDAG), which employs an entropy-based pre-pruning and node merging strategy to reduce the number of nodes in random forest. Empirical results show that the resulting model, which is a DAG, dramatically reduces the model size while achieving competitive classification performance when compared to RF.

Supported by the Natural Science Foundation of China (61672441, 61673324), the Natural Science Foundation of Fujian (2018J01097), the Shenzhen Basic Research Program (JCYJ20170818141325209).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bache, K., Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
Begon, J.M., Joly, A., Geurts, P.: Globally induced forest: a prepruning compression scheme. In: International Conference on Machine Learning, pp. 420–428 (2017)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Elisha, O., Dekel, S.: Wavelet decompositions of random forests: smoothness analysis, sparse approximation and applications. J. Mach. Learn. Res. 17(1), 6952–6989 (2016)
MathSciNet MATH Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)
MathSciNet MATH Google Scholar
Shotton, J., Sharp, T., Kohli, P., Nowozin, S., Winn, J., Criminisi, A.: Decision jungles: compact and rich models for classification. In: Advances in Neural Information Processing Systems, pp. 234–242 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Automation, Xiamen University, Xiamen, China
Xin Liu, Xiao Liu & Fan Yang
Shenzhen Research Institute/Software School, Xiamen University, Xiamen, China
Yongxuan Lai & Fan Yang
School of Computing, Teesside University, Middlesbrough, UK
Yifeng Zeng

Authors

Xin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yongxuan Lai
View author publications
You can also search for this author in PubMed Google Scholar
Fan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yifeng Zeng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fan Yang .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Guoliang Li
Duke University, Durham, NC, USA
Jun Yang
University of Porto, Porto, Portugal
Joao Gama
Chiang Mai University, Chiang Mai, Thailand
Juggapong Natwichai
Beihang University, Beijing, China
Yongxin Tong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, X., Liu, X., Lai, Y., Yang, F., Zeng, Y. (2019). Random Decision DAG: An Entropy Based Compression Approach for Random Forest. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds) Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science(), vol 11448. Springer, Cham. https://doi.org/10.1007/978-3-030-18590-9_37

Download citation

DOI: https://doi.org/10.1007/978-3-030-18590-9_37
Published: 24 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18589-3
Online ISBN: 978-3-030-18590-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics