Cognitive Computation

, Volume 11, Issue 5, pp 685–696 | Cite as

Diversity-Based Random Forests with Sample Weight Learning

  • Chun Yang
  • Xu-Cheng YinEmail author


Given a variety of classifiers, one prevalent approach in classifier ensemble is to diversely combine classifier components, i.e., diversity-based ensembles, and a lot of previous works show that these ensembles can improve classification accuracy. Random forests are one of the most important ensembles. However, most random forests approaches with diversity-related aspects focus on maximizing tree diversity while producing and training component trees. Alternatively, a novel cognitive-inspired diversity-based random forests method, diversity-based random forests via sample weight learning (DRFS), is proposed. Given numerous component trees from the original random forests, DRFS selects and combines tree classifiers adaptively via diversity learning and sample weight learning. By designing a matrix for the data distribution creatively, a unified optimization model is formulated to learn and select diverse trees, where tree weights are learned through a convex quadratic programming problem with sample weights. Moreover, a self-training algorithm is proposed to solve the convex optimization iteratively and learn sample weights automatically. Comparative experiments on 39 typical UCI classification benchmarks and a variety of real-world text categorization benchmarks of our proposed method are conducted. Extensive experiments show that our method outperforms the traditional methods. Our proposed DRFS method can select and combine tree classifiers adaptively and improves the performance on a variety of classification tasks.


Diversity-based ensembles Classifier ensemble Random forests Sample weight learning Convex quadratic programming 


Funding Information

The research was partly supported by the National Natural Science Foundation of China (61473036), China Postdoctoral Science Foundation (2018M641199), and Beijing Natural Science Foundation (4194084).

Compliance with Ethical Standards

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.


  1. 1.
    Amasyali MF, Ersoy OK. Classifier ensembles with the extended space forest. IEEE Trans Knowl Data Eng 2014;26(3):549–62.Google Scholar
  2. 2.
    Amozegar M, Khorasani K. An ensemble of dynamic neural network identifiers for fault detection and isolation of gas turbine engines. Neural Netw 2016;76:106–21.PubMedGoogle Scholar
  3. 3.
    Ayerdi B, Graṅa M. Hybrid extreme rotation forest. Neural Netw 2014;52:33–42.PubMedGoogle Scholar
  4. 4.
    Ball K, Grant C, Mundy WR, Shafer TJ. A multivariate extension of mutual information for growing neural networks. Neural Netw 2017;95:29–43.PubMedGoogle Scholar
  5. 5.
    Bernard S, Adam S, Heutte L. Dynamic random forests. Pattern Recogn Lett 2012;33(12):1580–6.Google Scholar
  6. 6.
    Biau G. Analysis of a random forests model. J Mach Learn Res 2012;13:1063–95.Google Scholar
  7. 7.
    Brazdil P, Soares C. A comparison of ranking methods for classification algorithm selection. Proceedings of the 11th European Conference on Machine Learning, pp 63–74; 2000.Google Scholar
  8. 8.
    Breiman L. Bagging predictors. Mach Learn 1996;24(1):123–40.Google Scholar
  9. 9.
    Breiman L. Random forests. Mach Learn 2001;45:5–32.Google Scholar
  10. 10.
    Cardoso-Cachopo A. Improving methods for single-label text categorization. PdD Thesis. Instituto Superior Tecnico: Universidade Tecnica de Lisboa; 2007.Google Scholar
  11. 11.
    Chang CC, Lin CJ. Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2011;2 (3):1–27. Scholar
  12. 12.
    Debole F, Sebastiani F. An analysis of the relative hardness of Reuters-21578 subsets. JASIST 2005;56(6): 584–96. Scholar
  13. 13.
    Demsar J. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 2006;7:1–30.Google Scholar
  14. 14.
    Frank A, Asuncion A. 2010. UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences.
  15. 15.
    Freund Y, Schapire R. Experiments with a new boosting algorithm. Proceedings of the 13th International Conference on Machine Learning, pp 148–156; 1996.Google Scholar
  16. 16.
    Freund Y, Schapire R. A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 1997;55(1):119–39.Google Scholar
  17. 17.
    Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software: an update. SIGKDD Explor Newsl 2009;11(1):10–8. Scholar
  18. 18.
    Han EH, Karypis G. Centroid-based document classification: analysis and experimental results. Principles of Data Mining and Knowledge Discovery, 4th European Conference, PKDD 2000, Lyon, France, September 13-16, 2000, Proceedings, pp 424–431; 2000.Google Scholar
  19. 19.
    Hansen LK, Salamon P. Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 1990;12(10): 993–1001.Google Scholar
  20. 20.
    Hothorn T, Hornik K, Zeileis A. Unbiased recursive partitioning: a conditional inference framework. J Comput Graph Stat 2006;15(3):651–74.Google Scholar
  21. 21.
    Huang K, Zhang R, Jin X, Hussain A. Special issue editorial: cognitively-inspired computing for knowledge discovery. Cogn Comput 2018;10(1):1–2.Google Scholar
  22. 22.
    Jiang L. Learning random forests for ranking. Frontiers of Computer Science in China 2011;5(1):79–86.Google Scholar
  23. 23.
    Jiang L, Wang S, Li C, Zhang L. Structure extended multinomial naive Bayes. Inf Sci 2016;329: 346–56.Google Scholar
  24. 24.
    Krogh A, Sollich P. Statistical mechanics of ensemble learning. Phys Rev E 1997;55(1):811–25.Google Scholar
  25. 25.
    Kuncheva LI, Whitaker CJ. Measures of diversity in classifier ensembles. Mach Learn 2003;51(2):181–207.Google Scholar
  26. 26.
    Li N, Yu Y, Zhou ZH. Diversity regularized ensemble pruning. Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases; 2012.Google Scholar
  27. 27.
    Liu FT, Ting KM. Variable randomness in decision tree ensembles. Advances in Knowledge Discovery and Data Mining, 10th Pacific-Asia Conference, PAKDD 2006, Singapore, April 9-12, 2006, Proceedings, pp 81–90; 2006.Google Scholar
  28. 28.
    Liu FT, Ting KM, Fan W. Maximizing tree diversity by building complete-random decision trees. Advances in Knowledge Discovery and Data Mining, 9th Pacific-Asia Conference, PAKDD 2005, Hanoi, Vietnam, May 18-20, 2005, Proceedings, pp 605–610; 2005.Google Scholar
  29. 29.
    Liu FT, Ting KM, Yu Y, Zhou ZH. Spectrum of variable-random trees. J Artif Intell Res (JAIR) 2008;32:355–84.Google Scholar
  30. 30.
    Lu Z, Wu X, Zhu X, Bongard J. Ensemble pruning via individual contribution ordering. Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, July 25-28, 2010, pp 871–880; 2010.Google Scholar
  31. 31.
    Lulli A, OnetoEmail L, Anguita D. 2019. Mining big data with random forests. Cognitive Computation pp. 1–23. Published online.Google Scholar
  32. 32.
    Margineantu D, Dietterich T. Pruning adaptive boosting. Proceedings of International Conference on Machine Learning, pp 211–218; 1997.Google Scholar
  33. 33.
    Martinez-Munoz G, Hernandez-Lobato D, Suarez A. An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Trans Pattern Anal Mach Intell 2009;31(2):245–59.PubMedGoogle Scholar
  34. 34.
    McCallum A, Nigam K. 1998. A comparison of event models for naive Bayes text classification. In: Learning for text categorization: papers from the 1998 AAAI Workshop, pp 41–48.
  35. 35.
    Menze BH, Kelm BM, Splitthoff DN, Kothe U, Hamprecht FA. On oblique random forests. Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases, (ECML-PKDD’11), pp 453–469; 2011.Google Scholar
  36. 36.
    Opitz DW, Shavlik JW. Generating accurate and diverse members of a neural network ensemble. Advances in Neural Information Processing Systems (NIPS’96), pp 535–541. MIT Press; 1996.Google Scholar
  37. 37.
    Osadchy M, Keren D, Raviv D. Recognition using hybrid classifiers. IEEE Trans Pattern Anal Mach Intell 2016;38(4):759–71.PubMedGoogle Scholar
  38. 38.
    Perera AG, Law YW, Chahl JS. Human pose and path estimation from aerial video using dynamic classifier selection. Cogn Comput 2018;10(6):1019–41.Google Scholar
  39. 39.
    Qiu C, Jiang L, Li C. Randomly selected decision tree for test-cost sensitive learning. Appl Soft Comput 2017;53:27–33.Google Scholar
  40. 40.
    Quinlan JR. 1993. C4.5: Programs for machine learning. Morgan Kaufmann.Google Scholar
  41. 41.
    Robnik-Sikonja M. Improving random forests. Proceedings of 15th European Conference on Machine Learning (ECML’04), pp 359–370; 2004.Google Scholar
  42. 42.
    Rodriguez JJ, Kuncheva LI, Alonso CJ. Rotation forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 2006;28(10):1619–30.PubMedGoogle Scholar
  43. 43.
    Tang B, He H, Baggenstoss PM, Kay S. A Bayesian classification approach using class-specific features for text categorization. IEEE Trans Knowl Data Eng 2016;28(6):1602–06.Google Scholar
  44. 44.
    Trawinski K, Quirin A, Cordon O. On the combination of accuracy and diversity measures for genetic selection of bagging fuzzy rule-based multiclassification systems. Proceedings of the 9th Intelligent Systems Design and Applications, pp 121–127; 2009.Google Scholar
  45. 45.
    Tsoumakas G, Partalas I, Vlahavas I. An ensemble pruning primer. Applications of Supervised and Unsupervised Ensemble Methods, pp 1–13; 2009.Google Scholar
  46. 46.
    Wen G, Hou Z, Li H, Li D, Jiang L, Xun E. Ensemble of deep neural networks with probability-based fusion for facial expression recognition. Cogn Comput 2017;9(5):597–610.Google Scholar
  47. 47.
    Wolpert D. Stacked generalization. Neural Netw 1992;5(2):241–60.Google Scholar
  48. 48.
    Yang C, Yin XC, Hao HW. Diversity-based ensemble with sample weight learning. 22nd International Conference on Pattern Recognition, ICPR 2014, Stockholm, Sweden, August 24-28, 2014, pp 1236–1241; 2014.Google Scholar
  49. 49.
    Yang C, Yin XC, Huang K. Text categorization with diversity random forests. Neural Information Processing - 21st International Conference, ICONIP 2014, Kuching, Malaysia, November 3-6, 2014. Proceedings, Part III, pp 317–324; 2014.Google Scholar
  50. 50.
    Yin XC, Huang K, Hao HW, Iqbal K, Wang ZB. A novel classifier ensemble method with sparsity and diversity. Neurocomputing 2014;134:214–21.Google Scholar
  51. 51.
    Yin XC, Huang K, Yang C, Hao HW. Convex ensemble learning with sparsity and diversity. Inf Fusion 2014;20:49–59.Google Scholar
  52. 52.
    Zhang Y, Burer A, Street WN, Bennett K, Parrado-hern E. Ensemble pruning via semi-definite programming. J Mach Learn Res 2006;7:1315–38.Google Scholar
  53. 53.
    Zhou ZH. Ensemble methods: foundations and algorithms. Boca Raton: Chamman & Hall/CRC; 2012.Google Scholar
  54. 54.
    Zhou ZH, Wu J, Tang W. Ensembling neural networks: many could be better than all. Artif Intell 2002; 137:239–63.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Computer Science and Technology, School of Computer and Communication EngineeringUniversity of Science and Technology BeijingBeijingChina
  2. 2.Beijing Key Laboratory of Materials Science Knowledge Engineering, School of Computer and Communication EngineeringUniversity of Science and Technology BeijingBeijingChina
  3. 3.Institute of Artificial IntelligenceUniversity of Science and Technology BeijingBeijingChina

Personalised recommendations