Skip to main content
Log in

Conclusive local interpretation rules for random forests

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

In critical situations involving discrimination, gender inequality, economic damage, and even the possibility of casualties, machine learning models must be able to provide clear interpretations of their decisions. Otherwise, their obscure decision-making processes can lead to socioethical issues as they interfere with people’s lives. Random forest algorithms excel in the aforementioned sectors, where their ability to explain themselves is an obvious requirement. In this paper, we present LionForests, which relies on a preliminary work of ours. LionForests is a random forest-specific interpretation technique that provides rules as explanations. It applies to binary classification tasks up to multi-class classification and regression tasks, while a stable theoretical background supports it. A time and scalability analysis suggests that LionForests is much faster than our preliminary work and is also applicable to large datasets. Experimentation, including a comparison with state-of-the-art techniques, demonstrate the efficacy of our contribution. LionForests outperformed the other techniques in terms of precision, variance, and response time, but fell short in terms of rule length and coverage. Finally, we highlight conclusiveness, a unique property of LionForests that provides interpretation validity and distinguishes it from previous techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Algorithm 5
Algorithm 6
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Availability of data and material

The code used for using the Datasets used are available in the GitHub repository: https://git.io/JYpRT.

Notes

  1. ECOA 15 U.S. Code §1691 et seq.

  2. We use scikit-learn as core library (https://scikit-learn.org)

  3. Reduction through clustering was not used in regression because reduction through association rules almost reaches the maximal local error allowed, and the overhead of clustering does not justify the effort.

  4. https://git.io/JY0gF.

References

  • Adadi A, Berrada M (2018) Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access 6:52138–52160

    Article  Google Scholar 

  • Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: ACM sigmod record, vol 22, pp 207–216. ACM

  • Agrawal R, Srikant R et al. (1994) Fast algorithms for mining association rules. In: Proceedings of 20th international conference very large data bases, VLDB, vol 1215, pp 487–499

  • Bodria F, Giannotti F, Guidotti R, Naretto F, Pedreschi D, Rinzivillo S (2021) Benchmarking and survey of explanation methods for black box models. arXiv preprint arXiv:2102.13076

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324

    Article  MATH  Google Scholar 

  • Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Int Group 432:151–166

    MATH  Google Scholar 

  • Bucilua C, Caruana R, Niculescu-Mizil A (2006) Model compression. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’06, pp 535–541. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/1150402.1150464

  • Chen A (2018) IBM’s Watson gave unsafe recommendations for treating cancer. https://cutt.ly/keHQDma. Accessed 18 Nov 2019

  • Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM sigkdd international conference on knowledge discovery and data mining, pp 785–794. ACM

  • Clark D, Schreter Z, Adams A, Williamson RC, Burkitt A, Bartlett P (1996) A quantitative comparison of dystal and backpropagation, Australian conference; 7th, neural networks. In: Neural networks, proceedings of the seventh Australian conference on neural networks, Australian conference; 7th, Neural networks, pp 132–137. Australian National University. https://www.tib.eu/de/suchen/id/BLCP%3ACN016972815

  • Cole S (2019) This trippy t-shirt makes you invisible to AI. https://cutt.ly/FeHQHAa. Accessed 18 Nov 2019

  • Cortez P, Cerdeira A, Almeida F, Matos T, Reis J (2009) Modeling wine preferences by data mining from physicochemical properties. Decis Support Syst 47(4):547–553

    Article  Google Scholar 

  • Deng H (2019) Interpreting tree ensembles with InTrees. Int J Data Sci Anal 7(4):277–287

    Article  Google Scholar 

  • Domingos P (1998) Knowledge discovery via multiple models. Intell Data Anal 2(1–4):187–202

    Article  Google Scholar 

  • Došilović FK, Brčić M, Hlupić N (2018) Explainable artificial intelligence: a survey. In: 2018 41st International convention on information and communication technology, electronics and microelectronics (MIPRO), pp. 0210–0215. IEEE

  • Dua D, Graff C (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml

  • Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15(90):3133–3181

    MathSciNet  MATH  Google Scholar 

  • Firth N (2019) Apple card is being investigated over claims it gives women lower credit limits. https://cutt.ly/oeGYCx5. Accessed 18 Nov 2019

  • Freitas AA (2014) Comprehensible classification models: a position paper. SIGKDD Explor Newsl 15(1):1–10. https://doi.org/10.1145/2594473.2594475

    Article  Google Scholar 

  • Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 22:1189–1232

    MathSciNet  MATH  Google Scholar 

  • Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4):367–378

    Article  MathSciNet  MATH  Google Scholar 

  • Friedman JH, Popescu BE (2008) Predictive learning via rule ensembles. Ann Appl Stat 2(3):916–954

    Article  MathSciNet  MATH  Google Scholar 

  • Gries ST (2019) On classification trees and random forests in corpus linguistics: some words of caution and suggestions for improvement. Corpus Linguist Linguistic Theory 22:1147

    Google Scholar 

  • Guidotti R, Monreale A, Ruggieri S, Pedreschi D, Turini F, Giannotti F (2018) Local rule-based explanations of black box decision systems. CoRR abs/1805.10820

  • Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87

    Article  MathSciNet  Google Scholar 

  • Hara S, Hayashi K (2018) Making tree ensembles interpretable: a bayesian model selection approach. In: A Storkey, F Perez-Cruz (eds.) Proceedings of the twenty-first international conference on artificial intelligence and statistics, proceedings of machine learning research, vol 84, pp 77–85. PMLR, Playa Blanca, Lanzarote, Canary Islands. http://proceedings.mlr.press/v84/hara18a.html

  • Harrison D Jr, Rubinfeld DL (1978) Hedonic housing prices and the demand for clean air. J Environ Econ Manag 5(1):81–102

    Article  MATH  Google Scholar 

  • Hatwell J, Gaber MM, Azad R (2021) GBT-hips: explaining the classifications of gradient boosted tree ensembles. Appl Sci 11(6):2511

    Article  Google Scholar 

  • Hatwell J, Gaber MM, Azad RMA (2020) CHIRPS: explaining random forest classification. Artif Intell Rev 53(8):5747–5788. https://doi.org/10.1007/s10462-020-09833-6

    Article  Google Scholar 

  • Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. In: NIPS deep learning and representation learning workshop. http://arxiv.org/abs/1503.02531

  • Ioannis M, Nick B, Ioannis V, Grigorios T (2020) Lionforests: local interpretation of random forests. In: S Alessandro, S Luciano, L Paul (eds.) First International workshop on new foundations for human-centered AI (NeHuAI 2020), no 2659 in CEUR Workshop Proceedings, pp 17–24. Aachen. http://ceur-ws.org/Vol-2659/mollas.pdf

  • Jemima Jebaseeli T, Venkatesan R, Ramalakshmi K (2021) Fraud detection for credit card transactions using random forest algorithm. In: Peter JD, Fernandes SL, Alavi AH (eds) Intelligence in big data technologies-beyond the hype. Springer, Singapore, pp 189–197

    Chapter  Google Scholar 

  • Kaufman L, Rousseeuw PJ (1987) Clustering by means of medoids. In: Dodge Y (ed) Statistical data analysis based on the l1 norm. Springer, Berlin, pp 405–416

    Google Scholar 

  • Kohavi R (1996) Scaling up the accuracy of naive-bayes classifiers: a decision-tree hybrid. In: Proceedings of the second international conference on knowledge discovery and data mining

  • Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N, Lee SI (2020) From local explanations to global understanding with explainable AI for trees. Nat Mach Intell 2(1):56–67. https://doi.org/10.1038/s42256-019-0138-9

    Article  Google Scholar 

  • Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In: I Guyon, UV Luxburg, S Bengio, H Wallach, R Fergus, S Vishwanathan, R Garnett (eds.) Advances in neural information processing systems, vol 30, pp 4765–4774. Curran Associates, Inc. http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf

  • Maaten LVD, Hinton G (2008) Visualizing data using T-sne. J Mach Learn Res 9:2579–2605

    MATH  Google Scholar 

  • Meinshausen N (2010) Node harvest. Ann Appl Stat 2:2049–2072

    MathSciNet  MATH  Google Scholar 

  • Moore A, Murdock V, Cai Y, Jones K (2018) Transparent tree ensembles. In: The 41st international ACM SIGIR conference on research and development in information retrieval, pp 1241–1244. ACM

  • Nigam B, Nigam A, Dalal P (2017) Comparative study of top 10 algorithms for association rule mining. Int J Comput Sci Eng 5(8):1148

    Google Scholar 

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  • Prokhorenkova LO, Gusev G, Vorobev A, Dorogush AV, Gulin A (2018) Catboost: unbiased boosting with categorical features. In: S Bengio, HM Wallach, H Larochelle, K Grauman, N Cesa-Bianchi, R Garnett (eds.) Advances in neural information processing systems 31: annual conference on neural information processing systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, pp 6639–6649. https://proceedings.neurips.cc/paper/2018/hash/14491b756b3a51daac41c24863285549-Abstract.html

  • Ravikumar S, Muralidharan V, Ramesh P, Pandian C (2021) Fault diagnosis of self-aligning conveyor idler in coal handling belt conveyor system by statistical features using random forest algorithm. In: Zhou N, Hemamalini S (eds) Adv Smart Grid Technol. Springer, Singapore, pp 207–219

    Chapter  Google Scholar 

  • Regulation GDP (2016) Regulation (EU) 2016/679 of the European parliament and of the council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46. Off J Eur Union (OJ) 59(1–88):294

    Google Scholar 

  • Resende PAA, Drummond AC (2018) A survey of random forest based methods for intrusion detection systems. ACM Comput Surv 51(3):1158

    Google Scholar 

  • Ribeiro MT, Singh S, Guestrin C (2016) Why should i trust you?: Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135–1144. ACM

  • Ribeiro MT, Singh S, Guestrin C (2018) Anchors: high-precision model-agnostic explanations. In: Thirty-second AAAI conference on artificial intelligence. www.aaai.org

  • Ricciardi C, Amboni M, De Santis C, Ricciardelli G, Improta G, Iuppariello L, D’Addio G, Barone P, Cesarelli M (2020) Classifying different stages of Parkinson’s disease through random forests. In: Henriques J, Neves N, de Carvalho P (eds) XV mediterranean conference on medical and biological engineering and computing-MEDICON 2019. Springer International Publishing, Cham, pp 1155–1162

    Chapter  Google Scholar 

  • Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 1(5):206–215

    Article  Google Scholar 

  • Schubert E, Gertz M (2018) Improving the cluster structure extracted from OPTICS plots. In: R Gemulla, SP Ponzetto, C Bizer, M Keuper, H Stuckenschmidt (eds.) Proceedings of the conference "Lernen, Wissen, Daten, Analysen", LWDA 2018, Mannheim, Germany, August 22–24. CEUR workshop proceedings, vol 2191, pp 318–329. CEUR-WS.org. http://ceur-ws.org/Vol-2191/paper37.pdf

  • Simsekler MCE, Qazi A, Alalami MA, Ellahham S, Ozonoff A (2020) Evaluation of patient safety culture using a random forest algorithm. Reliabil Eng Syst Saf 204:107186. https://doi.org/10.1016/j.ress.2020.107186

    Article  Google Scholar 

  • Štrumbelj E, Kononenko I (2014) Explaining prediction models and individual predictions with feature contributions. Knowl Inf Syst 41(3):647–665

    Article  Google Scholar 

  • van der Waa J, Nieuwburg E, Cremers A, Neerincx M (2021) Evaluating XAI: a comparison of rule-based and example-based explanations. Artifi Intell 291:103404. https://doi.org/10.1016/j.artint.2020.103404

    Article  MathSciNet  Google Scholar 

  • Vens C, Costa F (2011) Random forest based feature induction. In: DJ Cook, J Pei, W Wang, OR Zaïane, X Wu (eds.) 11th IEEE international conference on data mining, ICDM 2011, Vancouver, BC, Canada, 2011, pp 744–753. IEEE Computer Society. https://doi.org/10.1109/ICDM.2011.121

  • von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416. https://doi.org/10.1007/s11222-007-9033-z

    Article  MathSciNet  Google Scholar 

  • Yao S, Wei M, Yan L, Wang C, Dong X, Liu F, Xiong Y (2020) Prediction of crime hotspots based on spatial factors of random forest. In: 15th international conference on computer science and education, ICCSE 2020, Delft, The Netherlands, August 18–22, 2020, pp 811–815. IEEE. https://doi.org/10.1109/ICCSE49874.2020.9201899

  • Yusuf R, Lawal Z (2016) Performance analysis of apriori and FP-growth algorithms (association rule mining). Int J Comput Appl Technol 7:279–293

    Google Scholar 

  • Zhang H, Bi Y, Jiang W, Luo C, Cao S, Guo P, Zhang J (2020) Application of random forest classifier in loan default forecast. In: Sun X, Wang J, Bertino E (eds) Artificial intelligence and security. Springer, Singapore, pp 410–420

    Chapter  Google Scholar 

  • Zhao X, Wu Y, Lee DL, Cui W (2018) Iforest: interpreting random forests via visual analytics. IEEE Trans Visual Comput Gr 25(1):407–416

    Article  Google Scholar 

  • Zhou Y, Hooker G (2016) Interpreting models via single tree approximation. arXiv:1610.09036

Download references

Acknowledgements

This paper is supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 825619. AI4EU Project https://www.ai4europe.eu.

Funding

This paper is supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 825619, of Framework Programme (call: Information and Communication Technologies) AI4EU Project. The recipients are: Ioannis Mollas, Nick Bassiliades and Grigorios Tsoumakas.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ioannis Mollas.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Code availability

Experiments’ code is available in the GitHub repository: https://git.io/JY0gF. The experiments’ code related to the revision is also uploaded.

Additional information

Responsible editor: Martin Atzmueller, Johannes Fürnkranz, Tomáš Kliegr and Ute Schmid.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Deeper sensitivity analysis

In this appendix, we present a deeper sensitivity analysis, as originally presented in Sect. 5.2.

1.1 Binary classification

Diving deeper to the sensitivity analysis, Fig. 11 presents the FR% for the parameters of the RF while Fig. 12 refers to the parameters of LF. The parameter analysis reveals that when the RF’s max features parameter is set to 75%, the reduction in features in all datasets is higher. With regard to depth and estimators, LF achieves over 35% FR when the depth is greater than or equal to 5 and the estimators are 100 or more.

Fig. 11
figure 11

Binary classification: analysis of FR relation to RF’s parameters. ‘sqrt’ and ‘log2’, as well as estimators 500 and 1000, have similar results, and they are grouped

Fig. 12
figure 12

Binary classification: analysis of FR relation to LF’s parameters

In Fig. 12, we can see how the parameters of LF affect the FR% in these datasets. We observe that the two different AR (1) are performing identically in the FR%. In CR (2), the FR% seemed to diverge for the different algorithms. Specifically, k-medoids and SC manage to reduce the features by 20% or more in all datasets, while OPTICS could not manage to perform any reduction. RS (3) did not achieve any FR in Adult and Banknote, while it achieved low FR% in Heart (Statlog). Among the three approaches, AR seems necessary to achieve high FR%, while the combination of AR (1) with CL (2) seems to increase slightly in all three datasets the FR%.

Fig. 13
figure 13

Binary classification: analysis of PR relation to RF’s parameters. Depth 7 and 10, as well as estimators 500 and 1000 are grouped

Fig. 14
figure 14

Binary classification: analysis of PR relation to LF’s parameters

About the PR analysis, Fig. 13 reveals that when the RF’s max features parameter is set to ‘None’, the reduction in paths in all datasets is higher. Regarding depth and estimators, LF achieves over 40% PR when the depth is greater or equal to 5 and the estimators are 100 or over.

In Fig. 14, we can see how the parameters of LF affect the PR% in these datasets. In contrast to FR, for PR, CR (2) and RS (3) are both maximising the PR%. Recall that we cannot reduce more than a quorum in a binary setup, thus these techniques achieving 49% PR are performing optimally. AR (1), on the other hand, cannot seem to be able to optimally reduce paths. Finally, we observe that when combining all three techniques (123), for every parameter setting, the PR% is higher than 40%.

Fig. 15
figure 15

Multi-class classification: analysis FR relation to RF’s parameters. ‘sqrt’ and ‘log2’, as well as estimators 500 and 1000, have similar results, and they are grouped

Fig. 16
figure 16

Multi-class classification: analysis of FR relation to LF’s parameters

1.2 Multi-class classification

The tuning of RF’s parameters and their impact to the FR% are visible in Fig. 15. The analysis reveals that when the RF’s max features parameter is set to 75%, the FR in all datasets is higher. LF achieves over 17% FR when the depth is greater than or equal to 5 and estimators are 100 or over, while it achieves more than 25% and 34% for the individual datasets, Abalone and I. Segmentation, respectively.

Figure 16 presents how FR% is affected based on the different parameters of LF. As observed in the binary classification sensitivity analysis, here as well it is eminent that AR (1) is performing equally in the FR%. CR (2) in the Abalone dataset achieved a higher FR than AR, and when combined (123) with the other techniques, AR and RS, the FR% is not increasing. The analysis of Glass dataset revealed that rather than the FR achieved by the AR (1), no other method or combination managed to increase the FR%. Finally, on I. Segmentation seemed the combination of AR and CR (12), with specifically SC, to provide the highest FR%. Another interesting point is that the RS (3) managed to reduce the features of the rules in all three datasets, in contrast to the RS’s performance on the binary’s classification sensitivity analysis.

Through Fig. 17, observing the PR while tuning the RF’s parameters in these datasets, we can say that the max features parameters do not affect the PR%. We can not conclude the same for depth and estimators, where we need 5 or higher and 100 or more, respectively, to achieve higher PR%. The highest PR% it is achieved when depth equals 10 and estimators equals 1000.

Fig. 17
figure 17

Multi-class classification: analysis of PR relation to RF’s parameters. Depth 7 and 10, as well as estimators 500 and 1000 are grouped

Fig. 18
figure 18

Multi-class classification: analysis of PR relation to LF’s parameters

In Fig. 18, we can see how the parameters of LF affect the PR% in these datasets. RS (3) is maxing out the PR%. AR (1) cannot seem to achieve the desirable PR results, while CL (2) is performing well, but not as good as RS (3). Thus, RS or any combination with RS leads to a PR% of 38% or more.

1.3 Regression

Fig. 19
figure 19

Regression: analysis FR relation to RF’s parameters. ‘sqrt’ and ‘log2’, as well as estimators 500 and 1000, have similar results, and they are grouped

In Fig. 19, the relation of RF’s parameters to the FR% is visible. We can say that the most influencing parameter is estimators. When estimators are equal or more than 500 and depth is either 1 or 5, then the reduction is between \(35\%\) to \(51\%\). Moreover, for Boston and Wine we observe that when max features is set to either ‘sqrt’ or ‘log2’, the FR% is higher. On the other hand, higher max features values like ‘0.75’ or ‘None’ seem to favour the FR% for Abalone.

Fig. 20
figure 20

Regression: analysis of FR relation to LF’s parameters

Fig. 21
figure 21

Regression: analysis of PR relation to LF’s parameters

Inspecting how the LF’s parameters are affecting the FR%, in Fig. 20, we can see that AR+RS method provides better results for Abalone, while DSi for Boston and Wine. However, DSo cannot reach desirable levels of FR% in any case.

Fig. 22
figure 22

Regression: analysis of PR relation to RF’s parameters. ‘sqrt’ and ‘log2’, as well as estimators 500 and 1000, have similar results, and they are grouped

The same pattern we identified for the FR% relation to RF’s parameters, it is apparent for the relation of PR% with the RF’s parameters as well (Fig. 22). Setting estimators between 500 or 1000 and depth to either 1 or 5, the PR is between \(50\%\) to \(85\%\). However, max features do not affect the PR%.

In Fig. 21, we can see how the parameters of LF affect the PR% in these datasets. We observe the highest PR%, over 50%, with the DSi reduction method of LF. DSo is also better than AR+RS in terms of PR% (Fig. 22).

Fig. 23
figure 23

Regression: analysis of FR relation to \(local\_error\)

Examining the relation of \(local\_error\) with the FR% (Fig. 23), we can say that for the Wine dataset we can achieve high FR%, over 50%, with a low \(local\_error\) of around 0.36. For the Abalone dataset, we need a \(local\_error\) with a value between [1.1, 1.4] in order to achieve approximately 35% of FR. Finally, for Boston in order to achieve FR% higher than 40% we need a \(local\_error\) around 2.2.

Fig. 24
figure 24

Regression: analysis of PR relation to \(local\_error\)

Finally, about the relation of PR% with the \(local\_error\), we observe, in Fig. 24, that we acquire higher PR% when we allow higher \(local\_error\), in every dataset. In order to let the reader understand better the relation of both the FR% and PR% with the \(local\_error\), we present the target variable statistics of each dataset in Table 12. This will help to associate the \(local\_error\) with the actual values of the target variable of each dataset (Fig. 25).

Deeper analysis of time and scalability analysis

In this appendix, we present a deeper analysis regarding the runtime performance and scalability, as originally presented in Sect. 5.3.

Table 12 Statistics of target variable of regression task’s datasets
Fig. 25
figure 25

Comparison of preliminary LF and LF on features’ ranges generation without reduction (y-axis in seconds)

Fig. 26
figure 26

Comparison of preliminary LF and LF on features’ ranges generation without reduction (y-axis in second-zoomed)

In Fig. 26 we are zooming the y-axis in order to make visible that LF runs approximately between 0.2 and 0.6 s per explanation, in contrast to the preliminary version which generates explanations from 0.2 to almost 80 s (Fig. 27).

Fig. 27
figure 27

Comparison of preliminary LF and LF on features’ ranges generation with reduction (y-axis in second)

Fig. 28
figure 28

Comparison of preliminary LF and LF on features’ ranges generation with reduction (y-axis in second-zoomed)

In Fig. 28 we are zooming the y-axis in order to make visible that the version of LF runs approximately between 0.2 to 11 seconds per explanation, in contrast to the preliminary version which generates explanations from 2 to 128 s, and even over 280 in few extreme cases.

Fig. 29
figure 29

Binary classification: analysis of runtime performance of LF for different number of features and different parameters for RF

As it is visible from Fig. 29, the worst performance in thee binary setup occurred when we used a dataset with 1000 features, 1000 estimators, and a depth of 10, reaching over 1 minute per explanation. An explanation takes 4 s in a typical configuration with 1000 features, 500 estimators, and a depth of 5. While 100 features, 1000 estimators, and depth 2 produce an explanation in half a second.

Fig. 30
figure 30

Multi-class classification: analysis of runtime performance of LF for different number of features and different parameters for RF

In the multi-class experiments (Fig. 30), the lowest performance was with 1000 features, 1000 estimators, and a depth of 10, with either 10 or 100 classes, reaching over 1 minute, actually 64 s. In a common configuration with 1000 features, 500 estimators, and a depth of 5, an explanation takes 4.5 s. An explanation takes 0.8 s to generate using 100 features, 1000 estimators, and depth 2.

Fig. 31
figure 31

Regression: analysis of runtime performance of LF for different number of features and different parameters for RF

In the regression experiments (Fig. 31), the worst performance was with 10 features, 1000 estimators, and a depth of 10, reaching almost 1 s. In a common configuration with 1000 features, 500 estimators, and a depth of 5, an explanation takes 0.64 s. An explanation takes 0.48 s to generate using 100 features, 1000 estimators, and depth 2.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mollas, I., Bassiliades, N. & Tsoumakas, G. Conclusive local interpretation rules for random forests. Data Min Knowl Disc 36, 1521–1574 (2022). https://doi.org/10.1007/s10618-022-00839-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-022-00839-y

Keywords

Mathematics Subject Classification

Navigation