Abstract
In this paper we propose a new algorithm, named NICE, to generate counterfactual explanations for tabular data that specifically takes into account algorithmic requirements that often emerge in real-life deployments: (1) the ability to provide an explanation for all predictions, (2) being able to handle any classification model (also non-differentiable ones), (3) being efficient in run time, and (4) providing multiple counterfactual explanations with different characteristics. More specifically, our approach exploits information from a nearest unlike neighbor to speed up the search process, by iteratively introducing feature values from this neighbor in the instance to be explained. We propose four versions of NICE, one without optimization and, three which optimize the explanations for one of the following properties: sparsity, proximity or plausibility. An extensive empirical comparison on 40 datasets shows that our algorithm outperforms the current state-of-the-art in terms of these criteria. Our analyses show a trade-off between on the one hand plausibility and on the other hand proximity or sparsity, with our different optimization methods offering users the choice to select the types of counterfactuals that they prefer. An open-source implementation of NICE can be found at https://github.com/ADMAntwerp/NICE.
Similar content being viewed by others
Notes
Pronounced as “Set See”.
This name comes from the GitHub repository of Van Looveren and Klaise (2021), in which CFproto is the name used for the.py file which contains their algorithm.
See Appendix A.2 for more details about the hyperparameter tuning.
All metrics are compared over the same number of observations and algorithms, causing the critical difference to always be 0.151.
Whereas these and other tables contain a summary of our results across the different datasets of Table 4, a detailed overview of all results per dataset can be found in the online appendix (https://github.com/ADMAntwerp/NICE_experiments).
The speed of CFproto and DiCE could be improved if access was given to the gradients of the ANN and RF (Van Looveren and Klaise 2021). But to level the playing field we used the model-agnostic version of both these algorithms in all our experiments, since all other algorithms are model-agnostic.
See online appendix on GitHub: https://github.com/ADMAntwerp/NICE_experiments.
Recall that as part of NICE we only classify an instance x according to the trained model f but do not retrain the model itself. The latter happens offline.
References
Barocas S, Selbst AD, Raghavan M (2020). The hidden assumptions behind counterfactual explanations and principal reasons. In: Proceedings of the 2020 conference on fairness, accountability, and transparency, pp 80–89
Barredo Arrieta A, Díaz-Rodríguez N, Del Ser J, Bennetot A, Tabik S, Barbado A, Garcia S, Gil-Lopez S, Molina D, Benjamins R, Chatila R, Herrera F (2020) Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf Fusion 58:82–115
Byrne RMJ (2019) Counterfactuals in explainable artificial intelligence (XAI): evidence from human reasoning. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence, IJCAI-19. International Joint Conferences on Artificial Intelligence Organization, pp 6276–6282
Callahan A, Shah NH (2017) Chapter 19—machine learning in healthcare. In: Sheikh A, Cresswell KM, Wright A, Bates DW (eds) Key advances in clinical informatics. Academic Press, Cambridge, pp 279–291
Chen C, Li O, Tao C, Barnett AJ, Su J, Rudin C (2019) This looks like that: deep learning for interpretable image recognition. Curran Associates Inc., Red Hook
Cormen T, Leiserson C, Rivest R, Stein C (2009) Introduction to algorithms, 3rd edn. The MIT Press, Cambridge
Dandl S, Molnar C, Binder M, Bischl B (2020) Multi-objective counterfactual explanations. In: International conference on parallel problem solving from nature. Springer, pp 448–469
de Oliveira RMB, Martens D (2021) A framework and benchmarking study for counterfactual generating methods on tabular data. Appl Sci 11(16):7274
Delaney E, Greene D, Keane MT (2020) Instance-based counterfactual explanations for time series classification. arXiv:2009.13211
Delaney E, Greene D, Keane MT (2021) Uncertainty estimation and out-of-distribution detection for counterfactual explanations: pitfalls and solutions. arXiv:2107.09734
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Dhurandhar A, Chen P-Y, Luss R, Tu C-C, Ting P, Shanmugam K, Das P (2018) Explanations based on the missing: towards contrastive explanations with pertinent negatives. Adv Neural Inf Process Syst 31:592–603
Dhurandhar A, Pedapati T, Balakrishnan A, Chen P-Y, Shanmugam K, Puri R (2019) Model agnostic contrastive explanations for structured data. arXiv:1906.00117
Digiampietri LA, Roman NT, Meira LA, Filho JJ, Ferreira CD, Kondo AA, Constantino ER, Rezende RC, Brandao BC, Ribeiro HS et al (2008) Uses of artificial intelligence in the Brazilian customs fraud detection system. In: Proceedings of the 2008 international conference on digital government research, pp 181–187
Dodge J, Liao QV, Zhang Y, Bellamy RK, Dugan C (2019) Explaining models: an empirical study of how explanations impact fairness judgment. In: Proceedings of the 24th international conference on intelligent user interfaces, pp 275–285
Doshi-Velez F, Kim B (2017) Towards a rigorous science of interpretable machine learning. arXiv:1702.08608
Edwards BJ, Williams JJ, Gentner D, Lombrozo T (2019) Explanation recruits comparison in a category-learning task. Cognition 185:21–38
European Parliament (2016) Regulation (EU) 2016/679 of the European parliament and of the council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/ec (general data protection regulation)
Fernández-Loría C, Provost FJ, Han X (2020) Explaining data-driven decisions made by AI systems: the counterfactual approach. arXiv:2001.07417
Förster M, Klier M, Kluge K, Sigler I (2020) Fostering human agency: a process for the design of user-centric XAI systems. In: ICIS 2020 proceedings
Förster M, Hühn P, Klier M, Kluge K (2021) Capturing users’ reality: a novel approach to generate coherent counterfactual explanations. In: Proceedings of the 54th Hawaii international conference on system sciences, pp 1274
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92
Fürnkranz J, Kliegr T, Paulheim H (2020) On cognitive preferences and the plausibility of rule-based models. Mach Learn 109(4):853–898
Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D (2018) A survey of methods for explaining black box models. ACM Comput Surv 51(5):1–42
Huang Z, Dong W, Bath P, Ji L, Duan H (2015) On mining latent treatment patterns from electronic medical records. Data Min Knowl Discov 29(4):914–949
Joshi S, Koyejo O, Vijitbenjaronk W, Kim B, Ghosh J (2019) Towards realistic individual recourse and actionable explanations in black-box decision making systems. arXiv:1907.09615
Kanamori K, Takagi T, Kobayashi K, Arimura H (2020) Dace: distribution-aware counterfactual explanation by mixed-integer linear optimization. In: Bessiere C (ed) Proceedings of the twenty-ninth international joint conference on artificial intelligence, IJCAI-20. International Joint Conferences on Artificial Intelligence Organization, pp 2855–2862
Karimi A-H, Barthe G, Balle B, Valera I (2020a) Model-agnostic counterfactual explanations for consequential decisions. In: International conference on artificial intelligence and statistics. PMLR, pp 895–905
Karimi A-H, Barthe G, Schölkopf B, Valera I (2020b) A survey of algorithmic recourse: definitions, formulations, solutions, and prospects. arXiv:2010.04050
Keane MT, Smyth B (2020) Good counterfactuals and where to find them: a case-based technique for generating counterfactuals for explainable AI (XAI). In: Case-based reasoning research and development: 28th international conference, ICCBR 2020. Springer, pp 163–178
Keane M, Kenny E, Delaney E, Smyth B (2021) If only we had better counterfactual explanations: five key deficits to rectify in the evaluation of counterfactual xai techniques. In: Proceedings of the thirtieth international joint conference on artificial intelligence, IJCAI-21, pp 4466–4474
Kim B, Wattenberg M, Gilmer J, Cai C, Wexler J, Viegas F, sayres R (2018) Interpretability beyond feature attribution: quantitative testing with concept activation vectors (TCAV). In: Dy J, Krause A (eds) Proceedings of the 35th international conference on machine learning, volume 80 of proceedings of machine learning research. PMLR, pp 2668–2677
Kment B (2006) Counterfactuals and explanation. Mind 115(458):261–310
Kramer MA (1991) Nonlinear principal component analysis using autoassociative neural networks. AIChE J 37(2):233–243
Langer M, Oster D, Speith T, Hermanns H, Kästner L, Schmidt E, Sesing A, Baum K (2021) What do we want from explainable artificial intelligence (xai)?—A stakeholder perspective on xai and a conceptual model guiding interdisciplinary xai research. Artif Intell 296:103473
Lapuschkin S, Wäldchen S, Binder A, Montavon G, Samek W, Müller K-R (2019) Unmasking clever Hans predictors and assessing what machines really learn. Nat Commun 10:1–8
Laugel T, Lesot M-J, Marsala C, Renard X, Detyniecki M (2017) Inverse classification for comparison-based interpretability in machine learning. arXiv:1712.08443
Laugel T, Lesot M-J, Marsala C, Renard X, Detyniecki M (2018) Comparison-based inverse classification for interpretability in machine learning. In: International conference on information processing and management of uncertainty in knowledge-based systems. Springer, pp 100–111
Lessmann S, Baesens B, Seow H-V, Thomas LC (2015) Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research. Eur J Oper Res 247(1):124–136
Lewis D (2013) Counterfactuals. Wiley, Hoboken
Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems, NIPS’17. Curran Associates Inc, Red Hook, NY, USA, pp 4768–4777
Mahajan D, Tan C, Sharma A (2019) Preserving causal constraints in counterfactual explanations for machine learning classifiers. arXiv:1912.03277
Martens D, Provost F (2014) Explaining data-driven document classifications. MIS Q 38(1):73–100
Medin DL, Wattenmaker WD, Hampson SE (1987) Family resemblance, conceptual cohesiveness, and category construction. Cogn Psychol 19(2):242–279
Miller GA (1956) The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol Rev 63(2):81
Miller T (2019) Explanation in artificial intelligence: insights from the social sciences. Artif Intell 267:1–38
Molnar C (2022) Interpretable machine learning: a guide for making black box models explainable (2nd ed.). https://christophm.github.io/interpretable-ml-book
Mothilal RK, Sharma A, Tan C (2020) Explaining machine learning classifiers through diverse counterfactual explanations. In: Proceedings of the 2020 conference on fairness, accountability, and transparency, pp 607–617
Mothilal RK, Mahajan D, Tan C, Sharma A (2021) Towards unifying feature attribution and counterfactual explanations: different means to the same end. Association for Computing Machinery, New York, pp 652–663
Nemenyi P (1962) Distribution-free multiple comparisons. In: Biometrics, vol 18. International Biometric Soc, Washington, DC, p 263
Ngai EW, Hu Y, Wong YH, Chen Y, Sun X (2011) The application of data mining techniques in financial fraud detection: a classification framework and an academic review of literature. Decis Support Syst 50(3):559–569
Nugent C, Cunningham P (2005) A case-based explanation system for black-box systems. Artif Intell Rev 24(2):163–178
Nugent C, Doyle D, Cunningham P (2009) Gaining insight through case-based explanation. J Intell Inf Syst 32:267–295
Olson RS, La Cava W, Orzechowski P, Urbanowicz RJ, Moore JH (2017) Pmlb: a large benchmark suite for machine learning evaluation and comparison. BioData Min 10(1):1–13
Pawelczyk M, Broelemann K, Kasneci G (2020) On counterfactual explanations under predictive multiplicity. In: Conference on uncertainty in artificial intelligence. PMLR, pp 809–818
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Ramon Y, Martens D, Provost F, Evgeniou T (2020) A comparison of instance-level counterfactual explanation algorithms for behavioral and textual data: SEDC, LIME-C and SHAP-C. Adv Data Anal Classif 14:801–819
Ramon Y, Vermeire T, Toubia O, Martens D, Evgeniou T (2021) Understanding consumer preferences for explanations generated by XAI algorithms. arXiv:2107.02624
Ribeiro MT, Singh S, Guestrin C (2016) “Why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’16. Association for Computing Machinery, New York, NY, USA, pp 1135–1144
Ribeiro MT, Singh S, Guestrin C (2018) Anchors: high-precision model-agnostic explanations. In: Proceedings of the AAAI conference on artificial intelligence, vol 32, no 1
Ruben D-H (2015) Explaining explanation. Routledge, Abingdon
Schleich M, Geng Z, Zhang Y, Suciu D (2021) GeCo: quality counterfactual explanations in real time. Proc VLDB Endow 14(9):1681–1693
Sokol K, Flach P (2020) Explainability fact sheets: a framework for systematic assessment of explainable approaches. In: Proceedings of the 2020 conference on fairness, accountability, and transparency, pp 56–67
United States Congress (1970) An act to amend the federal deposit insurance act to require insured banks to maintain certain records, to require that certain transactions in US currency be reported to the department of the treasury, and for other purposes
Van Looveren A, Klaise J (2021) Interpretable counterfactual explanations guided by prototypes. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 650–665
Vanhoeyveld J, Martens D, Peeters B (2020) Value-added tax fraud detection with scalable anomaly detection techniques. Appl Soft Comput 86:105895
Verma S, Dickerson J, Hines K (2020) Counterfactual explanations for machine learning: a review. arXiv:2010.10596
Vermeire T, Brughmans D, Goethals S, de Oliveira R, Martens D (2022) Explainable image classification with evidence counterfactual. Pattern Anal Appl 25:315–335
Wachter S, Mittelstadt B, Russell C (2018) Counterfactual explanations without opening the black box: automated decisions and the GDPR. Harv JL Tech 31:841
Weld DS, Bansal G (2019) The challenge of crafting intelligible intelligence. Commun ACM 62(6):70–79
Wexler J, Pushkarna M, Bolukbasi T, Wattenberg M, Viégas F, Wilson J (2019) The what-if tool: interactive probing of machine learning models. IEEE Trans Vis Comput Graph 26(1):56–65
Whitrow C, Hand DJ, Juszczak P, Weston D, Adams NM (2009) Transaction aggregation as a strategy for credit card fraud detection. Data Min Knowl Discov 18(1):30–55
Wilson DR, Martinez TR (1997) Improved heterogeneous distance functions. J Artif Intell Res 6:1–34
Acknowledgements
We would like to thank the Flemish Research Council (FWO, Grant G0G2721N) for financial support.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Martin Atzmüller, Johannes Fürnkranz, Tomas Kliegr, Ute Schmid.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A Appendix
A Appendix
1.1 A.1 Examples time complexity
With respect to NICE, recall that the worst case time complexity (Table 3 in Sect. 3.2) is related to both the k and m values of the treated dataset. More specifically, let us consider two datasets from Table 4, namely “adult” where \(k=0.8\cdot 48,842=39,073\) and \(m=14\) and “clean2” where \(k=6598\cdot 0.8=5278\) and \(m=168\). in Tables 12 and 13. In both tables, we repeat the worst case time complexity of each of the four variations of NICE, along with the CPU times obtained for both the ANN and RF classifiers. Furthermore, we “fill in” the complexity functions based on the specific k and m values of the two datasets, which results in an Order of #“operations”.
In both tables we observe that the Order of #“operations” can become quite large (we intentionally chose two datasets with some of the largest k and m values), but that the impact on the CPU times remains limited. E.g., for “adult” with the ANN, the CPU times remain below 100 milliseconds. Even with a higher value for m (ANN for “clean2”) the CPU times are still quite small. The only noticeable increase comes from NICE(plaus), which can be attributed to the AE. For the RF classifier, we notice the CPU times are in general larger than those for the ANN, but are still below 2 s, with NICE(plaus) being the only real exception.
In summary, the CPU times remain small, even for some of these larger (in terms of k and m values) datasets. However, the frequent use of an AE for NICE(plaus) can have a negative impact, as can be seen in particular for “clean2” in Table 13 (the constant by which g(x) is multiplied is much larger than for “adult”). Combined with the RF classifier requiring considerable more CPU time than the ANN classifier,Footnote 9 we conclude that the largest CPU times occur for NICE(plaus) with the RF classifier, which can also be observed from Tables 5 and 6, but that NICE’s CPU times remain within reasonable bounds.
1.2 A.2 Hyperparameters classification models
For both classifiers we used the scikit-learn Pedregosa et al. (2011) implementation which is sklearn.ensemble.RandomForestClassifier for an RF and sklearn.neural_network.MLPclassifier for an ANN. A five-fold cross-validation grid search is performed with the values of Table 14 where the best performing model is selected based on the ROC AUC score. For the RF the hyperparameter class_weight is set to “balanced” and all other hyperparameters are set to default. The ANN always consists of one hidden layer for which the number of neurons in the grid is relative to the size of the input layer (k) with a minimum of 2 neurons. For example for dataset clean2, the number of input neurons is 168, which results in the following grid for the hyperparameter hidden_layer_sizes: 2, 25, 50, 76, 101, 126, 151, 176, 202, 227 and 252.
1.3 A.3 Multi-class Reward Functions
To apply NICE to multi-class, our reward functions need a more general definition. For binary classification we assumed two classes (-1 and 1) for which a classifier f maps \(\mathbb {R}^m\) in the class score vector such that \(f(x) \in [-1,1]\). For multi-class classification, it is no longer possible to project the scores of our model in such a one-dimensional vector. Therefore, we assume a m-dimensional feature space \(X \subset \mathbb {R}^m\) consisting of both categorical and numerical features, a feature vector \( x\in X\) with a corresponding label denoted as \(y \in Y = \{0,n\}\) and a trained classification model h that maps \(\mathbb {R}^m\) in an n-dimensional class probability vector where \(h_i(x)\) corresponds to the probability of x belonging to class i.
There are two options to generate multi-class counterfactual explanations Vermeire et al. (2022). First, one might be interested in a counterfactual explanation from a specific class and second, one might be interested in a counterfactual explanation from any class. We propose the following general reward function for both cases.
Equation (5) can be simplified as follows because the sparsity increase in every step is equal to 1.
The definition of \(h_c\) and \(h_o\) is different depending on the type of counterfactual we are looking for. To find a valid counterfactual from a specific class, the probability of this class has to be higher than the probability of all other classes. In this case the counterfactual probability \(h_c\) is equal to the probability of this specific class c, and \(h_o\) is the maximum probability of all other class probabilities:
For the second case, where we look for a counterfactual from any class, we want any class probability to be higher than the probability of the original class. In this case we define \(h_o\) as the probability of the class for which the original instance to explain had the highest probability and \(h_c\) as the maximum probability of all other classes.
The proposed reward function in Equation (6) can be reduced to our reward function (2) for binary classification. To do this we have to project the probabilities of both classes into the one dimensional score vector [-1,1 ]by taking the following assumptions:
Replacing these in Eq. 6 results in:
which is equal to our sparsity reward function (2).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Brughmans, D., Leyman, P. & Martens, D. NICE: an algorithm for nearest instance counterfactual explanations. Data Min Knowl Disc (2023). https://doi.org/10.1007/s10618-023-00930-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10618-023-00930-y