1 Introduction

Powerful machine learning models such as deep neural networks are inherently opaque, which has motivated numerous explanation methods that the research community developed over the last decade [1, 2, 7, 15, 16, 20, 26, 29]. The meaning and validity of an explanation depends on the underlying principle of the explanation framework. Therefore, a trustworthy explanation framework must align intuition with mathematical rigor while maintaining maximal flexibility and applicability. We believe the Rate-Distortion Explanation (RDE) framework, first proposed by [16], then extended by [9], as well as the similar framework in [2], meets the desired qualities. In this chapter, we aim to present the RDE framework in a revised and holistic manner. Our generalized RDE framework can be applied to any model (not just classification tasks), supports in-distribution interpretability (by leveraging in-painting GANs), and admits interpretation queries (by considering suitable input signal representations).

The typical setting of a (local) explanation method is given by a pre-trained model \(\varPhi :\mathbb {R}^n\rightarrow \mathbb {R}^m,\) and a data instance \(x\in \mathbb {R}^n\). The model \(\varPhi \) can be either a classification task with m class labels or a regression task with m-dimensional model output. The model decision \(\varPhi (x)\) is to be explained. In the original RDE framework [16], an explanation for \(\varPhi (x)\) is a set of feature components \(S\subset \left\{ 1,\ldots ,n\right\} \) in x that are deemed relevant for the decision \(\varPhi (x)\). The core principle behind the RDE framework is that a set \(S\subset \left\{ 1,\ldots ,n\right\} \) contains all the relevant components if \(\varPhi (x)\) remains (approximately) unchanged after modifying \(x_{S^c}\), i.e., the components in x that are not deemed relevant. In other words, S contains all relevant features if they are sufficient for producing the output \(\varPhi (x)\). To convey concise explanatory information, one aims to find the minimal set \(S\subset \left\{ 1,\ldots ,n\right\} \) with all the relevant components. As demonstrated in [16] and [31], the minimal relevant set \(S\subset \left\{ 1,\ldots ,n\right\} \) cannot be found combinatorically in an efficient manner for large input sizes. A meaningful approximation can nevertheless be found by optimizing a sparse continuous mask \(s\in [0,1]^n\) that has no significant effect on the output \(\varPhi (x)\) in the sense that \(\varPhi (x)\approx \varPhi (x\odot s + (1-s)\odot v)\) should hold for appropriate perturbations \(v\in \mathbb {R}^n\), where \(\odot \) denotes the componentwise multiplication. Suppose \(d\big (\varPhi (x),\varPhi (y)\big )\) is a measure of distortion (e.g. the \(\ell _2\)-norm) between the model outputs for \(x,y\in \mathbb {R}^n\) and \(\mathcal {V}\) is a distribution over appropriate perturbations \(v\sim \mathcal {V}\). An explanation in the RDE framework can be found as a solution mask \(s^*\) to the following minimization problem:

$$\begin{aligned} s^* := \quad \mathop {\text {arg min}}\limits _{s\in [0,1]^n} \mathop {\mathbb {E}}_{v\sim \mathcal {V}}\Bigg [d\Big (\varPhi (x),\varPhi (x\odot s + (1-s)\odot v)\Big )\Bigg ] + \lambda \Vert s\Vert _1, \end{aligned}$$

where \(\lambda >0\) is a hyperparameter controlling the sparsity of the mask.

We further generalize the RDE framework to abstract input signal representations \(x=f(h)\), where f is a data representation function with input h. The philosophy of the generalized RDE framework is that an explanation for generic input signals \(x=f(h)\) should be some simplified version of the signal, which is interpretable to humans. This is achieved by demanding sparsity in a suitable representation system h, which ideally optimally represents the class of explanations that are desirable for the underlying domain and interpretation query. This philosophy underpins our experiments on image classification in the wavelet domain, on audio signal classification in the Fourier domain, and on radio map estimation in an urban environment domain. Therein we demonstrate the versatility of our generalized RDE framework.

2 Related Works

To our knowledge, the explanation principle of optimizing a mask \(s\in [0,1]^n\) has been first proposed in [7]. Fong et al. [7] explained image classification decisions by considering one of the two “deletion games”: (1) optimizing for the smallest deletion mask that causes the class score to drop significantly or (2) optimizing for the largest deletion mask that has no significant effect on the class score. The original RDE approach [16] is based on the second deletion game and connects the deletion principle to rate-distortion-theory, which studies lossy data compression. Deleted entries in [7] were replaced with either constants, noise, or blurring and deleted entries in [16] were replaced with noise.

Explanation methods introduced before the “deletion games” principle from [7] were typically based upon gradient-based methods [26, 29], propagation of activations in neurons [1, 25], surrogate models [20], and game-theory [15]. Gradient-based methods such as smoothgrad [26] suffer from a lacking principle of relevance beyond local sensitivity. Reference-based methods such as Integrated Gradients [29] and DeepLIFT [25] depend on a reference value, which has no clear optimal choice. DeepLIFT and LRP assign relevance by propagating neuron activations, which makes them dependent on the implementation of \(\varPhi \). LIME [20] uses an interpretable surrogate model that approximates \(\varPhi \) in a neighborhood around x. Surrogate model explanations are inherently limited for complex models \(\varPhi \) (such as image classifiers) as they only admit very local approximations. Generally, explanations that only depend on the model behavior on a small neighborhood \(U_x\) of x offer limited insight. Lastly, Shapley values-based explanations [15] are grounded in Shapley values from game-theory. They assign relevance scores as weighted averages of marginal contributions of respective features. Though Shapley values are mathematically well-founded, relevance scores cannot be computed exactly for common input sizes such as \(n\ge 50\), since one exact relevance score generally requires \(O(2^n)\) evaluations of \(\varPhi \) [30].

A notable difference between the RDE method and additive feature explanations [15] is that the values in the mask \(s^*\) do not add up to the model output. The additive property as in [15] takes the view that features individually contribute to the model output and relevance should be reflected by their contributions. We emphasize that the RDE method is designed to look for a set of relevant features and not an estimate of individual relative contributions. This is particularly desirable when only groups of features are interpretable, as for example in image classification tasks, where individual pixels do not carry any interpretable meaning. Similarly to Shapley values, the explanation in the RDE framework cannot be computed exactly, as it requires solving a non-convex minimization problem. However, the RDE method can take full advantage of modern optimization techniques. Furthermore, the RDE method is a model-agnostic explanation technique, with a mathematically principled and intuitive notion of relevance as well as enough flexibility to incorporate the model behavior on meaningful input regions of \(\varPhi \).

The meaning of an explanation based on deletion masks \(s\in [0,1]^n\) depends on the nature of the perturbations that replace the deleted regions. Random [7, 16] or blurred [7] replacements \(v\in \mathbb {R}^n\) may result in a data point \(x\odot s + (1-s)\odot v\) that falls out of the natural data manifold on which \(\varPhi \) was trained on. This is a subtle though important problem, since such an explanation may depend on evaluations of \(\varPhi \) on data points from undeveloped decision regions. The latter motivates in-distribution interpretability, which considers meaningful perturbations that keep \(x\odot s + (1-s)\odot v\) in the data manifold. [2] was the first work that suggested to use an inpainting-GAN to generate meaningful perturbations to the “deletion games”. The authors of [9] then applied in-distribution interpretability to the RDE method in the challenging modalities music and physical simulations of urban environments. Moreover, they demonstrated that the RDE method in [16] can be extended to answer so-called “interpretation queries”. For example, the RDE method was applied in [9] to an instrument classifier to answer the global interpretation query “Is magnitude or phase in the signal more important for the classifier?”. Most recently, in [11], we introduced CartoonX as a novel explanation method for image classifiers, answering the interpretation query “What is the relevant piece-wise smooth part of an image?” by applying RDE in the wavelet basis of images.

3 Rate-Distortion Explanation Framework

Based on the original RDE approach from [16], in this section, we present a general formulation of the RDE framework and discuss several implementations. While [16] focuses merely on image classification with explanations in pixel representation, we will apply the RDE framework not only to more challenging domains but also to different input signal representations. Not surprisingly, the combinatorical optimization problem in the RDE framework, even in simpler form, is extremely hard to solve [16, 31]. This motivates heuristic solution strategies, which will be discussed in Subsect. 3.2.

3.1 General Formulation

It is well-known that in practice there are different ways to describe a signal \(x \in \mathbb {R}^n\). Generally speaking, x can be represented by a data representation function \(f:\prod _{i=1}^k\mathbb {R}^{d_i}\rightarrow \mathbb {R}^n\),

$$\begin{aligned} x = f(h_1, \ldots , h_k), \end{aligned}$$
(1)

for some inputs \(h_i \in \mathbb {R}^{d_i}\), \(d_i \in \mathbb {N}\), \(i \in \left\{ 1, \ldots , k\right\} \), \(k\in \mathbb {N}\). Note, we do not restrict ourselves to linear data representation functions f. To briefly illustrate the generality of this abstract representation, we consider the following examples.

Example 1 (Pixel representation)

An arbitrary (vectorized) image \(x \in \mathbb {R}^n\) can be simply represented pixelwise

$$\begin{aligned} x = \begin{bmatrix} x_1 \\ \vdots \\ x_n\end{bmatrix} = f(h_1, \ldots , h_n), \end{aligned}$$

with \(h_i := x_i\) being the individual pixel values and \(f :\mathbb {R}^n \rightarrow \mathbb {R}^n\) being the identity transform.

Due to its simplicity, this standard basis representation is a reasonable choice when explaining image classification models. However, in many other applications, one requires more sophisticated representations of the signals, such as through a possibly redundant dictionary.

Example 2

Let \(\left\{ \psi _j\right\} _{j =1}^k\), \(k \in \mathbb {N}\), be a dictionary in \(\mathbb {R}^n\), e.g., a basis. A signal \(x \in \mathbb {R}^n\) is represented as

$$\begin{aligned} x = \sum _{j =1}^k h_j \psi _j, \end{aligned}$$

where \(h_j\in \mathbb {R}\), \(j \in \left\{ 1, \ldots , k\right\} \), are appropriate coefficients. In terms of the abstract representation (1), we have \(d_j = 1\) for \(j \in \left\{ 1, \ldots , k\right\} \) and f is the function that yields the weighted sum over \(\psi _j\). Note that Example 1 can be seen as a special case of this representation.

The following gives an example of a non-linear representation function f.

Example 3

Consider the discrete inverse Fourier transform, defined as

$$\begin{aligned}&f: \prod _{j=1}^{n}\mathbb {R}_+ \times \prod _{j=1}^n[0,2\pi ] \rightarrow \mathbb {C}^n,\\ {}&\big [f(m_1,...,m_n,\omega _1,...,\omega _n)\big ]_l:= \frac{1}{n} \sum _{j=1}^{n} \underbrace{m_je^{i\omega _j}}_{:= c_j\in \mathbb {C}}e^{i2\pi l(j-1)/n}, \; l \in \left\{ 1, \ldots , n\right\} , \end{aligned}$$

where \(m_j\) and \(\omega _j\) are respectively the magnitude and the phase of the j-th discrete Fourier coefficient \(c_j\). Thus every signal \(x \in \mathbb {R}^n \subseteq \mathbb {C}^n\) can be represented in terms of (1) with f being the discrete inverse Fourier transform while \(h_{j}\), \(j=1,\ldots ,k\) (with \(k=2n\)) being specified as \(m_{j'}\) and \(\omega _{j'}\), \(j' = 1, \ldots , n\).

Further examples of dictionaries \(\left\{ \psi _j\right\} _{j=1}^k\) include the discrete wavelet [21], cosine [19] or shearlet [12] representation systems and many more. In these cases, the coefficients \(h_i\) are given by the forward transform and f is referred to as the backward transform. Note that in the above examples we have \(d_i = 1\), i.e., the input vectors \(h_i\) are real-valued. In many situations, one is also interested in representations \(x = f(h_1, \ldots , h_k)\) with \(h_i \in \mathbb {R}^{d_i}\) where \(d_i >1\).

Example 4

Let \(k=2\) and define f again as the discrete inverse Fourier transform, but as a function of two components: (1) the entire magnitude spectrum and (2) the entire frequency spectrum, namely

$$\begin{aligned}&f: \mathbb {R}_+^n \times [0,2\pi ]^n,\\&\big [f(m,\omega )\big ]_l := \frac{1}{n} \sum _{j=1}^{n} \underbrace{m_j e^{i\omega _j}}_{:= c_n\in \mathbb {C}}e^{i2\pi l(j-1)/n},\; l \in \left\{ 1, \ldots , n\right\} . \end{aligned}$$

Similarly, instead of individual pixel values, one can consider patches of pixels in an image \(x \in \mathbb {R}^n\) from Example 1 as the input vectors \(h_i\) to the identity transform f. We will come back to these examples in the experiments in Sect. 4.

Finally, we would like to remark that our abstract representation

$$x = f(h_1,\ldots ,h_k)$$

also covers the cases where the signal is the output of a decoder or generative model f with inputs \(h_1, \ldots , h_k\) as the code or the latent variables.

As was discussed in previous sections, the main idea of the RDE framework is to extract the relevant features of the signal based on the optimization over its perturbations defined through masks. The ingredients of this idea are formally defined below.

Definition 1 (Obfuscations and expected distortion)

Let \(\varPhi :\mathbb {R}^n\rightarrow \mathbb {R}^m\) be a model and \(x\in \mathbb {R}^n \) a data point with a data representation \(x =f(h_1,...,h_k)\) as discussed above. For every mask \(s\in [0,1]^k\), let \(\mathcal {V}_s\) be a probability distribution over \(\prod _{i=1}^k\mathbb {R}^{d_i}\). Then the obfuscation of x with respect to s and \(\mathcal {V}_s\) is defined as the random vector

$$\begin{aligned} y := f(s\odot h + (1-s)\odot v), \end{aligned}$$

where \(v\sim \mathcal {V}_s\), \((s\odot h)_i = s_i h_i\in \mathbb {R}^{d_i}\) and \(((1-s)\odot v)_i= (1-s_i)v_i\in \mathbb {R}^{d_i}\) for \(i\in \left\{ 1, \ldots , k\right\} \). Furthermore, the expected distortion of x with respect to the mask s and the perturbation distribution \(\mathcal {V}_s\) is defined as

$$ D(x,s,\mathcal {V}_s, \varPhi ):= \mathop {\mathbb {E}}_{v\sim \mathcal {V}_s} \Bigg [ d\Big (\varPhi (x), \varPhi (y)\Big )\Bigg ], $$

where \(d:\mathbb {R}^m\times \mathbb {R}^m\rightarrow \mathbb {R}_+\) is a measure of distortion between two model outputs.

In the RDE framework, the explanation is given by a mask that minimizes distortion while remaining relatively sparse. The rate-distortion-explanation mask is defined in the following.

Definition 2 (The RDE mask)

In the setting of Definition 1 we define the RDE mask as a solution \(s^*(\ell )\) to the minimization problem

$$\begin{aligned} \min _{s\in \{0,1\}^k} \quad D(x,s,\mathcal {V}_s, \varPhi ) \quad \text { s.t. } \quad \left\| s \right\| _0 \le \ell , \end{aligned}$$
(2)

where \(\ell \in \left\{ 1, \ldots , k\right\} \) is the desired level of sparsity.

Here, the RDE mask is defined as the binary mask that minimizes the expected distortion while keeping the sparsity smaller than a certain threshold. Besides this, one could obviously also define the RDE mask as the sparsest binary mask that keeps the distortion lower than a given threshold, as defined in [16]. Geometrically, one can interpret the RDE mask as a subspace that is stable under \(\varPhi \). If \(x=f(h)\) is the input signal and s is the RDE mask for \(\varPhi (x)\) on the coefficients h, then the associated subspace \(R_\varPhi (s)\) is defined as the space of feasible obfuscations of x with s under \(\mathcal {V}_s\), i.e.,

$$\begin{aligned} R_\varPhi (s) :=\{f(s\odot h + (1-s)\odot v)\;|\;v\in \text {supp}\mathcal {V}_s \}, \end{aligned}$$

where \(\text {supp}\mathcal {V}_s\) denotes the support of the distribution \(\mathcal {V}_s\). The model \(\varPhi \) will act similarly on signals in \(R_\varPhi (s)\) due to the low expected distortion \( D(x,s,\mathcal {V}_s, \varPhi )\)—making the subspace stable under \(\varPhi \). Note that RDE directly optimizes towards a subspace that is stable under \(\varPhi \). If, instead, one would choose the mask s based on information of the gradient \(\nabla \varPhi (x)\) and Hessian \(\nabla ^2\varPhi (x)\), then only a local neighborhood around x would tend to be stable under \(\varPhi \) due to the local nature of the gradient and Hessian. Before discussing practical algorithms to approximate the RDE mask in Subsect. 3.2, we will review frequently used obfuscation strategies, i.e., the distribution \(\mathcal {V}_s\), and measures of distortion.

3.1.1 Obfuscation Strategies and in-Distribution Interpretability

The meaning of an explanation in RDE depends greatly on the nature of the perturbations \(v\sim \mathcal {V}_s\). A particular choice of \(\mathcal {V}_s\) defines an obfuscation strategy. Obfuscations are either in-distribution, i.e., if the obfuscation \( f(s\odot h + (1-s)\odot v) \) lies on the natural data manifold that \(\varPhi \) was trained on, or out-of-distribution otherwise. Out-of-distribution obfuscations pose the following problem. The RDE mask (see Definition 2) depends on evaluations of \(\varPhi \) on obfuscations \(f(s\odot h + (1-s)\odot v)\). If \(f(s\odot h + (1-s)\odot v)\) is not on the natural data manifold that \(\varPhi \) was trained on, then it may lie in undeveloped regions of \(\varPhi \). In practice, we are interested in explaining the behavior of \(\varPhi \) on realistic data and an explanation can be corrupted if \(\varPhi \) did not develop the region of out-of distribution points \(f(s\odot h + (1-s)\odot v)\). One can guard against this by choosing \(\mathcal {V}_s\) so that \(f(s\odot h + (1-s)\odot v)\) is in-distribution. Choosing \(\mathcal {V}_s\) in-distribution boils down to modeling the conditional data distribution – a non-trivial task.

Example 5 (In-distribution obfuscation strategy)

In light of the recent success of generative adversarial networks (GANs) in generative modeling [8], one can train an in-painting GAN [32]

$$ G(h,s,z)\in \prod _{i=1}^k \mathbb {R}^{d_i}, $$

where z are random latent variables of the GAN, such that the obfuscation \(f\big (s\odot h + (1-s)\odot G(h,s,z) \big )\) lies on the natural data manifold (see also [2]). In other words, one can choose \(\mathcal {V}_s\) as the distribution of \(v:= G(h,s,z)\), where the randomness comes from the random latent variables z.

Example 6 (Out-of-distribution obfuscation strategies)

A very simple obfuscation strategy is Gaussian noise. In that case, one defines \(\mathcal {V}_s\) for every \(s\in [0,1]^k\) as \( \mathcal {V}_s:= \mathcal {N}(\mu ,\varSigma ), \) where \(\mu \) and \(\varSigma \) denote a pre-defined mean vector and covariance matrix. In Sect. 4.1, we give an example of a reasonable choice for \(\mu \) and \(\varSigma \) for image data. Alternatively, for images with pixel representation (see Example 1) one can mask out the deleted pixels by blurred inputs, \(v = K*x\), where K is a suitable blur kernel.

Table 1. Common obfuscation strategies with their perturbation formulas.

We summarize common obfuscation strategies for a given target signal in Table 1.

3.1.2 Measure of Distortion

Various options exist for the measure \(d :\mathbb {R}^m \times \mathbb {R}^m \rightarrow \mathbb {R}\) of the distortion between model outputs. The measure of distortion should be chosen according to the task of the model \(\varPhi :\mathbb {R}^n\rightarrow \mathbb {R}^m\) and the objective of the explanation.

Example 7 (Measure of distortion for classification task)

Consider a classification model \(\varPhi :\mathbb {R}^n\rightarrow \mathbb {R}^m\) and a target input signal \(x \in \mathbb {R}^n\). The model \(\varPhi \) assigns to each class \(j\in \left\{ 1, \ldots , m\right\} \) a (pre-softmax) score \(\varPhi _j(x)\) and the predicted label is given by \(j^*:= \mathop {\mathop {\text {arg max}}\limits }\nolimits _{j \in \left\{ 1, \ldots , m\right\} } \varPhi _j(x)\). One commonly used measure of the distortion between the outputs at x and another data point \(y\in \mathbb {R}^n\) is given as

$$\begin{aligned} d_1\big (\varPhi (x),\varPhi (y) \big ) := \big (\varPhi _{j^*}(x)- \varPhi _{j^*}(y) \big )^2. \end{aligned}$$

On the other hand, the vector \([\varPhi _j(x)]_{j =1}^m\) is usually normalized to a probability vector \([\tilde{\varPhi }_j(x)]_{j=1}^m\) by applying the softmax function, namely \(\tilde{\varPhi }_j(x) := \exp {\varPhi _j(x)}/\sum _{i = 1}^m\exp {\varPhi _i(x)}\). This, in turn, gives another measure of the distortion between \(\varPhi (x), \varPhi (y) \in \mathbb {R}^m\), namely

$$\begin{aligned} d_2\big (\varPhi (x),\varPhi (y) \big ) := \big (\tilde{\varPhi }_{j^*}(x)- \tilde{\varPhi }_{j^*}(y) \big )^2, \end{aligned}$$

where \(j^* := \mathop {\mathop {\text {arg max}}\limits }\nolimits _{j \in \left\{ 1, \ldots , m\right\} } \varPhi _j(x) = \mathop {\mathop {\text {arg max}}\limits }\nolimits _{j \in \left\{ 1, \ldots , m\right\} } \tilde{\varPhi }_j(x)\). An important property of the softmax function is the invariance under translation by a vector \([c,\ldots ,c]^\top \in \mathbb {R}^m\), where \(c\in \mathbb {R}\) is a constant. By definition, only \(d_2\) respects this invariance while \(d_1\) does not.

Example 8 (Measure of distortion for regression task)

Consider a regression model \(\varPhi :\mathbb {R}^n\rightarrow \mathbb {R}^m\) and an input signal \(x \in \mathbb {R}^n\). One can then define the measure of distortion between the outputs of x and another data point \(y\in \mathbb {R}^n\) as

$$\begin{aligned} d_3\big ((\varPhi (x),\varPhi (y)\big ) := \left\| \varPhi (x)- \varPhi (y) \right\| _2^2. \end{aligned}$$

Sometimes it is reasonable to consider a certain subset of components \(J \subseteq \left\{ 1,\ldots ,m\right\} \) of the output vectors instead of all m entries. Denoting the vector formed by corresponding entries by \(\varPhi _J(x)\), the measure of distortion between the outputs can be defined as

$$\begin{aligned} d_4\big ((\varPhi (x),\varPhi (y)\big ) := \left\| \varPhi _J(x)- \varPhi _J(y) \right\| _2^2. \end{aligned}$$

The measure \(d_4\) will be used in our experiments for radio maps in Subsect. 4.3.

3.2 Implementation

The RDE mask from Definition 2 was defined as a solution to

$$\begin{aligned} \min _{s\in \{0,1\}^k} \quad D(x,s,\mathcal {V}_s, \varPhi ) \quad \text { s.t. } \quad \left\| s \right\| _0 \le \ell . \end{aligned}$$

In practice, we need to relax this problem. We offer the following three approaches.

3.2.1 \(\ell _1\)-relaxation with Lagrange Multiplier

The RDE mask can be approximately computed by finding an approximate solution to the following relaxed minimization problem:

figure a

where \(\lambda >0\) is a hyperparameter for the sparsity level. Note that the optimization problem is not necessarily convex, thus the solution might not be unique.

The expected distortion \(D(x,s,\mathcal {V}_s, \varPhi )\) can typically be approximated with simple Monte-Carlo estimates, i.e., by averaging i.i.d. samples from \(\mathcal {V}_s\). After estimating \(D(x,s,\mathcal {V}_s, \varPhi )\), one can optimize the mask s with stochastic gradient descent (SGD) to solve the optimization problem (\(\mathcal {P}_{1}\)).

3.2.2 Bernoulli Relaxation

By viewing the binary mask as Bernoulli random variables \(s\sim \text {Ber}(\theta )\) and optimizing over \(\theta \), one can guarantee that the expected distortion \(D(x,s,\mathcal {V}_s, \varPhi )\) is evaluated on binary masks \(s\in \{0,1\}^n\). To encourage sparsity of the resulting mask, one can still apply \(\ell _1\)-regularization on s, giving rise to the following optimization problem:

figure b

Optimizing the parameter \(\theta \) requires a continuous relaxation to apply SGD. This can be done using the concrete distribution [17], which samples s from a continuous relaxation of the Bernoulli distribution.

3.2.3 Matching Pursuit

As an alternative, one can also perform matching pursuit [18]. Here, the non-zero entries of \(s\in \{0,1\}^n\) are determined sequentially in a greedy fashion to minimize the resulting distortion in each step. More precisely, we start with a zero mask \(s^0=0\) and gradually build up the mask by updating \(s^t\) at step t by the rule given by

$$\begin{aligned} s^{t+1} = s^t + \mathop {\text {arg min}}\limits _{e_j:\,s_j^t=0} \, D(x,s^{t}+e_j,\mathcal {V}_s,\varPhi ). \end{aligned}$$

Here, the minimization is taken over all standard basis vectors \(e_j \in \mathbb {R}^k\) with \(s_j^t = 0\). The algorithm terminates when reaching some desired error tolerance or after a prefixed number of iterations. While this means that in each iteration we have to test every entry of s, it is applicable when k is small or when we are only interested in very sparse masks.

4 Experiments

With our experiments, we demonstrate the broad applicability of the generalized RDE framework. Moreover, our experiments illustrate how different choices of obfuscation strategies, optimization procedures, measures of distortion, and input signal representations, discussed in Sect. 3.1, can be leveraged in practice. We explain model decisions on various challenging data modalities and tailor the input signal representation and measure of distortion to the domain and interpretation query. In Sect. 4.1, we focus on image classification, a common baseline task in the interpretability literature. In Sects. 4.2 and 4.3, we consider two other data modalities that are often unexplored. Section 4.2 focuses on audio data, where the underlying task is to classify acoustic instruments based on a short audio sample of distinct notes, while in Sect. 4.3, the underlying task is a regression with data in the form of physical simulations in urban environments. We also believe our explanation framework sustains applications beyond interpretability tasks. An example is given in Sect. 4.3.2, where we add an RDE inspired regularizer to the training objective of a radio map estimation model.

4.1 Images

We begin with the most ordinary domain in the interpretability literature: image classification tasks. The authors of [16] applied RDE to image data before by considering pixel-wise perturbations. We refer to this method as Pixel RDE. Other explanation methods [1,2,3, 20], have also previously exclusively operated in the pixel domain. In [11], we challenged this customary practice by successfully applying RDE in a wavelet basis, where sparsity translates into piece-wise smooth images (also called cartoon-like images). The novel explanation method was coined CartoonX [11] and extracts the relevant piece-wise smooth part of an image. First, we review the Pixel RDE method and present experiments on the ImageNet dataset [4], which is commonly considered a challenging classification task. Finally, we present CartoonX and discuss its advantages. For all the ImageNet experiments, we use the pre-trained MobileNetV3-Small [10], which achieved a top-1 accuracy of 67.668% and a top-5 accuracy of 87.402%, as the classifier.

Fig. 1.
figure 1

Top row: original images correctly classified as (a) snail, (b) male duck, and (c) airplane. Middle row: Pixel RDEs. Bottom row: CartoonX. Notably, CartoonX is roughly piece-wise smooth and overall more interpretable than the jittery Pixel RDEs.

4.1.1 Pixel RDE

Consider the following pixel-wise representation of an RGB image \(x\in \mathbb {R}^{3\times n}\): \( f: \prod _{i=1}^n \mathbb {R}^3 \rightarrow \mathbb {R}^{n\times 3},\; x = f(h_1,...,h_n), \) where \(h_i\in \mathbb {R}^3\) represents the three color channel values of the i-th pixel in the image x, i.e. \((x_{i,j})_{j=1,..,3}=h_{i}\). In pixel RDE a sparse mask \(s\in [0,1]^n\) with n entries—one for each pixel—is optimized to achieve low expected distortion \(D(x,s,\mathcal {V}_s, \varPhi )\). The obfuscation of an image x with the pixel mask s and a distribution \(v\sim \mathcal {V}_s\) on \(\prod _{i=1}^n \mathbb {R}^3\) is defined as \(f(s \odot h + (1-s)\odot v)\). In our experiments, we initialize the mask with ones, i.e., \(s_i = 1\) for every \(i \in \left\{ 1,\ldots , n\right\} \), and consider Gaussian noise perturbations \(\mathcal {V}_s = \mathcal {N}(\mu ,\varSigma )\). We set the noise mean \(\mu \in \mathbb {R}^{3\times n}\) as the pixel value mean of the original image x and the covariance matrix \(\varSigma :=\sigma ^2{\text {Id}}\in \mathbb {R}^{3n\times 3n}\) as a diagonal matrix with \(\sigma >0\) defined as the pixel value standard deviation of the original image x. We then optimize the pixel mask s for 2000 gradient descent steps on the \(\ell _1\)-relaxation of the RDE objective (see Sect. 3.2.1). We computed the distortion \(d\big (\varPhi (x),\varPhi (y) \big )\) in \(D(x,s,\mathcal {V}_s, \varPhi )\) in the post-softmax activation of the predicted label multiplied by a constant \(C=100\), i.e., \(d\big (\varPhi (x),\varPhi (y) \big ) := C\big (\varPhi _{j^*}(x)- \varPhi _{j^*}(y) \big )^2\).

The expected distortion \(D(x,s,\mathcal {V}_s, \varPhi )\) was approximated as a simple Monte-Carlo estimate after sampling 64 noise perturbations. For the sparsity level, we set the Lagrange multiplier to \(\lambda =0.6\). All images were resized to 256 \(\times \) 256 pixels. The mask was optimized for 2000 steps using the Adam optimizer with step size 0.003. In the middle row of Fig. 1, we show three example explanations with Pixel RDE for an image of a snail, a male duck, and an airplane, all from the ImageNet dataset. Pixel RDE highlights as relevant both the snail’s inner shell and part of its head, the lower segment of the male duck along with various lines in the water, and the airplane’s fuselage and part of its rudder.

Fig. 2.
figure 2

Discrete Wavelet Transform of an image: (a) original image (b) discrete wavelet transform. The coefficients of the largest quadrant in (b) correspond to the lowest scale and coefficients of smaller quadrants gradually build up to the highest scales, which are located in the four smallest quadrants. Three nested L-shaped quadrants represent horizontal, vertical and diagonal edges at a resolution determined by the associated scale.

Fig. 3.
figure 3

CartoonX machinery: (a) image classified as park-bench, (b) discrete wavelet transform of the image, (c) final mask on the wavelet coefficients after the RDE optimization procedure, (d) obfuscation with final wavelet mask and noise, (e) final CartoonX, (f) Pixel RDE for comparison.

4.1.2 CartoonX

Formally, we represent an RGB image \(x\in [0,1]^{3\times n}\) in its wavelet coefficients \(h = \{h_i\}_{i=1}^n \in \prod _{i=1}^n\mathbb {R}^3\) with \(J \in \left\{ {1, \ldots , \lfloor \log _2 n \rfloor }\right\} \) scales as \( x = f(h) \), where f is the discrete inverse wavelet transform. Each \(h_i = (h_{i,c})_{c=1}^3\subseteq \mathbb {R}^3\) contains three wavelet coefficients of the image, one for each color channel and is associated with a scale \(k_i\in \left\{ 1, \ldots , J\right\} \) and a position in the image. Low scales describe high frequencies and high scales describe low frequencies at the respective image position. We briefly illustrate the wavelet coefficients in Fig. 2, which visualizes the discrete wavelet transform of an image. CartoonX [11] is a special case of the generalized RDE framework, particularly a special case of Example 2, and optimizes a sparse mask \(s\in [0,1]^n\) on the wavelet coefficients (see Fig. 3c) so that the expected distortion \(D(x,s,\mathcal {V}_s, \varPhi )\) remains small. The obfuscation of an image x with a wavelet mask s and a distribution \(v\sim \mathcal {V}_s\) on the wavelet coefficients is \(f(s \odot h + (1-s)\odot v)\). In our experiments, we used Gaussian noise perturbations and chose the standard deviation and mean adaptively for each scale: the standard deviation and mean for wavelet coefficients of scale \(j\in \left\{ 1, \ldots , J\right\} \) were chosen as the standard deviation and mean of the wavelet coefficients of scale \(j\in \left\{ 1, \ldots , J\right\} \) of the original image. Figure 3d shows the obfuscation \(f(s \odot h + (1-s)\odot v)\) with the final wavelet mask s after the RDE optimization procedure. In Pixel RDE, the mask itself is the explanation as it lies in pixel space (see middle row in Fig. 1), whereas the CartoonX mask lies in the wavelet domain. To go back to the natural image domain, we multiply the wavelet mask element-wise with the wavelet coefficients of the original greyscale image and invert this product back to pixel space with the discrete inverse wavelet transform. The inversion is finally clipped into [0, 1] as are obfuscations during the RDE optimization to avoid overflow (we assume here the pixel values in x are normalized into [0, 1]). The clipped inversion in pixel space is the final CartoonX explanation (see Fig. 3e).

The following points should be kept in mind when interpreting the final CartoonX explanation, i.e., the inversion of the wavelet coefficient mask: (1) CartoonX provides the relevant pice-wise smooth part of the image. (2) The inversion of the wavelet coefficient mask was not optimized to be sparse in pixel space but in the wavelet basis. (3) A region that is black in the inversion could nevertheless be relevant if it was already black in the original image. This is due to the multiplication of the mask with the wavelet coefficients of the greyscale image before taking the discrete inverse wavelet transform. (4) Bright high resolution regions are relevant in high resolution and bright low resolution regions are relevant in low resolution. (5) It is inexpensive for CartoonX to mark large regions in low resolution as relevant. (6) It is expensive for CartoonX to mark large regions in high resolution as relevant.

Fig. 4.
figure 4

Scatter plot of rate-distortion in pixel basis and wavelet basis. Each point is an explanation of a distinct image in the ImageNet dataset with distortion and normalized \(\ell _1\)-norm measured for the final mask. The wavelet mask achieves lower distortion than the pixel mask, while using less coefficients.

In Fig. 1, we compare CartoonX to Pixel RDE. The piece-wise smooth wavelet explanations are more interpretable than the jittery Pixel RDEs. In particular, CartoonX asserts that the snail’s shell without the head suffices for the classification, unlike Pixel RDE, which insinuated that both the inner shell and part of the head are relevant. Moreover, CartoonX shows that the water gives the classifier context for the classification of the duck, which one could have only guessed from the Pixel RDE. Both Pixel RDE and CartoonX state that the head of the duck is not relevant. Lastly, CartoonX, like Pixel RDE, confirms that the wings play a subordinate role in the classification of the airplane.

4.1.3 Why Explain in the Wavelet Basis?

Wavelets provide optimal representation for piece-wise smooth 1D functions [5], and represent 2D piece-wise smooth images, also called cartoon-like images [12], efficiently as well [21]. Indeed, sparse vectors in the wavelet coefficient space encode cartoon-like images reasonably well [27], certainly better than sparse pixel representations. Moreover, the optimization process underlying CartoonX produces sparse vectors in the wavelet coefficient space. Hence CartoonX typically generates cartoon-like images as explanations. This is the fundamental difference to Pixel RDE, which produces rough, jittery, and pixel-sparse explanations. Cartoon-like images are more interpretable and provide a natural model of simplified images. Since the goal of the RDE explanation is to generate an easy to interpret simplified version of the input signal, we argue that CartoonX explanations are more appropriate for image classification than Pixel RDEs. Our experiments confirm that the CartoonX explanations are roughly piece-wise smooth explanations and are overall more interpretable than Pixel RDEs (see Fig. 1).

4.1.4 CartoonX Implementation

Throughout our CartoonX experiments we chose the Daubechies 3 wavelet system, \(J=5\) levels of scales and zero padding for the discrete wavelet transform. For the implementation of the discrete wavelet transform, we used the Pytorch Wavelets package, which supports gradient computation in Pytorch. Distortion was computed as in the Pixel RDE experiments. The perturbations \(v\sim \mathcal {V}_s\) on the wavelet coefficients were chosen as Gaussian noise with standard deviation and mean computed adaptively per scale. As in the Pixel RDE experiments, the wavelet mask was optimized for 2000 steps with the Adam optimizer to minimize the \(\ell _1\)-relaxation of the RDE objective. We used \(\lambda =3\) for CartoonX.

4.1.5 Efficiency of CartoonX

Finally, we compare Pixel RDE to CartoonX quantitatively by analyzing the distortion and sparsity associated with the final explanation mask. Intuitively, we expect the CartoonX method to have an efficiency advantage, since the discrete wavelet transform already encodes natural images sparsely, and hence less wavelet coefficients are required to represent images than pixel coefficients. Our experiments confirmed this intuition, as can be seen in the scatter plot in Fig. 4.

4.2 Audio

We consider the NSynth dataset [6], a library of short audio samples of distinct notes played on a variety of instruments. We pre-process the data by computing the power-normalized magnitude spectrum and phase information using the discrete Fourier transform on a logarithmic scale from 20 to 8000 Hertz. Each data instance is then represented by the magnitude and the phase of its Fourier coefficients as well as the discrete inverse Fourier transform (see Example 3).

4.2.1 Explaining the Classifier

Our model \(\varPhi \) is a network trained to classify acoustic instruments. We compute the distortion with respect to the pre-softmax scores, i.e., deploy \(d_1\) in Example 7 as the measure of distortion. We follow the obfuscation strategy described in Example 5 and train an inpainter G to generate the obfuscation G(hsz). Here, h corresponds to the representation of a signal, s is a binary mask and z is a normally distributed seed to the generator.

We use a residual CNN architecture for G with added noise in the input and deep features. More details can be found in Sect. 4.2.3. We train G until the outputs are found to be satisfactory, exemplified by the outputs in Fig. 5.

Fig. 5.
figure 5

Inpainted Bass: Example inpainting from G. The bottom plot depicts phase versus frequency and the top plot depicts magnitude versus frequency. The random binary mask is represented by the green parts. The axes for the inpainted signal (black) and the original signal (blue dashed) are offset to improve visibility. Note how the inpainter generates plausible peaks in the magnitude and phase spectra, especially with regard to rapid (\(\ge \)600 Hz) versus smooth (<270 Hz) changes in phase. (Color figure online)

To compute the explanation maps, we numerically solve (\(\mathcal {P}_2\)) as discussed in Subsect. 3.2. In particular, s is a binary mask indicating whether the phase and magnitude information of a certain frequency should be dropped and is specified as a Bernoulli variable \(s \sim \text {Ber}(\theta )\). We chose a regularization parameter of \(\lambda = 50\) and minimized the corresponding objective using the Adam optimizer with a step size of \(10^{-5}\) in \(10^6\) iterations. For the concrete distribution, we used a temperature of 0.1. Two examples resulting from this process can be seen in Fig. 6.

Fig. 6.
figure 6

Interpreting NSynth Model: The optimized importance parameter \(\theta \) (green) overlayed on top of the DFT (blue). For each of guitar and bass, the top graph shows the power-normalized magnitude and the bottom the phase. Notice the solid peaks 30 Hz and 60 Hz for guitar and 100 Hz and 230 Hz for bass. These occur because the model is relying on those parts of the spectra, for the classification. Notice also how many parts of the spectrum are important even when the magnitude is near zero. This indicates that the model pays attention to whether those frequencies are missing. (Color figure online)

Notice here that the method actually shows a strong reliance of the classifier on low frequencies (30 Hz–60 Hz) to classify the top sample in Fig. 6 as a guitar, as only the guitar samples have this low frequency slope in the spectrum. We can also see in contrast that classifying the bass sample relies more on the continuous signal 100 Hz and 230 Hz.

4.2.2 Magnitude vs Phase

In the above experiment, we have represented the signals by the magnitude and phase information at each frequency, hence the mask s acts on each frequency. Now we consider the interpretation query of whether the entire magnitude spectrum or the entire phase spectrum is more relevant for the prediction. Accordingly, we consider the representation discussed in Example 4 and apply the mask s to turn off or on the whole magnitude spectrum or the phase information. Furthermore, we can optimize s not only for one datum but for all samples from a class. This extracts the information whether magnitude or phase is more important for predicting samples from a specific class.

For this, we again minimized (\(\mathcal {P}_2\)) (meaned over all samples of a class) with \(\theta \) as the Bernoulli parameter using the Adam optimizer for \(2 \times 10^5\) iterations with a step size of \(10^{-4}\) and the regularization parameter \(\lambda =30\). Again, a temperature of \(t=0.1\) was used for the concrete distribution.

From the results of these computations, which can be seen in Table 2, we can observe that there is a clear difference on what the classifier bases its decision on across instruments. The classification of most instruments is largely based on phase information. For the mallet, the values are low for magnitude and phase, which means that the expected distortion is very low compared to the \(\ell _1\)-norm of the mask, even when the signal is completely inpainted. This underlines that the regularization parameter \(\lambda \) may have to be adjusted for different data instances, especially when measuring distortion in the pre-softmax scores.

4.2.3 Architecture of the Inpainting Network G

Here, we briefly describe the architecture of the inpainting network G that was used to generate obfuscations to the target signals. In particular, Fig. 7 shows the diagram of the network G and Table 3 shows information about its layers.

Table 2. Magnitude importance versus phase importance.

4.3 Radio Maps

In this subsection, we assume a set of transmitting devices (Tx) broadcasting a signal within a city. The received strength varies with location and depends on physical factors such as line of sight, reflection, and diffraction. We consider the regression problem of estimating a function that assigns the proper signal strength to each location in the city. Our dataset \(\mathcal {D}\) is RadioMapSeer [14] containing 700 maps, 80 Tx per map, and a corresponding grayscale label encoding the signal strength at every location. Our model \(\varPhi \) receives as input \(x = [x^{(0)},x^{(1)},x^{(2)}]\), where \(x^{(0)}\) is a binary map of the Tx locations, \(x^{(1)}\) is a noisy binary map of the city (where a few buildings are missing), and \(x^{(2)}\) is a grayscale image representing a number of ground truth measurements of the strength of the signal at the measured locations and zero elsewhere. We apply the UNet [13, 14, 22] architecture and train \(\varPhi \) to output the estimation of the signal strength throughout the city that interpolates the input measurements.

Apart from the model \(\varPhi \), we also have a simpler model \(\varPhi _0\) , which only receives the city map and the Tx locations as inputs and is trained with unperturbed input city maps. This second model \(\varPhi _0\) will be deployed to inpaint measurements to input to \(\varPhi \). See Fig. 8a, 8b, and 8c for examples of a ground truth map and estimations for \(\varPhi \) and \(\varPhi _{0}\), respectively.

Fig. 7.
figure 7

Diagram of the inpainting network for NSynth.

Table 3. Layer table of the Inpainting model for the NSynth task.
Fig. 8.
figure 8

Radio map estimations: The radio map (gray), input buildings (blue), and input measurements (red). (Color figure online)

4.3.1 Explaining Radio Map \(\varPhi \)

Observe that in Fig. 8a there is a missing building in the input (the black one) and in Fig. 8b, \(\varPhi \) in-fills this building with a shadow. As a black box method, it is unclear why it made this decision. Did it rely on signal measurements or on building patterns? To address this, we consider each building as a cluster of pixels and each measurement as potential targets for our mask \(s = [s^{(1)}, s^{(2)}]\), where \(s^{(1)}\) acts on buildings and \(s^{(2)}\) acts on measurements. We then apply matching pursuit (see Subsect. 3.2.3) to find a minimal mask s of critical components (buildings and measurements).

To be precise, suppose we are given a target input signal \(x = [x^{(0)}, x^{(1)}, x^{(2)}]\). Let \(k_1\) denote the number of buildings in \(x^{(1)}\) and \(k_2\) denote the number of measurements in \(x^{(2)}\). Consider the function \(f_1\) that takes as inputs vectors in \(\left\{ 0,1\right\} ^{k_1}\), which indicate the existence of buildings in \(x^{(1)}\), and maps them to the corresponding city map in the original city map format. Analogously, consider the function \(f_2\) that takes as input the measurements in \(\mathbb {R}^{k_2}\) and maps them to the corresponding grayscale image of the original measurements format. Then, \(f_1\) and \(f_2\) encode the locations of the buildings and measurements in the target signal \(x=[x^{(0)}, f_1(h^{(1)}), f_2(h^{(2)})]\), where \(h^{(1)}\) and \(h^{(2)}\) denotes the building and measurement representation of x in \(f_1\) and \(f_2\). When \(s^{(1)}\) has a zero entry, i.e., a building in \(h^{(1)}\) was not selected, we replace the value in the obfuscation with zero (this corresponds to a constant perturbation equal to zero). Then, the obfuscation of the target signal x with a mask \(s=[s^{(1)}, s^{(2)}]\) and perturbations \(v=[v^{(1)}, v^{(2)}]:= [0, v^{(2)}] \) becomes:

$$\begin{aligned} y :=[x^{(0)}, f_1(s^{(1)}\odot h^{(1)}), f_2(s^{(2)}\odot h^{(2)}+ (1-s^{(2)})\odot v^{(2)})]. \end{aligned}$$

While it is natural to model masking out a building by simply zeroing out the corresponding cluster of pixels by choosing \(v^{(1)}=0\), we need to also properly choose \(v^{(2)}\) for the entries, where the mask \(s^{(2)}\) takes value 0, in order to obtain appropriate obfuscations. For this, we can deploy the second model \(\varPhi _0\) as an inpainter. We consider the following two extreme obfuscation strategies. The first is to set also \(v^{(2)}\) to zero, i.e., simply remove the unchosen measurements from the input, with the underlying assumption being that any subset of measurements is valid for a city map. In the other extreme case, we inpaint all unchosen measurements by sampling at their locations the estimated radio map obtained by \(\varPhi _0\) based on the buildings selected by \(s^{(1)}\).

The two extreme measurement completion methods correspond to two extremes of the interpretation query. Filling-in the missing measurements by \(\varPhi _0\) tends to overestimate the strength of the signal because there are fewer buildings to obstruct the transmissions. The empty mask will complete all measurements to the maximal possible signal strength – the free space radio map. The overestimation in signal strength is reduced when more measurements and buildings are chosen, resulting in darker estimated radio maps. Thus, this strategy is related to the query of which measurements and buildings are important to darken the free space radio map, turning it to the radio map produced by \(\varPhi \). In the other extreme, adding more measurements to the mask with a fixed set of buildings typically brightens the resulting radio map. This allows us to answer which measurements are most important for brightening the radio map.

Between these two extreme strategies lies a continuum of completion methods where a random subset of the unchosen measurements is sampled from \(\varPhi _0\), while the rest are set to zero. Examples of explanations of a prediction \(\varPhi (x)\) according to these methods are presented in Fig. 9. Since we only care about specific small patches exemplified by the green boxes, the distortion here is measured with respect to the \(\ell _2\) distance between the output images restricted to the corresponding region (see also Example 8).

Fig. 9.
figure 9

Radio map queries and explanations: The radio map (gray), input buildings (blue), input measurements (red), and area of interest (green box). Middle represents the query “How to fill in the image with shadows”, while right is the query “How to fill in the image both with shadows and bright spots?”. We inpaint with \(\varPhi _0\). (Color figure online)

When the query is how to darken the free space radio map (Fig. 9), the optimized mask s suggests that samples in the shadow of the missing building are the most influential in the prediction. These dark measurements are supposed to be in line-of-sight of a Tx, which indicates that the network deduced that there is a missing building. When the query is how to fill in the image both with shadows and bright spots (Fig. 9c), both samples in the shadow of the missing building and samples right before the building are influential. This indicates that the network used the bright measurements in line-of-sight and avoided predicting an overly large building. To understand the chosen buildings, note that \(\varPhi \) is based on a composition of UNets and is thus interpreted as a procedure of extracting high level and global information from the inputs to synthesize the output. The locations of the chosen buildings in Fig. 9 reflect this global nature.

4.3.2 Interpretation-Driven Training

We now discuss an example application of the explanation obtained by the RDE approach described above, called interpretation driven training [23, 24, 28]. When a missing building is in line-of-sight of a Tx, we would like \(\varPhi \) to reconstruct this building relying on samples in the shadow of the building rather than patterns in the city. To reduce the reliance of \(\varPhi \) on the city information in this situation, one can add a regularization term in the training loss which promotes explanations relying on measurements. Suppose \(x = [x^{(0)}, x^{(1)}, x^{(2)}]\) contains a missing input building in line-of-sight of the Tx location and denote the subset of pixels of the missing building in the city map as \(J_x\). Denote the prediction by \(\varPhi \) restricted to the subset \(J_x\) as \(\varPhi _{J_x}\). Moreover, define \(\tilde{x} := [x^{(0)}, 0, x^{(2)}]\) to be the modification of x with all input buildings masked out. We then define the interpretation loss for x as

$$\begin{aligned} \ell _{\text {int}}(\varPhi , x) := \left\| \varPhi _{J_x}(x) - \varPhi _{J_x}(\tilde{x}) \right\| _2^2. \end{aligned}$$
Fig. 10.
figure 10

Radio map estimations, interpretation driven training vs vanilla training: The radio map (gray), input buildings (blue), input measurements (red), and domain of the missing building (green box). (Color figure online)

The interpretation driven training objective then regularizes \(\varPhi \) during training by adding the interpretation loss for all inputs x that contain a missing input building in line-of-sight of the Tx location. An example comparison between explanations of the vanilla RadioUNet \(\varPhi \) and the interpretation driven network \(\varPhi _{\text {int}}\) is given in Fig. 10.

5 Conclusion

In this chapter, we presented the Rate-Distortion Explanation (RDE) framework in a revised and comprehensive manner. Our framework is flexible enough to answer various interpretation queries by considering suitable data representations tailored to the underlying domain and query. We demonstrate the latter and the overall efficacy of the RDE framework on an image classification task, on an audio signal classification task, and on a radio map estimation task, a seldomly explored regression task.