1 Introduction

Rapid development of miniaturized high-resolution, low-cost image sensors, dedicated to operate in various lighting conditions, makes image enhancement and noise suppression to be very important operations of digital image processing.

There are various types of noise which affect acquisition and processing of digital color images. The disturbances may be introduced by [1,2,3,4,5,6]:

  • electric signal instabilities,

  • physical imperfections in sensors,

  • corrupted memory locations,

  • transmission errors,

  • aging of the storage material,

  • natural or artificial electromagnetic interferences.

Therefore, noise suppression is one of the most frequently performed low-level image processing tasks [1, 2, 5, 6]. There are plentiful different techniques tailored for suppression of distinct type of noise, but most of them are vulnerable to occurrence of a impulsive noise, which introduces significant deviations of color image channel values [7,8,9]. Therefore, the suppression of the impulsive noise is a critical step of image preprocessing.

Impulsive noise removal techniques are contextual processing schemes which estimate the channels of the processed pixel using information obtained from its neighborhood, represented by a sliding operational window. Many of them are based on a vector-ordering scheme [10,11,12,13,14], and use cumulative distances between samples in a window as dissimilarity estimates. Those accumulated distances are then sorted and constitute the basis for further processing in various filtering algorithms.

One of the most basic filtering techniques, utilizing this ordering scheme, is the vector median filter (VMF) [10, 15]. The output of VMF is the pixel from operational window for which the sum of distances to other samples from the window is minimized. Although this filter does not introduce any new colors to the processed image, there is no guarantee that the output pixel is itself noise-free, and thus, numerous solutions were developed to solve this problem and improve filtering performance [16,17,18,19,20,21].

The main reason, that the efficiency of vector-ordering schemes is limited, lies in processing of every image pixel, regardless whether it is corrupted or not. Unnecessary processing of noise-free pixels results in inevitable degradation of the image quality. To address this issue, a significant improvement has been made by introduction of more sophisticated switching filters [22,23,24,25,26,27,28,29,30,31], which focus on the restoration of corrupted pixels only.

The switching techniques use various approaches to determine if the processed pixel is corrupted or not. Then, only those classified as noisy are further processed by the output estimation algorithm. This way, not only the quality of output of restored image is preserved, but also a significant reduction of the computational cost is often achieved.

There are numerous techniques of noisy pixel detection to be found in the literature [32,33,34]. Those schemes can be categorized by the following families:

  • schemes based on reduced vector ordering [14, 35,36,37,38];

  • techniques using peer group concept [39,40,41,42];

  • filters utilizing quaterions [43,44,45];

  • methods based on fuzzy set theory [46,47,48,49,50,51,52,53].

The Fast Adaptive Switching Trimmed Arithmetic Mean Filter (FASTAMF), concerned in this paper, has been proposed recently [54] and is a very efficient technique from both noise suppression efficiency and computational cost point of view. The main practical drawback of the algorithm (which is common among alternatives) is the necessity of manual parameter adjustment to image noise contamination severity, to achieve its optimal performance. Therefore, the main goal of the research presented here is the introduction of a self-tuning mechanism, so that manual experimental choice of the main filter parameter (threshold) will no longer be required.

1.1 Notation

For the purpose of better readability of subsequent sections describing the concerned algorithm design, the following notations are introduced:

  • \(\varvec{X}\)—input image (corrupted),

  • \(\varvec{x}_{u,v}\)—input image pixel located at spatial coordinates (u, v),

  • \(\varvec{\hat{X}}\)—output image (restored),

  • \(\hat{\varvec{x}}_{u,v}\)—output image pixel located at (u, v),

  • \(\varvec{O}\)—original image (reference image),

  • \(\varvec{o}_{u,v}\)—original image pixel located at (u, v),

  • M—original map of noise acquired using artificial image contamination.

  • \(m_{u,v}\)—real state of pixel corruption located at (uv) (0—noisy; 1—noise-free),

  • \(\hat{M}\)—final estimated map of noise acquired during noise detection phase.

  • \(\hat{m}_{u,v}\)—classification of pixel contamination located at uv (0—noisy; 1—noise-free).

  • \(\varvec{W}\)—local operating window centered at \(\varvec{x}_{u,v}\), containing pixels from direct 8–neighborhood,

  • \(\varvec{x}_{i}\)ith pixel of the local operating window \(\varvec{W}\) (the pixel \(\varvec{x}_{1}\) is the central pixel in \(\varvec{W}\)),

  • w—size of \(\varvec{W}\) (odd integer),

  • n—number of pixels in \(\varvec{W}\) (\(n=w \times w=w^2\)),

  • \(d(\varvec{x}_i,\varvec{x}_j)\)—distance between two pixels from \(\varvec{W}\),

  • \(\delta _{i}\)—distance between central pixel \(\varvec{x}_1\) and \(\varvec{x}_{i}\in \varvec{W}\),

  • \(\delta _{(r)}\)rth smallest distance among all \(\delta _{i}\) computed for the same \(\varvec{W}\),

  • \(c_{u,v}\)—sum of \(\alpha\) smallest distances, representing raw impulsiveness of particular \(\varvec{x}_{u,v}\),

  • \(W_c\)—window containing values of raw impulsiveness computed for every pixel from local neighborhood of currently processed pixel \(\varvec{x}_{u,v}\),

  • \(c_{\min }\)—smallest accumulated distance in \(W_c\) representing simple estimate of image structure,

  • \(s_{u,v}\)—corrected impulsiveness measure of pixel at position uv,

  • k—iteration number in self-tuning (ST) procedure,

  • l—iteration number in multiple run test,

  • t—threshold value (filter parameter),

  • \(t_k\)—threshold value adjusted in the kth iteration of the self-tuning,

  • \(\text {AMF}(\varvec{W})\)—output of the Arithmetic Mean Filter,

  • \(\rho\)—true noise density used in artificial image contamination.

  • \(\hat{\rho }\)—estimated noise density obtained through noise detection phase,

  • \(\hat{\rho }_k\)—estimated noise density obtained in the kth iteration of self- tuning procedure,

  • \(\rho _R, \rho _G, \rho _B\)—probability of contamination of channels in RGB color space,

  • \(\rho _A\)—probability of contamination of all pixel channels at once,

  • \(\hat{M}_k\)—estimated map of noise obtained during kth iteration of self-tuning,

  • \(\mu\),\(\nu\) —height and width of the image (in pixels),

  • \(\theta\)—number of pixels in \(\varvec{X}\) (\(\varTheta = \mu \times \nu\)),

  • \(n_k\)—number of pixels designated as noisy during kth iteration of self-tuning,

  • p—probability of error in statistical reasoning (result of statistical test which allows to hold or reject a null hypothesis).

  • \(k_{F}\)—final number of iterations of self-tuning procedure, required to satisfy the convergence condition.

In above, three-dimensional arrays (e.g., images, operating windows) are denoted using emboldened capital letters, two-dimensional arrays (e.g., map of noise) are indicated as normal capital letters, vectors (like single pixel) are presented as emboldened lowercase characters, and, finally, scalars are represented in normal lowercase manner.

1.2 Impulsive noise models

In this paper, four different noise models are considered [55, 56]. In all of those models, the main parameter is the noise density (\(\rho\)) expressed by the percentage of corrupted pixels in the processed image:

  • Channel Together Random Impulse (CTRI)—if a pixel is noisy, all of its RGB channels are corrupted.

  • Channel Independent Random Impulse (CIRI)—if a pixel is contaminated, the alteration of every channel is independent.

  • Channel Correlated Random Impulse (CIRI)—if a pixel is contaminated, then the corruption of channels is correlated with fixed correlation coefficient.

  • Custom Probability Random Impulse (CPRI)—if a pixel is contaminated, there is a fixed set of probabilities that single RGB channels are corrupted (\(\rho _R,\rho _G,\rho _B\)) or that all channels are corrupted together (\(\rho _A=1-(\rho _R+\rho _G+\rho _B)\)). The model does not take into account the corruption of two channels at once.

In all above models, contaminated pixel channel is represented by a random value taken from full encoding range: \(\left\langle 0,255\right\rangle\) (for 8-bit RGB image coding).

1.3 Performance measures

The noise detection efficiency alone can be evaluated using binary classification. The result of noise detection, represented by estimated noise map \(\hat{M}\), is compared to the original noise map M, established during artificial image corruption, which is treated as true information of noise occurrence. After comparison, each pixel can be assigned to one of the following classes:

  • True positive (TP)—pixel was correctly recognized as being contaminated.

  • False positive (FP)—pixel was falsely classified as noisy—also known as Type-I error.

  • True negative (TN)—pixel was correctly recognized as not corrupted.

  • False negative (FN)—pixel was incorrectly classified as not noisy—also known as Type-II error.

After an assignment of above states to every pixel in the image, the detection performance can be measured using accuracy:

$$\begin{aligned} \text {ACC} = \frac{|\text {TP}|+|\text {TN}|}{|\text {TP}|+|\text {TN}|+|\text {FP}|+| \text {FN}|}, \end{aligned}$$
(1)

where \(|\text {TP}|,|\text {TN}|, |\text {FP}| \text { and } |\text {FN}|\) are cardinalities of pixels assigned to particular categories.

The overall noise suppression efficiency can be evaluated using many different performance measures. In this paper, we consider the following:

$$\begin{aligned} \text {PSNR}= & {} 10\log _{10}{\frac{255^2}{\text {MSE}}}, \end{aligned}$$
(2)
$$\begin{aligned} \text {MSE}= & {} \frac{1}{3\theta } \sum _{u=1}^{\mu }\sum _{v=1}^{\nu }\sum _{q\in Z}^{} (o_{u,v}^{q}-\hat{x}_{u,v}^{q})^2, \end{aligned}$$
(3)

where \(o_{u,v}^{q}\) and \(Z=\{R,G,B\}\) are the channels of the original image pixels, and \(\hat{x}_{u,v}^{q}\) are the pixels of the restored image:

$$\begin{aligned} \text {MAE}= & {} \frac{1}{3\theta } \sum _{u=1}^{\mu }\sum _{v=1}^{\nu }\sum _{q\in Z}^{} \left| o_{u,v}^{q}-\hat{x}_{u,v}^{q}\right| , \end{aligned}$$
(4)
$$\begin{aligned} \text {NCD}= & {} \frac{\sum _{u=1}^{\mu }\sum _{v=1}^{\nu } \sqrt{\varDelta E} }{\sum _{u=1}^{\mu }\sum _{v=1}^{\nu }\sqrt{L_{u,v}^2+a_{u,v}^2+b_{u,v}^2}}, \end{aligned}$$
(5)
$$\begin{aligned} \varDelta E= & {} \left( L_{u,v}\!-\!\hat{L}_{u,v}\right) ^2\!+\!\left( a_{u,v}\!-\!\hat{a}_{u,v}\right) ^2\!+\! \left( b_{u,v}\!-\!\hat{b}_{u,v}\right) ^2, \end{aligned}$$
(6)

where Lab are the coordinates of the original and \(\hat{L}, \hat{a}, \hat{b}\) of restored image pixels, both in CIE Lab color space [1].

In addition, the Feature-SIMilarity index for color images (FSIMc), [57] was used to provide additional information about noise suppression performance. In contrast to Peak Signal-to-Noise Ratio (PSNR), Mean Absolute Error (MAE), and Normalized Color Difference (NCD), which operate on individual pixels, thus compare images in context-free manner, the FSIMc index, is based on the properties of the human visual system.

2 Original FASTAMF algorithm

Fig. 1
figure 1

Block diagram of original FASTAMF

The FASTAMF [54] algorithm is composed of two main processing phases (Fig. 1):

  1. 1.

    noise detection—the noise map (\(\hat{M}\)) is estimated upon input image (\(\varvec{X}\)), using reduced ordering scheme and two parameters provided by the user: operating window size (w) and threshold (t).

  2. 2.

    pixel replacement—the output image (\(\varvec{\hat{X}}\)) is obtained using input image (\(\mathbf {X}\)), and noise map (\(\hat{M}\)) provided by noise detection phase. Only pixels classified as noisy are processed by AMF with operating window size w.

The filter operates on every pixel of input image \(\varvec{X}\) located at coordinates (uv), denoted as \(\varvec{x}_{u,v}\), using operational window W containing \(n = w^2\) samples. Pixels in W are denoted \(\varvec{x}_1\ldots ,\varvec{x}_{n}\), and \(\varvec{x}_{1} = \varvec{x}_{u,v}\) is the center pixel of W (Fig. 2).

Fig. 2
figure 2

Notation of the pixels in the filtering window

2.1 Noise detection

The noise detection phase is composed of the following steps:

  1. I.

    Evaluation of pixels impulsiveness begins with computation of dissimilarity measure \(d(\varvec{x}_1,\varvec{x}_j)\) between the central pixel and every other pixel contained in W, denoted as \(\delta _{i}\). Originally, the Euclidean distance was used, but many other dissimilarity measures can be used instead [11]. For example, in [58], the authors show that the use of Chebyshev distance (\(L_\infty\)) improves detection performance, because this way algorithm is then more sensitive to outliers occurring on individual channels. Next, distances \(\delta _i\) (excluding \(\delta _1\) which is equal to 0) are sorted in ascending order: \(\delta _{2}, \ldots , \delta _{n} \longrightarrow \delta _{(1)}, \ldots , \delta _{(n-1)},\) and the trimmed sum of \(\alpha =2\) smallest distances is computed for pixel \(\varvec{x}_{u,v}\):

    $$\begin{aligned} c_{u,v} = \sum _{r=1}^{\alpha } \delta _{(r)}. \end{aligned}$$
    (7)

    The \(c_{u,v}\) can be interpreted as raw impulsiveness of the pixel (Fig. 3).

  2. II.

    Adaptation to local image variation is performed. For every \(c_{u,v}\), a window \(W_{c}\), containing n values \(c_{i}\), is taken, so that \(c_{1}=c_{u,v}\) is in the center of that window. The final corrected measure of corrected pixel impulsiveness (Fig. 4) assigned to pixel \(\varvec{x}_{u,v}\) is obtained:

    $$\begin{aligned} s_{u,v} = c_{u,v}-c_{\min }, \end{aligned}$$
    (8)

    where \(c_{\min } = \min \{c \in W_c\}\). This correction normalizes impulsiveness based on the local image variation. In homogeneous image regions \(c_{\min }\) is close to 0 and it rises together with variation in local neighborhood. As a result, the pixels of high raw impulsiveness in harsh regions of the image are less likely to be classified as noisy pixels of the same raw impulsiveness in smooth areas.

  3. III.

    Noise map acquisition finalizes the noise detection phase, during which the estimated noise map \(\hat{M}\) is obtained. It is achieved by the comparison of \(s_{u,v}\) to the threshold t, provided by the user, for every pixel in the image \(\varvec{X}\) as follows:

    $$\begin{aligned} \hat{m}_{u,v}= \left\{ \begin{array}{ll} 0 &{} \text { if } \; s_{u,v}> t, \\ 1 &{} \; \text {otherwise}. \end{array} \right. \end{aligned}$$
    (9)

    The labeling of noisy pixels as 0 in \(\hat{M}\) is needed for the subsequent pixel replacement phase.

Fig. 3
figure 3

Computation of pixel raw impulsiveness c (for two exemplary operating windows: \(W_1\) and \(W_2\))

Fig. 4
figure 4

Transition from raw impulsiveness to estimated map of noise (using estimate of the image structure for impulsiveness correction)

Fig. 5
figure 5

Block diagram of self-tuning FASTAMF

2.2 Pixel replacement

In the pixel replacement phase, the output image is obtained according to the following rule:

$$\begin{aligned} \varvec{\hat{x}}_{u,v}= \left\{ \begin{array}{ll} \text {AMF}(\varvec{W}) &{} \text { if }\; \hat{m}_{u,v} = 0, \\ \varvec{x}_{u,v} &{} \; \text {otherwise}, \end{array} \right. \end{aligned}$$
(10)

where AMF(\(\varvec{W}\)) is the arithmetic mean computed only on members of \(\varvec{W}\), which were designated as noise-free. In rare occasions (occurring for very high noise densities), when there is no noise-free pixels in \(\varvec{W}\), the output is determined using the VMF scheme.

3 Self-tuning

As it has been shown in Fig. 1, there are three inputs for the algorithm: processed image (\(\varvec{X}\)), threshold (t), and operational window size (w). As long as w is intuitive parameter to adjust, the proper choice of t may be a difficult one. It was shown in [54, 58] that optimal choice of t is dependent on impulsive noise density (\(\rho\)), which is mostly unknown in real-case scenarios, so the operator is forced to experimental search of adequate value of t.

To free the user from manual adjusting of this parameter, a self-tuning modification is introduced. The main concept of this improvement is to use the estimated noise map \(\hat{M}\) (obtained during noise detection phase) to compute the estimated noise density \(\hat{\rho }\). Combining \(\hat{\rho }\) with proper tuning characteristics like provided in [58] enables to adjust the t value, which can be used to obtain more accurate noise map.

3.1 Algorithm

Fig. 6
figure 6

Training image set

Based on the aforementioned idea, the self-tuning modification is introduced (Fig. 5). Before execution of the algorithm, the input t is set to initial value \(t_1\) = 60 (recommended value for t in FASTAMF using Chebyshev distance, obtained experimentally). Then, the noise detection is performed until corrected impulsiveness measure is obtained for every pixel in the input image \(\varvec{X}\).

Table 1 Optimal and recommended t values for CIRI noise model
Table 2 Optimal and recommended t values for CCRI noise model

Next, the recursive procedure of automatic t adjustment is performed by the following steps:

  1. (a)

    The estimated map of noise for the current iteration \(\hat{M}_k\) is obtained by (9) using \(t_k\).

  2. (b)

    The estimated noise density for the current iteration \(\hat{\rho }_k\) is evaluated: \(\hat{\rho }_k = n_k/\theta\), where \(n_k\) is the number of pixels designated as corrupted (in kth iteration) and \(\theta\) is the number of pixels in image X.

  3. (c)

    \(t_{k+1}\) value is interpolated (simple linear interpolation between two closest values) using tuning tables (see Table 4), which were obtained using procedure presented in Sect. 3.2.

Steps (a)–(c) are repeated in a loop until desired convergence \(|t_{k+1}-t_{k}|< \epsilon\) is achieved, where \(\epsilon\) regulates the convergence. In all experiments presented in this paper, \(\epsilon =1\) was used.

Table 3 Optimal and recommended t values for CTRI noise model
Table 4 Tuning values for threshold t

Final estimated map of noise, denoted as \(\hat{M}\), is then taken as an input to the pixel replacement phase. It is important that only the final step of the entire noise detection phase (9) has to be recursively repeated, so the increase in computational cost is not significant.

Although it might be tempting to design a similar solution for the adaptive tuning of operation window size w, it is pointless to do so due to the following reasons:

  • It is very intuitive to choose w value, and using windows larger than \(3\times 3\) is reasonable for high noise intensities only (\(\rho>50\%\)).

  • Window size w has a critical impact on computational cost of the algorithm, so its automatic on-the-fly tuning certainly makes its execution time extremely unpredictable.

  • Alteration of the w during algorithm’s execution requires repetition of the entire noise detection phase, which is very costly form computational point of view. Therefore, such tuning algorithm would be inapplicable for real-time image processing tasks.

  • Preliminary tests (omitted in the paper) revealed that w has stronger impact on noise detection phase performance than on pixel replacement phase. Therefore, partial solution, assuming the use of altered on-the-fly w for pixel replacement phase only, resulted in lack of restored image quality improvement.

3.2 Tuning tables

Fig. 7
figure 7

Validation image set. The images are numbered from 1 (top left) to 10 (bottom right)

The core of the self-adjusting threshold t modification is the tuning Table 4 which provides the t values for interpolation step (c). Originally, this table was proposed for CTRI and CIRI noise models in [58]; however, in this paper, more thorough experiments were performed, to obtain more general insight into the problem.

The set of 100 color images has been taken as the training set (Fig. 6) [59]. Each of those images was artificially contaminated with CIRI, CCRI (correlation coefficient set to 0.5) and CTRI model for noise densities \(\rho \in \left\{ 0.1, 1, 5, 10, 15, \ldots , 80\%\right\}\). Finally, for each contaminated image, the optimization has been performed to find optimal value of t for which ACC, PSNR, and FSIMc are maximal and MAE and NCD is minimal.

The mean values (and standard deviations) of optimal t, computed upon entire set of training images for a chosen performance measure, noise model, and noise density are presented in Tables 1, 2, and 3. The final proposition of general tuning values obtained as an average of results from all experiments is shown in Table 4.

4 Noise suppression performance

Fig. 8
figure 8

Algorithm’s performance in subsequent iterations for validation images 1 and 8 (for which STF performs better)

Our aim was to provide the most objective noise suppression performance test; therefore, a new set of ten images was taken as validation set (Fig. 7) [60]. In addition, all of those validation images were contaminated with CPRI noise (\(p_R=p_G=p_B=p_A=0.25\)), which has not been used for obtaining the tuning Table 4. This way, we provided an independent test input, further minimizing the possibility that the tuning values are optimized for particular image set or noise model.

4.1 FASTAMF compared with state-of-the-art algorithms

While original FASTAMF algorithm [54] was operating on Euclidean distance, the new version uses its Chebyshev counterpart. Therefore, a new comparison to state-of-the-art filters is required. This time, we decided to restrict the state-of-the-art algorithm base to four filters, which were found to be the most competitive using the recommended parameter settings:Footnote 1

  • FASTAMF with recommended \(t=60\).

  • ACWVMF [61] with \(\lambda =2\) and \(Tol= 80\).

  • FAPGF [42] with \(d=0.1\) and \(\gamma =0.8\).

  • FFNRF [48] with \(K=1024\) and \(\alpha =3.5\).

  • FPGF [41] with \(m=3\) and \(d=45\).

For all tested algorithms, the operating window size was set to \(w=3\) and the comparison was performed for noise densities \(\rho \in \left\{ 10, 20, \ldots , 50\%\right\}\). For each corrupted image from validation set and for each tested algorithm, the noise suppression was performed, and PSNR, MAE, NCD, and FSIMc measures were calculated. The results were grouped by each measure and noise density and compared using statistical tests.

For all results in the test group, the Friedman’s test [62] was performed. Two opposite hypotheses were taken under consideration:

  • H0: There is no evidence that results for all algorithms are significantly heterogeneous.

  • H1: There is evidence that results for all algorithms are significantly heterogeneous.

For each group for which H0 has been discarded in favor of the H1, the set of Friedman—Post hoc tests proposed by Nemenyi [62]—was performed comparing FASTAMF to each other algorithm. For those tests, the following hypotheses were stated:

  • H2: There is no evidence that FASTAMF performs significantly better than the compared algorithm.

  • H3: There is evidence that FASTAMF performs significantly better than compared algorithm.

Table 5 Friedman’s test results—FASTAMF compared with state-of-the-art filters
Table 6 Post hoc test results—FASTAMF compared with state-of-the-art filters (emboldened results speak against FASTAMF superiority)

The results of above tests are summarized in Table 5 (Friedman’s tests) and in Table 6 (Post hoc tests). All emboldened values in the tables do not support the superiority of the FASTAMF (as those are in minority). In case of PSNR and FSIMc measures, the better value is the higher one, so the higher mean rank values support the superiority of particular algorithm. The opposite situation is for MAE and NCD measures which are better if smaller.

The following conclusions can be drawn:

  • The results obtained from all algorithms (represented by quality measures) were always heterogeneous (H0 was discarded in favor of H1 in every case). In addition, p in each Friedman’s test was very low, so the differences in results are unquestionably significant.

  • For every measure and noise density, the best mean ranks were observed for FASTAMF algorithm, which means that it was the best or almost the best for every tested image.

  • Only very few of Post hoc tests resulted in favor of H2 hypothesis. For those rare cases, we can state that FASTAMF is not significantly better than the compared algorithm. For each other case, however, it is the best performing algorithm among tested.

  • For low-noise densities, the ACWVMF tends to be a competitive choice for FASTAMF, while, for higher noise contamination ratios, the FAPGF provides the most similar results.

4.2 Self-tuning FASTAMF against original FASTAMF

The main goal of self-tuning feature is to free the user from experimental threshold selection, which is optimal for a given noise density. Therefore, self-tuning FASTAMF (further denoted as STF) is compared to the original FASTAMF (further denoted as OF from Original Filter) with recommended \(t=60\). This time, only two algorithms were compared, so Wilcoxon’s test [62] was used (not every sample has normal distribution, so t test cannot be preformed). The following hypotheses were formulated:

  • H0: There is not enough evidence that STF provides significantly better results.

  • H1: There is enough evidence that STF provides significantly better results.

The results are presented in Table 7. This time, all emboldened values support the superiority of STF algorithm). The results can be summarized as follows:

  • Larger Positive Sums of Ranks for PSNR ad FSIMc quality measures indicate that STF algorithm performs better (values of measure are more frequently higher). Such outcome can be observed for \(\rho \ge 20\%\), for both measures.

  • Smaller Positive Sums of Ranks for NCD and MAE measures indicate that STF algorithm performs better (values of measure are more frequently lower). Such outcome can be observed for MAE when \(\rho \ge 40\%\) and for all results evaluated with NCD.

  • For \(\rho \ge 40\%\) the STF algorithm performs significantly better than OF form NCD and PSNR point of view.

The results obtained are very satisfactory. It is obvious that ST modification is not significantly better for lower \(\rho\) values due to the recommended fixed t being suitable for such scenarios. In addition, while \(\rho\) becomes higher, the better performance of STF becomes more noticeable, as this algorithm automatically adjusts its optimal threshold value.

4.3 Multi-run and visual comparison

One of the common approaches to achieve good noise suppression performance is to repeat the processing of the noisy picture several times, using output image as an input for next algorithm’s execution. This way noisy pixels omitted during first filtering may be detected and restored during subsequent runs. However, this approach may lead to stronger degradation of image details, especially if the algorithm has adaptive features.

The noise suppression scheme (further referenced as multi-run or MR) was performed for three iterations (\(l=1,2,3\)) upon all the validating images for noise densities \(\rho \in \left\{ 10, 30, 50\right\}\), and the four representative images were selected for detailed comparison (validation images 1, 7, 8, and 9).

To provide fair comparison, two images for which OF algorithm achieved a better performance in the MR test (validation images 7 and 9) were opposed to two images for which ST provided better results (validation images 1 and 8). The OF scheme was applied for three iterations with the same recommended \(t=60\), while STF algorithm calculated threshold value automatically in each iteration.

Fig. 9
figure 9

Algorithm’s performance in subsequent iterations for validation images 7 and 9 (for which OF performs better)

Fig. 10
figure 10

Visual comparison of OF and STF performance for \(\rho = 30\%\) and validation image 1

Fig. 11
figure 11

Visual comparison of OF and STF performance for \(\rho = 30\%\) and validation image 7

The efficiency of both algorithms in terms of PSNR and FSIMc measures is presented in Table 8. In addition, the visual comparison of performance for both filtering schemes for \(l=1\), \(l=3\) and \(\rho = 30\%\) is depicted in Figs. 10, 11, 12, and 13.

As can be seen, the STF algorithm performs better for “easier” tasks (validation images 1 and 8)—which are meager in detail and have large homogeneous regions (Figs. 10 and 12). The threshold value t is well adjusted in the first execution (Fig. 8), and then, it is set automatically to higher value in subsequent runs. This is caused by low estimated noise density after first noise suppression and is beneficial for the purpose of detail preservation. The STF algorithm does not try to repair less explicit outliers, which might be the image details.

Fig. 12
figure 12

Visual comparison of OF and STF performance for \(\rho = 30\%\) and validation image 8

Fig. 13
figure 13

Visual comparison of OF and STF performance for \(\rho = 30\%\) and validation image 9

Fig. 14
figure 14

Execution time and performance of ST version in comparison to the original FASTAMF algorithm

For harder tasks, however (validation images 7 and 9)—images rich in small details (Figs. 11 and 13)—a large value of ST might be too high to enable the restoration of omitted noisy pixels (Fig. 9). The OF algorithm with fixed t shows higher efficiency, while it tends to restore pixels with the same impulsiveness.

Table 7 Wilcoxon’s tests for STF and OF comparison (emboldened results denotes a significant superiority of ST)
Table 8 Multi-run test (emboldened results are superior in each contamination rate)
Table 9 The number of iterations (\(k_{\text {F}}\)), required to achieve the ST convergence

We can observe that although the self-tuning feature of the algorithm is very convenient and may achieve a better statistical performance, especially if the noise density is unknown or non-stationary, it may achieve slightly inferior efficiency than fixed t value version for more complicated images.

A more detailed analysis of zoomed regions on images 1 and 8 (Figs. 10, 11, 12, 13) shows that:

  • the OF removes less noisy pixels for \(l=1\) than its STF counterpart (Fig. 10c, d), due to higher value of t. If the reason behind those leftovers is high variance of the local area, those are mostly removed in subsequent iterations (Fig. 10e). If a too high t value caused this omission, those will not be restored, no matter how many iterations will be performed. In addition, fixed t value makes OF completely insensitive to less explicit noisy pixels, which is reflected in numerical (PSNR) and structural (FSIMc) measures;

  • the STF scheme removes more noisy pixels in the first execution (Fig. 10d), strongly decreasing the local variance of the image. Consequently, it is easier to remove omitted noisy pixels in subsequent iterations (Fig. 10f), and there is also lower count of less explicit noisy pixels due to lower t in the first run;

  • the STF scheme tends to remove more details (Fig. 11d, f) from the image (it is more blurry), than OF (Fig. 11c, e). As long as it is hard to be noticed visually without zoom, it clearly affects numerical (PSNR) and also structural (FSIMc) measures;

  • undetected by OF noisy pixels during the first execution (Fig. 12c), may cause low-level distortions around them, which will not be repaired in subsequent iterations (Fig. 12e). Such phenomenon is far less noticeable if STF scheme is used (Fig. 12d, f).

  • STF scheme tends to remove more details in the most difficult cases (Fig. 13d, f), which is reflected mostly by the PSNR measure.

It has to be pointed out that images 1 and 8 contain large homogeneous regions which makes the adjustment of t easier. In contrast, the high local variance of regions occurring in images 7 and 9 makes t tuning harder. As long as very large number of distinct images was used as training set for obtaining the tuning Table 4, the local variance of the image has not been taken into account, nor it is measured in any way during t adjustment. Such approach was considered and tested in the early stages of STF development, but it was very computationally expensive, so not applicable for real-time implementations.

The visual comparison shows that STF algorithm seems to always achieve a better noise suppression efficiency (less explicit leftovers can be noticed). Therefore, lower PSNR values might be caused by very subtle differences, which can be detected on the numerical level only.

5 Efficiency

5.1 Computational complexity

A detailed analysis of computational complexity of algorithm is presented in [54]; therefore, in this paper, it has been performed for ST modification only. Self-tuning begins after the computation of corrected impulsiveness and in each iteration k requires:

  1. 1.

    Estimation of map of noise \(\hat{M}_k\) which needs \(\mu \times \nu\) comparisons (COMPS). This step has linear complexity.

  2. 2.

    Estimation of noise density \(\hat{\rho }_k\) for which \(\mu \times \nu\) additions (ADDS) and one division (DIVS) are necessary. This step also has linear complexity.

  3. 3.

    Linear interpolation of t:

    $$\begin{aligned} t_k=\frac{t_A-t_B}{\rho _A-\rho _B}\hat{\rho }_k+\left( t_A- \frac{t_A-t_B}{\rho _A-\rho _B}\rho _A\right) , \end{aligned}$$
    (11)

    where A and B are the nearest indicates of values in Table 4 for which \(\rho _A \le \hat{\rho }_k \le \rho _B\). It demands 5 subtractions (SUBS), 2 divisions (DIVS), 2 multiplications (MULTS), and 1 addition (ADDS) and up to 18 COMPS (required for determination for A and B). This step is not image size-dependent, so it can be treated as step with constant computational complexity.

  4. 4.

    Convergence condition check which also has constant computational complexity and requires one subtraction, and two comparisons.

The remaining issue is the number of iterations required to achieve desired convergence. For every image in training image set, noise models: CTRI, CIRI, and CCRI, and noise densities \(\rho \in \left\{ 0.1, 1, 5, 10, 15, \ldots , 80\%\right\}\) STF algorithm has been executed and the number of iterations (denoted as \(k_{\text {F}}\)) needed to satisfy the convergence condition has been obtained. The results are presented in Table 9. It can bee seen that the \(k_{\text {F}}\) is very stable and has almost deterministic value.

The computational cost of single iteration of ST modification is not very heavy and is linearly dependent on image size. The number of iteration required to archive final t values is fairly low and predictable, so this modification is a suitable addition to FASTAMF algorithm in terms of real-time image processing requirements.

5.2 Experimental comparison

The execution time and noise suppression efficiency of FASTAMF with ST modification has been compared to the original FASTAMF (with recommended \(t=60\)). In tests, all ten images’ validation set (Fig. 7) contaminated with CPRI model and noise densities \(\rho \in \left\{ 10, 20, 30\%\right\}\) is used.

The noise suppression efficiency has been evaluated by PSNR, MAE, NCD, and FSIMc measures. Since the tuning tables were obtained as the trade-off between those measures, in Fig. 14, the most favorable (NCD) and most adverse (PSNR) outcomes of using ST modification were presented. On vertical axis, the difference in particular measure is presented, while, on horizontal one, the change of execution time (in percentages) is exhibited. The point (100,0) refers to all results obtained using original FASTAMF algorithm, and marked points represent results obtained for ST version using ten test images.

Interestingly, in individual cases for \(\rho = 10\%\), the ST version might be even faster than original algorithm, because the threshold value is calculated to be higher then \(t=60\). As a consequence, fewer pixels are recognized as noisy, and noise suppression (AMF) has less work to do.

Also the major conclusion is that results for ST modification become better, along with increasing noise density.

6 Summary

The achieved denoising results are very satisfactory, since the reduction of the number of FASTAMF parameters and a better overall performance have been the main goal of this research. The new self-tuning FASTAMF, achieves slightly better or at least not worse overall performance than the original algorithm, yet it has no parameters which require experimental adjustment.

Also the computational cost of the self-tuning is not significantly higher, since it works after the most computationally expensive part of the algorithm (estimation of the pixel impulsiveness).

The major virtue of the self-tuning FASTAMF is its adaptability to noise density. As it can be useful for filtering of images contaminated by impulsive noise of unknown density, it might be even more advantageous for processing of video sequences distorted by noise with time-dependent parameters. The initial value of t (for self-tuning mechanism) can be carried out from frame to frame, in such implementations, decreasing the number of potential iterations, required to achieve required convergence. The application of proposed filtering scheme to the video enhancement will be the subject of future work.