Advertisement

SN Applied Sciences

, 1:1587 | Cite as

New symmetric decorrelating set-membership NLMS adaptive algorithms for blind speech intelligibility enhancement

  • Mohamed DjendiEmail author
  • Abdelhak Cheffi
Research Article
  • 64 Downloads
Part of the following topical collections:
  1. Engineering: Signal Processing

Abstract

This paper addresses the problem of speech enhancement and acoustic noise reduction by partial and set-membership adaptive algorithms combined with the symmetric decorrelating adaptive (SAD) algorithm structure. In this paper, we propose two new adaptive algorithms based on set-membership principle that improve the original set-membership algorithm behavior in speech enhancement applications. The first proposed algorithm (called Proposed 1), is based on the combination of the SAD algorithm structure with a smart control that uses decorrelating properties between the output and the mixing signal to control and update the SAD adaptive filters. The second proposed algorithm (called Proposed 2) is a modification of Proposed 1 and based on a new regularization relations of the SAD adaptive filters that use a combination between the variance of the mixing and the output signals of the SAD structures. These two proposed algorithms (Proposed 1 and Proposed 2) aim to improve the convergence speed performance and the output signal-to-noise-ratio of the original SAD algorithm when no smart control of the adaptive filters is used. The proposed algorithms have very interesting properties with non-stationary signal like speech when the SAD algorithm is, used alone, fails. The simulation results that are obtained by the comparison between the proposed algorithms (Proposed 1 and Proposed 2) and the original two-channel set-membership NLMS algorithm have shown the best performances of Proposed 1 and Proposed 2 in terms of the following criteria: systems mismatch, segmental SNR and segmental men square error.

Keywords

Speech enhancement SAD NLMS Set-membership Adaptive algorithm Convergence speed 

Abbreviations

Fs

Sampling frequency

SNR

Signal to noise ratio

LMS

Least-mean-squares

NLMS

Normalized least-mean-squares

AWGN

Additive white Gaussian noise

WGN

White Gaussian noise

USASI

United state of America standard institute

CSP

Convergence speed performance

MSE

Mean square error

BSS

Blind source separation

CD

Cepstral distance

FBSS

Forward BSS

SFT

Short-Fourier-transform

SegMSE

Segmental mean-square-error

SegSNR

Segmental signal-to-noise ratio

SM

System mismatch

PNLMS

Proportionate NLMS

SAD

Sysmetric adaptive decorrelating

SMF

Set-membership filtering

AP

Affine projection

SM-AP

Set-membership affine projection

TC-SM-NLMS

Two-channel set membership NLMS

List of symbols

L

Real impulse responses length

k

Time index

dB

Decibel

\(s\left( n \right)\)

Original speech signal

\(b\left( n \right)\)

Noise

\(h_{12} \left( n \right)\), \(h_{21} \left( n \right)\)

Cross-coupling impulse responses

\(p_{1} \left( n \right)\), \(p_{2} \left( n \right)\)

Noisy speech signals

\(\delta \left( n \right)\)

Dirac impulse

\(w_{12} \left( n \right)\), \(w_{21} \left( n \right)\)

Adaptive filters

\(u_{1} \left( n \right)\)

Estimated speech by forward structure

\(u_{2} \left( n \right)\)

Estimated noise by forward structure

\(H_{1} \left( k \right)\) and \(H_{2} \left( k \right)\)

Conditions set

\(\mu_{1}\) and \(\mu_{2}\)

Step-sizes

\(N\)

Adaptive filter length

\(\gamma_{1}\) and \(\gamma_{2}\)

Fixed threshold error

\(SNR1\), \(SNR2\)

Signal to noise ration in the two inputs

\(w_{12}^{opt} \left( k \right)\), \(w_{21}^{opt} \left( k \right)\)

Optimal filters

\(\mu_{c1} \left( k \right)\), \(\mu_{c2} \left( k \right)\)

Control step-sizes of Proposed 1

\(\mu_{v1} \left( k \right)\), \(\mu_{v2} \left( k \right)\)

Control step-sizes of Proposed 2

\(\beta_{1}\), \(\beta_{2}\)

Smoothing parameters of Proposed 2

1 Introduction

Adaptive filtering principle has been largely used in many area of research in the past decades, and several algorithms based on different types of criterion have been proposed in the literature [1, 2]. Several conventional adaptive techniques assume that the problem to be resolved is based on linear form [3, 4], and this assumption, that is not true in practice, reduces the efficiency of the adaptive approach to accomplish and limits the optimal solutions that can be achieved [5, 6].

The most popular adaptive filtering algorithms like the least mean square (LMS) and the normalized LMS (NLMS) algorithms are robust and have a low computational complexity [7, 8]. The adaptive filters updates of LMS and NLMS algorithms are directly controlled by the input vector [9, 10]. This property makes them very limited in terms of performances with a non-stationary signal like speech, but, in the other hand, very convenient for dispersive–impulse response type systems [11, 12, 13]. Several techniques and algorithms based on these principles were proposed to improve the LMS and NLMS algorithms behavior, like convergence speed performance in the transient regime, under similar conditions [14]. In [15, 16], it was proposed the proportionate NLMS (PNLMS) that recovers the convergence speed property of NLMS, nevertheless the PNLMS algorithms are more complexes in comparison with NLMS. The problem’s complexity of PNLMS algorithms were resolved by the set-membership filtering (SMF) family [17, 18]. In set-membership algorithm family, the output filtering control parameter is limited by a predetermined threshold [19]. Available methods based on set-membership (SM) approaches exist for linear and non linear models [20, 21, 22]. In the set-membership context, the estimated unknown system encloses the set of solutions or a union of disconnected sub-sets of solutions which are produced by the uncertainty of experimental data and the predefined system error boundaries [23]. However, the set-membership algorithms allow to get fast convergence speed performance as well as low steady-state error. Several techniques of SMF were proposed with low complexity in the literature like the set-membership NLMS (SMNLMS), [24], and the set-membership affine projection (SM-AP) [25] algorithms.

In this paper, we propose two new adaptive algorithms based on set-membership principle that improve the original set-membership algorithm behavior in speech enhancement applications. The first proposed algorithm (called Proposed 1), is based on the combination of the SAD algorithm structure with a smart control that uses decorrelating properties between the output and the mixing signal to control the SAD filters update. The second proposed algorithm (called Proposed 2) is a modification of Proposed 1 and based on a new regularization relations of the SAD adaptive filters. The two proposed algorithms (Proposed 1 and Proposed 2) aim to improve the convergence speed performance and the output signal-to-noise-ratio of the original SAD algorithm without control system.

This paper is organized as follows: In Sect. 2, the mixing model that we have used in simulation is presented. In Sect. 3, we present the proposed algorithms, i.e. Proposed 1 and Proposed 2. In Sect. 4, the simulation part that contains experiments on three algorithm, i.e. the classical or original TC-SM-NLMS, Proposed 1, and Proposed 2. All the experiments are expressed in terms of three objective criteria as SegSNR, SegMSE, and SM. In Sect. 5, we conclude our work.

2 Two channel mixing model

In Fig. 1, we present the used mixing model in this paper. This model allows to mix two uncorrelated sources that are the speech signal \(s\left( k \right)\) and the punctual noise \(b\left( k \right)\) as shown in Fig. 1 [26].Where \(h_{12} \left( k \right)\) and \(h_{21} \left( k \right)\) are the cross-coupling effects. The observed signals \(p_{1} \left( k \right)\) and \(p_{2} \left( k \right)\) of this model are:
$$\varvec{p}_{1} \left( \varvec{k} \right) = \varvec{s}\left( \varvec{k} \right) + \varvec{b}\left( \varvec{k} \right)\varvec{*h}_{21} \left( \varvec{k} \right)$$
(1)
$$p_{2} \left( k \right) = b\left( k \right) + s\left( k \right)*h_{12} \left( k \right)$$
(2)
where * represents the linear convolution operator.
Fig. 1

The two channel mixing model

3 Proposed TC-SM-NLMS algorithms

In this section, we present the mathematical derivation of two proposed algorithms, i.e.
  1. (i)

    the two channel correlated set membership NLMS (TC-SM-NLMScor) algorithm that we call, in this paper, “Proposed 1”.

     
  2. (ii)

    and the two channel correlated set membership regularized NLMS (TC-SM-RNLMScor) that we call, “Proposed 2”.

     
In first, the mixing signals \(p_{1} \left( k \right)\) and \(p_{2} \left( k \right)\), obtained by Fig. 1, present the inputs of the forward BSS structures given by Fig. 2 and Fig. 3. According to these Figures, the output signals \(u_{1} \left( k \right)\) and \(u_{2} \left( k \right)\) are estimated as follows:
$$u_{1} \left( k \right) = p_{1} \left( k \right) - p_{2} \left( k \right)*w_{21} \left( k \right)$$
(3)
$$u_{2} \left( k \right) = p_{2} \left( k \right) - p_{1} \left( k \right)*w_{12} \left( k \right)$$
(4)
where \(w_{12} \left( k \right)\) and \(w_{21} \left( k \right)\) are the two adaptive filters. Inserting Eqs. (1) and (2) into (3) and (4), respectively, and using the optimality assumption for both adaptive filters, \({\text{i}} . {\text{e}} .\, w_{12}^{opt} \left( k \right) = h_{12} \left( k \right)\) and \(w_{21}^{opt} \left( k \right) = h_{21} \left( k \right)\), we get the following output relations:
$$u_{1} \left( k \right) = s\left( k \right)*\left[ {\delta \left( k \right) - h_{12} \left( k \right)*w_{12} \left( k \right)} \right]$$
(5)
$$u_{2} \left( k \right) = b\left( k \right)*\left[ {\delta \left( k \right) - h_{21} \left( k \right)*w_{21} \left( k \right)} \right]$$
(6)
Weher \(\delta \left( k \right)\) is the Kronecker delta. According to relations (5) and (6), we conclude that the outputs \(u_{1} \left( k \right)\) and \(u_{2} \left( k \right)\) converge respectively to the original signals \(s\left( k \right)\) and \(b\left( k \right)\) with distortions that can be corrected by the post-filters [27]. However, The observed distortions are more critical in closely spaced microphone position. In this paper we consider the choice of loosely spaced microphone where the observed distortions could be ignored [28, 31]. Both proposed TC-SM-NLMS algorithms reduce \(\left\| {\varvec{w}_{21} \left( k \right) - \varvec{w}_{21} \left( {k - 1} \right)} \right\|^{2}\) and \(\left\| {\varvec{w}_{12} \left( k \right) - \varvec{w}_{12} \left( {k - 1} \right)} \right\|^{2}\), subject to a zero a posterior errors condition. Further condition is to check if the adaptive filters estimates of \(\varvec{w}_{21} \left( k \right)\) and \(\varvec{w}_{12} \left( k \right)\) are respectively on the outside conditions sets \(\varvec{H}_{1} \left( k \right)\) and \(\varvec{H}_{2} \left( k \right)\). We can write:
$$\begin{aligned} \varvec{H}_{1} \left( k \right) & = \left\{ {\varvec{w}_{21} \left( k \right) \in R^{N} \varvec{ }:\varvec{ }\left| {p_{1} \left( k \right) - p_{2} \left( k \right)*w_{21} \left( k \right)} \right| \le \gamma_{1} } \right\} \\ & = \left\{ {\varvec{w}_{21} \left( k \right) \in R^{N} \varvec{ }:\varvec{ }\left| {u_{1} \left( k \right)} \right| \le \gamma_{1} } \right\} \\ \end{aligned}$$
(7)
$$\begin{aligned} \varvec{H}_{2} \left( k \right) & = \left\{ {\varvec{w}_{12} \left( k \right) \in R^{N} \varvec{ }:\varvec{ }\left| {p_{2} \left( k \right) - p_{1} \left( k \right)*w_{12} \left( k \right)} \right| \le \gamma_{2} } \right\} \\ & = \left\{ {\varvec{w}_{12} \left( k \right) \in R^{N} \varvec{ }:\varvec{ }\left| {u_{2} \left( k \right)} \right| \le \gamma_{2} } \right\} \\ \end{aligned}$$
(8)
where \(N\) is the adaptive filter length, and \(\gamma_{1}\) and \(\gamma_{2}\) are the threshold error constants.
Fig. 2

The structure of Proposed 1

Fig. 3

The structure of Proposed 2

3.1 Presentation of Proposed 1

In Proposed 1 (see Fig. 2), we propose a new relations for the step-size updates \(\mu_{c1} \left( k \right)\) and \(\mu_{c2} \left( k \right)\) as given by relations (9) and (10) respectively. These new relations are given in the following:
$$\mu_{c1} \left( k \right) = \left\{ {\begin{array}{*{20}l} {\mu_{1} - \frac{{\gamma_{1} }}{{\left\| {u_{2} \left( k \right)\varvec{p}_{1}^{\varvec{T}} \left( k \right)} \right\| + \delta }}} \hfill & {{\text{if}}\, u_{2} \left( k \right) > \gamma_{1} } \hfill \\ 0 \hfill & {else} \hfill \\ \end{array} } \right.$$
(9)
$$\mu_{c2} \left( k \right) = \left\{ {\begin{array}{*{20}l} {\mu_{2} - \frac{{\gamma_{2} }}{{\left\| {u_{1} \left( k \right)\varvec{p}_{2}^{\varvec{T}} \left( k \right)} \right\| + \delta }}} \hfill & {if\,u_{1} \left( k \right) > \gamma_{2} } \hfill \\ 0 \hfill & {else} \hfill \\ \end{array} } \right.$$
(10)

From relation (9), the parameter \(\delta\) is a small constant that is introduced to avoid zero division. We can also see from (9) that the step size \(\mu_{c1} \left( k \right)\) is sensible to the correlation vector of \(\varvec{p}_{1} \left( k \right)\) and \(u_{2} \left( k \right)\), and from relation (10) that the step size \(\mu_{c2} \left( k \right)\) is sensible to the vector correlation of \(\varvec{ p}_{2} \left( k \right)\) and \(u_{1} \left( k \right)\). These two relations means that when the two output signals \(u_{1} \left( k \right)\) and \(u_{2} \left( k \right)\) converge to their optimal solution, \({\text{i}} . {\text{e}} .\, w_{12}^{opt} \left( k \right) = h_{12} \left( k \right)\) and \(w_{21}^{opt} \left( k \right) = h_{21} \left( k \right)\), under the condition that the step-size \(\mu_{c1} \left( k \right)\) takes important values (close to 1) when speech is absent and small values (close to 0) in the other case (speech presence), in such condition, the FBSS structure combined with our proposed algorithm (Proposed 1) provides a recovered speech signal without noise at the output \(u_{1} \left( k \right)\). The same conclusions can be generated for the step size \(\mu_{c2} \left( k \right)\), but as we are only interested in the output \(u_{1} \left( k \right)\), we will focus only on this last output.

The new relations of the adaptive filters \(\varvec{w}_{12} \left( k \right)\) and \(\varvec{w}_{21} \left( k \right)\) are given in the following:
$$\varvec{w}_{12} \left( k \right) = \left\{ {\begin{array}{*{20}l} {\varvec{w}_{12} \left( {k - 1} \right) + \mu_{c1} \left( k \right)\frac{{u_{2} \left( k \right)\varvec{p}_{1} \left( k \right)}}{{\left\| {\varvec{p}_{1} \left( k \right)} \right\|^{2} + \delta }}} \hfill & { if\,\left| {u_{2} \left( k \right)} \right| > \gamma_{1} } \hfill \\ {\varvec{w}_{12} \left( {k - 1} \right)} \hfill & {else} \hfill \\ \end{array} } \right.$$
(11)
$$\varvec{w}_{21} \left( k \right) = \left\{ {\begin{array}{*{20}l} {\varvec{w}_{21} \left( {k - 1} \right) + \mu_{c2} \left( k \right)\frac{{u_{1} \left( k \right)\varvec{p}_{2} \left( k \right)}}{{\left\| {\varvec{p}_{2} \left( k \right)} \right\|^{2} + \delta }}} \hfill & {if\,\left| {u_{1} \left( k \right)} \right| > \gamma_{2} } \hfill \\ {\varvec{w}_{21} \left( {k - 1} \right) } \hfill & {else} \hfill \\ \end{array} } \right.$$
(12)
The summary of the Proposed 1 algorithm is given by Table 1.
Table 1

Proposed 1

3.2 Presentation of Proposed 2

In the second proposed algorithm (Proposed 2, see Fig. 3), we propose new relations for the step-sizes \(\mu_{v1} \left( k \right)\) and \(\mu_{v2} \left( k \right)\), they are given as the following:
$$\mu_{v1} \left( k \right) = \left\{ {\begin{array}{*{20}l} {\mu_{1} - \frac{{\gamma_{1} \left( {\beta_{1} + \beta_{2} } \right)}}{{\beta_{1} \left\| {\varvec{p}_{1} \left( k \right)} \right\|^{2} + \beta_{2} \left\| {u_{2} \left( k \right)\varvec{p}_{1} \left( k \right)} \right\|^{2} + \delta }} } \hfill & {if\,u_{2} \left( k \right) > \gamma_{1} } \hfill \\ 0 \hfill & {else} \hfill \\ \end{array} } \right.$$
(13)
$$\mu_{v2} \left( k \right) = \left\{ {\begin{array}{*{20}l} {\mu_{2} - \frac{{\gamma_{2} \left( {\beta_{1} + \beta_{2} } \right)}}{{\beta_{1} \left\| {\varvec{p}_{2} \left( k \right)} \right\|^{2} + \beta_{2} \left\| {u_{1} \left( k \right)\varvec{p}_{2} \left( k \right)} \right\|^{2} + \delta }}} \hfill & { if\,u_{1} \left( k \right) > \gamma_{2} } \hfill \\ 0 \hfill & {else} \hfill \\ \end{array} } \right.$$
(14)
We can well see that the new relations of both step sizes \(\mu_{v1} \left( k \right)\) and \(\mu_{v2} \left( k \right)\) are controlled by two parameters \(\beta_{1}\) and \(\beta_{2}\). These two constants (that are comprised between 0 and 1) allow mixing the step-size control of \(\mu_{v1} \left( k \right)\) by \(\left\| {\varvec{p}_{1} \left( k \right)} \right\|^{2}\) and \(\left\| {u_{2} \left( k \right)\varvec{p}_{1} \left( k \right)} \right\|^{2}\). In the case of the step-size \(\mu_{v2} \left( k \right)\), the control is balanced between \(\left\| {\varvec{p}_{2} \left( k \right)} \right\|^{2}\) and \(\left\| {u_{1} \left( k \right)\varvec{p}_{2} \left( k \right)} \right\|^{2}\). The controlled balanced parameters are chosen by \(\beta_{1} + \beta_{2} = 1\). This means that the control of \(\beta_{1}\) affects directly the control of \(\beta_{2}\), and the variances of the balanced quantities that are controlled by the couple (\(\beta_{1} , \beta_{2} )\) kept unchanged, i.e. there is no amplification of the balanced quantities in the proposed relations (13) and (14). An example of the control of \(\beta_{1}\) is given by : \(\beta_{1} = 1\) if \(\left| {u_{2} \left( k \right)} \right| < 10 \gamma_{1}\) or \(\left| {u_{1} \left( k \right)} \right| < 10\gamma_{2}\). The goal of the proposed systems of relations (13) and (14) is the control of the step sizes \(\mu_{v1} \left( k \right)\) and \(\mu_{v2} \left( k \right)\) to take opposite values in the range 0 and 1. This means that the proposed control allows to get a couple of values as the following:
$$\mu_{v1} \left( k \right) = \left\{ {\begin{array}{*{20}l} { \approx 1 \to {\text{Activity}}\,{\text{speech}}\,{\text{signal}}\,{\text{periods}}} \hfill \\ { \approx 0 \to Inactivity\,speech\,signal\,periods} \hfill \\ \end{array} } \right.$$
(15)
$$\mu_{v2} \left( k \right) = \left\{ {\begin{array}{*{20}l} { \approx 0 \to {\text{Activity }}\,{\text{speech}}\,{\text{signal}}\,{\text{periods}}} \hfill \\ { \approx 1 \to Inactivity\,speech\,signal\,periods} \hfill \\ \end{array} } \right.$$
(16)
Under the conditions (15) and (16), we propose to add a further technique to make automatic the adaptive process of both adaptive filters \(\varvec{w}_{12} \left( k \right)\) and \(\varvec{w}_{21} \left( k \right)\). We propose the new updates relations of both filters as the following:
$$\varvec{w}_{12} \left( k \right) = \left\{ {\begin{array}{*{20}l} {\varvec{w}_{12} \left( {k - 1} \right) + \mu_{v1} \left( k \right)\frac{{u_{2} \left( k \right)\varvec{p}_{1} \left( k \right)}}{{\left\| {\varvec{p}_{1} \left( k \right)} \right\|^{2} + \left\| {\varvec{u}_{2} \left( k \right)} \right\|^{2} + \delta }} ;} \hfill & { if\,\left| {u_{2} \left( k \right)} \right| > \gamma_{1} } \hfill \\ {\varvec{w}_{12} \left( {k - 1} \right)} \hfill & {else} \hfill \\ \end{array} } \right.$$
(17)
$$\varvec{w}_{21} \left( k \right) = \left\{ {\begin{array}{*{20}l} {\varvec{w}_{21} \left( {k - 1} \right) + \mu_{v2} \left( k \right) \frac{{u_{1} \left( k \right)\varvec{p}_{2} \left( k \right)}}{{\left\| {\varvec{p}_{2} \left( k \right)} \right\|^{2} + \left\| {\varvec{u}_{1} \left( k \right)} \right\|^{2} + \delta }} ;} \hfill & {if\,\left| {u_{1} \left( k \right)} \right| > \gamma_{2} } \hfill \\ {\varvec{w}_{21} \left( {k - 1} \right) } \hfill & { else } \hfill \\ \end{array} } \right.$$
(18)
The variable step sizes \(\mu_{v1} \left( k \right)\) and \(\mu_{v2} \left( k \right)\) are given by relations (13) and (14), respectively. As we can see, we have added to the denominator of the second part of \(\varvec{w}_{12} \left( k \right)\) and \(\varvec{w}_{21} \left( k \right)\), i.e. relations (17) and (18), the quantities \(\varvec{u}_{2} \left( k \right)^{2}\) and \(\varvec{u}_{1} \left( k \right)^{2}\), respectively. We remind here that the constant \(\delta\) is introduced to avoid division by zeros in the case of inactivity speech signal. These new modifications give more insurance to be close to the relations (15) and (16) and lead to an automatic adaptive algorithm that is robust and fast in the same time (Table 2).
Table 2

Proposed 2

4 Simulation results of the proposed algorithms (Proposed 1 and 2)

The simulation results of the proposed algorithms are shown in this section. We compare the performance of the proposed algorithms (Proposed 1 and 2) with the classical TC-SM-NLMS algorithm. The comparison is based on the evaluation of stability and convergence speed performances. Two types of noise are used in simulations (i.e. White Gaussian noise WGN, and United States of America Standard Institute (USASI) noise that has a spectrum-like speech signal). The sampling frequency is \(Fs = 8\,{\text{kHz}}\). The parameter simulations of each algorithm are summarized in Table 3.
Table 3

Simulation parameters of TC-SM-NLMS and Proposed algorithms (i.e. Proposed 1 and 2)

 

TC-SM-NLMS [2]

Proposed 1 (in this paper)

Proposed 2 (in this paper)

Simulation parameters

Step-size

\(\mu_{1} = \mu_{2} = 0.2 , 0.4\,and\,0.6\)

Adaptive filter length

\(N = 64 , 128\,and\,256\)

Fixed threshold error

\(\gamma_{1} = \gamma_{2} = 20\) and 80

Input \(SNR1 = 3\) dB

Input \(SNR2 = 3\) dB

Step-size

\(\mu_{1} = \mu_{2} = 0.2 , 0.4\,and\,0.6\)

Adaptive filter length

\(N = 64 , 128\,and\,256\)

Fixed threshold error

\(\gamma_{1} = \gamma_{2} = 20\) and 80

Input \(SNR1 = 3\,{\text{dB}}\)

Input \(SNR2 = 3\) dB

Step-size

\(\mu_{1} = \mu_{2} = 0.2 , 0.4\,and\,0.6\)

Adaptive filter length

\(N = 64 ,128\,and\,256\)

Fixed threshold error

\(\gamma_{1} = \gamma_{2} = 20\) and 80

Input \(SNR1 = 3\) dB

Input \(SNR2 = 3\) dB

\(\beta_{1} = \beta_{2} = 0.5\)

In Fig. 4, we give an example of two impulse responses (IRs) that are used in simulations, the IRs are generated according to the model proposed in [29]. The original speech signal and its manual voice activity detector (MVAD) and also a noisy speech signal are given in Fig. 5.
Fig. 4

A sample of impulse responses (\(\varvec{h}_{12} \left( k \right)\) in the left, \(\varvec{h}_{21} \left( k \right)\) in the right), generated by the model of [29]

Fig. 5

Original speech signal and its MVAD (left), and noisy signal (right)

In order to evaluate the convergence speed performance of the proposed algorithms (Proposed 1 and Proposed 2), we have used 3 objective criteria [30].

(i) The system mismatch (SM): it is calculated by the following relation:
$$SM_{dB} = 20Log_{10} \left( {\frac{{\varvec{h}_{21} \left( k \right) - \varvec{w}_{21} \left( k \right)}}{{\varvec{h}_{21} \left( k \right)}}} \right)$$
(19)
where \(h_{21} \left( k \right)\) and \(w_{21} \left( k \right)\) are respectively the real impulse response and adaptive filters vectors.
(ii) The segmental mean square error (SegMSE): It is evaluated as the following:
$$SegMSE_{dB} = 20Log_{10} \left( {\mathop \sum \limits_{k = 0}^{M - 1} E\left[ {u_{1} \left( k \right)} \right]} \right)$$
(20)
where \(M\) is the averaged time frame length of the output signal \(u_{1} \left( k \right)\), and \(E\) is the expectation operator. We note that this criterion is calculated only in the voice inactivity periods (silence periods).
(iii) The segmental signal to noise ratio (SegSNR): it is given by the following relation:
$$SegSNR_{dB} = \frac{1}{P}\mathop \sum \limits_{p = 0}^{P - 1} \left[ {10Log_{10} \left( {\frac{{\mathop \sum \nolimits_{k = 0}^{M - 1} \left| {s\left( k \right)} \right|^{2} }}{{\mathop \sum \nolimits_{k = 0}^{M - 1} \left| {s\left( k \right) - u_{1} \left( k \right)} \right|^{2} }}} \right)} \right]$$
(21)
where \(s\left( k \right)\) and \(u_{1} \left( k \right)\) are the original speech signal and the enhanced one at the output of each algorithm, respectively. The parameter M is the mean averaging value of the output SegSNR criterion. The parameter P is the number of only-speech periods.

4.1 System mismatch (SM) criterion evaluation

We have evaluated the SM criterion for TC-SM-NLMS, Proposed 1 and Proposed 2 algorithms. We focus only on the output \(u_{1} \left( k \right)\) and the adaptive filter \(w_{21} \left( k \right)\). The SM criterion is calculated according to relation (19). All simulation parameters are summarized in Table 3. For each simulation, we have selected the adaptive filter length, i.e. N = 64, 128 and 256. In Fig. 6, the original speech and the white Gaussian noise (WGN) are used as the input of the system. In Fig. 7, we use the same speech signal with USASI noise as a second source. According to Figs. 6 and 7, we observe that the three algorithms converge toward the optimal solution with different noise types and adaptive filter lengths. We clearly show also the fast convergence of the two proposed algorithms in comparison with the classical TC-SM-NLMS. These experiments prove that the Proposed 1 is more efficient than the classical TC-SM-NLMS algorithms in terms of convergence speed to the optimal solution, and the Proposed 2 behaves more faster than the two other ones. This means that the proposed modification that incorporate the correlation parameters in the variance normalization has improved the behavior of Proposed 2.
Fig. 6

SM evaluation of TC-SM-NLMS, Proposed 1 and Proposed 2 for speech signal and white Gaussian noise (WGN) at source. \(\mu_{1} = \mu_{2} = 0.2\), \(\gamma_{1} = \gamma_{2} = 20\) and adaptive filter length \(N = 64 , 128 , 256\) (from left to right)

Fig. 7

SM evaluation of TC-SM-NLMS, Proposed 1 and Proposed 2 for speech signal and USASI noise at source. \(\mu_{1} = \mu_{2} = 0.4\), \(\gamma_{1} = \gamma_{2} = 80\) and adaptive filter length \(N = 64 , 128 , 256\) (from left to right)

This good performance still the same even with important adaptive filter lengths. The good behavior of Proposed 2 is managed by the new formulas of the step-size that accelerates the algorithm when the input signal variance is low (in the speech absence periods) and to slow down it when the variance is high (in the presence of speech plus noise), this smart automatic mechanism lead to an ANC system with only noise at the reference signal and this is exactly why Proposed 2 has the best performance.

4.2 Segmental Mean Square Error (SegMSE) criterion evaluation

The SegMSE is evaluated for TC-SM-NLMS, Proposed 1 and Proposed 2 algorithms and the obtained results are reported in this section. The MSE evaluation is done in the silence periods of the speech signal after the denoising process, we focus only on the output \(u_{1} \left( k \right)\). The segmental MSE criterion is computed by (20). We keep all simulation parameters given in Sect. 4.1 unchanged. In each experiment, we change the adaptive filter length, i.e. N = 64, 128 and 256. We have done two experiments according to the source signals. First, we use the speech signal shown in Fig. 5 and WGN as the source signal and noise, respectively, in the mixing model. The obtained results for different adaptive filters length are presented in Fig. 8. In the second experiment, the noise source is USASI, and the results are reported on Fig. 9. We confirm again that the output speech signal obtained by Proposed 2 is more intelligible than the other results.
Fig. 8

MSE evaluation of TC-SM-NLMS, Proposed 1 and Proposed 2 for two source signals: the speech signal and the white Gaussian noise (WGN). The parameters simulations are: \(\mu_{1} = \mu_{2} = 0.2\), \(\gamma_{1} = \gamma_{2} = 20\); the adaptive filter lengths are \(N = 64 , 128 , 256\) (from left to right)

Fig. 9

MSE evaluation of TC-SM-NLMS, Proposed 1 and Proposed 2 for two source signals: the speech signal and the USASI noise. The parameters simulations are: \(\mu_{1} = \mu_{2} = 0.4\), \(\gamma_{1} = \gamma_{2} = 80\); and adaptive filter lengths are: \(N = 64 , 128 , 256\) (from left to right)

Basing on Figs. 8 and 9, we confirm that the three algorithms converge to the optimal solution. However, we see clearly the better performance of Proposed 1 and 2 in comparison with the TC-SM-NLMS algorithms in term of convergence speed. Also, we confirm that the introduced modification on Proposed 2 algorithm has improved the MSE values of Proposed 1, which means that Proposed 2 algorithm is more efficient than the other algorithms and allows to quickly converge to the smallest MSE values in the permanent regime. This allows low distortion of the output speech signal (more Intelligible speech signal) and more noise reduction at the output result.

4.3 Segmental signal to noise ratio (SegSNR) criterion evaluation

We have evaluated the SegSNR criterion of the following algorithms, TC-SM-NLMS, Proposed 1 and Proposed 2. We have used relation (21) to evaluate the SegSNR criterion. All simulation parameters are summarized in Table 3. The obtained results of three experiments, with three adaptive filter length, i.e. N = 64, 128 and 256 are presented in this section. In all experiments, we have used two punctual noise types, i.e. white Gaussian in the first experiment, and USASI noises in the second one, in both experiments the source signal is a speech signal which is phonetically balanced and given by Fig. 5. The obtained results of both experiments are reported on Figs. 10 and 11, respectively.
Fig. 10

SegSNR criterion evaluation of TC-SM-NLMS, Proposed 1 and Proposed 2 for two source signals: the speech signal and the white Gaussian noise (WGN). The parameters simulations of each algorithm are: \(\mu_{1} = \mu_{2} = 0.2\), \(\gamma_{1} = \gamma_{2} = 20\). The adaptive filter lengths are \(N = 64 , 128 , 256\) (from left to right)

Fig. 11

SegSNR criterion evaluation of TC-SM-NLMS, Proposed 1 and Proposed 2 for two source signals: the speech signal and the USASI noise. The parameters simulations of each algorithm are: \(\mu_{1} = \mu_{2} = 0.4\), \(\gamma_{1} = \gamma_{2} = 80\). The adaptive filter lengths are: \(N = 64 , 128 , 256\) (from left to right)

From the obtained results, we clearly show the supremacy of both proposed algorithms (Proposed 1 and 2) in the transient phase in comparison with the classical partial TC-SM-NLMS algorithm. Also, we have observed the best performance of the second proposed (Proposed 2) algorithm in all cases (N = 64, 128 and 256 and with both types noises, i.e. white Gaussian in Fig. 10 or USASI in Fig. 11). Also, we confirm that the processing speech signal obtained by Proposed 2 is more intelligible than the other results obtained with other algorithms.

4.4 Performance quantification of the proposed algorithms

In this section, we have evaluated the convergence speed performance (CSP) of the proposed algorithms (i.e. Proposed 1 and Proposed 2) in comparison with the classical TC-SM-NLMS algorithm. We have quantified the CSP in term of number of iteration to achieve the steady state regime. Tow experiments were done with the three algorithms, the input signal is USASI noise in the first experiment, and a speech signal in the second one. The input SNR in both experiments is 3 dB. The maximal step-size value of each algorithm is 0.6. We have evaluated the CSP in term of the threshold values which are selected to be equal to 20, 30, and 50 (best values of the threshold and they are selected experimentally). We have evaluated the percentage of iteration performed by each algorithm before to achieve convergence toward the optimal solutions. The obtained results are given by the Figs. 12 and 13.
Fig. 12

Percentage of performed iteration evaluation of TC-SM-NLMS, Proposed 1 and Proposed 2. The source signal is USASI noise. The parameters simulations of each algorithm are: \(\mu_{1} = \mu_{2} = 0.4\), \(\gamma_{1} = \gamma_{2} = 20, 30, 40, 50\), and the adaptive filter length is \(N = 256\)

Fig. 13

Percentage of performed iteration evaluation of TC-SM-NLMS, Proposed 1 and Proposed 2. The source signal is speech. The parameters simulations of each algorithm are: \(\mu_{1} = \mu_{2} = 0.4\), \(\gamma_{1} = \gamma_{2} = 20, 30, 40, 50\), and the adaptive filter length is \(N = 256\)

According to the obtained results of Figs. 12 and 13, we can easily see the best performance of Proposed 2 in comparison with the other algorithms. The same remark is noted when we use a speech signal or USASI noise as an input. We have also noted that the threshold parameters must be experimentally selected and when this items is selected around 50, the convergence speed performance of Proposed 2 get its good values in both experiments.

5 Conclusion

In this paper, two new algorithms based set-membership principle are proposed. Both of them combine the advantages of the SAD algorithm and regularization techniques based on the cross-correlation between the output signals of the SAD structure and the mixing ones. Three criteria were used to evaluate the performances of the proposed algorithms in comparison with the original TC-SM-SAD algorithm, i.e. The system mismatch (SM), the output SegSNR, and SegMSE. All the criteria have shown the good behavior of the proposed algorithms (Proposed 1 and Proposed 2) even when non-stationary signals like speech are used in the input. Intensives experiments were done on both algorithms and show, in terms of these criteria, the efficiency of Proposed 2 in comparison with Proposed 1 and the original TC-SM-SAD algorithms. The proposed combination between the variance of the output signals and the mixing signal and its used as a regularization factor have shown efficiency in improving the behavior of Proposed 2 algorithm. In conclusion, we can say that we get a less distortion at the output speech signal when small values of the TC-SM-SAD algorithm step-sizes are chosen, nevertheless, the convergence speed will be degraded. The new Proposed 1 and Proposed 2 algorithms allow to improve in the same time the convergence speed performance and the distortion with low step-sizes values. We have also noted that the intelligibility property of the processed speech signal has been improved in all the cases especially with the Proposed 2 algorithm. This is why we recommend the use of these two proposed algorithms in speech enhancement and acoustic noise reduction applications.

Notes

Acknowledgements

This study was carried out without funding. Authors are grateful to the Blida University, Algeria, and Professor Abderreak Guessoum, Professor at Blida University, Algeria, and Director of Signal Processing and imaging Laboratory (LATSI) for providing the infrastructure to achieve this work.

Compliance with ethical standards

Conflict of interest

On behalf of all authors, the corresponding author states that they have no conflict of interest.

References

  1. 1.
    Sayed AH (2003) Fundamentals of adaptive filtering. Wiley, New YorkGoogle Scholar
  2. 2.
    Rupp M (1998) A family of adaptive filter algorithms with decorrelating properties. IEEE Trans Signal Process 46(3):771–775CrossRefGoogle Scholar
  3. 3.
    Haykin S (2002) Adaptive filter theory, 4th edn. Prentice-Hall, Upper Saddle RiverzbMATHGoogle Scholar
  4. 4.
    Diniz PSR (2008) Data-selective adaptive filtering, adaptive filtering. Springer, BerlinzbMATHGoogle Scholar
  5. 5.
    Van Gerven S, Van Compernolle D (1995) Signal separation by symmetric adaptive decorrelation: stability, convergence, and uniqueness. IEEE Trans Signal Process 43(7):1602–1612CrossRefGoogle Scholar
  6. 6.
    Al-Kindi MJ, Dunlop J (1989) Improved adaptive noise cancellation in the presence of signal leakage on the noise reference channel. Sig Process 17(3):241–250MathSciNetCrossRefGoogle Scholar
  7. 7.
    Djendi M (2015) New efficient adaptive fast transversal filtering FTF-type algorithms for mono and stereophonic acoustic echo cancellation. Int J Adapt Control Signal Process 29(3):273–301MathSciNetCrossRefGoogle Scholar
  8. 8.
    Sayoud A, Djendi M, Medahi S, Guessoum A (2018) A dual fast NLMS adaptive filtering algorithm for blind speech quality enhancement. Appl Acoust 135:101–110CrossRefGoogle Scholar
  9. 9.
    Loizou PC (2017) Speech enhancement: theory and practice, 2nd edn. CRC Press, Taylor & Francis Group, Boca RatonGoogle Scholar
  10. 10.
    Djendi M (2017) A new two-microphone Gauss–Seidel pseudo affine projection algorithm for speech quality enhancement. Int J Adapt Control Signal Process 31:1162–1183MathSciNetCrossRefGoogle Scholar
  11. 11.
    Huang J, Zhang X, Zhang Y, Zou X, Zeng L (2014) Speech denoising via lowrank and sparse matrix decomposition. ETRI J 36(1):167–170CrossRefGoogle Scholar
  12. 12.
    Deshpande A, Grant SL (2005) A new multi-algorithm approach to sparse system adaptation. In: Proceedings of the 13th, European signal processing conference (EUSIPCO’05), Antalya, Turkey, September 2005Google Scholar
  13. 13.
    Martin RK, Sethares WA, Williamson RC, Johnson CR Jr (2002) Exploiting sparsity in adaptive filters. IEEE Trans Signal Process 50(8):1883–1894CrossRefGoogle Scholar
  14. 14.
    Djendi M, Scalart P (2012) Double pseudo affine projection algorithm for speech enhancement and acoustic noise reduction. In: Proceedings of IEEE. EUSIPCO, Romania, Bucharest, vol 1, pp 2080–2084Google Scholar
  15. 15.
    Duttweiler DL (2000) Proportionate normalized least-mean squares adaptation in echo cancelers. IEEE Trans Speech Audio Process 8(5):508–518CrossRefGoogle Scholar
  16. 16.
    Deng H, Doroslovacki M (2006) Proportionate adaptive algorithms for network echo cancellation. IEEE Trans Signal Process 54(5):1794–1803CrossRefGoogle Scholar
  17. 17.
    Gollamudi S, Nagaraj S, Kapoor S, Huang Y-F (1998) Set-membership filtering and a set-membership normalized LMS algorithm with an adaptive step size. IEEE Signal Process Lett 5(5):111–114CrossRefGoogle Scholar
  18. 18.
    Werner S, Diniz PSR (2001) Set-membership affine projection algorithm. IEEE Signal Process Lett 8(8):231–235CrossRefGoogle Scholar
  19. 19.
    Diniz PSR, Werner S (2003) Set-membership binormalized data-reusing LMS algorithms. IEEE Trans Signal Process 51(1):124–134MathSciNetCrossRefGoogle Scholar
  20. 20.
    Cheffi A, Djendi M, Guessoum A (2018) New two channel set-membership partial-update NLMS algorithms for acoustic noise reduction. In: Applied acoustic, 2018, pp 322–332Google Scholar
  21. 21.
    Yazdanpanah H, Lima MVS, Diniz PSR (2017) On the robustness of set-membership adaptive filtering algorithms. EURASIP J Adv Signal Process 72:1–12Google Scholar
  22. 22.
    Jin Z, Li Y, Liu J (2018) An improved set-membership proportionate adaptive algorithm for a block–sparse system. J Symmetry MDPI Symmetry 10(75):1–15zbMATHGoogle Scholar
  23. 23.
    Mao WL (2017) Robust set-membership filtering techniques on GPS sensor jamming mitigation. IEEE Sens J 17(6):1810–1818CrossRefGoogle Scholar
  24. 24.
    Zhang S (2014) Zhang J (2014) Set-membership NLMS algorithm with robust error bound. IEEE Trans Circ Syst II Express Br 61(7):536–540Google Scholar
  25. 25.
    Werner S, Apolinario JA, Diniz PSR (2007) Set-membership proportionate affine projection algorithms. EURASIP J Audio Speech Music Process 2007, 34242, pp 1–10CrossRefGoogle Scholar
  26. 26.
    Van Gervan S, Van Compernolle D (1992) Feed forward and feedback in symmetric adaptive noise canceller: stability analysis in a simplified case. In: European signal processing conference, Brussels, Belgium, pp 1081–1084Google Scholar
  27. 27.
    Zinser RL, Mirchandani G, Evans JB (1985) Some experimental and theoretical results using a new adaptive filter structure for noise cancellation in the presence of crosstalk. In: IEEE international conference on acoustic, speech and signal processing, Mars 1985, pp 1253–1256Google Scholar
  28. 28.
    Djendi M, Scalart P, Gilloire A (2007) New frequency domain post-filters for noise cancellation using two closely spaced microphones. In: 15th European signal processing conference (EUSIPCO), Poland, September 3–7, 2007Google Scholar
  29. 29.
    Djendi M, Scalart P, Gilloire A (2006) Noise cancellation using two closely spaced microphones: experimental study with a specific model and two adaptive algorithms. In: IEEE international conference on acoustic, speech and signal processing, Toulouse, France, May 2006Google Scholar
  30. 30.
    Hu Y, Loizou PC (2008) Evaluation of objective quality measures for speech enhancement. IEEE Trans Audio Speech Lang Process ASLP 16(1):229–238CrossRefGoogle Scholar
  31. 31.
    Henni R, Djendi M, Djebari M (2019) A new efficient two-channel fast transversal adaptive filtering algorithm for blind speech enhancement and acoustic noise reduction. Comput Electr Eng 73:349–368.  https://doi.org/10.1016/j.compeleceng.2018.12.009 CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Laboratory of Signal Processing and Imaging (LATSI)University of Blida 1BlidaAlgeria

Personalised recommendations