Abstract
The feature leastmeansquare (FLMS) algorithm has already been introduced to exploit hidden sparsity in lowpass and highpass systems. In this paper, by proposing the extended FLMS (EFLMS) algorithm, we boosted the FLMS algorithm to exploit hidden sparsity in more general systems, those which are neither lowpass nor highpass. To this end, by means of the socalled feature matrix, we reveal the hidden sparsity in coefficients and utilize the \(l_1\)norm to exploit the exposed sparsity. As a result, the EFLMS algorithm will improve the convergence rate and the steadystate meansquared error (MSE) as compared to the traditional leastmeansquare algorithm. Moreover, in this work, we analyze the convergence behavior of the coefficient vector and the steadystate MSE performance of the EFLMS algorithm. Through synthetic and realworld experiments, it has been seen that the EFLMS algorithm can improve the convergence rate and the steadystate MSE whenever the hidden sparsity is revealed.
Introduction
Adaptive filtering is an active field in signal processing and has found applications in several areas such as communications [30], robotics [27], sonar and radar [19], biomedical engineering [5, 23], vehicles [16, 38], and noise control [1, 24], to name just a few. Among various adaptive filtering algorithms proposed in the last six decades, the LMS algorithm [32] has been used enormously due to its low computational resources, simple implementation, and inexpensive hardware requirements [8, 13, 28]. The LMS algorithm and its variants are mostly utilized in real problems such as system identification [4, 14], signal prediction [34], echo cancellation [11, 25], adaptive array beamforming [26], and active noise control [1, 24].
Related works
In practice, we may have some a priori information about the coefficients of an unknown system to be estimated; however, the LMS algorithm estimates the coefficients of a linear system without exploiting any a priori knowledge in the unknown system. One of the most critical prior information about the unknown system is the sparsity level of the system. Therefore, in the last two decades, various machine learning and adaptive filtering algorithms have been proposed in order to exploit sparsity in systems to improve their convergence rate or their meansquared error (MSE) [2, 17, 18, 22, 31, 35,36,37]. All of these algorithms deal with plain sparsity; however, there are many examples where sparsity is not explicit and visible in system’s coefficients, and some appropriate technique is needed to reveal and exploit hidden sparsity. Indeed, sometimes the sparsity is in the linear combination of neighboring coefficients of unknown systems. This kind of sparsity is called hidden sparsity [9].
A system with hidden sparsity does not necessarily contain coefficients equal or close to zero, but the linear combination of its coefficients are equal or close to zero. Therefore: first, the hidden sparsity must be revealed using a linear transformation; then, we can exploit the revealed sparsity using conventional approaches. Recently, the feature leastmeansquare (FLMS) algorithms have been proposed to exploit hidden sparsity in systems with lowpass, highpass, and bandpass spectrum contents [9, 10, 33]. These algorithms utilize the feature matrix to expose the hidden sparsity in lowpass, highpass, and bandpass systems; however, they are not capable of exploiting hidden sparsity in systems with unknown spectrum.
Innovative contributions
In this paper, we propose the extended FLMS (EFLMS) algorithm by extending the applicability of the feature matrix so that it can detect similarities between adjacent coefficients of a system, independent of the system spectrum content. The EFLMS extends the use of the FLMS algorithm not only for lowpass, highpass, and bandpass systems but also for systems with other types of spectral correlation. To design the EFLMS algorithm, we compare the values and the signs of neighboring coefficients. If two adjacent coefficients have similar values (like in lowpass systems), then they are considered similar, and the EFLMS algorithm forces them to be equal. Also, if two adjacent coefficients have similar absolute values with different signs (like in highpass systems), then they are assumed similar, and the EFLMS algorithm compels them to have identical values but with different signs.
The conventional FLMS algorithms need mandatory prior knowledge about the spectrum content of unknown systems [6, 9, 10, 33]. However, the EFLMS algorithm does not require a priori information about the spectrum content of unknown systems, which permits this algorithm to be applied in reallife scenarios. The EFLMS algorithm has the best performance among the tested algorithms in experimental results. In particular, in reallife problems, it has a remarkable superior performance when compared to other tested algorithms due to its independence from the spectrum content of the unknown system to be identified.
Organization
This paper is organized as follows. In Sect. 2, we review the objective function of the FLMS algorithm and propose the EFLMS algorithm. Also, in this section, we address the computational complexity of the EFLMS algorithm as compared to the LMS algorithm. Section 3 discusses the convergence behavior of the coefficient vector and the steadystate MSE performance of the EFLMS algorithm. Section 4 provides some synthetic and realworld numerical examples utilizing the EFLMS algorithm. For the synthetic data, the EFLMS algorithm is applied to system identification problems. Furthermore, this algorithm is used to identify room impulse responses (RIR) in a realworld application. Finally, the conclusions are drawn in Sect. 5.
Notations and acronyms: Scalars are represented by lowercase letters. Vectors (matrices) are denoted by lowercase (uppercase) boldface letters. The ith entry of a given vector \({\mathbf {z}}\) is denoted by \(z_i\). For a given iteration k, the weight vector and the input vector are denoted by \({\mathbf {w}}(k),{\mathbf {x}}(k)\in {\mathbb {R}}^{N+1}\), respectively, where N is the adaptive filter order. The a priori error signal at the kth iteration is defined as \(e(k)\triangleq d(k){\mathbf {w}}^\text {T}(k){\mathbf {x}}(k)\), where \(d(k)\in {\mathbb {R}}\) is the desired signal, and the superscript \((\cdot )^\text {T}\) stands for vector transposition. The \(l_1\)norm and the \(l_2\)norm of a vector \({\mathbf {w}}\in {\mathbb {R}}^{N+1}\) are given by \(\Vert {\mathbf {w}}\Vert _1=\sum _{i=0}^Nw_i\) and \(\Vert {\mathbf {w}}\Vert ^2={\mathbf {w}}^\text {T}{\mathbf {w}}=\sum _{i=0}^Nw_i^2\), respectively. The sign function is denoted by \(\mathrm{sgn}\), and it is defined as
Also, the employed acronyms in this paper are as follows:
\(\bullet \) LMS: leastmeansquare  \(\bullet \) FLMS: feature LMS 
\(\bullet \) EFLMS: extended FLMS  \(\bullet \) MSE: meansquared error 
\(\bullet \) RIR: room impulse response  \(\bullet \) EMSE: excess MSE 
\(\bullet \) SNR: signaltonoise ratio  \(\bullet \) FIR: finite impulse response 
The Extended FLMS algorithm
The FLMS is an LMStype algorithm designed to exploit the hidden sparsity in unknown systems with either low or high frequency contents [9]. This algorithm utilizes the following objective function to update the system coefficients,
where \(\alpha \in {\mathbb {R}}_+\) is the weight given to the \(l_1\)norm, and \({\mathbf {F}}(k)\) is the feature matrix. The role of this matrix is to transform the adaptive filter \({\mathbf {w}}(k)\) (which does not need to be sparse in its original domain) to a different domain where the transformed coefficients are sparse. In other words, the feature matrix \({\mathbf {F}}(k)\) maps the nonsparse vector \({\mathbf {w}}(k)\) to the sparse vector \({\mathbf {F}}(k){\mathbf {w}}(k)\). Then the exposed sparsity in the transformed domain (i.e., \({\mathbf {F}}(k){\mathbf {w}}(k)\)) is exploited using the \(l_1\)norm. Note that some other candidates, acting as sparsitypromoting penalty functions, can be used to replace the \(l_1\)norm [3, 12, 18, 29, 35, 36].
The fundamental advantages of the FLMS algorithm are attained thanks to the feature matrix, which plays a primary role in generating the appropriate FLMS algorithm. However, the original FLMS algorithm, with the feature matrix introduced in [9], requires strong a priori information about the unknown system. Therefore, it cannot exploit suitably the hidden sparsity eventually present in general systems, those which are neither lowpass nor highpass. In this section, we intend to propose a novel version of the FLMS algorithm in such a way that it is capable of exploiting hidden sparsity in a broader class of unknown systems.
In general, there might exist hidden relations between adjacent coefficients of unknown systems. In this case, few coefficients may have abrupt and irrelevant changes as compared to their neighbors. If we exploit the relation between adjacent coefficients appropriately, the hidden sparsity in the unknown system would be revealed and would benefit the learning process. Indeed, if vector \({\mathbf {w}}_*\) denotes the unknown system, we should construct a feature matrix \({\mathbf {F}}\) such that \({\mathbf {F}}{\mathbf {w}}_*\) generates a sparse vector; then we can exploit this sparsity with the help of sparsitypromoting penalty functions. In order to avoid the requirement of a priori knowledge about the spectrum content of the unknown system, we should define a feature matrix such that it seeks the relations between adjacent coefficients independently of the spectrum content of the system.
A strong characteristic of unknown coefficients is their energy or absolute value, and the feature matrix can focus on recognizing the similarity between the absolute values of adjacent coefficients. However, two adjacent coefficients with similar absolute values can have the same or opposite signs. Therefore, at a given iteration k, we introduce the feature matrix \({\mathbf {F}}(k)\in {\mathbb {R}}^{N\times (N+1)}\) as
where \(r^+_{\epsilon ,i}(k)\triangleq g_\epsilon (v_i(k))+g_\epsilon (z_i(k))\), \(r^_{\epsilon ,i}(k)\triangleq g_\epsilon (v_i(k))g_\epsilon (z_i(k))\), \(v_i(k)\triangleq w_i(k)+w_{i+1}(k)\), and \(z_i(k)\triangleq w_i(k)w_{i+1}(k)\), for \(i=0,\ldots ,N1\). Moreover, the function \(g_\epsilon :{\mathbb {R}}\rightarrow {\mathbb {R}}\) is defined by^{Footnote 1}
\(\epsilon \in {\mathbb {R}}_+\) being a small positive constant utilized to detect sharp transitions between the absolute values of adjacent coefficients.
To understand \({\mathbf {F}}(k)\) better, for a given iteration k, we discuss different possible scenarios in the following:

When two adjacent coefficients of \({\mathbf {w}}(k)\) with indexes i and \(i+1\) have close values (i.e., their difference is less than \(\epsilon \)), then \(z_i(k)<\epsilon \) and \(v_i(k)>\epsilon \); it results in \(g_\epsilon (z_i(k))=1\) and \(g_\epsilon (v_i(k))=0\). Thus, the ith row of \({\mathbf {F}}(k)\) contains 1 and 1 on its ith and \((i+1)\)th columns, respectively, and the ith element of \({\mathbf {F}}(k){\mathbf {w}}(k)\), \(z_i(k)\), is close to zero since its absolute value is less than \(\epsilon \).

When two adjacent coefficients of \({\mathbf {w}}(k)\) with indexes i and \(i+1\) have close absolute values with different signs (i.e., their sum is less than \(\epsilon \)), then \(z_i(k)>\epsilon \) and \(v_i(k)<\epsilon \); it leads to \(g_\epsilon (z_i(k))=0\) and \(g_\epsilon (v_i(k))=1\). Hence, the ith row of \({\mathbf {F}}(k)\) contains 1 on its ith and \((i+1)\)th columns, and the ith element of \({\mathbf {F}}(k){\mathbf {w}}(k)\), \(v_i(k)\), is close to zero since its absolute value is less than \(\epsilon \).

If there is no similarity between two adjacent coefficients of \({\mathbf {w}}(k)\) with indexes i and \(i+1\) (i.e., neither their difference nor their sum is less than \(\epsilon \)), then both \(z_i(k)\) and \(v_i(k)\) are greater than \(\epsilon \); it concludes \(g_\epsilon (z_i(k))=g_\epsilon (v_i(k))=0\). Therefore, the ith row of \({\mathbf {F}}(k)\) is identical to the null vector, and the ith element of \({\mathbf {F}}(k){\mathbf {w}}(k)\) is null.

When two adjacent coefficients of \({\mathbf {w}}(k)\) with indexes i and \(i+1\) have similarity in two aspects, i.e., their difference and their sum have absolute values less than \(\epsilon \), then we can easily conclude that \(w_i(k)\) and \(w_{i+1}(k)\) are less than \(\epsilon \). In this case, \(g_\epsilon (z_i(k))=g_\epsilon (v_i(k))=1\); thus, the ith row of \({\mathbf {F}}(k)\) contains the digit 2 on its ith column and zero on other columns. Since \(w_i(k)<\epsilon \), the ith element of \({\mathbf {F}}(k){\mathbf {w}}(k)\) is less than \(2\epsilon \); i.e., for small \(\epsilon \), it is close to zero.
We can observe that, in all possible scenarios for the adjacent coefficients of \({\mathbf {w}}(k)\), the elements of the vector \({\mathbf {F}}(k){\mathbf {w}}(k)\) are less than \(2\epsilon \). Therefore, for small enough \(\epsilon >0\), \({\mathbf {F}}(k){\mathbf {w}}(k)\) is a sparse vector. Hence, we may now introduce the Extended FLMS (EFLMS) algorithm by considering the objective function in (2) with the feature matrix in (3). Using the stochastic gradient descent method for the objective function in (2), the recursion rule of the EFLMS algorithm can be characterized by
where \(\mu \in {\mathbb {R}}_+\) is the stepsize parameter and should be adopted small enough to guarantee convergence [8]. Moreover, \({\mathbf {p}}(k)\in {\mathbb {R}}^{N+1}\) is the gradient of \(\Vert {\mathbf {F}}(k){\mathbf {w}}(k)\Vert _1\) with respect to \({\mathbf {w}}(k)\), when \({\mathbf {F}}(k)\) is given by (3). In this case, the expression of \({\mathbf {p}}(k)\) is described by (6), and it is worthwhile to mention that the computation of \({\mathbf {p}}(k)\) is inexpensive since it does not involve multiplication or division.
Finally, the EFLMS algorithm is summarized in Table 1. When there is a similarity between two adjacent coefficients of \({\mathbf {w}}(k)\), the EFLMS algorithm exploits the hidden sparsity and enforces these coefficients to have similar absolute values. However, when there is no similarity between adjacent coefficients, the corresponding row of \({\mathbf {F}}(k)\) is identical to a null vector, and the EFLMS algorithm does not force these coefficients to have a similarity; the algorithm learns these coefficients similarly as the LMS algorithm does.
Remark 1
The value of \(\epsilon \) should be small enough to detect a reasonable similarity between neighboring coefficients and to result in a high sparsity level for \({\mathbf {F}}(k){\mathbf {w}}(k)\). In reallife examples, empirically, we concluded that \(0.001\le \epsilon \le 0.005\) could be an appropriate candidate.
Remark 2
The computational complexity of the EFLMS algorithm is a little higher than that of the LMS algorithm. In fact, for a given iteration, the EFLMS and the LMS algorithms need a similar number of multiplications since the multiplications in Eq. (6) are trivial. The only exception is when adjacent coefficients have similarity in two aspects, i.e., the absolute values of their difference and their sum are less than \(\epsilon \), and it means that the system has plain sparsity. In this case, there are many other techniques in literature to exploit plain sparsity [2, 18, 22, 35, 36], and in this work our target is to exploit hidden sparsity. Hence, in systems containing hidden sparsity, the similarity in two aspects between adjacent coefficients is assumed unusual, and the number of required multiplication in the EFLMS algorithm is similar to or a little more than that of the LMS algorithm.
Moreover, as compared to the LMS algorithm, the EFLMS algorithm requires at most 6N more addition/subtraction per iteration. However, in practice, the difference of the required addition/subtraction operations between the EFLMS and the LMS algorithms is definitely smaller than 6N per iteration. Also, at each iteration, the EFLMS algorithm needs 6N comparison operations to implement Eq. (6).
Some Properties of the EFLMS algorithm
In this section, we analyze the convergence behavior of the coefficient vector and the steadystate meansquared error performance of the EFLMS algorithm.
Convergence Behavior of the Coefficient Vector
Suppose that the coefficient vector of an unknown FIR filter is denoted by \({\mathbf {w}}_*\). Let us identify this unknown system by an adaptive filter of the same order using the EFLMS algorithm. Also, assume that a measurement zeromean white noise n(k) with variance \(\sigma _n^2\), statistically independent of the input signal x(k), is added to the output of the filter. Moreover, let us denote by \({\widetilde{{\mathbf {w}}}}(k)\) the difference between the adaptive filter coefficients \({\mathbf {w}}(k)\) and the unknown system \({\mathbf {w}}_*\); i.e., \({\widetilde{{\mathbf {w}}}}(k)\triangleq {\mathbf {w}}_*{\mathbf {w}}(k)\). Therefore, we have
Subtracting both sides of the update rule in (5) from \({\mathbf {w}}_*\) and utilizing (7), we obtain
where \({\mathbf {I}}_{N+1}\) is the \((N+1)\times (N+1)\) identity matrix.
Assume that n(k), \({\mathbf {x}}(k)\), and \({\widetilde{{\mathbf {w}}}}(k)\) are statistically independent, thus by taking the expected value of both sides of the above equation, we obtain
where \({\mathbf {R}}\triangleq E[{\mathbf {x}}(k){\mathbf {x}}^T(k)]\) denotes the autocorrelation matrix. From Eq. (6), we observe that the elements of \({\mathbf {p}}(k)\) are the sum of at most four sign functions. It implies that
where \(\mathbf{1}=[1~1~\ldots ~1]^\text {T}\), and \(\preceq \) stands for the componentwise inequality. Therefore, \(\mu \alpha E[{\mathbf {p}}(k)]\) is bounded and, by taking \(0<\mu <\frac{2}{\lambda _\text {max}}\) [7, 8], \(E[{\widetilde{{\mathbf {w}}}}(k+1)]\) will converge, \(\lambda _\text {max}\) being the largest eigenvalue of \({\mathbf {R}}\). Hence, by choosing \(0<\mu <\frac{2}{\lambda _\text {max}}\) and using Eq. (9), we have
Assuming \({\mathbf {R}}\) is invertible, we obtain
This equation states that, by adopting a proper stepsize parameter, the EFLMS algorithm will never diverge. Finally, the validity of Eq. (12) is verified by numerical simulations in Sect. 4.1.
SteadyState MSE Performance
To analyze the steadystate MSE performance of the EFLMS algorithm, let us subtract both sides of the update equation in (5) from \({\mathbf {w}}_*\), then we get
From Eq. (7), we have
where \({\widetilde{e}}(k)={\widetilde{{\mathbf {w}}}}^\text {T}(k){\mathbf {x}}(k)\) is the noiseless error. Thus, by multiplying each side of (13) to its transpose, we attain
Since \(\mu \) and \(\alpha \) are small constants, we can neglect the terms multiplied by \(\mu ^2\alpha \) and \(\mu ^2\alpha ^2\). Hence, using (14), we obtain
Moreover, in the steadystate, we can assume that n(k), \({\widetilde{e}}(k)\), and x(k) are statistically independent. Thus, by taking the expected value of the above equation, we get
where \(\mathrm{tr}({\mathbf {R}})\) denotes the trace of \({\mathbf {R}}\). When the EFLMS algorithm converges to the steadystate, we can assume that \(E[\Vert {\widetilde{{\mathbf {w}}}}(k+1)\Vert ^2]=E[\Vert {\widetilde{{\mathbf {w}}}}(k)\Vert ^2]\). Therefore, we get
Hence, the steadystate excess MSE (EMSE) of the EFLMS algorithm is given by
Experimental Results
In this section, we provide some experimental results to verify the performance of the EFLMS algorithm in comparison with a few other algorithms. To this end, we present simulation experiments in Sect. 4.1 and reallife experiments in Sect. 4.2 and 4.3.
Simulation Experiments
In this subsection, we utilize the LMS, the FLMS [9], and the proposed EFLMS algorithms in system identification problems. For the FLMS algorithm, we require the a priori information about the spectrum content of the unknown system; thus, we use the FLMS algorithm for lowpass systems [9]. The input signal is a zeromean white Gaussian noise with unit variance, and all algorithms are initialized with the null vector. The parameter \(\alpha \) is set to 0.05. For each scenario, the stepsize parameter will be informed later in the figures. The signaltonoise ratio (SNR) is set to 20 dB, and the MSE learning curves are computed by averaging the outcomes of 200 independent trials.
The first unknown system, \({\mathbf {w}}_*\), is of order 39, i.e., it contains 40 coefficients, and its impulse response is presented in Fig. 1a. Indeed, it is comprised of eight blocks with equal lengths, where adjacent coefficients inside four blocks have similar values but in the other four blocks, the adjacent coefficients have similar absolute values with different signs. The parameter \(\epsilon \) in the EFLMS algorithm is chosen as 0.03. Indeed, in this synthetic example, since we know the system in advance, we adopted the best value for \(\epsilon \). Figure 1b shows the MSE learning curves of the tested algorithms. As can be seen, choosing \(\mu =0.03\), the EFLMS algorithm has the highest convergence rate and the lowest misadjustment, followed by the LMS and the FLMS algorithms. We can observe that the EFLMS algorithm outperforms the FLMS algorithm since, differently from the FLMS algorithm, it can exploit the similarity between all adjacent coefficients independent of the spectrum content of the system. Indeed, the FLMS algorithm has a poor performance since it uses a timeinvariant feature matrix which enforces adjacent coefficients with no similarity to have identical values. Furthermore, when the LMS algorithm has adopted a small stepsize, 0.01, to attain the same MSE as that of the EFLMS algorithm, we can observe that its convergence speed degrades significantly. It is worth mentioning that, in Fig. 1b, the learning curves start decreasing for \(k>N\) since, in the system identification scenarios, the input signal vector is composed of a delay line with a single input signal, i.e., \({\mathbf {x}}(k)=[x(k)~x(k1)~\ldots ~x(kN)]^\text {T}\) and, for \(N\le i\le 1\), \(x(i)=0\).
The second unknown system, \({\mathbf {w}}_*'\), is a bandpass system of order 199 (i.e., 200 coefficients) with central frequency at \(\frac{\pi }{4}\), the lower and the upper cutoff frequencies at \(\frac{\pi }{4}0.2\) and \(\frac{\pi }{4}+0.2\), respectively, the lower and the upper transition frequencies at \(\frac{\pi }{4}0.3\) and \(\frac{\pi }{4}+0.3\), respectively. The impulse response of \({\mathbf {w}}_*'\) is depicted in Fig. 2a. In this scenario, for the FLMS algorithm, we adopt the feature matrix for lowpass systems. However, the EFLMS algorithm utilizes the feature matrix proposed in Eq. (3), and it does not require the a propri information about the spectrum content of the unknown system. The stepsize parameter is chosen as 0.007 for all tested algorithms, and \(\epsilon =0.005\) in the EFLMS algorithm. In this example, the MSE learning curves are the average of outcomes of 1000 independent ensembles. Figure 2b illustrates the MSE learning curves of the LMS, the FLMS, and the EFLMS algorithms. As can be seen, the EFLMS algorithm has the highest convergence rate and the lowest misadjustment, followed by the LMS and the FLMS algorithms. Again, it is worth mentioning that the learning curves start decreasing for \(k>N\) since the input signal vector is adopted as \({\mathbf {x}}(k)=[x(k)~x(k1)~\ldots ~x(kN)]^\text {T}\) and, for \(N\le i\le 1\), \(x(i)=0\).
Moreover, Fig. 3 presents the MSE learning curves of the LMS, the FLMS, and the EFLMS algorithms for a timevarying system. In the first 1500 iterations, the bandpass system \({\mathbf {w}}'_*\) is considered as the unknown system and, at the iteration 1501, the unknown system is changed to a system of order 199 containing four lowpass blocks and four highpass blocks of the same length. For all tested algorithms, the stepsize parameter is selected as 0.007, and \(\epsilon =0.005\) in the EFLMS algorithm. As can be observed, the EFLMS algorithm can perfectly track the abrupt change in the system, and it attained the lowest MSE among the tested algorithms.
The fourth unknown system, \({\mathbf {w}}_*''\), is of order 39, and its coefficients are drawn independently from a zeromean Gaussian distribution with unit variance. In this case, there is no relation between adjacent coefficients. The stepsize parameter is adopted as 0.03 for all employed algorithms and \(\epsilon =0.005\) for the EFLMS algorithm. Figure 4 shows the MSE learning curves of the LMS, the FLMS, and the EFLMS algorithms considering \({\mathbf {w}}_*''\). As can be seen, when there is no relation between adjacent coefficients, the performance of the LMS and the EFLMS algorithms are incredibly similar; however, the FLMS algorithm has significantly higher steadystate MSE. This example shows that when there is no relation between the adjacent coefficients of a system, by adopting a proper \(\epsilon \), the EFLMS algorithm will match with the LMS algorithm. However, when there is some relation between the adjacent coefficients, as in the previous examples, the EFLMS algorithm outperforms the LMS algorithm.
Furthermore, to corroborate the validity of the proposed analysis in Sect. 3.1, the Euclidean norm of both sides of Eq. (12) are presented in Fig. 5, when \(\epsilon =0.03\) and \({\mathbf {w}}_*\), shown in Fig. 1a, is adopted as the unknown system. To obtain the results in Fig. 5, for different stepsize parameters \(0.003\le \mu \le 0.035\), the EFLMS algorithm has been implemented for 6000 iterations, and the averages of \({\mathbf {w}}(k){\mathbf {w}}_*\) and \(\alpha {\mathbf {R}}^{1}{\mathbf {p}}(k)\) over 1000 independent runs have been computed to produce the ensembleaverages. Then the timeaverages of the last 4000 iterations have been calculated. Finally, for \(0.003\le \mu \le 0.035\), the Euclidean norms of these values are illustrated in Fig. 5. As can be seen, this figure substantiates the validity of Eq. (12). Moreover, we can observe that, for different stepsizes, the blue curve (circles), \(\Vert E[{\mathbf {w}}(\infty ){\mathbf {w}}_*]\Vert \), is extremely close to zero; it means that the EFLMS algorithm has converged to the optimum solution \({\mathbf {w}}_*\). For instance, when \(\mu =0.03\), the values of \(w_i(k)\), for \(i=0,5,10,15,20,25,30,35\), are presented in Fig. 6. We can see that all these coefficients have converged to the optimum coefficients after 600 iterations.
Also, Fig. 7 shows the theoretical and the experimental steadystate EMSE of the EFLMS algorithm, for \(0.003\le \mu \le 0.035\), when \(\epsilon =0.03\) and \({\mathbf {w}}_*\) is chosen as the unknown system. The theoretical steadystate EMSE is computed by Eq. (19). To acquire the experimental EMSE, for \(0.003\le \mu \le 0.035\), we have executed the EFLMS algorithm for 6000 iterations, and we have calculated the average of \({\widetilde{e}}^2(k)\) over 1000 independent trials to generate the ensembleaverages. Finally, the timeaverage of the last 4000 iterations has been computed. As can be observed in Fig. 7, there is an excellent match between theoretical and experimental results.
Room Impulse Response Identification
As a reallife experiment, we have used the LMS and the EFLMS algorithms to identify the measured room impulse response (RIR) for three microphones in a room receiving the sound generated by a loudspeaker. The microphones are B&K condenser microphones type 4165, and the loudspeaker is Yamaha MSP3. This experiment has been carried out in a room with dimensions and shape as depicted in Fig. 8, where we can observe the positions of the loudspeaker and of the microphones. Microphones were located: (1) in the middle of the room, (2) on one side of the room next to the wall, and (3) in the corner of the room. After been estimated offline, the RIR was utilized as the unknown system in the system identification problem using the LMS and the EFLMS algorithms. The sampling frequency is set to 8 kHz. The input signal is a zeromean white Gaussian noise with unit variance, and the measurement noise has zeromean Gaussian distribution with variance 0.01. In this experiment, the value of \(\alpha \) is adopted as 0.02, and \(\epsilon =0.005\) for the EFLMS algorithm. The stepsize parameter \(\mu \) for these algorithms is informed in Fig. 10. Both algorithms are initialized with the null vector. For microphones 1 and 2, the order of the adaptive filter is 3800, while for microphone 3, it is 4000. The learning curves are the average of outcomes of 100 independent runs.
The impulse responses for microphones 1, 2, and 3 are illustrated in Fig. 9a–c, respectively. Figure 10a–c illustrate the MSE learning curves of the LMS and the EFLMS algorithms when they have been applied to identify the RIR by microphones 1, 2, and 3, respectively. As can be seen, for all microphones, the EFLMS algorithm has lower misadjustment and higher convergence speed as compared to the LMS algorithm. Thus, this experiment highlights the benefits provided by the EFLMS algorithm for a particular scenario when the system to be identified does not have any particular shape (lowpass or highpass, for instance). Moreover, the EFLMS algorithm did not require significant computational resources in comparison with the LMS algorithm. As an example, when identifying the RIR of order \(N=3800\) with microphone 1, the EFLMS algorithm has executed 222 multiplication operations and 9509 (\(\ll 6N\)) addition/subtraction operations more than the LMS algorithm per iteration. Furthermore, the EFLMS algorithm has implemented \(6N=22{,}800\) comparison operations per iteration.
Underwater Channel Estimation
In this scenario, a channel impulse response for a shallow water receiver is employed in a system identification problem. The details of the measured channel impulse response are available in [15]. In a system identification scenario, we tried to estimate the impulse response of an underwater communication channel by the LMS and the EFLMS algorithms. Four different input signals are adopted: (i) a zeromean white Gaussian noise (WGN) with unit variance, (ii) a firstorder moving average (MA) process generated by \(x(k)=m(k)+0.75m(k1)\), where m(k) is a zeromean WGN with unit variance; (iii) a firstorder autoregressive (AR) process produced by \(x(k)=0.75x(k1)+m(k)\); (iv) a firstorder autoregressive moving average (ARMA) process generated by \(x(k)=0.75x(k1)+0.75m(k1)+m(k)\). Moreover, all input signals are normalized so that they have unit variance and zero mean. The measurement noise signal has a zeromean white Gaussian distribution with variance 0.1; i.e., the SNR was set to 10 dB. In both algorithms, the adaptive filter order is 2880, \({\mathbf {w}}(0)\) is the null vector, and the stepsize parameter is chosen as 0.0005. In the EFLMS algorithm, the values of \(\alpha \) and \(\epsilon \) were set to 0.02 and 0.005, respectively. Moreover, the learning curves are obtained by averaging the outcomes of 100 independent trials.
Figure 11 depicts the MSE learning curves of the LMS and the EFLMS algorithms when they were employed to identify the underwater channel impulse response. Indeed, the MSE learning curves of these algorithms are illustrated in Fig.11a–d when the input signals are adopted as WGN, MA, AR, and ARMA, respectively. As can be observed, in Figure 11a–c, the EFLMS algorithm outperforms the LMS algorithm, reaching a significantly lower steadystate MSE and a slightly higher convergence rate. Also, even for the case of the ARMA input signal, as observed in Fig. 11d, the EFLMS algorithm performs somewhat better than the LMS algorithm. Thus, for all adopted input signals, the EFLMS algorithm is capable of exposing and exploiting the sparsity hidden in the underwater channel impulse response without any a priori knowledge about the spectrum content of the system.
Conclusion
In this paper, we have proposed the EFLMS algorithm in order to exploit the hidden sparsity in a broader class of unknown systems. For this purpose, we have introduced a proper feature matrix so that, by multiplying the system coefficients by the feature matrix, we obtain a sparse vector. Once the hidden sparsity is revealed, it is exploited using the \(l_1\)norm. Indeed, at each iteration, the difference and the sum of each two adjacent coefficients are computed: if the magnitudes of the obtained values are less than a predetermined value \(\epsilon >0\), the algorithm considers a similarity between these coefficients and exploits this property in the learning process. Moreover, we have analyzed the convergence behavior of the coefficient vector of the EFLMS algorithm and have introduced an upper bound for the stepsize parameter. The synthetic numerical results and the reallife simulations have highlighted the advantages provided by the EFLMS algorithm.
The EFLMS algorithm requires a little higher computational burden as compared to the LMS algorithm; however, it can attain lower misadjustment and higher convergence rate for a wider class of systems than for those where the FLMS is applicable. In comparison with the FLMS algorithm, the EFLMS does not need a priori information about the spectrum content of systems to exploit hidden sparsity whereas, in using FLMS algorithms, this prior knowledge is indispensable. Also, note that the parameter \(\epsilon \) in the EFLMS algorithm is a key constant that should be chosen carefully to avoid low performance. Thus, we recommend underestimating this parameter to prevent bad selection. However, in future works, datadriven approaches can be utilized to learn this parameter from data. Moreover, as a potential future research direction, hidden sparsity can be exploited in some stateoftheart artificial intelligence applications different from classical problems. As an example, the idea of hidden sparsity can be extended to nonlinear and Hammerstein adaptive filters [20, 21].
Notes
 1.
This function was inspired by the discard function proposed in [36].
References
 1.
M.S. Aslam, M.A.Z. Raja, A new adaptive strategy to improve online secondary path modeling in active noise control systems using fractional signal processing approach. Signal Process. 107, 433–443 (2015)
 2.
J. Benesty, S.L. Gay, An improved PNLMS algorithm, in Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May, Dallas, USA (2002), pp. 1881–1884
 3.
Z.A. Bhotto, A. Antoniou, A family of shrinkage adaptivefiltering algorithms. IEEE Trans. Signal Process. 61, 1689–1697 (2013)
 4.
N.I. Chaudhary, S. Zubair, M.S. Aslam et al., Design of momentum fractional LMS for Hammerstein nonlinear system identification with application to electrically stimulated muscle model. Eur. Phys. J. Plus 134, 407 (2019)
 5.
N.I. Chaudhary, S. Zubair, M.A.Z. Raja et al., Normalized fractional adaptive methods for nonlinear control autoregressive systems. Appl. Math. Model. 66, 457–471 (2019)
 6.
G.S. Chaves, M.V.S. Lima, H. Yazdanpanah, P.S.R. Diniz, T.N. Ferreira, A simple sparsityaware feature LMS algorithm, in Proceedings 27th European Signal Processing Conference (EUSIPCO) (A Coruna, Spain, Sept., 2019), pp. 1–5
 7.
Y. Chen, Y. Gu, A.O. Hero, Sparse LMS for system identification, in Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (Taipei, Taiwan, Apr., 2009), pp. 3125–3128
 8.
P.S.R. Diniz, Adaptive Filtering: Algorithms and Practical Implementation, 4th edn. (Springer, New York, 2013)
 9.
P.S.R. Diniz, H. Yazdanpanah, M.V.S. Lima, Feature LMS algorithms, in Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), April, Calgary, Alberta, Canada (2018), pp. 4144–4148
 10.
P.S.R. Diniz, H. Yazdanpanah, M.V.S. Lima, Feature LMS algorithm for bandpass system models, in Proceedings 27th European Signal Processing Conference (EUSIPCO) (A Coruna, Spain, Sept., 2019), pp. 1–5
 11.
B. FarhangBoroujeny, Fast LMS/Newton algorithms based on autoregressive modeling and their application to acoustic echo cancellation. IEEE Trans. Signal Process. 45, 1987–2000 (1997)
 12.
Y. Gu, J. Jin, S. Mei, \(l_0\) norm constraint LMS algorithm for sparse system identification. IEEE Signal Process. Lett. 16, 774–777 (2009)
 13.
S. Haykin, Adaptive Filter Theory, 4th edn. (Prentice Hall, Englewood Cliffs, NJ, 2002)
 14.
S. He, Y. Lin, Cauchy distribution functionpenalized LMS for sparse system identification. Circuits Syst. Signal Process. 38, 470–480 (2019)
 15.
J. Huang, R. Diamant, Channel impulse responses from Mar. 2019 long range experiment (Mediterranean Sea), IEEE Dataport (2019). https://doi.org/10.21227/nzgrds72.
 16.
K. Jiang, H. Zhang, H.R. Karimi, J. Lin, L. Song, Simultaneous input and state estimation for integrated motortransmission systems in a controller area network environment via an adaptive unscented Kalman filter, IEEE Trans. Syst. Man Cybern Syst. (2018). https://doi.org/10.1109/TSMC.2018.2795340
 17.
Y. Li, C. Zhang, S. Wang, Lowcomplexity nonuniform penalized affine projection algorithm for sparse system identification. Circuits Syst. Signal Process. 35, 1611–1624 (2016)
 18.
M.V.S. Lima, T.N. Ferreira, W.A. Martins, P.S.R. Diniz, Sparsityaware dataselective adaptive filters. IEEE Trans. Signal Process. 62, 4557–4572 (2014)
 19.
A. Mandal, R. Mishra, Digital equalization for cancellation of noiselike interferences in adaptive spatial filtering. Circuits Syst. Signal Process. 36, 675–702 (2017)
 20.
A. Mehmood, N.I. Chaudhary, A. Zameer et al., Novel computing paradigms for parameter estimation in Hammerstein controlled auto regressive auto regressive moving average systems. Appl. Soft Comput. 80, 263–284 (2019)
 21.
A. Mehmood, A. Zameer, N.I. Chaudhary et al., Backtracking search heuristics for identification of electrical muscle stimulation models using Hammerstein structure. Appl. Soft Comput. 84, 105705 (2019)
 22.
R. Meng, R.C. de Lamare, V.H. Nascimento, Sparsityaware affine projection adaptive algorithms for system identification, in Proceedings Sensor Signal Processing for Defence (SSPD) (London, UK, Sept., 2011), pp. 1–5
 23.
L. Murali, D. Chitra, T. Manigandan, B. Sharanya, An efficient adaptive filter architecture for improving the seizure detection in EEG signal. Circuits Syst Signal Process 35, 2914–2931 (2016)
 24.
T. Padhi, M. Chandra, A. Kar, M.N.S. Swamy, A new hybrid active noise control system with convex combination of time and frequency domain filteredx LMS algorithms. Circuits Syst. Signal Process. 37, 3275–3294 (2018)
 25.
K. Pu, J. Zhang, L. Min, A signal decorrelation PNLMS algorithm for doubletalk acoustic echo cancellation. Circuits Syst. Signal Process. 35, 669–684 (2016)
 26.
M.A.Z. Raja, R. Akhtar, N.I. Chaudhary et al., A new computing paradigm for the optimization of parameters in adaptive beamforming using fractional processing. Eur. Phys. J. Plus 134, 275 (2019)
 27.
G. Reina, A. Vargas, K. Nagatani, K. Yoshida, Adaptive Kalman filtering for GPSbased mobile robot localization, in Proceedings IEEE International Workshop on Safety, Security and Rescue Robotics (Rome, Italy, Sept., 2007), pp. 1–6
 28.
A.H. Sayed, Adaptive Filters (WileyIEEE, New York, USA, 2008)
 29.
K. Shi, P. Shi, Adaptive sparse Volterra system identification with \(l_0\)norm penalty. Signal Process. 91, 2432–2436 (2011)
 30.
A. Tarighat, A.H. Sayed, Least meanphase adaptive filters with application to communications systems. IEEE Signal Process. Lett. 11, 220–223 (2004)
 31.
S. Werner, J.A. Apolinário Jr., P.S.R. Diniz, Setmembership proportionate affine projection algorithms. EURASIP J. Audio Speech Music Process (2007). https://doi.org/10.1155/2007/34242
 32.
B. Widrow, M.E. Hoff, Adaptive switching circuits. IRE WESCOM Conv. Rec. 4, 96–104 (1960)
 33.
H. Yazdanpanah, J.A. Apolinário Jr., P.S.R. Diniz, M.V.S. Lima, \(l_0\)norm feature LMS algorithms, in Proceeding 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP), Nov., Anaheim, CA, USA, (2018), pp. 311–315
 34.
H. Yazdanpanah, P.S.R. Diniz, New trinion and quaternion setmembership affine projection algorithms. IEEE Trans. Circuits Syst. II Express Briefs 64, 216–220 (2017)
 35.
H. Yazdanpanah, P.S.R. Diniz, Recursive LeastSquares algorithms for sparse system modeling, in Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (New Orleans, LA, USA, Mar., 2017), pp. 3879–3883
 36.
H. Yazdanpanah, P.S.R. Diniz, M.V.S. Lima, A simple setmembership affine projection algorithm for sparse system modeling, in Proceedings 24th European Signal Processing Conference (EUSIPCO) (Budapest, Hungary, Aug., 2016), pp. 1798–1802
 37.
Y. Yu, H. Zhao, B. Chen, Sparsenesscontrolled proportionate affine projection sign algorithms for acoustic echo cancellation. Circuits Syst. Signal Process. 34, 3933–3948 (2015)
 38.
H. Zhang, J. Wang, Active steering actuator fault detection for an automaticallysteered electric ground vehicle. IEEE Trans. Veh. Technol. 66, 3685–3702 (2017)
Acknowledgements
The authors would like to thank the São Paulo Research Foundation (FAPESP) Grants #2015/223082, #2019/062801 and the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Codes 23038.018065/201817 and 88881.371305/201901.
Author information
Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yazdanpanah, H., Apolinário, J.A. The Extended Feature LMS Algorithm: Exploiting Hidden Sparsity for Systems with Unknown Spectrum. Circuits Syst Signal Process 40, 174–192 (2021). https://doi.org/10.1007/s00034020014613
Received:
Revised:
Accepted:
Published:
Issue Date:
Keywords
 Adaptive filtering
 LMS algorithm
 Feature
 Sparsity
 \(l_1\)norm
 Computational complexity