Skip to main content
Log in

Incorporating Nonparametric Knowledge to the Least Mean Square Adaptive Filter

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

In the framework of the maximum a posteriori estimation, the present study proposes the nonparametric probabilistic least mean square (NPLMS) adaptive filter for the estimation of an unknown parameter vector from noisy data. The NPLMS combines parameter space and signal space by combining the prior knowledge of the probability distribution of the process with the evidence existing in the signal. Taking advantage of kernel density estimation to estimate the prior distribution, the NPLMS is robust against the Gaussian and non-Gaussian noises. To achieve this, some of the intermediate estimations are buffered and then used to estimate the prior distribution. Despite the bias-compensated algorithms, there is no need to estimate the input noise variance. Theoretical analysis of the NPLMS is derived. In addition, a variable step-size version of NPLMS is provided to reduce the steady-state error. Simulation results in the system identification and prediction show the acceptable performance of the NPLMS in the noisy stationary and non-stationary environments against the bias-compensated and normalized LMS algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Subscripts are used to refer to time indices of vector variables and parentheses to refer to the time indices of scalar variables.

References

  1. J. Arenas-Garcia, A.R. Figueiras-Vidal, A.H. Sayed, Mean-square performance of a convex combination of two adaptive filters. IEEE Trans. Signal Process. 54(3), 1078–1090 (2006). https://doi.org/10.1109/TSP.2005.863126

    Article  MATH  Google Scholar 

  2. B.C. Arnold, p-norm bounds on the expectation of the maximum of a possibly dependent sample. J. Multivar. Anal. 17(3), 316–332 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  3. B. Babadi, N. Kalouptsidis, V. Tarokh, Sparls: the sparse rls algorithm. IEEE Trans. Signal Process. 58(8), 4013–4025 (2010). https://doi.org/10.1109/TSP.2010.2048103

    Article  MathSciNet  MATH  Google Scholar 

  4. J. Benesty, H. Rey, L.R. Vega, S. Tressens, A nonparametric vss nlms algorithm. IEEE Signal Process. Lett. 13(10), 581–584 (2006)

    Article  Google Scholar 

  5. N.J. Bershad, J.C.M. Bermudez, J.Y. Tourneret, Stochastic analysis of the lms algorithm for system identification with subspace inputs. IEEE Trans. Signal Process. 56(3), 1018–1027 (2008). https://doi.org/10.1109/TSP.2007.908967

    Article  MathSciNet  MATH  Google Scholar 

  6. J.V. Candy, Bayesian Signal Processing: Classical, Modern, and Particle Filtering Methods, vol. 54 (Wiley, Hoboken, 2016)

    Book  Google Scholar 

  7. K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, Y. Singer, Online passive-aggressive algorithms. J. Mach. Learn. Res. 7(Mar), 551–585 (2006)

    MathSciNet  MATH  Google Scholar 

  8. R.C. De Lamare, R. Sampaio-Neto, Adaptive reduced-rank equalization algorithms based on alternating optimization design techniques for mimo systems. IEEE Trans. Veh. Technol. 60(6), 2482–2494 (2011)

    Article  MATH  Google Scholar 

  9. J. Fernandez-Bes, V. Elvira, S. Van Vaerenbergh, A probabilistic least-mean-squares filter. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2199–2203. IEEE (2015)

  10. J. Gao, H. Sultan, J. Hu, W.W. Tung, Denoising nonlinear time series by adaptive filtering and wavelet shrinkage: a comparison. IEEE Signal Process. Lett. 17(3), 237–240 (2010)

    Article  Google Scholar 

  11. A. Gilloire, M. Vetterli, Adaptive filtering in subbands with critical sampling: analysis, experiments, and application to acoustic echo cancellation. IEEE Trans. Signal Process. 40(8), 1862–1875 (1992)

    Article  MATH  Google Scholar 

  12. S. Haykin, A.H. Sayed, J.R. Zeidler, P. Yee, P.C. Wei, Adaptive tracking of linear time-variant systems by extended rls algorithms. IEEE Trans. Signal Process. 45(5), 1118–1128 (1997)

    Article  Google Scholar 

  13. S.S. Haykin, Adaptive Filter Theory (Pearson Education India, London, 2008)

    MATH  Google Scholar 

  14. C. Huemmer, R. Maas, W. Kellermann, The nlms algorithm with time-variant optimum stepsize derived from a bayesian network perspective. IEEE Signal Process. Lett. 22(11), 1874–1878 (2015)

    Article  Google Scholar 

  15. A. Ilin, T. Raiko, Practical approaches to principal component analysis in the presence of missing values. J. Mach. Learn. Res. 11(Jul), 1957–2000 (2010)

    MathSciNet  MATH  Google Scholar 

  16. S.M. Jung, P. Park, Normalised least-mean-square algorithm for adaptive filtering of impulsive measurement noises and noisy inputs. Electron. Lett. 49(20), 1270–1272 (2013)

    Article  Google Scholar 

  17. S.M. Jung, P. Park, Stabilization of a bias-compensated normalized least-mean-square algorithm for noisy inputs. IEEE Trans. Signal Process. 65(11), 2949–2961 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  18. J. Liu, X. Yu, H. Li, A nonparametric variable step-size nlms algorithm for transversal filters. Appl. Math. Comput. 217(17), 7365–7371 (2011)

    MathSciNet  MATH  Google Scholar 

  19. W. Liu, J.C. Principe, S. Haykin, Kernel Adaptive Filtering: A Comprehensive Introduction, vol. 57 (Wiley, Hoboken, 2011)

    Google Scholar 

  20. I.M. Park, S. Seth, S. Van Vaerenbergh, Probabilistic kernel least mean squares algorithms. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8272–8276. IEEE (2014)

  21. A.H. Sayed, Adaptive Filters (Wiley, Hoboken, 2008)

    Book  Google Scholar 

  22. A.H. Sayed, T. Kailath, A state-space approach to adaptive rls filtering. IEEE Signal Process. Mag. 11(3), 18–60 (1994)

    Article  Google Scholar 

  23. S. Van Vaerenbergh, J. Fernandez-Bes, V. Elvira, On the relationship between online Gaussian process regression and kernel least mean squares algorithms. In: 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. IEEE (2016)

  24. S. Van Vaerenbergh, M. Lzaro-Gredilla, I. Santamara, Kernel recursive least-squares tracker for time-varying regression. IEEE Trans. Neural Netw. Learn. Syst. 23(8), 1313–1326 (2012)

    Article  Google Scholar 

  25. S.V. Vaseghi, Advanced Digital Signal Processing and Noise Reduction (Wiley, Hoboken, 2008)

    Book  Google Scholar 

  26. P. Wen, J. Zhang, A novel variable step-size normalized subband adaptive filter based on mixed error cost function. Signal Process. 138, 48–52 (2017)

    Article  Google Scholar 

  27. H. Zhao, Z. Zheng, Bias-compensated affine-projection-like algorithms with noisy input. Electron. Lett. 52(9), 712–714 (2016)

    Article  Google Scholar 

  28. J. Zhao, X. Liao, S. Wang, K.T. Chi, Kernel least mean square with single feedback. IEEE Signal Process. Lett. 22(7), 953–957 (2015)

    Article  Google Scholar 

  29. Z. Zheng, H. Zhao, Bias-compensated normalized subband adaptive filter algorithm. IEEE Signal Process. Lett. 23(6), 809–813 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hadi Sadoghi-Yazdi.

Appendix

Appendix

1.1 A

Considering Eqs. (1) and (3), one can rewrite (20) as

$$\begin{aligned} \varvec{w}_{n+1} =\,&\varvec{w}_{n} +\alpha R_{n} \left( w_{o} -\varvec{w}_{n} \right) { +}\alpha \varvec{x}_{n}^\mathrm{T} \varvec{v}(n)-\alpha \varvec{w}_{n} +\alpha \sum _{i=1}^{K}\mu _{i,\varvec{w}_{n} } \varvec{w}_{i} \end{aligned}$$
(51)

where \(R_{n} =\varvec{x}_{n}^\mathrm{T} \varvec{x}_{n} \) is the correlation matrix. Subtracting both sides of (51) from \(w_{o} \), we have

$$\begin{aligned} \begin{aligned} w_{o} -\varvec{w}_{n+1}&=w_{o} -\varvec{w}_{n} -\alpha R_{n} \left( w_{o} -\varvec{w}_{n} \right) -\alpha \varvec{x}_{n}^\mathrm{T} \varvec{v}(n)\\&\quad +\alpha \varvec{w}_{n} -\alpha \sum _{i=1}^{K}\mu _{i,\varvec{w}_{n} } \varvec{w}_{i} \pm \alpha w_{o} \pm \alpha \sum _{i=1}^{K}\mu _{i,\varvec{w}_{n} } w_{o}. \end{aligned} \end{aligned}$$
(52)

Defining \(\tilde{\varvec{w}}_{n} =w_{o} -\varvec{w}_{n} \), (52) can be rewritten as

$$\begin{aligned} \begin{aligned} \tilde{\varvec{w}}_{n+1}&=\tilde{\varvec{w}}_{n} -\alpha R_{n} \tilde{\varvec{w}}_{n} -\alpha \varvec{x}_{n}^\mathrm{T} \varvec{v}(n)-\alpha \tilde{\varvec{w}}_{n} \\&\quad +\alpha \sum _{i=1}^{K}\mu _{i,\varvec{w}_{n} } \tilde{\varvec{w}}_{i}+\alpha w_{o} -\alpha w_{o} \sum _{i=1}^{K}\mu _{i,\varvec{w}_{n} } \end{aligned} \end{aligned}$$
(53)

where \(\tilde{\varvec{w}}_{i} =w_{o} -\varvec{w}_{i} \). Since \(\sum _{i=1}^{K}\mu _{i,\varvec{w}_{n} } =1\), (53) reduces to

$$\begin{aligned} \tilde{\varvec{w}}_{n+1} =\left( I-\alpha R_{n} -\alpha I\right) \tilde{\varvec{w}}_{n} -\alpha \varvec{x}_{n}^\mathrm{T} \varvec{v}(n)+\alpha \sum _{i=1}^{K}\mu _{i,\varvec{w}_{n} } \tilde{\varvec{w}}_{i} . \end{aligned}$$
(54)

On the other hand, it is clear that \(\left\| \varvec{w}_{n} -\varvec{w}_{i} \right\| =\left\| w_{o} -w_{o} +\varvec{w}_{n} -\varvec{w}_{i} \right\| =\left\| \tilde{\varvec{w}}_{i} -\tilde{\varvec{w}}_{n} \right\| \). Using the reverse triangle inequality, we have

$$\begin{aligned} \left\| \tilde{\varvec{w}}_{i} \right\| -\left\| \tilde{\varvec{w}}_{n} \right\| \le \left\| \tilde{\varvec{w}}_{i} -\tilde{\varvec{w}}_{n} \right\| \end{aligned}$$
(55)

Therefore,

$$\begin{aligned} \begin{array}{l} {\left( \left\| \tilde{\varvec{w}}_{i} \right\| -\left\| \tilde{\varvec{w}}_{n} \right\| \right) ^{2} \le \left\| \tilde{\varvec{w}}_{i} -\tilde{\varvec{w}}_{n} \right\| ^{2} } \\ {\left\| \tilde{\varvec{w}}_{i} \right\| ^{2} +\left\| \tilde{\varvec{w}}_{n} \right\| ^{2} -2\left\| \tilde{\varvec{w}}_{i} \right\| \left\| \tilde{\varvec{w}}_{n} \right\| \le \left\| \tilde{\varvec{w}}_{i} -\tilde{\varvec{w}}_{n} \right\| ^{2} } \end{array} \end{aligned}$$
(56)

and

$$\begin{aligned} \begin{aligned}&\underbrace{\exp \left( -\frac{1}{2} \left\| \tilde{\varvec{w}}_{i} -\tilde{\varvec{w}}_{n} \right\| ^{2} \right) }_{a_1}\\&\quad \le \underbrace{\exp \left( -\frac{1}{2} \left\| \tilde{\varvec{w}}_{i} \right\| ^{2} \right) \exp \left( -\frac{1}{2} \left\| \tilde{\varvec{w}}_{n} \right\| ^{2} \right) \exp \left( \left\| \tilde{\varvec{w}}_{i} \right\| \left\| \tilde{\varvec{w}}_{n} \right\| \right) }_{a_2}. \end{aligned} \end{aligned}$$
(57)

The summation over (57) is

$$\begin{aligned} \begin{aligned}&\underbrace{\sum _{i=1}^{K}\exp \left( -\frac{1}{2} \left\| \tilde{\varvec{w}}_{i} -\tilde{\varvec{w}}_{n} \right\| ^{2} \right) }_{a_3}\\&\quad \le \underbrace{\exp \left( -\frac{1}{2} \left\| \tilde{\varvec{w}}_{n} \right\| ^{2} \right) \sum _{i=1}^{K}\exp \left( -\frac{1}{2} \left\| \tilde{\varvec{w}}_{i} \right\| ^{2} \right) \exp \left( \left\| \tilde{\varvec{w}}_{i} \right\| \left\| \tilde{\varvec{w}}_{n} \right\| \right) }_{a_4}. \end{aligned} \end{aligned}$$
(58)

Since \(a_1 \le a_3\) and \(a_2 \le a_4\), one can write

$$\begin{aligned} \begin{aligned} a_1a_4 \le a_3a_4\\ a_2a_3 \le a_3a_4. \end{aligned} \end{aligned}$$
(59)

Subtracting first inequality of (59) from the second one and vise versa leads to conclude that \(a_1a_4=a_2a_3\). In other words, \(\frac{a_1}{a_3}=\frac{a_2}{a_4}\). Therefore, considering (57), (58), and inequalities (59)\(, \mu _{i,\varvec{w}_{i} } \) can be rewritten as

$$\begin{aligned} \begin{aligned} \mu _{i,\tilde{\varvec{w}}_{i} }&=\frac{\exp \left( -\frac{1}{2} \left\| \varvec{w}-\varvec{w}_{i} \right\| ^{2} \right) }{\sum _{j=1}^{K}\exp \left( -\frac{1}{2} \left\| \varvec{w}-\varvec{w}_{i} \right\| ^{2} \right) } \\&=\frac{\exp \left( -\frac{1}{2} \left\| \tilde{\varvec{w}}_{i} \right\| ^{2} \right) \exp \left( -\frac{1}{2} \left\| \tilde{\varvec{w}}_{n} \right\| ^{2} \right) \exp \left( \left\| \tilde{\varvec{w}}_{i} \right\| \left\| \tilde{\varvec{w}}_{n} \right\| \right) }{\exp \left( -\frac{1}{2} \left\| \tilde{\varvec{w}}_{n} \right\| ^{2} \right) \sum _{j=1}^{K}\exp \left( -\frac{1}{2} \left\| \tilde{\varvec{w}}_{j} \right\| ^{2} \right) \exp \left( \left\| \tilde{\varvec{w}}_{j} \right\| \left\| \tilde{\varvec{w}}_{n} \right\| \right) } \\&=\frac{\exp \left( -\frac{1}{2} \left\| \tilde{\varvec{w}}_{i} \right\| ^{2} \right) \exp \left( \left\| \tilde{\varvec{w}}_{i} \right\| \left\| \tilde{\varvec{w}}_{n} \right\| \right) }{\sum _{j=1}^{K}\exp \left( -\frac{1}{2} \left\| \tilde{\varvec{w}}_{j} \right\| ^{2} \right) \exp \left( \left\| \tilde{\varvec{w}}_{j} \right\| \left\| \tilde{\varvec{w}}_{n} \right\| \right) }. \end{aligned} \end{aligned}$$
(60)

Therefore, one can write (54) as

$$\begin{aligned} \tilde{\varvec{w}}_{n+1} =\left( I-\alpha R_{n} -\alpha I\right) \tilde{\varvec{w}}_{n} -\alpha \varvec{x}_{n}^\mathrm{T} \varvec{v}(n)+\alpha \sum _{i=1}^{K}\mu _{i,\tilde{\varvec{w}}_{i} } \tilde{\varvec{w}}_{i} . \end{aligned}$$
(61)

Remark 2

If we assume \(\tilde{\varvec{w}}_{i} \simeq \tilde{\varvec{w}}_{n} \), then \(\tilde{\varvec{w}}_{n+1} \) reduces to \(\tilde{\varvec{w}}_{n+1} =\left( I-\alpha R_{n} \right) \tilde{\varvec{w}}_{n} -\alpha \varvec{x}_{n}^\mathrm{T} \varvec{v}(n)\).

1.2 B

Lemma 1

[2]: Assume \(x_{i} ,i=1,\ldots ,m\) are possibly dependent identically distributed random variables with zero mean and finite pth absolute moment assumed, without loss of generality, to be equal to 1. Then

$$\begin{aligned} \begin{aligned} \mathbb E\left\{ \left\| x\right\| _{p} \right\}&=\mathbb E\left\{ \left[ \sum _{i=1}^{m}\left| x_{i} \right| ^{p} \right] ^{1/p} \right\} \\&\le \left\{ \mathbb E\left[ \sum _{i=1}^{m}\left| x_{i} \right| ^{p} \right] \right\} ^{1/p}=m^{1/p}. \end{aligned} \end{aligned}$$
(62)

Considering (60), one can write

$$\begin{aligned} \begin{aligned} {\mathbb E}\left\{ \sum _{i=1}^{K}\mu _{i,\tilde{\varvec{w}}_{i} } \tilde{\varvec{w}}_{i} \right\} = {{\mathbb E}}\left\{ \sum _{i=1}^{K}\frac{\exp \left( -\frac{1}{2} \left\| \tilde{\varvec{w}}_{i} \right\| ^{2} \right) \exp \left( \left\| \tilde{\varvec{w}}_{i} \right\| \left\| \tilde{\varvec{w}}_{n} \right\| \right) }{\sum _{j=1}^{K}\exp \left( -\frac{1}{2} \left\| \tilde{\varvec{w}}_{j} \right\| ^{2} \right) \exp \left( \left\| \tilde{\varvec{w}}_{j} \right\| \left\| \tilde{\varvec{w}}_{n} \right\| \right) } \tilde{\varvec{w}}_{i} \right\} . \end{aligned} \end{aligned}$$
(63)

Using linear approximation of Maclaurin series of \(\exp \left( -\frac{1}{2} \left\| \tilde{\varvec{w}}_{i} \right\| ^{2} \right) \) and \(\exp \left( \left\| \tilde{\varvec{w}}_{j} \right\| \left\| \tilde{\varvec{w}}_{n} \right\| \right) \), one can approximate \(\mu _{i,\tilde{\varvec{w}}_{i} } \) as

$$\begin{aligned} \mu _{i,\tilde{\varvec{w}}_{i} } \approx \frac{\left( 1-\frac{1}{2} \left\| \tilde{\varvec{w}}_{i} \right\| ^{2} \right) \left( 1-\left\| \tilde{\varvec{w}}_{i} \right\| \left\| \tilde{\varvec{w}}_{n} \right\| \right) }{\sum _{j=1}^{K}\left( 1-\frac{1}{2} \left\| \tilde{\varvec{w}}_{j} \right\| ^{2} \right) \left( 1-\left\| \tilde{\varvec{w}}_{j} \right\| \left\| \tilde{\varvec{w}}_{n} \right\| \right) } . \end{aligned}$$
(64)

Using Lemma 1,

$$\begin{aligned} \begin{aligned}&{\mathbb E}\left\{ \frac{\left( 1-\frac{1}{2} \left\| \tilde{\varvec{w}}_{i} \right\| ^{2} \right) \left( 1-\left\| \tilde{\varvec{w}}_{i} \right\| \left\| \tilde{\varvec{w}}_{n} \right\| \right) }{\sum _{i=1}^{K}\left( 1-\frac{1}{2} \left\| \tilde{\varvec{w}}_{i} \right\| ^{2} \right) \left( 1-\left\| \tilde{\varvec{w}}_{i} \right\| \left\| \tilde{\varvec{w}}_{n} \right\| \right) } \right\} \\&\quad \le \frac{\left( 1-\frac{1}{2} m\right) \left( 1-m\right) }{\sum _{i=1}^{K}\left( 1-\frac{1}{2} m\right) \left( 1-m\right) } =\frac{1}{K} \end{aligned} \end{aligned}$$
(65)

therefore (63) changes to

$$\begin{aligned} {\mathbb E}\left\{ \sum _{i=1}^{K}\mu _{i,\tilde{\varvec{w}}_{i} } \tilde{\varvec{w}}_{i} \right\} \le \frac{1}{K} \sum _{i=1}^{K}\left\{ {{\mathbb E}}\left\{ \tilde{\varvec{w}}_{i} \right\} \right\} . \end{aligned}$$
(66)

Taking expectation of both sides of (61), considering (66) and \(\mathrm{\mathrm{E}}\left( \varvec{x}_{n}^\mathrm{T} \varvec{v}(n)\right) =0\), because samples \(\varvec{x}_{n}^\mathrm{T} \) and noise \(\varvec{v}(n)\) are independent, one has

$$\begin{aligned} \begin{aligned} {\mathbb E}\left( \tilde{\varvec{w}}_{n+1} \right)&={{\mathbb E}}\left( I-\alpha R_{n} -\alpha \right) {{\mathbb E}}\left( \tilde{\varvec{w}}_{n} \right) +\alpha {{\mathbb E}}\left( \sum _{i=1}^{K}\mu _{i,\tilde{\varvec{w}}_{i} } \tilde{\varvec{w}}_{i} \right) \\&\le \left( I-\alpha R-\alpha \right) {{\mathbb E}}\left( \tilde{\varvec{w}}_{n} \right) +\alpha \frac{1}{K} \sum _{i=1}^{K}{\mathbb E}\left( \tilde{\varvec{w}}_{i} \right) . \end{aligned} \end{aligned}$$
(67)

If assumed \({\mathbb E}\left( \tilde{\varvec{w}}_{i} \right) =\beta _{i} {{\mathbb E}}\left( \tilde{\varvec{w}}_{n} \right) \), then (67) can be rewritten as

$$\begin{aligned} {\mathbb E}\left( \tilde{\varvec{w}}_{n+1} \right) \le \left( I-\alpha R-\alpha +\alpha \frac{1}{K} \sum _{i=1}^{K}\beta _{i} \right) {{\mathbb E}}\left( \tilde{\varvec{w}}_{n} \right) . \end{aligned}$$
(68)

Without loss of generality, it is assumed that

$$\begin{aligned} {\mathbb E}\left( \tilde{\varvec{w}}_{n+1} \right) \approx \left( I-\alpha R-\alpha +\alpha \frac{1}{K} \sum _{i=1}^{K}\beta _{i} \right) {{\mathbb E}}\left( \tilde{\varvec{w}}_{n} \right) . \end{aligned}$$
(69)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ashkezari-Toussi, S., Sadoghi-Yazdi, H. Incorporating Nonparametric Knowledge to the Least Mean Square Adaptive Filter. Circuits Syst Signal Process 38, 2114–2137 (2019). https://doi.org/10.1007/s00034-018-0954-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-018-0954-x

Keywords

Navigation