Skip to main content

Part of the book series: Mathematics for Industry ((MFI,volume 22))

  • 1234 Accesses

Abstract

In this chapter, two classical adaptation algorithms, the least-mean-squares algorithm (LMS algorithm) and the normalized least-mean-squares algorithm (NLMS algorithm), are reviewed. The LMS algorithm, also known as the Widrow-Hoff algorithm, is motivated by the steepest-descent method to minimize the expected error. The LMS algorithm has a parameter called the step-size that appears in the process of replacing differential with finite difference. By adjusting the step-size, the convergence rate and the convergence error at steady-state are controlled. The NLMS algorithm is an improved version of the LMS algorithm, in which the correction term to update the coefficients of the filter is normalized by the squared norm of the current regressor. In the LMS algorithm, the effective value of the step-size varies depending on the volume of the input signal. In the NLMS algorithm, on the other hand, the step-size has a definite meaning that is independent of the volume of the input signal. The NLMS algorithm can be looked at from a geometrical point of view. In fact, when the step-size equals unity, the updated coefficient vector is the orthogonal projection of the current coefficient vector onto the hyperplane defined by the current regressor. The geometrical interpretation of the NLMS algorithm leads to the affine projection algorithm (APA), the main theme of this book.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This means \(v(k)=0\) for \(k \ge 0\).

References

  1. Widrow, B., Hoff, M.E.Jr.: Adaptive switching circuits. IRE WESCON Conv. Rec. Pt.4, 96–104 (1960)

    Google Scholar 

  2. Ozeki, K., Umeda, T.: An adaptive filtering algorithm using an orthogonal projection to an affine subspace and its properties. IEICE Trans. J67-A(2), 126–132 (1984) (Also in Electron. Commun. Jpn. 67-A(5), 19–27 (1984))

    Google Scholar 

  3. Nagumo, J., Noda, A.: A learning method for system identification. IEEE Trans. Autom. Control AC-12(3), 282–287 (1967)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kazuhiko Ozeki .

Appendix 1: Steepest-Descent Method

Appendix 1: Steepest-Descent Method

Let f be a real differentiable function defined on \(\mathbb {R}^{n}\). The steepest-descent method searches for \(\mathrm{argmin}_{x}f(x)\) by iterating the following computation starting from an initial point \(x_{0}\):

$$\begin{aligned} x_{k+1}= x_{k}- \mu \nabla _{x}f(x_{k}), \end{aligned}$$
(2.9)

where \(\mu >0\) is the step-size. The step-size may depend on the iteration index k as \(\mu (k)\).

This algorithm is motivated by the following fact. Let \(x(t) \in \mathbb {R}^{n}\) be a function of t, and consider a differential equation

$$\begin{aligned} \frac{\mathrm{d} x(t)}{\mathrm{d} t}= - \nabla _{x}f(x(t)). \end{aligned}$$
(2.10)

Using the chain rule for differentiation and (2.10), we have

$$\begin{aligned} \frac{\mathrm{d} f(x(t))}{\mathrm{d} t}&= (\nabla _{x}f(x(t)))^{t} \frac{\mathrm{d}x(t)}{\mathrm{d} t} \\&= - (\nabla _{x}f(x(t)))^{t} \nabla _{x}f(x(t)) \\&= - \Vert \nabla _{x}f(x(t))\Vert ^{2} \\&\le 0.\, \end{aligned}$$

with equality only if \(\nabla _{x}f(x(t)) = 0\). This shows that if the curve x(t) is a solution of (2.10), f(x(t)) is a decreasing function of t.

The differential equation (2.10) can be approximated by a difference equation as

$$\begin{aligned} \frac{x(t+ \Delta t) - x(t)}{\Delta t}\approx - \nabla _{x}f(x(t)). \end{aligned}$$

If we let \(x_{k}\mathop {=}\limits ^{\triangle }x(t)\), \(x_{k+1}\mathop {=}\limits ^{\triangle }x(t + \Delta t)\), and \(\mu \mathop {=}\limits ^{\triangle }\Delta t\), we obtain (2.9). Under a certain condition, the vector sequence \(x_{0}, x_{1}, x_{2}, \ldots \) converges to a local minimum point. If the initial point \(x_{0}\) and the step-size \(\mu \) are appropriately chosen, the sequence converges to \(\mathrm{argmin}_{x}f(x)\).

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Japan

About this chapter

Cite this chapter

Ozeki, K. (2016). Classical Adaptation Algorithms. In: Theory of Affine Projection Algorithms for Adaptive Filtering. Mathematics for Industry, vol 22. Springer, Tokyo. https://doi.org/10.1007/978-4-431-55738-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-4-431-55738-8_2

  • Published:

  • Publisher Name: Springer, Tokyo

  • Print ISBN: 978-4-431-55737-1

  • Online ISBN: 978-4-431-55738-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics