Skip to main content

Least Squares and Chebyshev Systems

  • Chapter
  • First Online:
Thinking in Problems
  • 2488 Accesses

Abstract

As readers know, polynomials of degree n, in other words linear combinations of n + 1 monomials 1,…, t n, may have at most n real zeros. A far-reaching generalization of this fact raises a fundamental concept of Chebyshev systems, briefly, T-systems. Those systems are defined as follows. For a set (or system) of functions F = {f 0,…}, a linear combination of a finite number of elements, f = ∑c i f i , is called a polynomial on F (considered nonzero when ∃i: c i ≠ 0). A system of n + 1 function F = {f 0,…,f n } on an interval (or a half-interval, or a non-one-point segment) I ⊆ ℝ is referred to as T-system, when nonzero polynomials on F may have at most n distinct zeros in I (zeros of extensions outside I are not counted). The basic ideas of the theory of T-systems may be understood by readers with relatively limited experience. In this chapter we focus both on these ideas and on some nice applications of this theory such as estimation of numbers of zeros and critical points of functions (in analysis and geometry), a real-life problem in tomography, interpolation theory, approximation theory, and least squares.

We provide a comprehensive discussion of the linear least-squares technique and present its analytic, algebraic, and geometric versions. We examine its links to linear algebra and combinatorial analysis (in this connection we discuss some problems, previously considered in “A Combinatorial Algorithm in Multiexponential Analysis” and “Convexity and Related Classic Inequalities” chapters, from a different point of view). We also discuss some features of the least-squares solutions (e.g., passing through determinate points, asymptotic properties). In addition, we outline the connections between least squares and probability theory and statistics. Finally, we discuss real-life applications to polynomial interpolation (such as finding the best polynomial fitting for two-dimensional surfaces) and signal processing in nuclear magnetic resonance (NMR) technology (such as finding covariance matrices for maximal likelihood estimating parameters).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 84.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The problem of polynomial interpolation (or polynomial fitting) appears in numerous applications, both theoretical and practical, including high-order numerical methods, signal and image processing, and a variety of other applications. As a very small sample, see Wiener and Yomdin (2000) and Yomdin and Zahavi (2006) for an analysis of the stability of polynomial fitting and applications in high-order numerical solutions of linear and nonlinear partial differential equations (PDEs), Elichai and Yomdin (1993), Briskin et al. (2000), and Haviv and Yomdin (pers. communication: Model based representation of surfaces) for high-order polynomial fitting in image processing, and Yomdin (submitted) and Brudnyi and Yomdin (pers. communication: Remez Sets) for a geometric analysis of “fitting sets,” as appear in Problem P12.39*** . (We thank Yosef Yomdin for kindly giving us the details.)

  2. 2.

    For n > 0, this sign is unchanged regardless of the continuity of the nth derivatives. Readers may prove this with the help of the lemma from section E12.11 below, taking into account the continuity of differentiable functions.

  3. 3.

    This is a multivariate version of the Rolle theorem. Warning: a single inequality W n+1(f 0,…, f n ) ≠ 0, considered separately from the whole family of these inequalities for 0 ≤ m ≤ n, does not guarantee the Chebyshev property of f 0,…, f n ; provide counterexamples.

  4. 4.

    For n = 0, the continuity of f 0 needs to be assumed also because differentiability of the zero order commonly means nothing. Of course, it might be defined as continuity since representability of f(x + h) by a polynomial of degree n in h with accuracy o(|h|n) is equivalent to continuity at x for n = 0 and the nth-order differentiability at x for n > 0. However, the definition of continuity as the zero-order differentiability is not commonly accepted.

  5. 5.

    Similarly, setting K(s) = e irs yields an expression of the two-dimensional Fourier image of f(x) via the one-dimensional Fourier image of p(θ,s) with respect to variable s, known in tomography as the slice-projection theorem. All these formulas have obvious generalizations for multivariable distributions.

  6. 6.

    δ-basis p 0,…, p n is defined as \( {p_i}({t_j})={\delta_{ij }}=\left\{ {\begin{array}{rcl}{1\quad \mathrm{ for}\quad i=j} \\{0\quad \mathrm{ otherwise}} \\\end{array}} \right. \). These functions also provide restrictions to the set {t 0,…, t n } of certain Lagrange polynomials; we leave it to the reader to find out which polynomials are restricted in this way.

  7. 7.

    Therefore, a reasonable way of defining periodic M- (EM-) systems {f 0,…,f 2n } is as follows: subsystems {f 0,…,f 2n } must be periodic T- (ET-) systems on I for 0 ≤ m ≤ n.

  8. 8.

    For nonlinear models, see Press et al. (1992) and references therein, paying special attention to the brilliant Levenberg-Marquardt method.

  9. 9.

    Experienced readers familiar with projective geometry may eliminate the restriction of ∑u j  ≠ 0, as the point with ∑u j  = 0 belongs (at infinity) to the projective closure of this line in ℝP 2.

  10. 10.

    Experienced readers familiar with projective geometry may eliminate this restriction considering points at infinity of the projective closure as well.

  11. 11.

    The case where Y j have distinct variances σ j 2, attaining the maximum probability over x, for a fixed given y relates to the weighted least-squares formula ∑(A j (x) − y j )2/σ j 2 = min.

  12. 12.

    Researchers using the least-squares and related principles in practice should never forget about their conditionality. Those principles will yield practical results as long as the observations are free of systematic errors, independent, and normally distributed.

  13. 13.

    Readers may prove this on their own or refer to Lemma 1 in E12.39 below.

  14. 14.

    In this representation, the space of homogeneous polynomials \( \sum\limits_{i+j=n } {{c_{ij }}{{{(\cos \theta )}}^i}{{{(\sin \theta )}}^j}} \) corresponds to the space of Fourier polynomials \( \sum\limits_{{0\leqslant k\leqslant n,\;k\equiv n\;\bmod 2}} {{a_k}\cos k\theta +{b_k}\sin k\theta } \). (Why? Compare the behavior of the trigonometric polynomials to that of the Fourier polynomials with respect to the involution caused by the substitution θ \( \mapsto \) θ + t.)

  15. 15.

    Readers familiar with a considerably more general Stone-Weierstrass theorem can provide multiple further examples using many other infinite sets F.

  16. 16.

    Readers may verify, for the distance function defined by a norm, that any line segments contained in the ball lie in its interior (perhaps except the ends) if and only if the triangle inequality is strict: dist (x,y) < dist (x,z) + dist (y,z) for noncollinear x − z and x − y.

  17. 17.

    More advanced readers may find a different proof based on the theory of spherical functions.

  18. 18.

    This means the nondegeneracy of the matrix \( {{\left( {s_j^kt_j^l} \right)}_{\begin{subarray}{l} k+l=0, \ldots, d, \\ j=1, \ldots, m(d) \end{subarray}}} \) [(s j ,t j )∈S d ]. But we must draw the reader’s attention to the distinctive row scaling, which causes, for large degrees (practically, seven and larger), serious trouble for numerical computer processing. Commonly, the least-squares polynomial fitting using routine matrix computations becomes impractical very quickly with the growth of both the number of independent variables and the polynomials’ degrees. One way to improve the situation consists in rescaling the rows by factors ρ −k−l (however, this may bring the matrix close to a matrix of a smaller rank). Also, shifting the arguments to a neighborhood of the origin can be useful. A different way is to use the Tikhonov-Phillips regularization. Finer numerical procedures require a sophisticated analysis using methods of algebraic geometry.

  19. 19.

    Correlation coefficients ρ ij form a dimensionless covariance matrix: \( {\rho_{ij }}:={{{{{{\operatorname{cov}}}_{ij }}}} \left/ {{\sqrt{{{{{\operatorname{cov}}}_{ii }}\cdot {{{\operatorname{cov}}}_{jj }}}}}} \right.} \).

  20. 20.

    Readers may give examples of differentiable functions f = x + o(x) with derivatives discontinuous at x = 0, when the equation y = f(x) has a multiple solution for arbitrarily small y. The applicability of this lemma to our purposes is provided by the following condition: the number of derivatives is ≥ W > 1; therefore, the first derivative is continuous.

  21. 21.

    Readers may prove that actually \( \mathrm{ res}(f,g)=v_0^nw_0^m\prod\limits_{i,j } {({t_i}-{u_j})} \), where t i , u j are roots of f(x), g(x), respectively (in an algebraic closure of a field of the coefficients) (Van der Waerden 1971, 1967; Lang 1965).

  22. 22.

    Readers may easily prove the implication “a straight line l(s,t) = 0 is contained in an algebraic curve p(s,t) = 0” ⇒ “p(s,t) is divided by l(s,t),” by making an affine change of the independent variables so that this line would become the ordinate axes t = 0. (The similar arguments allow an immediate multivariate generalization substituting a straight line with an affine hyperplane.)

  23. 23.

    Iterating this calculation provides values for all central moments of a normal (Gaussian) distribution: \( {{\left( {\sqrt{{2\pi }}\sigma } \right)}^{-1 }}\int\limits_{{-\infty}}^{\infty } {{y^{2p }}{e^{{-{{{{y^2}}} \left/ {{2{\sigma^2}}} \right.}}}}dy} ={\sigma^{2p }}(2p-1)!! \); also, \( {{\left( {\sqrt{{2\pi }}\sigma } \right)}^{-1 }}\int\limits_0^{\infty } {{y^{2p+1 }}{e^{{-{{{{y^2}}} \left/ {{2{\sigma^2}}} \right.}}}}dy} ={\sigma^{2p+1 }}{{(2\pi )}^{{-{1 \left/ {2} \right.}}}}(2p)!! \) (p = 0,1,…; (−1)!! = 0!! = 1).

  24. 24.

    The left (right) kernel of an m × n matrix A is the space of all row (resp. column) vectors v of dimension m (resp. n) such that vA = 0 (resp. Av = 0).

  25. 25.

    Also, the left kernel may be calculated as the kernel of the adjoint operator A * (which acts on row vectors as A * v = vA), using the orthogonality of ker A * with im A. (Work out the details.)

  26. 26.

    Readers not yet familiar with the compactness of bounded closed sets in finite-dimensional spaces may verify it using quite elementary means. Advanced readers probably know an inverse theorem proved by the famous twentieth-century Hungarian mathematician Frigyes (Frederic) Riesz: a locally compact normed space is finite-dimensional. For a proof and further discussions, see Riesz and Sz.-Nagy (1972), Dieudonné (1960), Rudin (1973), and Banach (1932).

References

  • Akhiezer, N.I.: Классическая проблема моментов и некоторые вопросы анализа, связанные с ней. “Физматгиз” Press, Moscow (1961). [English transl. The Classical Moment Problem and Some Related Questions in Analysis. Oliver and Boyd Press, Edinburgh/London (1965)]

    Google Scholar 

  • Arnol’d, V.I.: Теорема Штурма и симплектическая геометрия. Функц. анализ и его прилож. 19(4), 1–10 (1985). [English transl. The Sturm theorem and symplectic geometry. Functional Anal. App. 19(4)]

    Google Scholar 

  • Arnol’d, V.I.: Сто задач (One Hundred Problems). МФТИ Press, Moscow (1989). Russian

    Google Scholar 

  • Arnol’d, V.I.: Topological Invariants of Plane Curves and Caustics. University Lecture Series, vol. 5. American Mathematical Society, Providence (1994)

    MATH  Google Scholar 

  • Arnol’d, V.I.: Лекции об уравнениях с частными производными (Lectures on Partial Differential Equations). “Фазис” Press, Moscow (1997). [English transl. Lectures on Partial Differential Equations (Universitext). 1st edn. Springer (2004)]

    Google Scholar 

  • Arnol’d, V.I.: Что такое математика (What is Mathematics)? МЦНМО Press, Moscow (2002). Russian

    Google Scholar 

  • Banach, S.: Théorie des operations linéaries. Paris (1932). [English transl. Theory of Linear Operations (Dover Books on Mathematics). Dover (2009)]

    Google Scholar 

  • Berezin, I.S., Zhidkov, N.P.: Методы вычислений, ТТ. 1–2. “Физматгиз” Press, Moscow (1959–1960). [English transl. Computing Methods. Franklin Book Company (1965)]

    Google Scholar 

  • Bernshtein, D.N.: Число корней системы уравнений. Функц. анализ и его прилож. 9(3), 1–4 (1975). [English transl. The number of roots of a system of equations. Functional Anal. App. 9(3), 183–185]

    Google Scholar 

  • Briskin, M., Elichai, Y., Yomdin, Y.: How can singularity theory help in image processing. In: Gromov, M., Carbone, A. (eds.) Pattern Formation in Biology, Vision and Dynamics, pp. 392–423. World Scientific, Singapore (2000)

    Chapter  Google Scholar 

  • Brudnyi, A., Yomdin, Y. Remez Sets (preprint)

    Google Scholar 

  • Courant, R., Hilbert, D.: Methods of Mathematical Physics. Wiley, New York (1953–1962)

    Google Scholar 

  • Cramér, H.: Mathematical Methods of Statistics. Princeton University Press, Princeton (1946)

    MATH  Google Scholar 

  • Dieudonné, J.: Foundation of Modern Analysis. Academic, New York/London (1960)

    Google Scholar 

  • Elichai, Y., Yomdin, Y.: Normal forms representation: A technology for image compression. SPIE. 1903, Image and Video Processing, 204–214 (1993)

    Google Scholar 

  • Erdelyi, A. (ed.): Higher Transcendental Functions, vol. 1–3. McGraw-Hill, New York/Toronto/London (1953)

    Google Scholar 

  • Gantmacher, F.R., Krein, M.G. Осцилляционные матрицы и ядра, и малые колебания механических систем. “Гостехиздат” Press, Moscow-Leningrad (1950). [English transl. Oscillation Matrices and Kernels and Small Vibrations of Mechanical Systems. US Atomic Energy Commission, Washington (1961)]

    Google Scholar 

  • Gelfand, I.M., Shilov, G.E., Vilenkin, N.Ya., Graev, N.I.: Обобщённые функции, ТТ. 1–5. “Наука” Press, Moscow (1959–1962). [English transl. Generalized Functions. V.’s 1–5. Academic Press (1964)]

    Google Scholar 

  • Haviv, D., Yomdin, Y.: Model based representation of surfaces (preprint)

    Google Scholar 

  • Helgason, S.: The Radon Transform. Progress in Mathematics, vol. 5. Birkhaüsser, Boston/Basel/Stuttgart (1980)

    MATH  Google Scholar 

  • Helgason, S.: Groups and Geometric Analysis. Integral Geometry, Invariant Differential Operators, and Spherical Functions. Academic (Harcourt Brace Jovanovich), Orlando/San Diego/San Francisco/New York/London/Toronto/Montreal/Tokyo/São Paulo (1984)

    MATH  Google Scholar 

  • Herman, G.T.: Image Reconstruction from Projections. The Fundamentals of Computerized Tomography. Academic, New York/London/Toronto/Sydney/San Francisco (1980)

    MATH  Google Scholar 

  • Karlin, S., Studden, W.J.: Tchebycheff Systems: With Application in Analysis and Statistics. Interscience Publishers A. Divison of Willey, New York/London/Sydney (1966)

    Google Scholar 

  • Khovanskii, A.G.: Малочлены. “Фазис” Press, Moscow (1997). [English transl. Fewnomials. Translations of Mathematical Monographs 88, AMS, Providence/Rhode Island (1991)]

    Google Scholar 

  • Klein, F.: Vorlesungen über die Entwicklung der Mathematic im 19. Jahrhundert. Teil 1. Für den Druck bearbeitet von Courant, R., Neugebauer, O. Springer, Berlin (1926). [English transl. Development of Mathematics in the Nineteenth Century (Lie Groups Series, No 9). Applied Mathematics Group publishing (1979)]

    Google Scholar 

  • Krein, M.G., Nudelman, A.A.: Проблема моментов Маркова и экстремальные задачи. “Наука” Press, Moscow (1973). [English transl. The Markov Moment Problem and Extremal Problems, Translations of Math. Monographs, V.50. American Mathematical Society, Providence/Rhode Island (1977)]

    Google Scholar 

  • Lang, S.: Algebra. Addison-Wesley, Reading/London/Amsterdam/Don Mills/Sydney/Tokyo (1965)

    MATH  Google Scholar 

  • Marcus, M., Minc, H.: A Survey of Matrix Theory and Matrix Inequalities. Allyn and Bacon, Boston (1964)

    MATH  Google Scholar 

  • McLachlan, N.W.: Theory and Application of Mathieu Functions. Dover, New York (1964)

    MATH  Google Scholar 

  • Polya, G., & Szegö, G.: Aufgaben und Lehrsätze aus der Analysis. Springer, Göttingen/Heidelberg/New York (1964). [English transl. Problems and Theorems in Analysis I, II. Springer, Reprint edition (1998)]

    Google Scholar 

  • Prasolov, V.V., Soloviev, Y.P.: Эллиптические функции и алгебраические уравнения (Elliptic Functions and Algebraic Equations). “Факториал” Press, Moscow (1997). Russian

    Google Scholar 

  • Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in Fortran 77. The Art of Scientific Computing, 2nd edn. Cambridge University Press, Cambridge/New York/Melbourne (1992)

    Google Scholar 

  • Riesz, F. & Sz.-Nagy, B.: Leçons d’analyse fonctionnnelle. Sixtème edition, Budapest (1972). [English transl. Functional Analysis. Dover (1990)]

    Google Scholar 

  • Rudin, W.: Functional Analysis. McGraw-Hill, New York/St. Louis/San Francisco/Düsseldorf/Johannesburg/London/Mexico/Montreal/New Delhi/Panama/Rio de Janeiro/Singapore/Sydney/Toronto (1973)

    MATH  Google Scholar 

  • Smith, K.T., Solmon, D.C., Wagner, S.L.: Practical and mathematical aspects of the problem of reconstructing objects from radiographs. Bull. AMS 83(6), 1227–1270 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  • Stoker, J.J.: Nonlinear Vibrations in Mechanical and Electrical Systems. Interscience, New York (1950)

    MATH  Google Scholar 

  • Szegö, G.: Orthogonal Polynomials, 4th edn. American Mathematical Society, Providence/Rhode Island (1981)

    Google Scholar 

  • Van der Waerden, B.L.: Mathematische statistik. Springer, Berlin/Göttingen/Heidelberg (1957). [English transl. Mathematical Statistics. Springer (1969)]

    Google Scholar 

  • Van der Waerden, B.L.: Algebra II. Springer, Berlin/Heidelberg/New York (1967)

    Book  Google Scholar 

  • Van der Waerden, B.L.: Algebra I. Springer, Berlin/Heidelberg/New York (1971)

    MATH  Google Scholar 

  • Vilenkin, N.Ya.: Специальные функции и теория представлений групп. “Наука” Press, Moscow (1965). [English transl. Special Functions and the Theory of Group Representations. Translations of Mathematical Monographs 22, American Mathematical Society, Providence/Rhode Island (1968)]

    Google Scholar 

  • Walker, R.J.: Algebraic Curves. Princeton University Press, Princeton/New Jersey (1950)

    MATH  Google Scholar 

  • Wiener, Z., Yomdin, Y.: From formal numerical solutions of elliptic PDE's to the true ones. Math. Comput. 69(229), 197–235 (2000)

    MathSciNet  MATH  Google Scholar 

  • Yomdin, Y.: Discrete Remez inequality. Isr. J. of Math. (submitted)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Roytvarf, A.A. (2013). Least Squares and Chebyshev Systems. In: Thinking in Problems. Birkhäuser, Boston. https://doi.org/10.1007/978-0-8176-8406-8_12

Download citation

Publish with us

Policies and ethics