Least Squares and Chebyshev Systems

Roytvarf, Alexander A.

doi:10.1007/978-0-8176-8406-8_12

Alexander A. Roytvarf²

2488 Accesses

Abstract

As readers know, polynomials of degree n, in other words linear combinations of n + 1 monomials 1,…, t ⁿ, may have at most n real zeros. A far-reaching generalization of this fact raises a fundamental concept of Chebyshev systems, briefly, T-systems. Those systems are defined as follows. For a set (or system) of functions F = {f ₀,…}, a linear combination of a finite number of elements, f = ∑c _i f _i, is called a polynomial on F (considered nonzero when ∃i: c _i ≠ 0). A system of n + 1 function F = {f ₀,…,f _n} on an interval (or a half-interval, or a non-one-point segment) I ⊆ ℝ is referred to as T-system, when nonzero polynomials on F may have at most n distinct zeros in I (zeros of extensions outside I are not counted). The basic ideas of the theory of T-systems may be understood by readers with relatively limited experience. In this chapter we focus both on these ideas and on some nice applications of this theory such as estimation of numbers of zeros and critical points of functions (in analysis and geometry), a real-life problem in tomography, interpolation theory, approximation theory, and least squares.

We provide a comprehensive discussion of the linear least-squares technique and present its analytic, algebraic, and geometric versions. We examine its links to linear algebra and combinatorial analysis (in this connection we discuss some problems, previously considered in “A Combinatorial Algorithm in Multiexponential Analysis” and “Convexity and Related Classic Inequalities” chapters, from a different point of view). We also discuss some features of the least-squares solutions (e.g., passing through determinate points, asymptotic properties). In addition, we outline the connections between least squares and probability theory and statistics. Finally, we discuss real-life applications to polynomial interpolation (such as finding the best polynomial fitting for two-dimensional surfaces) and signal processing in nuclear magnetic resonance (NMR) technology (such as finding covariance matrices for maximal likelihood estimating parameters).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Hardcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The problem of polynomial interpolation (or polynomial fitting) appears in numerous applications, both theoretical and practical, including high-order numerical methods, signal and image processing, and a variety of other applications. As a very small sample, see Wiener and Yomdin (2000) and Yomdin and Zahavi (2006) for an analysis of the stability of polynomial fitting and applications in high-order numerical solutions of linear and nonlinear partial differential equations (PDEs), Elichai and Yomdin (1993), Briskin et al. (2000), and Haviv and Yomdin (pers. communication: Model based representation of surfaces) for high-order polynomial fitting in image processing, and Yomdin (submitted) and Brudnyi and Yomdin (pers. communication: Remez Sets) for a geometric analysis of “fitting sets,” as appear in Problem P12.39^*** . (We thank Yosef Yomdin for kindly giving us the details.)
2.
For n > 0, this sign is unchanged regardless of the continuity of the nth derivatives. Readers may prove this with the help of the lemma from section E12.11 below, taking into account the continuity of differentiable functions.
3.
This is a multivariate version of the Rolle theorem. Warning: a single inequality W _n+1(f ₀,…, f _n) ≠ 0, considered separately from the whole family of these inequalities for 0 ≤ m ≤ n, does not guarantee the Chebyshev property of f ₀,…, f _n; provide counterexamples.
4.
For n = 0, the continuity of f ₀ needs to be assumed also because differentiability of the zero order commonly means nothing. Of course, it might be defined as continuity since representability of f(x + h) by a polynomial of degree n in h with accuracy o(|h|ⁿ) is equivalent to continuity at x for n = 0 and the nth-order differentiability at x for n > 0. However, the definition of continuity as the zero-order differentiability is not commonly accepted.
5.
Similarly, setting K(s) = e ^−irs yields an expression of the two-dimensional Fourier image of f(x) via the one-dimensional Fourier image of p(θ,s) with respect to variable s, known in tomography as the slice-projection theorem. All these formulas have obvious generalizations for multivariable distributions.
6.
δ-basis p ₀,…, p _n is defined as \( {p_i}({t_j})={\delta_{ij }}=\left\{ {\begin{array}{rcl}{1\quad \mathrm{ for}\quad i=j} \\{0\quad \mathrm{ otherwise}} \\\end{array}} \right. \). These functions also provide restrictions to the set {t ₀,…, t _n} of certain Lagrange polynomials; we leave it to the reader to find out which polynomials are restricted in this way.
7.
Therefore, a reasonable way of defining periodic M- (EM-) systems {f ₀,…,f _2n} is as follows: subsystems {f ₀,…,f _2n} must be periodic T- (ET-) systems on I for 0 ≤ m ≤ n.
8.
For nonlinear models, see Press et al. (1992) and references therein, paying special attention to the brilliant Levenberg-Marquardt method.
9.
Experienced readers familiar with projective geometry may eliminate the restriction of ∑u _j ≠ 0, as the point with ∑u _j = 0 belongs (at infinity) to the projective closure of this line in ℝP ².
10.
Experienced readers familiar with projective geometry may eliminate this restriction considering points at infinity of the projective closure as well.
11.
The case where Y _j have distinct variances σ _j ², attaining the maximum probability over x, for a fixed given y relates to the weighted least-squares formula ∑(A _j(x) − y _j)²/σ _j ² = min.
12.
Researchers using the least-squares and related principles in practice should never forget about their conditionality. Those principles will yield practical results as long as the observations are free of systematic errors, independent, and normally distributed.
13.
Readers may prove this on their own or refer to Lemma 1 in E12.39 below.
14.
In this representation, the space of homogeneous polynomials \( \sum\limits_{i+j=n } {{c_{ij }}{{{(\cos \theta )}}^i}{{{(\sin \theta )}}^j}} \) corresponds to the space of Fourier polynomials \( \sum\limits_{{0\leqslant k\leqslant n,\;k\equiv n\;\bmod 2}} {{a_k}\cos k\theta +{b_k}\sin k\theta } \). (Why? Compare the behavior of the trigonometric polynomials to that of the Fourier polynomials with respect to the involution caused by the substitution θ \( \mapsto \) θ + t.)
15.
Readers familiar with a considerably more general Stone-Weierstrass theorem can provide multiple further examples using many other infinite sets F.
16.
Readers may verify, for the distance function defined by a norm, that any line segments contained in the ball lie in its interior (perhaps except the ends) if and only if the triangle inequality is strict: dist (x,y) < dist (x,z) + dist (y,z) for noncollinear x − z and x − y.
17.
More advanced readers may find a different proof based on the theory of spherical functions.
18.
This means the nondegeneracy of the matrix \( {{\left( {s_j^kt_j^l} \right)}_{\begin{subarray}{l} k+l=0, \ldots, d, \\ j=1, \ldots, m(d) \end{subarray}}} \) [(s _j,t _j)∈S _d]. But we must draw the reader’s attention to the distinctive row scaling, which causes, for large degrees (practically, seven and larger), serious trouble for numerical computer processing. Commonly, the least-squares polynomial fitting using routine matrix computations becomes impractical very quickly with the growth of both the number of independent variables and the polynomials’ degrees. One way to improve the situation consists in rescaling the rows by factors ρ ^−k−l (however, this may bring the matrix close to a matrix of a smaller rank). Also, shifting the arguments to a neighborhood of the origin can be useful. A different way is to use the Tikhonov-Phillips regularization. Finer numerical procedures require a sophisticated analysis using methods of algebraic geometry.
19.
Correlation coefficients ρ _ij form a dimensionless covariance matrix: \( {\rho_{ij }}:={{{{{{\operatorname{cov}}}_{ij }}}} \left/ {{\sqrt{{{{{\operatorname{cov}}}_{ii }}\cdot {{{\operatorname{cov}}}_{jj }}}}}} \right.} \).
20.
Readers may give examples of differentiable functions f = x + o(x) with derivatives discontinuous at x = 0, when the equation y = f(x) has a multiple solution for arbitrarily small y. The applicability of this lemma to our purposes is provided by the following condition: the number of derivatives is ≥ W > 1; therefore, the first derivative is continuous.
21.
Readers may prove that actually \( \mathrm{ res}(f,g)=v_0^nw_0^m\prod\limits_{i,j } {({t_i}-{u_j})} \), where t _i, u _j are roots of f(x), g(x), respectively (in an algebraic closure of a field of the coefficients) (Van der Waerden 1971, 1967; Lang 1965).
22.
Readers may easily prove the implication “a straight line l(s,t) = 0 is contained in an algebraic curve p(s,t) = 0” ⇒ “p(s,t) is divided by l(s,t),” by making an affine change of the independent variables so that this line would become the ordinate axes t = 0. (The similar arguments allow an immediate multivariate generalization substituting a straight line with an affine hyperplane.)
23.
Iterating this calculation provides values for all central moments of a normal (Gaussian) distribution: \( {{\left( {\sqrt{{2\pi }}\sigma } \right)}^{-1 }}\int\limits_{{-\infty}}^{\infty } {{y^{2p }}{e^{{-{{{{y^2}}} \left/ {{2{\sigma^2}}} \right.}}}}dy} ={\sigma^{2p }}(2p-1)!! \); also, \( {{\left( {\sqrt{{2\pi }}\sigma } \right)}^{-1 }}\int\limits_0^{\infty } {{y^{2p+1 }}{e^{{-{{{{y^2}}} \left/ {{2{\sigma^2}}} \right.}}}}dy} ={\sigma^{2p+1 }}{{(2\pi )}^{{-{1 \left/ {2} \right.}}}}(2p)!! \) (p = 0,1,…; (−1)!! = 0!! = 1).
24.
The left (right) kernel of an m × n matrix A is the space of all row (resp. column) vectors v of dimension m (resp. n) such that vA = 0 (resp. Av = 0).
25.
Also, the left kernel may be calculated as the kernel of the adjoint operator A ^* (which acts on row vectors as A ^* v = vA), using the orthogonality of ker A ^* with im A. (Work out the details.)
26.
Readers not yet familiar with the compactness of bounded closed sets in finite-dimensional spaces may verify it using quite elementary means. Advanced readers probably know an inverse theorem proved by the famous twentieth-century Hungarian mathematician Frigyes (Frederic) Riesz: a locally compact normed space is finite-dimensional. For a proof and further discussions, see Riesz and Sz.-Nagy (1972), Dieudonné (1960), Rudin (1973), and Banach (1932).

References

Akhiezer, N.I.: Классическая проблема моментов и некоторые вопросы анализа, связанные с ней. “Физматгиз” Press, Moscow (1961). [English transl. The Classical Moment Problem and Some Related Questions in Analysis. Oliver and Boyd Press, Edinburgh/London (1965)]
Google Scholar
Arnol’d, V.I.: Теорема Штурма и симплектическая геометрия. Функц. анализ и его прилож. 19(4), 1–10 (1985). [English transl. The Sturm theorem and symplectic geometry. Functional Anal. App. 19(4)]
Google Scholar
Arnol’d, V.I.: Сто задач (One Hundred Problems). МФТИ Press, Moscow (1989). Russian
Google Scholar
Arnol’d, V.I.: Topological Invariants of Plane Curves and Caustics. University Lecture Series, vol. 5. American Mathematical Society, Providence (1994)
MATH Google Scholar
Arnol’d, V.I.: Лекции об уравнениях с частными производными (Lectures on Partial Differential Equations). “Фазис” Press, Moscow (1997). [English transl. Lectures on Partial Differential Equations (Universitext). 1st edn. Springer (2004)]
Google Scholar
Arnol’d, V.I.: Что такое математика (What is Mathematics)? МЦНМО Press, Moscow (2002). Russian
Google Scholar
Banach, S.: Théorie des operations linéaries. Paris (1932). [English transl. Theory of Linear Operations (Dover Books on Mathematics). Dover (2009)]
Google Scholar
Berezin, I.S., Zhidkov, N.P.: Методы вычислений, ТТ. 1–2. “Физматгиз” Press, Moscow (1959–1960). [English transl. Computing Methods. Franklin Book Company (1965)]
Google Scholar
Bernshtein, D.N.: Число корней системы уравнений. Функц. анализ и его прилож. 9(3), 1–4 (1975). [English transl. The number of roots of a system of equations. Functional Anal. App. 9(3), 183–185]
Google Scholar
Briskin, M., Elichai, Y., Yomdin, Y.: How can singularity theory help in image processing. In: Gromov, M., Carbone, A. (eds.) Pattern Formation in Biology, Vision and Dynamics, pp. 392–423. World Scientific, Singapore (2000)
Chapter Google Scholar
Brudnyi, A., Yomdin, Y. Remez Sets (preprint)
Google Scholar
Courant, R., Hilbert, D.: Methods of Mathematical Physics. Wiley, New York (1953–1962)
Google Scholar
Cramér, H.: Mathematical Methods of Statistics. Princeton University Press, Princeton (1946)
MATH Google Scholar
Dieudonné, J.: Foundation of Modern Analysis. Academic, New York/London (1960)
Google Scholar
Elichai, Y., Yomdin, Y.: Normal forms representation: A technology for image compression. SPIE. 1903, Image and Video Processing, 204–214 (1993)
Google Scholar
Erdelyi, A. (ed.): Higher Transcendental Functions, vol. 1–3. McGraw-Hill, New York/Toronto/London (1953)
Google Scholar
Gantmacher, F.R., Krein, M.G. Осцилляционные матрицы и ядра, и малые колебания механических систем. “Гостехиздат” Press, Moscow-Leningrad (1950). [English transl. Oscillation Matrices and Kernels and Small Vibrations of Mechanical Systems. US Atomic Energy Commission, Washington (1961)]
Google Scholar
Gelfand, I.M., Shilov, G.E., Vilenkin, N.Ya., Graev, N.I.: Обобщённые функции, ТТ. 1–5. “Наука” Press, Moscow (1959–1962). [English transl. Generalized Functions. V.’s 1–5. Academic Press (1964)]
Google Scholar
Haviv, D., Yomdin, Y.: Model based representation of surfaces (preprint)
Google Scholar
Helgason, S.: The Radon Transform. Progress in Mathematics, vol. 5. Birkhaüsser, Boston/Basel/Stuttgart (1980)
MATH Google Scholar
Helgason, S.: Groups and Geometric Analysis. Integral Geometry, Invariant Differential Operators, and Spherical Functions. Academic (Harcourt Brace Jovanovich), Orlando/San Diego/San Francisco/New York/London/Toronto/Montreal/Tokyo/São Paulo (1984)
MATH Google Scholar
Herman, G.T.: Image Reconstruction from Projections. The Fundamentals of Computerized Tomography. Academic, New York/London/Toronto/Sydney/San Francisco (1980)
MATH Google Scholar
Karlin, S., Studden, W.J.: Tchebycheff Systems: With Application in Analysis and Statistics. Interscience Publishers A. Divison of Willey, New York/London/Sydney (1966)
Google Scholar
Khovanskii, A.G.: Малочлены. “Фазис” Press, Moscow (1997). [English transl. Fewnomials. Translations of Mathematical Monographs 88, AMS, Providence/Rhode Island (1991)]
Google Scholar
Klein, F.: Vorlesungen über die Entwicklung der Mathematic im 19. Jahrhundert. Teil 1. Für den Druck bearbeitet von Courant, R., Neugebauer, O. Springer, Berlin (1926). [English transl. Development of Mathematics in the Nineteenth Century (Lie Groups Series, No 9). Applied Mathematics Group publishing (1979)]
Google Scholar
Krein, M.G., Nudelman, A.A.: Проблема моментов Маркова и экстремальные задачи. “Наука” Press, Moscow (1973). [English transl. The Markov Moment Problem and Extremal Problems, Translations of Math. Monographs, V.50. American Mathematical Society, Providence/Rhode Island (1977)]
Google Scholar
Lang, S.: Algebra. Addison-Wesley, Reading/London/Amsterdam/Don Mills/Sydney/Tokyo (1965)
MATH Google Scholar
Marcus, M., Minc, H.: A Survey of Matrix Theory and Matrix Inequalities. Allyn and Bacon, Boston (1964)
MATH Google Scholar
McLachlan, N.W.: Theory and Application of Mathieu Functions. Dover, New York (1964)
MATH Google Scholar
Polya, G., & Szegö, G.: Aufgaben und Lehrsätze aus der Analysis. Springer, Göttingen/Heidelberg/New York (1964). [English transl. Problems and Theorems in Analysis I, II. Springer, Reprint edition (1998)]
Google Scholar
Prasolov, V.V., Soloviev, Y.P.: Эллиптические функции и алгебраические уравнения (Elliptic Functions and Algebraic Equations). “Факториал” Press, Moscow (1997). Russian
Google Scholar
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in Fortran 77. The Art of Scientific Computing, 2nd edn. Cambridge University Press, Cambridge/New York/Melbourne (1992)
Google Scholar
Riesz, F. & Sz.-Nagy, B.: Leçons d’analyse fonctionnnelle. Sixtème edition, Budapest (1972). [English transl. Functional Analysis. Dover (1990)]
Google Scholar
Rudin, W.: Functional Analysis. McGraw-Hill, New York/St. Louis/San Francisco/Düsseldorf/Johannesburg/London/Mexico/Montreal/New Delhi/Panama/Rio de Janeiro/Singapore/Sydney/Toronto (1973)
MATH Google Scholar
Smith, K.T., Solmon, D.C., Wagner, S.L.: Practical and mathematical aspects of the problem of reconstructing objects from radiographs. Bull. AMS 83(6), 1227–1270 (1977)
Article MathSciNet MATH Google Scholar
Stoker, J.J.: Nonlinear Vibrations in Mechanical and Electrical Systems. Interscience, New York (1950)
MATH Google Scholar
Szegö, G.: Orthogonal Polynomials, 4th edn. American Mathematical Society, Providence/Rhode Island (1981)
Google Scholar
Van der Waerden, B.L.: Mathematische statistik. Springer, Berlin/Göttingen/Heidelberg (1957). [English transl. Mathematical Statistics. Springer (1969)]
Google Scholar
Van der Waerden, B.L.: Algebra II. Springer, Berlin/Heidelberg/New York (1967)
Book Google Scholar
Van der Waerden, B.L.: Algebra I. Springer, Berlin/Heidelberg/New York (1971)
MATH Google Scholar
Vilenkin, N.Ya.: Специальные функции и теория представлений групп. “Наука” Press, Moscow (1965). [English transl. Special Functions and the Theory of Group Representations. Translations of Mathematical Monographs 22, American Mathematical Society, Providence/Rhode Island (1968)]
Google Scholar
Walker, R.J.: Algebraic Curves. Princeton University Press, Princeton/New Jersey (1950)
MATH Google Scholar
Wiener, Z., Yomdin, Y.: From formal numerical solutions of elliptic PDE's to the true ones. Math. Comput. 69(229), 197–235 (2000)
MathSciNet MATH Google Scholar
Yomdin, Y.: Discrete Remez inequality. Isr. J. of Math. (submitted)
Google Scholar

Download references

Author information

Authors and Affiliations

Rishon LeZion, Israel
Alexander A. Roytvarf

Authors

Alexander A. Roytvarf
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Roytvarf, A.A. (2013). Least Squares and Chebyshev Systems. In: Thinking in Problems. Birkhäuser, Boston. https://doi.org/10.1007/978-0-8176-8406-8_12

Download citation

DOI: https://doi.org/10.1007/978-0-8176-8406-8_12
Published: 15 November 2012
Publisher Name: Birkhäuser, Boston
Print ISBN: 978-0-8176-8405-1
Online ISBN: 978-0-8176-8406-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics