## Abstract

This chapter contains descriptions of 14 great theorems published in the Annals of Mathematics in 2009.

## 9.1 On De Giorgi’s Conjecture in Dimension at Most 8

**The Derivative and Differential Equations**

One of the central concepts in the whole of mathematics is the *derivative*, which for a function \(u:{\mathbb R}\rightarrow {\mathbb R}\) is defined as \(u'(x)=\lim \limits _{\varepsilon \rightarrow 0} (u(x+\varepsilon )-u(x))/\varepsilon \), provided that the limit exists. There are numerous useful rules which help us to calculate derivatives for various functions, for example \((u+v)'(x)=u'(x)+v'(x)\), \((uv)'(x)=u'(x)v(x)+u(x)v'(x)\), \(u'(x)=f'(g(x))g'(x)\) if \(u=f(g(x))\), etc. We also have tables of derivatives for various functions, e.g. \((x^n)'=nx^{n-1}\) for any \(n\ne 0\), \((\ln x)'=\frac{1}{x}, \, x>0\), etc.

If we know the derivative of a function, we can recover it (up to an additive constant) using integration: if \(u'(x)=f(x)\) then \(u(x)=\int f(x)dx + C\) for some \(C\in {\mathbb R}\). For example, \(u'(x)=2x\) implies that \(u(x)=\int 2x dx + C = x^2+C\). An equation of the form \(u'(x)=f(x)\) is the simplest example of a *differential equation*, that is, an equation involving derivatives. Differential equations arise in numerous application of mathematics, like physics, biology, and economics, and they are extremely important in pure mathematics as well.

**A Slightly More Complicated Differential Equation**

*e*is the base of the natural logarithm. This results in \(u(x)= \tanh (x+C)\), \(C\in {\mathbb R}\), where \(\tanh \) is the function defined by \(\tanh (x):=\frac{e^x-e^{-x}}{e^x+e^{-x}}\).

**Differential Equations Involving the Second Derivative**

A differential equation may also include the second derivative \(u''(x)=(u'(x))'\). For example, if \(u(x)=x^3\) then \(u'(x)=3x^2\), and \(u''(x)=(3x^2)'=6x\). Another notation for the second derivative is \(\frac{d^2u}{dx^2}\), so the differential equation \(\frac{d^2u}{dx^2}=6x\) has a solution \(u(x)=x^3\) (and many other solutions as well).

*u*, one could naturally guess that the first derivative \(u'\) might also be written as

*P*(

*u*) for some polynomial

*P*. If it were a linear polynomial, that is, \(u'=Au+B\) for some \(A, B \in {\mathbb R}\), then \(u''=(Au+B)'=Au'=A(Au+B)\), again a linear polynomial, not a cubic one as in (9.2).

Next, let us try a quadratic polynomial *P* , that is, assume that \(u'=Au^2+Bu+C\), where \(A,B, C \in {\mathbb R}\) are some unknown coefficients to be found from (9.2)—this is called the *method of undetermined coefficients*. The second derivative is \(u''=(Au^2+Bu+C)'=A(u^2)'+Bu'=A(2uu')+Bu'=(2Au+B)u'=(2Au+B)(Au^2+Bu+C)=(2A^2)u^3+(3AB)u^2+(2AC+B^2)u+BC\). By (9.2), this expression should be equal to \(u^3-u\), which leads us to the system of equations \(2A^2=1\), \(3AB=0\), \(2AC+B^2=-1\), \(BC=0\), which has 2 solutions, one of which is \(A=-1/\sqrt{2}\), \(B=0\), \(C=1/\sqrt{2}\), in which case \(u'=Au^2+Bu+C=(1/\sqrt{2})(1-u^2)\).

**Differential Equations with Functions of Two Variables**

*several*variables, such as \(u=u(x, y)\). The

*partial*derivative \(\frac{\partial u}{\partial x}\) for such a function is the derivative with respect to

*x*if

*y*is treated as a constant. The derivative \(\frac{\partial u}{\partial y}\) with respect to

*y*is defined similarly, the second-order derivative \(\frac{\partial ^2 u}{\partial x^2}\) is \(\frac{\partial }{\partial x}\left( \frac{\partial u}{\partial x}\right) \), etc. For example, if \(u=x^3y\), then \(\frac{\partial u}{\partial y}=x^3\), \(\frac{\partial u}{\partial x}=3x^2y\), \(\frac{\partial ^2 u}{\partial x^2}=6xy\), and so on. Would you be able to find at least one non-trivial solution of an equation involving second partial derivatives, like

*x*and

*y*in a linear way, so that \(u(x, y)=g(Ax+By+C)\) for some \(A,B, C \in {\mathbb R}\) and function \(g:{\mathbb R} \rightarrow {\mathbb R}\). In this case, \(\frac{\partial u}{\partial x} = g'(Ax+By+C)\cdot \frac{\partial (Ax+By+C)}{\partial x}=Ag'(Ax+By+C)\), and therefore \(\frac{\partial ^2 u}{\partial x^2} = A g''(Ax+By+C)\cdot \frac{\partial (Ax+By+C)}{\partial x}=A^2g''(Ax+By+C)\). Similarly, \(\frac{\partial ^2 u}{\partial y^2} =B^2g''(Ax+By+C)\). Substituting this back into the equation and defining \(z=Ax+By+C\), we get \((A^2+B^2)g''(z)=g^3(z)-g(z)\). If \(A^2+B^2=1\), this is exactly the Eq. (9.2) above, which has solutions in the form \(g(z)= \tanh \left( \frac{z+C}{\sqrt{2}}\right) \). Hence, for every \(C\in {\mathbb R}\), and every

*A*and

*B*such that \(A^2+B^2=1\), the function \( u(x, y) = \tanh \left( \frac{Ax+By+C}{\sqrt{2}}\right) \) is a solution to (9.4). Note that if we choose \(B>0\), this solution satisfies the condition (i) \(|u(x, y)|<1\) and (ii) \(\frac{\partial u}{\partial y}>0\) for every \(x, y\in {\mathbb R}\), and also has a special geometric structure: if we fix any \(\lambda \in (-1,1)\), the set of points (

*x*,

*y*) such that \(u(x, y)=\lambda \) (such sets are called the

*level sets*) forms a line \(Ax+By+C=z\), where

*z*is such that \(\tanh (z)=\lambda \).

**Differential Equations with Functions of** *n* **Variables**

*any*number of variables, that is,

*hyperplane*. However, guessing one family of solutions to (9.5) does not mean finding

*all*the solutions.

**The De Giorgi Conjecture**

In 1978, De Giorgi [111] made a conjecture that, for \(n\le 8\), *every* solution to (9.5) satisfying (i) and (ii) on the whole of \({\mathbb R}^n\) has the property that all level sets are hyperplanes, and therefore is given by (9.6). By 2009, the conjecture has been proved only for \(n=2,3\). The following theorem, proved in [333], is a big advance for \(4 \le n \le 8\).

### Theorem 9.1

Suppose *u* is a solution to (9.5) such that (i) \(|u|<1\); (ii) \(\frac{\partial u}{\partial x_n}>0\) for every \(x=(x_1, \dots , x_n)\in {\mathbb R}^n\); and (iii) \(\lim \limits _{x_n\rightarrow \pm \infty } u(x_1, \dots , x_{n-1}, x_n)=\pm 1\) for every fixed \(x_1, \dots , x_{n-1}\). If \(n \le 8\), then all level sets of *u* are hyperplanes.

In other words, Theorem 9.1 proves the De Giorgi conjecture for all \(n\le 8\), under the additional condition (iii). In a later work, Manuel del Pino, Michal Kowalczyk and Juncheng Wei [113] found a counterexample to the De Giorgi conjecture in all dimensions \(n \ge 9\), hence the condition \(n \le 8\) in Theorem 9.1 cannot be removed.

### Reference

O. Savin, Regularity of flat level sets in phase transitions. *Annals of Mathematics* **169**-1, (2009), 41–78.

## 9.2 An Efficient Algorithm for Fitting a Smooth Function to Data

**Looking for a “Nice” Function Which Fits the Given Data Set**

In Sect. 5.2, we discussed the following question. Assume that we have performed some measurements, like the radiation level along a street, at a finite number of points. Can we then “guess” the result of the measurement at any other point?

Mathematically, let \(F(\cdot )\) be the (unknown) function such that *F*(*x*) represents the level of radiation at any point *x*. Let \(x_1, x_2, \dots , x_N\) be the points at which we have performed the measurements, and \(y_1, y_2, \dots , y_N\) be the corresponding results. Then we need to find a function *F*(*x*) such that \(F(x_i)=y_i, \, i=1,\dots , N\), see Fig. 9.2a.

Of course, there are infinitely many ways of doing this, for example, we can put \(F(x_i)=y_i, \, i=1,\dots , N\) and \(F(x)=0\) for all other *x*. However, it is unlikely that the true function *F* is anything like this. At the very least, we would expect it to be continuous and “smooth”. So, the true question is to find *F* such that (a) \(F(x_i)=y_i, \, i=1,\dots , N\), and (b) *F* is a function which is as “nice” as possible.

**Which Functions are “Nice”?**

- (i)
The function

*F*does not take too large or too small values; - (ii)
The function

*F*is smooth and does not increase or decrease too fast. Mathematically, this means that the derivative \(F'\) exists and does not take too large or too small values; - (iii)
In turn, the rate of increase/decrease of

*F*does not change too suddenly. This means that \(F'\) does not change its values too fast, or, equivalently, that the second derivative of*F*, denoted \(F^{(2)}\), does not take too large or too small values; - (iv)
And so on.

*m*-th derivative of

*F*. Our problem can then be formalized as

**“Nice” Functions of Several Variables**

If we measure the radiation in a city, not along a street, then \(x_i \in {\mathbb R}^2\), \(i=1,\dots , N\), are points in the plane, and the aim is to find the “nicest” function \(F:{\mathbb R}^2 \rightarrow {\mathbb R}\) such that \(F(x_i)=y_i\), \(i=1,\dots , N\). For measurements in space, \(x_i \in {\mathbb R}^3\), \(i=1,\dots , N\), and \(F:{\mathbb R}^3 \rightarrow {\mathbb R}\). More generally, the result of a measurement can depend on *n* parameters, and, in this case, each \(x_i, \, i=1,2,\dots , N\) is a point in \({\mathbb R}^n\), and the aim is to find the “nicest” function \(F:{\mathbb R}^n \rightarrow {\mathbb R}\) such that \(F(x_i)=y_i\), \(i=1,\dots , N\).

The “niceness” can be defined similarly as above, because the definition of the \(C^m\) norm can be extended to functions \(F:{\mathbb R}^n \rightarrow {\mathbb R}\). If \(F(z_1, z_2, \dots , z_n)\) is such a function, we can assume that the variables \(z_1, \dots , z_{i-1}, z_{i+1}, \dots , z_n\) are fixed, and treat *F* as a function *g* of one variable \(z_i\). The derivative of *g* is called the *partial derivative* of *F* with respect to \(z_i\). This derivative can then be differentiated again with respect to some other variable, and so on. Let us assume that we are allowed to perform *m* such differentiations, and then evaluate the resulting derivative at any point we wish. For example, let \(n=m=3\), and the function \(F(x,y, z)=xy^2z^3\). We can first differentiate it with respect to, say, *z*, to get \(3xy^2z^2\), then with respect to *x* to get \(3y^2z^2\), and then with respect to *z* again to get \(6y^2z\). Finally, we can substitute any values, say, \(x=1\), \(y=2\), \(z=3\), to get a numerical value \(6\cdot 2^2 \cdot 3 = 72\). For any function \(F:{\mathbb R}^n \rightarrow {\mathbb R}\), its \(C^m\) norm \(\Vert F\Vert _{C^m({\mathbb R}^n)}\) is the maximal possible absolute value of the number which we can get in this way after up to *m* differentiations. We can then minimize such a norm subject to \(F(x_i)=y_i\), \(i=1,\dots , N\).

**Approximate Fitting to the Dataset**

*M*is a variable which should be made as small as possible—the smaller

*M*, the better the function

*F*approximates our data \(y_i\).

**Can We Solve Problem (**9.7

**) Efficiently?**

In applications, the dimension *n* and parameter *m* are fixed, but *N* can be very large. Can we solve the optimization problem (9.7) efficiently? A theorem of Fefferman, which we discussed in Sect. 5.2, implies the existence of an algorithm solving (9.7) in time proportional to \(N^k\), where *k* is a large constant, which depends on *n* and *m*. Of course, even for \(k=10\) or \(k=15\), the running time \(N^k\) becomes impractical already for \(N=100\), while in applications *N* can be measured in millions.

It is very difficult, and most probably impossible, to solve problem (9.7) both efficiently and exactly. For practical purposes, an approximate solution is often sufficient. We say that an algorithm computes *the order of magnitude* of the solution to (9.7) if it returns output \(M'\) such that \(aM' \le M^* \le bM'\), where \(M^*\) is the optimal solution of (9.7), and *a* and *b* are some constants, depending only on *n* and *m*.

**An Efficient Algorithm for the Order of Magnitude**

The following theorem of Fefferman and Klartag [146] states that the order of magnitude of the solution to (9.7) can be computed very efficiently.

### Theorem 9.2

There exists an algorithm which computes the order of magnitude of the optimal value in (9.7) using at most \(CN \ln N\) operations and at most *CN* memory, where *C* is a constant which depends only on *m* and *n*.

In Theorem 9.2, “operations” means the usual operations with real numbers, such as addition, subtraction, multiplication, division, or comparison. It is assumed that all these operations are performed with perfect accuracy. “*CN* memory” means *CN* memory cells, each of which can store a real number, again with perfect accuracy. Of course, in reality any irrational real number has an infinite amount of digits, hence we need to round such numbers to store them, and all operations are subject to rounding errors. However, these issues are minor and were addressed in subsequent publications by the authors.

Also, Theorem 9.2 just computes (the order of magnitude of) *the optimal value* in the optimization problem (9.7). Of course, what we really need is to construct a function *F* for which this optimal value is achieved. In subsequent work [147], the authors developed an algorithm for this task as well.

At first, it is not clear how an algorithm can return a function. After all, a function is defined by its values at every point, and there are infinitely many points. In fact, Fefferman and Klartag’s algorithm works in two stages. At stage one, it takes input data (\(m,n,x_i, y_i,\sigma _i\)) and does some preprocessing. At stage 2, it takes any point \(x_0 \in {\mathbb R}^n\) as an input, and returns a polynomial which approximates *F* in a neighbourhood of the point \(x_0\).

### Reference

C. Fefferman and B. Klartag, Fitting a \(C^m\)-smooth function to data I, *Annals of Mathematics* **169**-1, (2009), 315–346.

## 9.3 A Helicoid-Like Surface with a Hole

**How “Curved” is a Curve?**

Given a curve, how do we measure how “curved” it is? For this, the concept of *curvature* is used. Intuitively, the curvature of any curve at any point is just the “speed of rotation” at this point, while you are travelling along the curve at unit speed. As a simple example, imagine you are travelling along a circle of radius *R* with unit speed. Then it is clear that your “speed of rotation” is the same throughout your journey. Because it takes you time \(2\pi R\) to rotate around the full angle of \(360^{\circ }\), or \(2\pi \) radians, your “speed of rotation” per unit of time is \(\frac{2\pi }{2\pi R}=\frac{1}{R}\). In other words, the curvature of a circle is the same at every point and is equal to \(\frac{1}{R}\). As another simple example, when you are travelling along a straight line, there is no rotation at all, hence the curvature is 0.

**The Curvature of a Parabola**

In general, of course, a curve may be straight, or almost straight, at some places, but be “curved a lot” elsewhere, so its curvature may vary from point to point. For example, let us estimate the curvature of the parabola \(y=x^2\) near some point \(x=x_0\). After some short time, you would travel from point \((x_0,x_0^2)\) to point \((x, x^2)\), where \(x=x_0+\varepsilon \) for some small \(\varepsilon \). Then \(x^2=(x_0+\varepsilon )^2 =x_0^2+2x_0\varepsilon +\varepsilon ^2 \approx x_0^2+2x_0\varepsilon = 2x_0 x - x_0^2\). Hence, the direction of your movement is along the line \(y=2x_0 x - x_0^2\), which is parallel to \(y=2x_0 x\). In other words, you move with angle \(\alpha \) with respect the *x*-axis, so that \(\tan \alpha = 2x_0\).

By the same logic, at the final point \((x_0+\varepsilon ,(x_0+\varepsilon )^2)\) your angle of movement \(\beta \) is such that \(\tan \beta = 2(x_0+\varepsilon )\). By the trigonometric formula, \(\tan (\beta - \alpha ) = \frac{\tan \beta - \tan \alpha }{1 + \tan \beta \tan \alpha } = \frac{2\varepsilon }{1+2(x_0+\varepsilon )\cdot 2x_0} \approx \frac{2\varepsilon }{1+4x_0^2}\). Because \(\beta -\alpha \) is small, \(\beta - \alpha \approx \tan (\beta - \alpha ) \approx \frac{2\varepsilon }{1+4x_0^2}\). This is how much you have rotated yourself.

If we travel along the curve \(y=x^2\) from left to right, the direction of travel rotates counter-clockwise. If we travel along the curve \(y=-x^2\), the rotation at \(x=x_0\) has the same magnitude \(\left( \frac{2}{(1+4x_0^2)^{3/2}}\right) \) but opposite direction, clockwise, and, to emphasize this fact, we can say that in this case the curvature is negative and is equal to \(-\frac{2}{(1+4x_0^2)^{3/2}}\).

**The Curvature of Any Curve**

*derivative*of

*f*at \(x_0\), see Sect. 3.1. A calculation very similar to the one above suggests that the “speed of rotation” is given by the formula

*f*(that is, the derivative of the function \(f'(x)\)). In particular, for \(f(x)=x^2\) we have \(f'(x)=2x\) and \(f''(x)=2\), so that \(k=\frac{2}{(1+4x_0^2)^{3/2}}\), confirming the calculation above. In fact, we can now forget about the initial semi-formal discussion and just use Eq. (9.8) as the definition of the curvature of any curve which is the graph of a twice differentiable function

*f*. For example, the

*catenary curve*is defined by the equation

*hyperbolic cosine*of

*t*, and \(a>0\) is the parameter, see Fig. 9.3b for some examples of graphs of catenary curves. The catenary has a physical interpretation as “the curve that an idealized cable assumes under its own weight when supported only at its ends”. Substitution of \(f(x)=a \cosh \left( \frac{x}{a}\right) \) into (9.8) yields the curvature \(k=\frac{1}{a \cosh (x/a)} = \frac{1}{f(x)}\).

**How “Curved” Is a Surface?**

How do we measure “how curved” a two-dimensional surface *S* is in \({\mathbb R}^3\)? Well, at any point *X* of the surface *S*, we can build a vector perpendicular to it, choose a plane containing this vector (called a “normal plane”), and measure the curvature at *X* of the curve which is the intersection of the surface and the plane. For example, if *S* is a sphere with radius *R*, then the intersection of any normal plane with *S* is just a circle of radius *R*, and its curvature is 1 / *R*.

In general, however, the answer depends on the choice of normal plane. For example, if *S* is an infinite cylinder with base circle having radius *R*, and *S* is any point on *S*, then one normal plane intersects *S* on a circle with radius *R* and curvature 1 / *R*, while another one intersects *S* on two parallel lines with curvature 0. We can also construct “intermediate” normal planes intersecting the cylinder in an ellipse, see Fig. 9.3c, and the curvature at *X* would be between 0 and 1 / *R*.

**Principal, Mean, Gaussian, and Total Curvatures**

In general, the minimal and maximal curvatures at *X* over all choices of normal plains are denoted \(k_1\) and \(k_2\) and called the *principal curvatures* of *S* at *X*. Their mean \(H=(k_1+k_2)/2\) is called the *mean curvature*, while their product \(K=k_1k_2\) is called the *Gaussian curvature* of *S* at *X*. The integral of the Gaussian curvature over the whole surface is called the *total curvature* of *S*. For example, if *S* is a sphere of radius *R*, the principal curvatures are \(k_1=k_2=1/R\), the mean curvature is also 1 / *R*, the Gaussian curvature is \(1/R^2\), and the total curvature is (Gaussian curvature)\(\cdot \)(Surface volume)\(\,=(1/R^2)(4\pi R^2)=4\pi \). If *S* is a cylinder, the principal curvatures are 0 and 1 / *R*, the mean curvature is 1 / 2*R*, the Gaussian curvature is 0, and hence the total curvature is 0 as well.

To obtain a cylinder, we can take a line \(y=R\) in the *x*-*y* coordinate plane, and rotate it in three-dimensional space around the *x*-axis. If, instead of a line, we rotate a catenary curve \(y = a \cosh \left( \frac{x}{a}\right) \), the corresponding surface is called a *catenoid*, see Fig. 9.3d. For a point \(X=(x,a \cosh (x/a), 0)\) on the catenoid *S*, one normal plane is the *x*-*y* coordinate plane, which intersects *S* at a catenary curve, whose curvature at *X* is \(\frac{1}{a \cosh (x/a)}\). One can show that this is the maximal possible, and the minimal possible is \(-\frac{1}{a \cosh (x/a)}\). Hence, in this case, the principal curvatures are \(\pm \frac{1}{a \cosh (x/a)}\), hence the mean curvature is identically 0. A surface with mean curvature identically 0 is called a *minimal surface*, see Sect. 5.3 for an alternative definition of this concept and a detailed discussion. A trivial example of a minimal surface is the plane, while the catenoid is the first non-trivial example, found by Euler in 1744. The Gaussian curvature of the catenoid is \(-\frac{1}{a^2 \cosh ^2(x/a)}\), and its total curvature turns out to be \(-4\pi \).

**Properly Embedded Curves and Surfaces**

A plane curve is a set of points (*x*, *y*) in the Euclidean plane \({\mathbb R}^2\) such that \(x=x(t), y=y(t), t\in I\), where *x*(*t*) and *y*(*t*) are continuous functions, and *I* is some (finite or infinite) interval of the real numbers. An example of a curve is the parabola \(x=t,\, y=t^2, \, t \in {\mathbb R}\). A curve is called *simple* if it has no self-intersection, that is, for all \(t_1,t_2 \in I\), if \(x(t_1)=x(t_2)\) and \(y(t_1)=y(t_2)\) then \(t_1=t_2\). For example, the parabola is a simple curve, while the curve \(x(t)=t^3-t,\, y(t)=t^2, \, t \in {\mathbb R}\) is not simple, because \((x(-1), y(-1))=(x(1), y(1))=(0,1)\). Another example of a simple curve is \(x(t)=\frac{\sin t}{t}, \, y(t)=\frac{\cos t}{t}, \, 0<t<+\infty \), known as a *hyperbolic spiral*. If \(t\rightarrow +\infty \), this curve winds around (0, 0), approaches it, but never reaches it. The part of the curve corresponding to the infinite interval \(1\le t < +\infty \) is contained in the bounded closed unit disk \(\{(x, y):x^2+y^2\le 1\}\).

A curve \(C \subset {\mathbb R}^2\) (or a surface \(S \subset {\mathbb R}^3\)) is called *properly embedded* in \({\mathbb R}^2\) (respectively, \({\mathbb R}^3\)), if it has no self-intersections, and its intersection with any compact subset of \({\mathbb R}^2\) (respectively, \({\mathbb R}^3\)) is compact. Intuitively, this means that no “infinite” part of the curve or surface is contained in any finite region. For example, the parabola is properly embedded in \({\mathbb R}^2\), while the hyperbolic spiral is not, because the ‘infinitely long” part of the spiral is contained in a small region around (0, 0). Planes and catenoids are examples of properly embedded surfaces in \({\mathbb R}^3\).

**The Helicoid and Its “Generalizations”**

*helicoid*. This is the surface given by

*s*,

*t*are real parameters, ranging from \(-\infty \) to \(\infty \), see Fig. 9.3e, and also Sect. 5.3. Unlike planes and cateniods, the helicoid has

*infinite*total curvature.

A surface *S* is said to have *finite topology* if it is homeomorphic to a compact surface with a finite number of points removed (that is, it can be obtained from such a surface via a continuous transformation, see Sect. 8.1 for more details). Since the proof that the helicoid is a minimal surface in 1776, a lot of minimal surfaces have been discovered, but none of them had finite topology and infinite total curvature, and it was an important open question whether such a surface exists, besides the helicoid. This question was resolved positively in 2009.

### Theorem 9.3

([398]) There exists a properly embedded minimal surface in \(\mathbb {R}^3\) with finite topology and infinite total curvature, which is not a helicoid.

An example of a surface satisfying the conditions of Theorem 9.3 has got a name: the “embedded genus-one helicoid”. It looks like a helicoid with a hole, see Fig. 9.3f.

### Reference

M. Weber, D. Hoffman and M. Wolf, An embedded genus-one helicoid, *Annals of Mathematics* **169**-2, (2009), 347–448.

## 9.4 Bounding the Condition Number of Random Discrete Matrices

**Linear Functions of One and Two Variables**

A function \(f:{\mathbb R} \rightarrow {\mathbb R}\) is called linear if \(f(x+y)=f(x)+f(y)\) for all \(x, y \in \mathbb R\). With \(y=0\), this implies \(f(x+0)=f(x)+f(0)\), hence \(f(0)=0\). With \(y=-x\), we get \(0=f(0)=f(x+(-x))=f(x)+f(-x)=0\), hence \(f(-x)=-f(x)\) for all *x*. Such functions are called *odd* functions.

With \(y=x\), we get \(f(2x)=f(x)+f(x)=2f(x)\). Then \(f(3x)=f(2x)+f(x)=3f(x)\), and, by induction, \(f(nx)=nf(x)\) for all *x* and all non-negative integers *n*. Because *f* is an odd function, this implies that in fact \(f(nx)=nf(x), \, \forall x \in {\mathbb R}\), for all integers *n*.

If \(f(1)=a\), and \(m, n\ne 0\) are any integers, then \(f(m)=f(m \cdot 1)=m \cdot f(1)=ma\), hence \(ma = f(m) = f\left( n\cdot \frac{m}{n}\right) = n f\left( \frac{m}{n}\right) \), hence \(f\left( \frac{m}{n}\right) = \frac{m}{n} a\). In other words, \(f(x)=ax\) for all rational numbers *x*. If we also assume that *f* is continuous, this implies that \(f(x)=ax\) for all \(x \in {\mathbb R}\). For example, \(f(x)=2x\) and \(f(x)=x/2\) are linear functions.

*x*,

*y*) into another pair (

*u*,

*v*), is called linear if \(f(x_1+x_2, y_1+y_2)=f(x_1,y_1)+f(x_2,y_2)\). By an argument similar to the one above, one can prove that any linear continuous

*f*has the form \(f(x, y)=(ax+by, cx+dy)\) for some real coefficients

*a*,

*b*,

*c*,

*d*. For example, \(f(x, y)=(x+y, -x-y)\) and \(f(x, y)=(x+y, x-y)\) are linear functions.

**Stretching and Contraction**

A linear function *f* is called a stretching if \(|f(x)|>|x|\) for all *x*, and a contraction if \(|f(x)|<|x|\) for all *x*. In the one-variable case, \(f(x)=ax\) is a stretching if \(|a|>1\) and a contraction if \(|a|<1\). For functions of two variables, the situation may be more involved. For example, the function \(f(x, y)=(x+y, -x-y)\) is, geometrically, a composition of a projection, clockwise rotation, and a homothetic transformation with coefficient 2, see Fig. 9.4a, and it can stretch some vectors and contract others. For example, it sends the vector (1, 1) to \((1+1,-1-1)=(2,-2)\). The length of (1, 1) is \(|(1,1)|=\sqrt{1^2+1^2}=\sqrt{2}\), while the length of *f*(1, 1) is \(|f(1,1)|=\sqrt{2^2+(-2)^2}=2\sqrt{2}\), hence the vector (1, 1) has been stretched twice. On the other hand, \(f(1,-1)=(1-1, -1-(-1))=(0,0)\), that is, the non-zero vector (1, 1) has been contracted to 0.

Let \(\sigma (f)\) denote the minimal possible ratio of \(\frac{|f(z)|}{|z|}\) over all \(z=(x, y)\) with \(|z|=\sqrt{x^2+y^2} \ne 0\).

**The Role of** \(\sigma (f)\) **in the Task of Inverting** **f**

Given a function *f* and the value *f*(*x*, *y*), can we uniquely determine *x* and *y*? For the function \(f(x, y)=(x+y, x-y)\), this is always possible. For example, if \(f(x, y)=(3,-1)\), then \(x+y=3\) and \(x-y=-1\), which can be easily solved to get \(x=1\), \(y=2\). In general, it is easy to prove that uniquely “restoring” (*x*, *y*) given *f*(*x*, *y*) is always possible if \(\sigma (f)>0\). However, for a function with \(\sigma (f)=0\) this procedure does not work. For example, for the function \(f(x, y)=(x+y, -x-y)\) assume that \(f(x, y)=(3,-1)\). Then \(x+y=3\) and \(-x-y=-1\). However, \(x+y=3\) implies \(-x-y=-3 \ne -1\), a contradiction. On the other hand, if \(f(x, y)=(1,-1)\), then \(x+y=1\) and \(-x-y=-1\) which is possible for many different pairs *x*, *y*, for example, \(x=0\) and \(y=1\) or \(x=2\) and \(y=-1\), etc.

For this reason, functions *f* with \(\sigma (f)=0\) are “unpleasant” in applications. Also, if \(\sigma (f)>0\) but is very small, then the “restoring” procedure above is possible, but may be difficult to compute. Hence, the ideal situation is when we can prove that \(\sigma (f)\) is not 0, and moreover is “reasonably far away” from 0.

**Estimating the Proportion of “Bad” Functions**

How many “good” and “bad” functions are there? For simplicity, assume that the coefficients *a*, *b*, *c*, *d* in the formula \(f(x, y)=(ax+by, cx+dy)\) can be either \(+1\) or \(-1\). Because there are two options for each coefficient, there are \(2^4=16\) such functions in total. An easy (but boring) computation shows that exactly 8 of them (including \(f(x, y)=(x+y, -x-y)\) discussed above) have \(\sigma (f)=0\), and another 8 (including \(f(x, y)=(x+y, x-y)\)) have \(\sigma (f)=\sqrt{2}\). In other words, if we select such a function *f* at random, we get \(\sigma (f)=0\) with probability \(\frac{8}{16}=0.5\), and \(\sigma (f)=\sqrt{2}\) with probability \(\frac{8}{16}=0.5\) as well.

*n*-tuples \((x_1, x_2, \dots , x_n)\) into \((y_1, y_2, \dots , y_n)\). If such \(A_n\) is linear and continuous, then

*i*-th row and

*j*-th column, and then \(A_n\) is called an \(n \times n\)

*matrix*. Once again, assume that each \(a_{ij}\) is either \(+1\) or \(-1\). Because there are \(n^2\) coefficients, we have \(2^{n^2}\) such functions/matrices \(A_n\) in total. A famous result of Kahn et al [213] states that the proportion of those matrices having \(\sigma (A_n)=0\) is at most \(0.999^n\). This result is not useful for small

*n*(for \(n=2\), it gives the estimate \(0.999^2 \approx 0.998\) while we know that the true proportion is 0.5), but, for large

*n*, the expression \(0.999^n\) decreases rapidly. For example, while the bound \(0.999^{1{,}000} \approx 0.37\) is still not very useful, the bound \(0.999^{10{,}000} \approx 0.00005\) for \(n=10{,}000\) is already good, while the bound \(0.999^{100{,}000} \approx 3.5 \cdot 10^{-44}\) for \(n=100{,}000\) is much better than needed for any practical purposes. In fact, in later work [377] the bound \(0.999^n\) has been improved to approximately \(0.5^n\), which already gives an excellent estimate \(0.5^{30} \approx 9.3 \cdot 10^{-10}\) for \(n=30\).

**How Often Is** \(\sigma (f)\) **Small?**

However, as mentioned above, a function/matrix \(A_n\) is practically unpleasant for the inversion procedure even if \(\sigma (A_n)\) is positive but small. For concreteness, let us agree that by “small” we mean smaller that \(\frac{1}{n^B}=n^{-B}\) for some constant \(B>0\). So, the problem is to find a good upper bound for the proportion of matrices \(A_n\) with \(\sigma (A_n) < n^{-B}\), or, equivalently, for the probability \(P(\sigma (A_n) < n^{-B})\) that a randomly selected \(A_n\) has small \(\sigma (A_n)\).

A theorem of Rudelson [329], discussed in Sect. 8.11, implies that \(P(\sigma (A_n) < C \varepsilon n^{-3/2}) \le \varepsilon \) holds for sufficiently large *n* and for any \(\varepsilon >c/\sqrt{n}\), where *C* and *c* are some constants. This is significant progress, but the condition \(\varepsilon >c/\sqrt{n}\) is restrictive in some important applications. Even for large *n* like \(n=10{,}000\), \(1/\sqrt{n}\) is 0.01, hence (assuming for simplicity that \(c=1\)) the above theorem works only for \(\varepsilon >0.01\), and provides no more than \(99\%\) warranty that \(\sigma (A_n)\) is small.

This was the state of the art before the following theorem was proved by Terence Tao and Van Vu [372].

### Theorem 9.4

*A*, there is a positive constant

*B*such that for any sufficiently large

*n*

The importance of Theorem 9.4 is that it works for any \(A>0\). For example, selecting \(A=10\) we get an estimate \(n^{-10}\) for the proportion of “unpleasant” matrices \(A_n\). For \(n=10{,}000\), this gives^{1} a chance of just \(10{,}000^{-10}=10^{-40}\) for \(\sigma (A_n)\) to be “small”, which is a much better probability guarantee than in Rudelson’s result.

### Reference

T. Tao and V. Vu, Inverse Littlewood-Offord theorems and the condition number of random discrete matrices, *Annals of Mathematics* **169**-2, (2009), 595–632.

## 9.5 Characterizing the Legendre Transform of Convex Analysis

**Convex Sets and Functions**

A region *S* in the plane is called *convex* if it contains the straight line segment *AB* whenever points *A* and *B* belong to *S*, see Fig. 9.5a. For example, any disk or the area bounded by a triangle is a convex region. On the other hand, if *ABC* is a triangle, and *X* is any point strictly inside it, then the area bounded by the quadrilateral *ABXC* is non-convex, because it contains points *B* and *C* but not the line segment *BC*, see Fig. 9.5b.

*x*,

*y*) in the coordinate plane such that \(y \ge x^2\). In general, for any function \(f:{\mathbb R}\rightarrow {\mathbb R}\), the set of all points (

*x*,

*y*) such that \(y \ge f(x)\) is called the

*epigraph*of

*f*, and a function

*f*is called

*convex*if its epigraph is a convex set. Equivalently, a function

*f*is convex if

For readers familiar with the concept of ‘derivative’ there is a much simpler proof that \(f(x)=x^2\) is a convex function. The derivative of *f* is \(f'(x)=2x\), and the second derivative is \(f''(x)=2\). There is a theorem that if \(f''(x)\) exists and is positive for all \(x \in {\mathbb R}\), then *f* is a convex function.

**Convex Sets as Intersections of Half-Planes**

Another example of a convex region is the set of all points (*x*, *y*) in the coordinate plane such that \(y \ge |x|\), where \(|\cdot |\) denotes the absolute value. This is the epigraph of the convex function \(f(x)=|x|\). This function is not differentiable at 0, but its convexity easily follows directly from (9.9).

The inequality \(y\ge |x|\) can be equivalently written as “\(y\ge x\) and \(y \ge -x\)”. Geometrically, the set of points (*x*, *y*) satisfying an inequality of the form \(y \ge ax+b\) for some constants *a*, *b* is a half-plane. Hence, the epigraph of the function \(f(x)=|x|\) is the intersection of the half-planes \(y\ge x\) and \(y \ge -x\), see Fig. 9.5c. Representing a set as an intersection of half-planes is extremely convenient in optimization, where linear inequalities are the easiest to deal with.

Can the epigraph \(S=\{(x,y)\,|\, y\ge x^2\}\) of the function \(f(x)=x^2\), see Fig. 9.5d, be written as an intersection of half-planes? This looks unlikely, because the intersection of any finite number of half-planes has a piecewise linear boundary, while in our case the boundary is smooth. However, what if we allow an *infinite* number of half-planes? A half-plane \(H(a,b)=\{(x,y)\,|\, y\ge ax+b\}\) contains *S* if \(x^2\ge ax+b\) for all *x*. For example, the half-plane \(H(1,0)=\{(x,y)\,|\, y\ge x\}\) does not contain *S* because the inequality \(f(x)=x^2 \ge x\) does not hold for, say \(x=0.5\). On the other hand, \(H(1,-1/4)=\{(x,y)\,|\, y\ge x\}\) contains *S*, because the inequality \(x^2\ge x-1/4\) is equivalent to \((x-1/2)^2 \ge 0\) and is valid for all *x*. In fact, *H*(1, *b*) contains *S* if and only if \(b \le -1/4\). More generally, *H*(*a*, *b*) contains *S* if and only if \(b \le -a^2/4\). Indeed, if \(b \le -a^2/4\), or \(0 \le -b-a^2/4\), then the inequality \(x^2\ge ax+b\) can be written as \(x^2-ax+a^2/4-a^2/4-b \ge 0\), or \((x-a/2)^2+(-b-a^2/4) \ge 0\), which clearly holds for all *x*. On the other hand, if \(b > -a^2/4\), the inequality \(x^2\ge ax+b\) fails for \(x=-a/2\). In summary, the set \(S=\{(x,y)\,|\, y\ge x^2\}\) is the intersection of half-planes \(H(a,-a^2/4)\), and the inequality \(y \ge x^2\) is equivalent to an infinite number of linear (in *x*) inequalities \(y \ge ax+(-a^2/4)\), \(a \in {\mathbb R}\).

**The Legendre Transform and Its Properties**

*f*(

*x*) if and only if

*x*, or, equivalently, if and only if \(b \le -\phi (a)\), where the function \(\phi :{\mathbb R}\rightarrow {\mathbb R}\) is defined by

*f*is a convex function,

*S*is the intersection of half-planes \(H(a,-\phi (a))\), \(a \in {\mathbb R}\), and the non-linear inequality \(y \ge f(x)\) is equivalent to an infinite number of linear (in

*x*) inequalities \(y \ge ax - \phi (a)\), \(a \in {\mathbb R}\).

*Legendre transform*of the function

*f*, and we write \(\phi =Lf\). Unfortunately, the Legendre transform is not always well-defined as a finite-valued function. For example, if \(f(x)=x\), and \(a=3\), then the expression \(ax-f(x)=3x-x=2x\) can be arbitrarily large for large

*x*. In this case, we put \(\phi (3)=+\infty \). In general, we may allow our functions

*f*and \(\phi \) to take infinite values, and then the Legendre transform is always well-defined. Moreover, the Legendre transform

*Lf*of any convex function

*f*is always a convex function itself. In addition, the Legendre transform has some other useful properties, for example

- (P1)
\(LLf=f\) for any convex function

*f*; - (P2)
\(f \le g\) implies \(Lf \ge Lg\).

In (P1), by *LLf* we mean “the Legendre transform of the Legendre transform of *f*”, while in (P2) by \(f \le g\) we mean \(f(x) \le g(x)\) for all *x*. For example, we have proved above that the Legendre transform of the function \(f(x)=x^2\) is the function \(\phi (a)=a^2/4\). By absolutely the same argument, we can prove that the Legendre transform of \(f(x)=Cx^2\) is \(\phi (a)=a^2/4C\) for any constant *C*. With \(C=1/4\), this implies that the Legendre transform of \(f(x)=x^2/4\) is \(\phi (a)=a^2\), in agreement with (P1). With \(C=2\), this implies that the Legendre transform of \(g(x)=2x^2\) is \(\psi (a)=a^2/8\). Note that we have \(x^2 \le 2x^2\) for all *x*, but \(a^2/4 \ge a^2/8\) for all *a*, in agreement with (P2).

**The Legendre Transform of Multivariate Convex Functions**

*several*variables, for example, \(f(x, y)=x^2+y^2\), and, more generally, \(f(x_1,x_2,\dots , x_n)=x_1^2+x_2^2+\dots +x_n^2\) are convex functions. A function \(f:{\mathbb R}^n\rightarrow {\mathbb R}\) is convex if and only if its epigraph \(S=\{(x_1,\dots , x_n, y)\,|\, y\ge f(x_1,x_2,\dots , x_n)\}\) is a convex subset of \({\mathbb R}^{n+1}\). The inequality \(y\ge f(x_1,x_2,\dots , x_n)\) can again be represented as an infinite number of linear inequalities \(y \ge \langle a, x \rangle - \phi (a)\), \(a=(a_1, \dots , a_n) \in {\mathbb R}^n\), where \(\langle a, x \rangle \) is the

*inner product*defined as \(\langle a, x \rangle = a_1x_1+a_2x_2+\dots +a_nx_n\), and the function \(\phi :{\mathbb R}^n\rightarrow {\mathbb R}\) is defined by

*f*. This trick is extremely useful in convex optimization.

**The Characterization of the Legendre Transform**

The theorem below, proved in [20], shows that the Legendre transform is, up to linear terms, *the only* transformation which has the useful properties (P1) and (P2). To formulate it, we need a few more definitions. A set \(S \subset {\mathbb R}^n\) is called *closed* if \(x_n \in S, \, \forall n\) and \(\lim \limits _{n \rightarrow \infty } x_n = x\) implies that \(x \in S\). For example, [0, 1] is a closed set, while (0, 1] is not, because it contains a sequence \(x_n=1/n, \, n=1,2,\dots \), but not its limit point 0. A function \(f:{\mathbb R}^n\rightarrow {\mathbb R}\cup \{\pm \infty \}\) is called *lower-semicontinuous* if its epigraph is a closed set in \({\mathbb R}^{n+1}\). Let \({\mathscr {C}}({\mathbb R}^n)\) be the set of all lower-semi-continuous convex functions \(f:{\mathbb R}^n\rightarrow {\mathbb R}\cup \{\pm \infty \}\). A transformation \(B:{\mathbb R}^n \rightarrow {\mathbb R}^n\), sending \((x_1, x_2, \dots x_n)\) to \((y_1, y_2, \dots , y_n)\), is called *linear* if \( y_j = b_{1j}x_1 + b_{2j}x_2 + \dots + b_{nj}x_n\), \(j=1,2,\dots , n, \) for some real coefficients \(b_{ij}, \, i=1,2,\dots , n, \, j=1,2,\dots , n\), *symmetric* if \(b_{ij}=b_{ji}, \, \forall i, j\), and *invertible* if \(B(x)\ne 0\) whenever \(x \ne ~0\).

### Theorem 9.5

*T*is essentially the Legendre transform

*L*. Namely, there exists a constant \(C_0 \in {\mathbb R}\), a vector \(v_0\in {\mathbb R}^n\), and an invertible symmetric linear transformation

*B*such that

### Reference

S. Artstein-Avidan and V. Milman, The concept of duality in convex analysis, and the characterization of the Legendre transform, *Annals of Mathematics* **169**-2, (2009), 661–674.

## 9.6 The Solution of the Ten Martini Problem

**Operators and Operations Between Them**

Familiar functions like \(f(x)=x\), \(f(x)=2x\), or \(f(x)=x^2\), map real numbers to real numbers. In geometry, we study motions of the plane, like rotations or reflections, which can be viewed as functions which map points of the plane to other points. If each point is given by two real coordinates, such functions map pairs of real numbers into pairs. For example, the function \(f(x, y)=(-x,-y)\) represents reflection with respect to the point (0, 0).

Here, we consider functions that map infinite sequences of numbers to infinite sequences. Such functions are called *operators*. Every operator *T* takes an infinite sequence \(x=(x_1, x_2, x_3, \dots )\) as an input, and transforms it into another infinite sequence \(y=(y_1, y_2, y_3, \dots )\), which we will also denote by *T*(*x*). The simplest operator, usually denoted by *I*, is the identity operator, which sends every sequence to itself, that is, \(I(x)=x\) for all sequences *x*. A little less trivial is the multiplication by constant operator, which just multiplies every term of the sequence by the same constant \(\lambda \), that is, transforms every infinite sequence \(x=(x_1, x_2, x_3, \dots )\) into the sequence \(\lambda x = (\lambda x_1, \lambda x_2, \lambda x_3, \dots )\). Another example is the shift operator, which we denote by *S*, which transforms every sequence \(x=(x_1, x_2, x_3, \dots )\) into the sequence \(S(x)=(0, x_1, x_2, x_3, \dots )\).

Operators can be added together and multiplied by constants. The product of any operator *T* and constant \(\lambda \in {\mathbb R}\) is an operator, denoted \(\lambda T\), which maps every sequence *x* into the sequence \(\lambda T(x)\). For example, \(\lambda I\) is just the “multiplication by \(\lambda \)” operator, while \(\lambda S\) is the operator transforming every sequence \((x_1, x_2, x_3, \dots )\) into the sequence \((0, \lambda x_1, \lambda x_2, \lambda x_3, \dots )\). The sum of two operators \(T_1\) and \(T_2\), denoted \(T_1+T_2\), is the operator which maps every sequence *x* to the sequence \(T_1(x)+T_2(x)\), where the sequences are added element-wise. The difference \(T_1-T_2\) is just \(T_1+(-1)\cdot T_2\). For example, the operator \(\lambda I - S\) maps every sequence \((x_1, x_2, x_3, \dots )\) to the sequence \((\lambda x_1, \lambda x_2 - x_1, \lambda x_3 - x_2, \dots )\).

**The Norm of an Infinite Sequence**

For any vector in the plane with coordinates (*x*, *y*) its length is \(\sqrt{x^2+y^2}\); for a vector (*x*, *y*, *z*) in three-dimensional space the length is \(\sqrt{x^2+y^2+z^2}\). Can we define the “length” of an infinite sequence in a similar way? For some sequences, like \((1,2,3,\dots )\), this seems to be difficult, but for others, like \((1, 1/2, 1/4, 1/8, \dots )\), a similar formula works. If we square all the “coordinates” and add them together, we get the expression \(1^2+(1/2)^2+(1/4)^2+(1/8)^2+\dots \), which is the same as \(1+1/4+(1/4)^2+(1/4)^3+\dots \). In general, for any *q*, the sum \(1+q+q^2+\dots +q^n\) is equal to \(\frac{1}{1-q}-\frac{q^n}{1-q}\). If \(q\in (0,1)\), the term \(\frac{q^n}{1-q}\) becomes smaller and smaller, hence the whole sum becomes closer and closer to \(\frac{1}{1-q}\). In this case, we say that the infinite sum \(1+q+q^2+\dots \) *converges* to \(\frac{1}{1-q}\) and write \(1+q+q^2+\dots = \frac{1}{1-q}\). In our case, \(q=4\), and \(1+1/4+(1/4)^2+(1/4)^3+\dots = \frac{1}{1-1/4}=\frac{4}{3}\), hence the infinite sequence \((1, 1/2, 1/4, 1/8, \dots )\) has finite “length” \(\sqrt{\frac{4}{3}}\). In general, the set of all sequences \(x=(x_1, x_2, x_3, \dots )\) with finite sum \(x_1^2+x_2^2+x_3^2+\dots \) is denoted \(l^2\), and the square root of this sum is denoted \(\Vert x\Vert \) and is called the *norm* of *x*. For example, the sequence \((1, 1/2, 1/4, 1/8, \dots )\) belongs to the set \(l^2\), while the sequence \((1,2,3,\dots )\) does not.

**Bounded Linear Operators on** \(l^2\)

All the operators *T* considered above have the property that if the input *x* belongs to \(l^2\), then so does the output *T*(*x*). For example, if the sum \(x_1^2+x_2^2+x_3^2+\dots \) converges to some finite number *A*, then the sum \((\lambda x_1)^2+(\lambda x_2)^2+(\lambda x_3)^2+\dots \) converges to a finite number \(\lambda ^2 A\), hence \(x \in l^2\) implies that \(\lambda x \in l^2\). Also, if \(x_1^2+x_2^2+x_3^2+\dots \) converges to *A*, then \(0^2+x_1^2+x_2^2+x_3^2+\dots \) converges to *A* as well. In other words, \(x \in l^2\) implies that \(S(x) \in l^2\). From now on, we consider only operators *T* such that \(T(x) \in l^2\) whenever \(x \in l^2\).

An operator *T* is called *linear* if \(T(x+y)=T(x)+T(y)\) for all sequences \(x, y\in l^2\). It is easy to check that operators *I*, \(\lambda I\), and *S* are linear. A linear operator is called *bounded* if there is a constant \(M>0\) such that \(\Vert T(x)\Vert \le M \Vert x\Vert \) for all sequences \(x \in l^2\). For example, operators \(\lambda I\) and *S* satisfy this property with \(M=|\lambda |\) and \(M=1\), respectively.

**The Spectrum of a Bounded Operator**

An operator *T* is called *invertible* if for any sequence \(y \in l^2\) there exists a unique sequence \(x \in l^2\) such that \(T(x)=y\). For example, the operator *I* is trivially invertible, with \(x=y\). More generally, the operator \(\lambda I\) is invertible for every \(\lambda \ne 0\), with \(x=(1/\lambda )y\). On the other hand, the operator *S* is not invertible, because for any sequence \(y=(y_1, y_2, y_3, \dots )\) with \(y_1 \ne 0\) there is no *x* with \(S(x)=y\).

What about the operator \(\lambda I - S\), mapping \((x_1, x_2, x_3, \dots )\) to \((\lambda x_1, \lambda x_2 - x_1, \lambda x_3 - x_2, \dots )\)? For \(\lambda = 0\), it reduces to \(-S\) and is not invertible. For \(\lambda \ne 0\) and any \(y=(y_1, y_2, y_3, \dots )\), the equation \(T(x)=y\) implies that \(\lambda x_1 = y_1\), \(\lambda x_2 - x_1 = y_2\), \(\lambda x_3 - x_2 = y_3\), and so on. From the first equation, \(x_1 = y_1 / \lambda \); from the second one, \(x_2=(y_2+x_1)/\lambda = y_2/\lambda + y_1/\lambda ^2\); from the third one, \(x_3=y_3/\lambda + y_2/\lambda ^2 + y_1/\lambda ^3\), and so on. Continuing in this way, we can restore \(x=(x_1, x_2, x_3, \dots )\) uniquely, hence \(\lambda I - S\) is invertible for any \(\lambda \ne 0\).

In general, the set of all real numbers \(\lambda \) such that the operator \(\lambda I - T\) is *not* invertible is called the *spectrum* of a bounded operator *T*. For example, we have just proved that the spectrum of the shift operator *S* consists of one number \(\lambda = 0\). It is also easy to see that the spectrum of *I* is one number \(\lambda = 1\). In general, however, the spectrum can have a much more complicated structure, and the study of the spectral properties of linear operators is an important area of mathematical research.

**The Almost Mathieu Operator and Its Spectrum**

*almost Mathieu operator*, which depends on three real parameters \(\lambda \ne 0\), \(\alpha \), and \(\theta \), and transforms every sequence \(x=(x_1, x_2, x_3, \dots )\) into the sequence \(y=(y_1, y_2, y_3, \dots )\) according to the formula

The study of the spectrum of this operator, motivated by physical applications, has kept mathematicians busy for several decades. Experiments shows that it has a complicated structure. For example, if one fixes \(\lambda \) and \(\theta \), and depicts the spectrum of the almost Mathieu operator for various \(\alpha \), one usually gets a fractal-like picture as in Fig. 9.6.

If \(\alpha \) is a rational number (that is, \(\alpha =p/q\) for some integers *p*, *q*), then the spectrum consists of the union of *q* intervals. For irrational \(\alpha \), it was conjectured that the spectrum is a so-called *Cantor set*. The most well-known example of a Cantor set is when you start with the interval [0, 1], remove the middle third (1 / 3, 2 / 3), then remove middle thirds (1 / 9, 2 / 9) and (7 / 9, 8 / 9) of the remaining intervals [0, 1 / 3] and [2 / 3, 1], then the middle thirds of the remaining four intervals, and so on, up to infinity. The set *C* of all points which survives is a Cantor set, see Sect. 1.4 for a more detailed discussion. Of course, we have some flexibility in this construction, e.g. we can remove some other fixed proportion of every interval at every step, or even different proportions at every step, etc. However, all the sets constructed in this way are *homeomorphic*, that is, for every pair of them, there is a continuous invertible function which maps one set into the other. In general, a Cantor set is any set homeomorphic to the one described above.

The conjecture that, for every irrational \(\alpha \), the spectrum of the almost Mathieu operator is a Cantor set, was proposed by Azbel [26] in 1964. In 1981, Mark Kac offered ten martinis for anyone who could prove or disprove it, and since then the problem has been known as “the Ten Martini Problem”. Despite many partial results for some special irrational \(\alpha \), the general case was open until 2009, when the final positive resolution by Artur Avila and Svetlana Jitomirskaya appeared [24].

### Theorem 9.6

The spectrum of the almost Mathieu operator is a Cantor set for all irrational \(\alpha \) and for all \(\theta \) and all \(\lambda \ne 0\).

### Reference

A. Avila and S. Jitomirskaya, The Ten Martini Problem, *Annals of Mathematics* **170**-1, (2009), 303–342.

## 9.7 A Linear Time Algorithm for Edge-Deletion Problems

**The Party Organization Problem**

In Sect. 5.11 we discussed the “party organization problem”: if some of your guests do not like each other, what is the minimal number of tables you need to be able to guarantee that no pair of enemies share a table? The correct mathematical language in which to study this problem is graph theory: we can represent the guests as points in the plane (vertices), and join any two vertices by a line (edge) if and only if the corresponding guests are enemies. We can also represent the tables as colours, and ask what is the minimal number of colours we need to colour the vertices of the graph in such a way that no two vertices connected by an edge have the same colour. A set of vertices, some of which are connected by edges, is called a *graph*, and the minimal number of colours in the problem described above is called the *chromatic number* of the graph. A graph with chromatic number at most *k* is called *k*-colourable.

In most restaurants, however, we have no control over the number of available tables. The restaurant may inform you that they have *k* big tables, where *k* is a small fixed number such as \(k=2\), and, in this case, it may be impossible to seat *every* pair of enemies at different tables. For example, if you have \(k=2\) tables and \(n=4\) guests Anna, Bob, Claire, and David, such that Anna and Bob dislike everyone else, including each other (but Claire and David are friends), you can start by putting enemies Anna and Bob at different tables, but then you need to put Claire either near Anna or near Bob, creating an unhappy pair of enemies at the same table. Moreover, you then need to put David either near Anna or near Bob as well, creating another unhappy pair.

While a perfect solution in this case is impossible, you can still do better than the way described above. Namely, you can put Anna and Bob at table 1, and Claire and David at table 2. In this case you still have a pair of enemies (Anna and Bob) seating near the same table, but at least you have *one* such pair, not *two*!

**Edge Removal and Graph Colouring**

*n*guests at

*k*tables such that the number of pairs of enemies at the same table is as small as possible. In the language of graph theory, we have a graph

*G*with

*n*vertices, and the problem is to colour the vertices in

*k*colours such that the number of edges which join vertices of the same colour is minimal. The same question can be formulated slightly differently: what is the minimal number of edges we should

*remove*from

*G*to make it

*k*-colourable? In the example above, we had a graph with vertices

*A*,

*B*,

*C*, and

*D*(A—Anna, B—Bob, C—Claire, D—David) and edges

*AB*,

*AC*,

*AD*,

*BC*,

*BD*, see Fig. 9.7a. This graph is not 2-colourable, but, after removing just one edge

*AB*, it becomes 2-colourable with vertices

*A*,

*B*coloured white, while

*C*and

*D*are coloured black. This colouring represents the way the guests should sit to create just one unhappy pair Anna-Bob—the pair whose edge was removed. In general, if the graph is

*k*-colourable after removing

*m*edges, then this colouring represents a guest distribution with exactly

*m*unhappy pairs.

**Removing Edges to “Kill” Triangles or Squares**

Similar problems in the form “What is the minimal number of edges we should remove from a graph to make it (something)?” arise in many subareas of graph theory and its applications. For example, a “triangle” is a triple of vertices *A*, *B*, *C*, such that all of them are connected by edges (that is, the graph contains edges *AB*, *BC*, and *CA*). A graph *G* is called *triangle-free* if it contains no triangles. For any graph *G* with triangles, we may ask what is the minimal number of edges we should remove from *G* to make it triangle-free. If a graph *G* contains *m* triangles, then removing *m* edges (one in each triangle) would surely work, but sometimes we can do better. For example, the Anna-Bob-Claire-David graph in Fig. 9.7b contains *two* triangles, *ABC* and *ABD*, but removing a single edge *AB* destroys them both, and makes it triangle-free after just one edge deletion.

In a similar way we may ask how many edges we should remove to make the graph *G* *square-free*, that is, containing no four vertices *A*, *B*, *C*, *D* such that *AB*, *BC*, *CD* and *DA* are edges in *G*. More generally, we may aim to avoid any fixed configuration like this.

**The General Edge Removal Problem for Monotone Properties**

In general, a property *P* of a graph is called *monotone* if it is preserved after removal of vertices and edges of *G*. For example, if a graph *G* is *k* colourable, then, after removing any vertex or any edge from it, it obviously remains *k* colourable, because the same colouring works. Similarly, if *G* is triangle-free, then after removing any vertex or edge from *G* it obviously remains triangle-free. Hence, the properties of being *k* colourable or being triangle-free are examples of monotone properties. The same is true for the property of being square-free, and for many other graph properties of theoretical and practical interest.

- (*)
Given a monotone property

*P*and arbitrary graph*G*, what is the minimal number of edge deletions needed to turn*G*into a graph satisfying*P*?

Problem (*) is very difficult to solve exactly. A naïve approach would be to just try all possible edge deletions, but this works only for small graphs. In a graph with *m* edges, there are *m* ways to delete the first edge, \(m-1\) ways to delete the second one, and so on, so there are not much less than \(m^k\) ways to delete *k* edges. For a large graph with \(m=1000\) edges, and \(k=10\), \(m^k=10^{30}\). Even for a supercomputer performing \(10^{16}\) operations per second, it would take more than three millions years to perform \(10^{30}\) operations. Moreover, for some graph properties *P* it is not easy even to check if the initial graph *G* satisfies *P*. For example, this is the case if *P* is the *k*-colourability property for \(k \ge 3\), because there are a huge number of possible colourings, and it could take ages to check if there is one that works.

**An Approximate Solution**

Because problem (*) is difficult to solve exactly, an important question is whether it is possible to efficiently find at least an *approximate* solution to it. This is what the following theorem of Noga Alon, Asaf Shapira, and Benny Sudakov [11] is about.

### Theorem 9.7

For any fixed \(\varepsilon > 0\) and any monotone property *P*, there is a constant *C* (depending on \(\varepsilon \) and *P*) and an algorithm which, given a graph *G* with *n* vertices and *m* edges, finds an approximate solution to (*) to within an additive error \(\varepsilon n^2\) after performing at most \(C(n+m)\) operations.

In other words, if the exact optimal answer to (*) is *k*(*P*, *G*), the algorithm will return an answer \(k'(P, G)\) such that \(|k'(P,G) - k(P, G)| \le \varepsilon n^2\). In many cases, this is a reasonable approximation. Indeed, the number of pairs of vertices of *G* is a bit less than \(n^2/2\) (the exact formula is \(n(n-1)/2\)). If about half of all pairs are connected by edges, then there are about \(m \approx n^2/4\) edges. If \(\varepsilon =0.001\) or so, then the error \(\varepsilon n^2\) is much less than the total number of edges. For example, if \(n=1000\) and \(m=n^2/4=250{,}000\), then the solution to (*) may be anything between 0 and 250, 000, while the algorithm outputs the number \(k'\), and guarantees that the answer is between \(k'-1000\) and \(k'+1000\).

Can we develop an algorithm with an even better approximation guarantee, e.g. with additive error proportional to \(n^{1.99}\) instead of \(n^2\), or at least to \(n^{2-\delta }\) for some \(\delta >0\)? The authors prove that this is possible if there is a 2-colourable graph that does *not* satisfy P. For example, this is the case for the property of being square-free, because the square itself (a graph with four vertices *A*, *B*, *C*, *D* such that *AB*, *BC*, *CD* and *DA* are edges) is clearly not square-free but is 2-colourable (to see this, colour *A* and *C* green and *B* and *D* blue).

On the other hand, if *P* is a property such that all 2-colourable graphs satisfy P (this is the case if *P* is the property of being *k*-colourable for \(k\ge 2\), or triangle-free), then the authors provide very strong evidence that, for any \(\delta >0\), no efficient algorithm with approximation guarantee \(n^{2-\delta }\) exists.

### Reference

N. Alon, A. Shapira, and B. Sudakov, Additive approximation for edge-deletion problems, *Annals of Mathematics* **170**-1, (2009), 371–411.

## 9.8 A Characterization of Stability-Preserving Linear Operators

**Polynomials and Their Roots**

*P*(

*x*) has degree

*n*. For example, polynomials of degree 0 are just constant functions \(P(x)=a_0\), \(a_0\ne 0\), polynomials of degree 1 are linear functions \(P(x)=a_1 x + a_0\), \(a_1 \ne 0\), polynomials of degree 2 are quadratic functions, \(P(x)=a_2 x^2 + a_1 x + a_0\), \(a_2 \ne 0\), and so on.

A (real) root of a polynomial *P*(*x*) is a real solution to the equation \(P(x)=0\). If the polynomial *P*(*x*) can be written as \(P(x)=(x-a)Q(x)\), where *Q*(*x*) is another real polynomial, then \(P(a)=(a-a)Q(a)=0\), hence *a* is a root of *P*(*x*). If, moreover, *P*(*x*) can be written as \(P(x)=(x-a)^kQ(x)\), then we say that *a* is a root of *P*(*x*) *of multiplicity* *k*, and then the convention is that the root *a* should be counted *k* times while counting the roots of *P*(*x*). For example, we say that the polynomial \(P(x)=(x-1)(x-3)^2\) has *three* roots: 1, 3, and 3 again.

**Stable Polynomials**

Polynomials of degree 0, 1, and 2 have at most 0, 1, and 2 real roots, respectively. This is not a coincidence. There is an (easy) mathematical theorem stating that *any* polynomial of degree *n* has at most *n* real roots. Polynomials of degree *n* which have *n* real roots (that is, the maximal possible number of roots), are called *stable*, or *hyperbolic*. By convention, we consider the special polynomial \(P(x)=0\) to be stable as well. For example, the polynomial \(P(x)=2x-3\) is stable, because its degree is \(n=1\), and it has one root \(x=3/2\). The polynomial \(P(x)=x^2-3x+2\) has degree \(n=2\) and two roots \(x=1\) and \(x=2\), hence it is stable. The polynomial \(P(x)=x^2-2x+1\) has degree \(n=2\) and root \(x=1\) of multiplicity 2, which is counted as two roots, hence it is stable as well. However, the polynomial \(P(x)=x^2+1\) has degree \(n=2\) but no real roots at all, hence it is *not* stable.

**Stability Under Differentiation**

*derivative*of a polynomial

*P*(

*x*) is a polynomial, denoted \(P'(x)\), which can be characterized using the following rules

- (i)
\((P+Q)'(x)=P'(x)+Q'(x)\) for all polynomials

*P*,*Q*, - (ii)
\((aP)'(x) = a P'(x)\) for every polynomial

*P*and constant \(a \in {\mathbb R}\), - (iii)
\((x^k)'=kx^{k-1}\) for all \(k \ge 0\).

For example, let us calculate the derivative of the polynomial \(P(x)=x^2-3x+2\). Rules (i) and (ii) imply that \(P'(x)=(x^2-3x+2)'=(x^2)'+(-3x)'+(2)'=(x^2)'-3(x)'+2(1)'\). By (iii), \((x^2)'=2x\), \((x)'=1\), and \((1)'=(x^0)'=0\), hence \(P'(x)=2x-3\cdot 1+2\cdot 0=2x-3\). One easy but useful theorem is that the derivative of any stable polynomial is stable. In other words, we say that *stability is preserved under differentiation*. For example, \(P(x)=x^2-3x+2\) is a stable polynomial (degree 2, and 2 roots), and its derivative \(P'(x)=2x-3\) is stable as well (degree 1, and 1 root).

**Stability After Multiplication**

Any individual polynomial, say \(P(x)=x^2-3x+2\), is a function transforming real numbers into real numbers, e.g. the number \(x=4\) is transformed into \(P(4)=4^2-3\cdot 4 +2 = 6\). In contrast, differentiation is an example of an “operation” transforming polynomials into polynomials, e.g. \(x^2-3x+2\) is transformed into \(2x-3\). Another example of such an “operation” is multiplication by any fixed polynomial *Q*(*x*), say, \(Q(x)=x-5\). In this case, the polynomial \(P(x)=x^2-3x+2\) is transformed into \((x^2-3x+2)(x-5)=x^3-8x^2+17x-10\). The roots of the “transformed” polynomial are the same as the roots of the original one plus the roots of *Q*. In particular, this implies that \(Q(x)\cdot P(x)\) is a stable polynomial whenever *P*(*x*) and *Q*(*x*) are stable. In other words, *stability is preserved after multiplication by a stable polynomial.*

**Linear Operators Transforming Polynomials**

*linear operator*is any “operation”

*T*transforming polynomials into polynomials which satisfies properties (i) and (ii) above, that is, (i) \(T(P+Q)=T(P)+T(Q)\) for all polynomials

*P*,

*Q*, and (ii) \(T(aP)=aT(P)\) for every polynomial

*P*and constant \(a \in {\mathbb R}\). This implies that \(T(x^2-3x+2)=T(x^2)-3T(x)+2T(1)\), and, more generally,

*T*, it suffices to define \(T(x^k)\) for all \(k \ge 0\). For example, if \(T(x^k)=kx^{k-1}\), \(k\ge 0\), then

*T*is differentiation, while \(T(x^k)=x^k\cdot Q(x)\), \(k\ge 0\), implies that

*T*is just multiplication by a fixed polynomial

*Q*(

*x*). In general, however,

*T*can be arbitrarily complicated: for example, it may be that \(T(x^k)=x^3-4x\) for even

*k*while \(T(x^k)=x^2-1\) for odd

*k*. In this case, \(T(x^2-3x+2)=(x^3-4x)-3(x^2-1)+2(x^3-4x)=3x^3-3x^2-12x+3\), and, in general, for any polynomial

*P*,

*T*(

*P*) has the form \(a(x^3-4x)+b(x^2-1)\) for some constants

*a*,

*b*. It is not difficult to verify that the polynomial \(a(x^3-4x)+b(x^2-1)\) is stable for all

*a*,

*b*, hence, in this case,

*T*(

*P*) is stable for all

*P*.

**Which Linear Operators Preserve Stability?**

One of the long-standing fundamental problems in the theory of stable polynomials was to “characterize” *all* linear operators *T* which preserve stability, that is, such that *T*(*P*) is always a stable polynomial whenever *P* is stable. Here, by “characterization” we mean simple-to-check necessary and sufficient conditions. In 1914, such conditions were derived by Pólya and Schur [308] for operators of the form \(T(x^k)=\lambda ^k x^k\), \(k\ge 0\), where \(\lambda _0, \lambda _1, \lambda _2, \dots \) is a given sequence of numbers. Since then, there have been many similar results covering very special transformations *T*, but almost no progress for general *T*, until the question was fully resolved [67] in 2009!

*P*(

*x*) and

*Q*(

*x*) are

*interlacing*if either \(\alpha _1 \le \beta _1 \le \alpha _2 \le \beta _2 \le \dots \) or \(\beta _1 \le \alpha _1 \le \beta _2 \le \alpha _2 \le \dots \), where \(\alpha _1 \le \alpha _2 \le \dots \le \alpha _n\) and \(\beta _1 \le \beta _2 \le \dots \le \beta _m\) are roots of

*P*(

*x*) and

*Q*(

*x*), respectively. Note that this condition may be satisfied only if

*n*and

*m*differ by at most one. For example, polynomials \(P(x)=x^3-4x\) and \(Q(x)=x^2-1\) are interlacing, because their roots are \(-2,0,2\) and \(-1,1\), respectively. In contrast, polynomials \(x^2-4\) and \(x^2-1\) are not interlacing, see Fig. 9.8. Stable interlacing polynomials

*R*and

*Q*have the property that \(aR(x)+bQ(x)\) is stable for all

*a*,

*b*. In particular, if

*T*is a linear operator such that

*T*(

*P*) has the form \(aR(x)+bQ(x)\) for all

*P*, then

*T*(

*P*) is stable for all

*P*.

A polynomial *P*(*x*, *y*) in two variables *x*,*y* is the sum of any finite number of terms of the form \(ax^ky^ m\), where \(a \in {\mathbb R}\) and *k*, *m* are non-negative integers. *P*(*x*, *y*) is called *stable* if \(Q(t)=P(a+bt, c+dt)\) is a stable polynomial in one variable *t* for any real *a*, *b*, *c*, *d* such that \(b>0\) and \(d>0\). For example, \(P(x, y)=x+y\) is stable because in this case \(Q(t)=(a+c)+(b+d)t\) has degree 1 and one root \(t=-(a+c)/(b+d)\).

For every linear operator *T*, let \(S_T\) be the linear operator transforming polynomials in two variables into other polynomials in two variables according to the rule \(S_T(x^ky^l) = T(x^k)y^l\). For example, if *T* is differentiation, then \(S_T\) is known as the partial derivative with respect to *x*, and is calculated by the rule \(S_T(x^ky^l) = k x^{k-1}y^l\), for example, \(S_T(x^3y+x^2y^2)=3x^2y+2xy^2\).

### Theorem 9.8

*T*, transforming polynomials into polynomials, preserves stability if and only if

- (a)
\(T(x^k)=a_k P(x)+ b_k Q(x)\), where \(a_k, b_k\), \(k=0,1,2,\dots \) are real numbers, and

*P*(*x*) and*Q*(*x*) are some fixed (independent of*k*) stable interlacing polynomials; or - (b)
\(S_T[(x+y)^k]\) is a stable polynomial (in 2 variables) for all \(k=0,1,2,\dots \); or

- (c)
\(S_T[(x-y)^k]\) is a stable polynomial (in 2 variables) for all \(k=0,1,2,\dots \).

### Reference

J. Borcea and P. Bra̋ndén, Pólya–Schur master theorems for circular domains and their boundaries, *Annals of Mathematics* **170**-1, (2009), 465–492.

## 9.9 On the Gaps Between Primes

**The Average Distance Between Consecutive Primes**

Primes are natural numbers with exactly two divisors, like 2,3,5,7,11,13,17,19,23,\(\dots \). Because all even numbers *n* greater than 2 have at least three divisors (1, 2, and *n*), 2 is the only even prime number. This implies that the pair 2, 3 is the only pair of consecutive prime numbers.

The pairs \(p=3,q=5\), or \(p=5,q=7\), or \(p=11,q=13\), and so on, are examples of pairs of primes *p* and *q* such that \(q-p=2\). The famous twin primes conjecture states that there are infinitely many such pairs, and it is one of the oldest unsolved problems in mathematics. A “naïve” reason why this conjecture may be hard to prove is that, if we study the sequence of primes further and further, the *average* distance between consecutive primes becomes larger and larger. The famous prime number theorem states that, for any large *N*, there are about \(\frac{N}{\ln N}\) primes less than *N*. Hence, the average distance between consecutive primes is approximately \(\ln N\). For \(N=10^{100}\), this implies that the average distance between 100-digit primes is about 230. Of course, this does not mean that this distance is exactly 230 in all cases: for some pairs of consecutive primes it is larger, while for some pairs it is smaller.

**Pairs of Primes at Distance Much Lower Than the Average**

*n*-th prime, so that \(p_1=2\), \(p_2=3\), \(p_3=5\), \(p_4=7\), and so on. The prime number theorem states that \(p_{n+1}-p_n\) is, on average, about \(\ln p_n\). The twin primes conjecture states that \(p_{n+1}-p_n = 2\) for infinitely many values of

*n*. To make progress towards it, can we at least prove that \(p_{n+1}-p_n\) is less than average infinitely often? That is, given some \(\varepsilon \in (0,1)\), can we prove that

- (*)
\(p_{n+1}-p_n \le \varepsilon \ln p_n\) for infinitely many values of

*n*?

In 1926, Hardy and Littlewood proved (*) for \(\varepsilon = \frac{2}{3}\), assuming an unproven conjecture called the Generalized Riemann Hypothesis. Unconditionally, Erdős [140] proved in 1940 that (*) holds for *some* \(\varepsilon \in (0,1)\), but he did not provide an explicit value. In 1954, Ricci [320] proved (*) for \(\varepsilon = \frac{15}{16}\), and then there was a long chain of improvements, with the best result before 2009 being a 1988 theorem of Maier, [260] proving that (*) holds for \(\varepsilon \approx 0.2484\).

In 2009, Goldston, Pintz, and Yildirim [165] proved the following theorem.

### Theorem 9.9

*n*such that

Theorem 9.9 states that (*) holds for *any* \(\varepsilon >0\), no matter how small. In the authors’ words, “there exist consecutive primes which are closer than any arbitrarily small multiple of the average spacing”.

While Theorem 9.9 is huge progress compared to the previous results, it is still far from confirming the twin primes conjecture.

**Primes in Arithmetic Progressions: Dirichlet’s Theorem**

*arithmetic progression*with first term

*a*and difference

*q*is a sequence of the form

*a*and

*q*are divisible by

*r*, then all terms in (9.10) are divisible by

*r*, hence it contains no primes at all, or possibly one prime which is equal to

*r*. If there are no such

*r*, the numbers

*a*and

*q*are called

*relatively prime*. For example, 4 and 6 are not relatively prime, because they are both divisible by \(r=2\), while \(a=3\) and \(q=4\) are relatively prime. Dirichlet’s famous theorem states that an arithmetic progression (9.10) contains infinitely many primes whenever

*a*and

*q*are relatively prime. For example, with \(a=3\) and \(q=4\), this implies that the sequence \(S_1=3,7,11,15,19,23,27,31,35,\dots \) contains infinitely many primes, while with \(a=1\) and \(q=4\), we conclude that the sequence \(S_2=1,5,9,13,17,21,25,29,33,\dots \) contains infinitely many primes as well.

**Primes in Arithmetic Progressions: The Elliott–Halberstam Conjecture**

*N*, there are about \(\frac{N}{\ln N}\) primes less than

*N*, and we expect that about \(\frac{N}{2\ln N}\) of them belong to \(S_1\), and another \(\frac{N}{2\ln N}\) of them to \(S_2\). Also, by the prime number theorem, the

*product*\(\Pi (N)\) of all primes less than

*N*(for example, \(\Pi (12)=2 \cdot 3 \cdot 5 \cdot 7 \cdot 11 = 2310\)) is approximately equal to \(e^N\), where \(e\approx 2.71828...\) is the base of the natural logarithm, and we expect that primes from \(S_1\) and \(S_2\) contribute approximately equally to this product, that is, \(p_1p_2 \dots p_k \approx p'_1p'_2 \dots p'_m \approx \sqrt{e^N}\), where \(p_1,p_2, \dots , p_k\) and \(p'_1,p'_2, \dots , p'_m\) are primes less than

*N*from \(S_1\) and \(S_2\), respectively. Equivalently, \(\ln (p_1p_2 \dots p_k) \approx \ln (p'_1p'_2 \dots p'_m) \approx \ln (\sqrt{e^N}) = N/2\). Or \(g(N, 4,3) \approx g(N, 4,1) \approx N/2\), where

*g*(

*N*,

*q*,

*a*) is the logarithm of the product of all primes less than

*N*in the arithmetic progression (9.10). Similarly, for \(q=12\), we expect that primes are approximately uniformly distributed across four arithmetic progressions (9.10) with \(a=1,5,7,11\) (these are all values of

*a*less than 12 which are relatively prime with 12), and \(g(N, 12,1) \approx g(N, 12,5) \approx g(N, 12,7) \approx g(N, 12,11) \approx N/4\). For general

*q*, we expect that

*a*less than

*q*which are relatively prime with

*q*, and \(\phi (q)\) denotes the number of such

*a*, for example, \(\phi (12)=4\). Equation (9.11) is equivalent to saying that \(|g(N,q, a_k)-\frac{N}{\phi (q)}|\) is “small” for \(k=1,2,\dots , \phi (q)\), or, equivalently, that the function

*all*

*q*not exceeding some value

*Q*, we need to have a good upper bound for the function \(H(N,Q):=\sum _{q \le Q} h(N, q)\).

*C*such that

*H*(

*N*,

*Q*) is an increasing function in

*Q*, (9.12) becomes harder to prove as

*v*increases. The best theorem in this direction is the famous Bombieri–Vinogradov theorem proving (9.12) for \(v \le 0.5\).

**Bounded Gaps Between Primes**

*any*\(v>0.5\), even for \(v=0.50000001\), then there exist infinitely many values of

*n*such that

*B*is a constant depending only on

*v*. In particular, if one can prove (9.12) for \(v=0.971\), then one can choose \(B=16\). This result already looks close to the twin primes conjecture, stating that the same statement holds with \(B=2\). However, (9.12) is currently known to hold only for \(v \le 0.5\), which is just a little bit less than needed!

In a later work, Zhang [408] observed that in fact an even weaker version of (9.12) implies (9.13), and was able to prove this weaker version, establishing (9.13) with \(B=70{,}000{,}000\). This was later improved by Maynard and others, and (9.13) is now known to hold with \(B=246\), see [309].

### Reference

D. Goldston, J. Pintz, and C. Yildirim, Primes in tuples I, *Annals of Mathematics* **170**-2, (2009), 819–862.

## 9.10 A Proof of the B. and M. Shapiro Conjecture in Real Algebraic Geometry

**Bases and Linear Independence in** \({\mathbb R}^2\)

Any point *A* in the coordinate plane \({\mathbb R}^2\) can be described by two coordinates, \(x_A\) and \(y_A\). Any two points *B* and *A* define a *vector* \(\mathbf {BA}\), which is, geometrically, just an arrow connecting *B* with *A*. Algebraically, we say that the vector \(\mathbf {BA}\) has coordinates \((x_A-x_B, y_A-y_B)\), where \((x_B, y_B)\) and \((x_A, y_A)\) are coordinates of *B* and *A*, respectively. In particular, if \(O=(0,0)\) is the center of the coordinate plane, then the vector \(\mathbf {OA}\) has the same coordinates as *A*.

Vectors may be multiplied by constants using the rule \(\alpha (x, y)=(\alpha x, \alpha y)\). If \(A \ne O\), the set of all points *M* such that \(\mathbf {OM} = \alpha \mathbf {OA},\, \alpha \in {\mathbb R}\), is just a line passing through points *O* and *A*. If *B* is any point not on this line, then *any* vector \(\mathbf {OM}\) in the plane can be uniquely represented as a linear combination \(\alpha \mathbf {OA} + \beta \mathbf {OB}\) of \(\mathbf {OA}\) and \(\mathbf {OB}\), where addition is coordinate-wise. In this case, we say that the vectors \(\mathbf {OA}\) and \(\mathbf {OB}\) form a *basis* of the coordinate plane. For example, if *A* and *B* have coordinates (2, 0) and (1, 2), respectively, then any vector \(\mathbf {OM}\) with coordinates (*x*, *y*) can be uniquely represented as \((x, y)=\alpha (2,0)+\beta (1,2)=(2\alpha +\beta , 2\beta )\), see Fig. 9.10a, and the coefficients \(\alpha \) and \(\beta \) in this representation are given by \(\alpha =x/2-y/4\) and \(\beta =y/2\).

The condition “*B* is not on the line *OA*” is equivalent to “\(\mathbf {OB} \ne \alpha \mathbf {OA}\) for any \(\alpha \in {\mathbb R}\)”, or, equivalently, to \(x_Ay_B - x_By_A \ne 0\). For example, for \((x_A, y_A)=(2,0)\) and \((x_B, y_B)=(1,2)\) this reduces to \(2\cdot 2 - 1\cdot 0 \ne 0\). In this case, vectors with coordinates \((x_A, y_A)\) and \((x_B, y_B)\) are called *linearly independent*. In fact, two vectors in the plane form a basis if and only if they are linearly independent.

**Bases and Linear Independence in** \({\mathbb R}^3\)

*x*,

*y*,

*z*). More generally, a (real)

*n*-dimensional vector is just a set of

*n*real coordinates \((x_1,x_2,\dots , x_n)\).

*k*such vectors \(\mathbf {a_1}, \dots , \mathbf {a_k}\) are called

*linearly independent*if there are no real numbers \(\lambda _1, \dots , \lambda _k\), not all 0, such that \(\lambda _1 \mathbf {a_1} + \dots + \lambda _k \mathbf {a_k}=0\). For example, vectors \(\mathbf {a_1}=(2,1,0)\), \(\mathbf {a_2}=(-1,2,0)\) and \(\mathbf {a_3}=(-1,-1,2)\) are linearly independent, and form a basis of \({\mathbb R}^3\), see Fig. 9.10b, while vectors \(\mathbf {a_1}=(0,4,2)\), \(\mathbf {a_2}=(-2,1,2)\) and \(\mathbf {a_3}=(-2,3,3)\) are

*not*linearly independent, because \(0.5\mathbf {a_1}+\mathbf {a_2}-\mathbf {a_3}=0\). In fact, all linear combinations of these vectors form a plane, see Fig. 9.10c, and these vectors do

*not*form a basis of \({\mathbb R}^3\).

**Polynomials in Real and Complex Variables**

The notion of linearly independence can be studied not only for vectors, but for any mathematical “objects” which can be added and multiplied by constants, for example, polynomials. A real polynomial is any function of the form \( P(x) = a_n x^n + a_{n-1} x^{n-1} + \dots + a_2 x^2 + a_1 x + a_0, \) where \(a_0, a_1, \dots , a_n\) are some real coefficients. A *root* of a polynomial is a solution to the equation \(P(x)=0\). For example, if \(n=2\), \(a_2 \ne 0\), the equation \(P(x)=0\) is a quadratic equation \(a_2 x^2 + a_1 x + a_0 = 0\), whose solutions are given by the formula \( x_{1,2} = \frac{-a_1 \pm \sqrt{a_1^2-4a_0a_2}}{2a_2}. \) In particular, real solution(s) exist if and only if \(a_1^2-4a_0a_2 \ge 0\). If \(a_1^2-4a_0a_2 = - D\) for some \(D>0\), then \( x_{1,2} = \frac{-a_1 \pm \sqrt{D}\sqrt{-1}}{2a_2} = \frac{-a_1 \pm \sqrt{D}i}{2a_2}, \) where *i* is just a notation for the square root of \(-1\) (which is not a real number). Numbers of the form \(z=a+bi\) for some real *a*, *b* are called *complex* numbers, see e.g. Sect. 1.7 for details. The set of all complex numbers is usually denoted by \({\mathbb C}\).

*always*has complex roots. The fundamental theorem of algebra states that this remains correct for any equation of the form \(P(z)=0\), where

*P*(

*z*) is a complex polynomial, that is, an expression of the form

*z*is a complex variable and \(a_0, a_1, \dots , a_n\) are complex coefficients.

**Linear Independence and Bases for Polynomials**

Two polynomials *P*(*z*) and *Q*(*z*) are called *linearly independent* if \(P(z) \ne \alpha Q(z)\) and \(Q(z) \ne \alpha P(z)\) for any complex number \(\alpha \). For example, \(P(z)=iz^2+(1-i)\) and \(Q(z)=-z^2+(1+i)\) are *not* linearly independent because \(Q(z)=iP(z)\). In contrast, \(P(z)=iz^2+z\) and \(Q(z)=z^2+iz\) are linearly independent, because \(\alpha (iz^2+z)=z^2+iz\), or \((\alpha i - 1)z^2 + (\alpha - i)z=0\) implies that \(\alpha i - 1=0\) and \(\alpha - i=0\), hence \(\alpha =-i\) and \(\alpha =i\), a contradiction.

*k*polynomials \(P_1(z), P_2(z), \dots , P_k(z)\) are called

*linearly independent*if there are no complex numbers \(\lambda _1, \dots , \lambda _k\), not all 0, such that \(\lambda _1 P_1(z) + \dots + \lambda _k P_k(z)=0\). Let

*S*be the set of polynomials which can be written as a linear combination of \(P_1(z), P_2(z), \dots , P_k(z)\), that is,

*k*linearly independent polynomials belonging to

*S*, then the set

*S*can be equivalently written as

*basis*for

*S*.

**Looking for a Simpler Basis**

*S*in (9.14) consists of polynomials of the form

*S*has a simple basis: \(Q_1(z)=z^2+1\) and \(Q_2(z)=z+1\).

In general, it is convenient to represent *S* in (9.14) using as simple a basis as possible. In particular, what are sufficient conditions which guarantee the existence of a basis \(Q_1(z), Q_2(z), \dots , Q_k(z)\) such that all polynomials \(Q_i(z), \, i=1,\dots , k\) have only real coefficients?

**Sufficient Conditions for the Existence of a Basis with Real Coefficients**

To formulate the answer to this question, established in [285], we need more definitions. For any polynomial *P*(*z*), its *derivative* is a polynomial \(P'(z)\), uniquely determined by the rules (a) \((P+Q)'(z)=P'(z)+Q'(z)\) for all polynomials *P*, *Q*, (b) \((aP)'(z) = a P'(z)\) for every constant \(a \in {\mathbb C}\), and (c) \((z^k)'=kz^{k-1}\) for \(k=0,1,2,\dots \). The second derivative of *P*, denoted \(P^{(2)}(z)\), is the derivative of \(P'(z)\), and so on. For example, for \(P(z)=z^3+iz^2+2z-i\), \(P'(z)=3z^2+2iz+2\), \(P^{(2)}(z)=6z+2i\), \(P^{(3)}(z)=6\), and \(P^{(i)}(z)=0\) for all \(i \ge 4\).

For an arbitrary set of *k* polynomials \(P_1(z), P_2(z), \dots , P_k(z)\), a complex number \(z^*\) is called a root of its *Wronskian* if the vectors \(\mathbf {a_1}=(P_1(z^*), P_2(z^*), \dots , P_k(z^*))\), \(\mathbf {a_2}=(P'_1(z^*), P'_2(z^*), \dots , P'_k(z^*))\), \(\dots \), \(\mathbf {a_k}=(P^{(k-1)}_1(z^*), P^{(k-1)}_2(z^*), \dots , P^{(k-1)}_k(z^*))\) are linearly dependent, that is, \(\lambda _1 \mathbf {a_1} + \dots + \lambda _k \mathbf {a_k}=0\) for some complex numbers \(\lambda _1, \dots , \lambda _k\), not all 0.

### Theorem 9.10

If all roots of the Wronskian of a set of polynomials \(P_1(z), P_2(z), \dots , P_k(z)\) are real, then the set *S* defined in (9.14) has a basis consisting of polynomials with real coefficients.

In the example above with \(k=2\), \(P_1(z)=(1+i)z^2+(1-i)z+2\), \(P_2(z)= (1-i)z^2+(1+i)z+2\), \(z^* \in {\mathbb C}\) is a root of the Wronskian if vectors \(\mathbf {a_1}=(P_1(z^*), P_2(z^*))\) and \(\mathbf {a_2}=(P'_1(z^*), P'_2(z^*))\) are linearly dependent, which is the case if \(P_1(z^*)P'_2(z^*) - P_2(z^*)P'_1(z^*) = 0\), where \(P'_1(z^*)=2(1+i)z^*+(1-i)\) and \(P'_2(z^*)=2(1-i)z^*+ (1+i)\). This simplifies to \(-4i(z^*)^2-8iz^*+4i=0\), or \((z^*)^2+2z^*-1=0\). Because this equation has only real roots, Theorem 9.14 guarantees that *S* in (9.14) has a basis consisting of polynomials with real coefficients. As we have seen above, this is indeed the case, and the basis is \(Q_1(z)=z^2+1\) and \(Q_2(z)=z+1\).

In fact, the \(k=2\) case of Theorem 9.14 was resolved in 2002, see Sect. 2.1, but the general case remained open until 2009. In its general form, Theorem 9.14 confirms a conjecture known as the “B. and M. Shapiro conjecture”, which has number of equivalent formulations, and many important consequences, especially in the field of mathematics called “real algebraic geometry”.

### Reference

E. Mukhin, V. Tarasov, and A. Varchenko, The B. and M. Shapiro conjecture in real algebraic geometry and the Bethe ansatz, *Annals of Mathematics* **170**-2, (2009), 863–881.

## 9.11 Bounding Diagonal Ramsey Numbers

**Looking for a Monochromatic Triangle**

Assuming that there are six people in a room, can we always find either three people who all know each other or three people who all do not know each other? To analyse questions like this, it is convenient to represent people as points in the plane, and then connect the points by a blue line for any pair of people who know each other, and by a red line for any pair who do not know each other. Then we have 6 points, each pair connected by either a red or a blue line, and the question is can we always find either a red or a blue triangle?

*A*we draw 5 lines, hence at least 3 of them should have the same colour, say, blue. Let

*A*be connected by blue lines to points

*B*,

*C*, and

*D*. If any of the lines

*BC*,

*CD*, or

*DB*are blue, then we have a blue triangle (for example, if

*BC*is blue, then the blue triangle is

*ABC*, and so on). Otherwise all lines

*BC*,

*CD*, and

*DB*are red, hence we have a red triangle

*BCD*. In Fig. 9.11a you can see that you will get a monochromatic triangle after any colouring of

*BD*.

What if we have just 5 people instead of 6? Then the answer to the same question is “No”. Let us label the people (and the corresponding points) *A*, *B*, *C*, *D*, and *E*, and let the lines *AB*, *BC*, *CD*, *DE*, and *EA* be blue, and the lines *AC*, *CE*, *EB*, *BD* and *DA* red, see Fig. 9.11b. It is easy to check that, in this case, neither a red nor a blue triangle exists. In fact, this colouring is “unique up to relabelling”, that is, in any set of 5 points connected by red or blue lines without red and blue triangles, we can always give the points names *A*, *B*, *C*, *D*, and *E* in such a way that the colouring becomes exactly as described above.

**Looking for a Red Triangle or Blue Quadruple**

A slightly more difficult problem is to prove that, in a group of 9 people, we can always find either three who do not know each other, or *four* who know each other. In other words, if 9 points are connected by red or blue lines, then either there exists a blue triangle, or there are 4 points all connected by red lines, which we will call a *red quadruple*. Indeed, if every point is adjacent to exactly 3 blue lines, then the total number of blue lines is \(9\cdot 3/2\), which is not an integer, a contradiction. Hence, there is a point *A* adjacent either to at least 4 blue lines or to at most 2. In the first case, let *A* be connected by blue lines to points *B*, *C*, *D* and *E*, see Fig. 9.11c. If any pair of them (say, *B* and *C*) is connected by a blue line as well, then we have a blue triangle (in this case, *ABC*). Otherwise points *B*, *C*, *D* and *E* form a red quadruple. In the second case, *A* is connected by blue lines to at most 2 points, hence there are 6 points to which it is connected by red lines, see Fig. 9.11d. As we have proved above, out of these 6 points we can always select a triangle, call it *BCD*, which is either red or blue. If it is blue, we have found a blue triangle. If it is red, then *ABCD* is a red quadruple.

**Looking for a Monochromatic Quadruple**

Similarly, we can prove that out of 18 points, connected by red or blue lines, we can always find either a red quadruple or a blue quadruple. Indeed, any point *A* is connected to 17 others, hence it is connected to at least 9 of them by lines of the same colour, say, blue. But we have just proved that in any set of 9 points we can always find either a blue triangle *BCD* (in which case *ABCD* is a blue quadruple), or a red quadruple.

What if we have just 17 points, can we always find either a red or a blue quadruple? It turns out, we cannot. Let us label the points \(A_1,A_2,\dots , A_{17}\), and position them in this order as the vertices of a regular 17-gon with unit side length. For any two points *A*, *B*, let *d*(*A*, *B*) be the distance of the “shortest path” between them while travelling along the 17-gon: for example, \(d(A_1,A_5)=4\) with shortest path \(A_1 \rightarrow A_2 \rightarrow A_3 \rightarrow A_4 \rightarrow A_5\), while \(d(A_2,A_{16})=3\) with shortest path \(A_2 \rightarrow A_1 \rightarrow A_{17} \rightarrow A_{16}\). Let us colour the line *AB* blue if *d*(*A*, *B*) is either 1, or 2, or 4, or 8, and red otherwise. In Fig. 9.11e only blue lines are depicted. Let us prove that there are no blue quadruples. Imagine we have one, with vertices (counter-clockwise) being *A*, *B*, *C*, *D*, and with \(d(A, B)=a\), \(d(B, C)=b\), \(d(C, D)=c\), and \(d(D, A)=d\). We can assume that \(\max {a,b,c, d}=d\). Then either \(a+b+c=d\) or \(a+b+c+d=17\). Because this is a blue quadruple, each of *a*, *b*, *c*, *d* are either 1, 2, 4, or 8, hence \(a+b+c+d=17\) is possible only if \(d=8\) and *a*, *b*, *c* are (in some order) 1, 4, 4. But then either \(d(A, C)=a+b=5\) or \(d(B, D)=b+c=5\), contradicting the fact that *AC* and *BD* are blue. Similarly, \(a+b+c=d\) is possible if (i) \(d=4\) and *a*, *b*, *c* are (in some order) 1, 1, 2 or (ii) \(d=8\) and *a*, *b*, *c* are (in some order) 2, 2, 4. In case (i), either \(d(A, C)=3\) or \(d(B, D)=3\), while in (ii), either \(d(A, C)=6\) or \(d(B, D)=6\), each case leading to a contradiction. The proof that there are no red quadruples is similar.

In fact, Evans, Pulham and Sheehan [143] proved in 1981 that the colouring described above (which is called *the Paley graph of order 17*) is “unique up to relabelling”. In any other red-blue colouring of lines between 17 points (and there are about \(2.46 \times 10^{26}\) such colourings) we can always find either a red or a blue quadruple.

**Diagonal Ramsey Numbers and Alien Invasions**

In general, Ramsey [314] proved in 1930 that, for every *n*, there exists an *N* such that, if *N* points are connected by red or blue lines, then there exists either *n* points all connected by red lines, or *n* points all connected by blue lines. The minimal number *N* with this property is called *the diagonal Ramsey number* *R*(*n*, *n*). It is trivial that \(R(2,2)=2\), and we have just proved that \(R(3,3)=6\), and \(R(4,4)=18\). One might guess that we can find *R*(5, 5) by a similar not-so-complicated argument, but in fact determining *R*(5, 5) remains an open problem despite all efforts, including an extensive computer search. The famous mathematician Paul Erdős said that, if an alien force, much-much more powerful than human civilization, contacted us and said that they will destroy the planet unless we tell them *R*(5, 5), then we could unite all mathematicians and all computer power in the world to solve the problem. However, if they asked us to determine *R*(6, 6), we would have a better chance to destroy the aliens...

**A Superpolynomial Improvement**

*R*(

*n*,

*n*) is so difficult, can we at least have some estimates? Erdős and Szekeres [137] proved in 1935 that

*R*(3, 3), this bound gives \(R(3,3)\le \frac{4!}{2! \cdot 2!}=\frac{1\cdot 2\cdot 3\cdot 4}{1\cdot 2\cdot 1\cdot 2}=6\), which is the exact value, while for

*R*(4, 4), it gives \(R(4,4)\le \frac{6!}{3! \cdot 3!}=\frac{1\cdot 2\cdot 3\cdot 4\cdot 5\cdot 6}{1\cdot 2\cdot 3\cdot 1\cdot 2\cdot 3}=20\), which is close to the correct value \(R(4,4)=18\). However, as

*n*grows, the gap between the bound and the exact value seems to grow, hence a better bound is desirable.

*A*. For large

*n*, this bound is better than Erdős’ one by a factor of about \(\sqrt{n}\). After 1987, there were no further improvements for more than 20 years, until the following theorem [100] was proved in 2009.

### Theorem 9.11

*C*such that

No matter what the value of the constant *C* is, we can find *n* large enough so that \(C \ln n/\ln \ln n\) is larger than, say, 100.5, and, for such values of *n*, the bound in Theorem 9.11 is better than Thomason’s one by a factor about \(n^{100}\), and the same is true if 100 is replaced by any other constant. As mathematicians say, the bound in Theorem 9.11 gives a *superpolynomial improvement* compared to the previous ones.

*C*, Erdős’ estimate can be rewritten as

*k*, where \(C_k\) is a constant depending on

*k*.

### Reference

D. Conlon, A new upper bound for diagonal Ramsey numbers, *Annals of Mathematics* **170**-2, (2009), 941–960.

## 9.12 An Almost Optimal Upper Bound for Moments of the Riemann Zeta Function

**A Short Paper of Riemann**

In mathematics, seemingly unrelated areas can sometimes become interconnected in an unexpected way. This happened, for example, with number theory and the theory of functions of a complex variable.

*n*, there was a conjecture that \(\pi (n) \approx \frac{n}{\ln n}\), or, more formally, that

**Functions of a Complex Variable**

In this paper, Riemann suggested to attack conjecture (9.16) using methods from a completely different field, the theory of functions of a complex variable. Complex numbers are those of the form \(z=a+ib\), with *a* and *b* real, where *i* is an (imaginary) number such that \(i^2=-1\), see e.g. Sect. 1.7 for more details. These numbers were initially invented to solve equations like \(x^2+1=0\), which have no real solutions, but quickly arose in many other applications. Geometrically, a complex number \(z=a+ib\) can be represented as a point (*a*, *b*) in the coordinate plane. The distance \(\sqrt{a^2+b^2}\) from this point to the coordinate center is called the *absolute value* of *z* and denoted by |*z*|.

*f*is defined is called the

*domain*of

*f*. For any \(z_0 \in D_f\), the

*derivative*of

*f*at \(z_0\), denoted by \(f'(z_0)\), is defined as

*f*has a derivative at every point of its domain, it is called

*holomorphic*on \(D_f\). For example, \(f(z)=z^2\) and \(f(z)=1/z\) are holomorphic functions, while, e.g. the function \(f(z)=|z|\) is not, because it has no derivative at \(z=0\).

**The Riemann Zeta Function**

Riemann noticed that there exists a *unique* function \(\zeta \) of a *complex* variable which (i) is defined for *all* complex numbers *z* except \(z=1\), (ii) is holomorphic, and (iii) satisfies \(\zeta (z)=\sum _{n=1}^\infty \frac{1}{n^z}\) for all *z* for which the infinite sum is well-defined. This function is called the *Riemann zeta function*. In particular, \(\zeta (s)\) is well-defined for real \(s<1\), for example, \(\zeta (0)=-\frac{1}{2}\), \(\zeta (-1)=-\frac{1}{12}\), which allows mathematician to write various funny formulas like \(1+1+1+\dots =-\frac{1}{2}\), or \(1+2+3+4+\dots = -\frac{1}{12}\). Also, \(\zeta (-2)=\zeta (-4)=\zeta (-6)=\dots = 0\). Numbers of the form \(-2k\), \(k=1,2,3\dots \), are called *trivial zeros* of \(\zeta \). All other complex numbers *z* such that \(\zeta (z)=0\) are called *non-trivial zeros*.

**The Riemann Hypothesis**

- (*)
If \(z=a+ib\) is a non-trivial zero of \(\zeta \), then \(a=1/2\).

*critical line*. Hence, (*) can be reformulated as the conjecture that all non-trivial zeros of \(\zeta \) lie on the critical line. Figure 9.12a depicts the first few zeros of \(\zeta \), and the critical line is drawn as a dotted line. Figure 9.12b depicts the real and imaginary parts of \(\zeta \) on the critical line, while Fig. 9.12c depicts its absolute value.

At first, (*) looked like a not-very-difficult-to-prove lemma, but Riemann was not able to find a rigorous justification. In 1896, Jacques Hadamard [182] proved a weaker statement “if \(z=a+ib\) is a non-trivial zero of \(\zeta \), then \(a \in [0,1]\)”, and was able to deduce (9.16) from it. However, it was clear that even better estimates for the distribution of primes would follow from (*). Statement (*) received the name *Riemann hypothesis* and gained the status of an important open problem. In 1900, Hilbert included it in his list [201] of 23 problems for 20th century mathematics. In 2000, the Clay Mathematics Institute included it in its list of 7 problems, offering a million-dollar prize for its solution. However, the problem is still open, and there is no sign that it will be solved in the near future.

**How Large is the Riemann Zeta Function on the Critical Line?**

*k*-th moment of \(\zeta \).

**Lower and Upper Bounds for the ** *k* **-th Moment**

*k*, and

*C*is an absolute constant.

The following theorem [354], also assuming the Riemann hypothesis, provides a much better upper bound for all values of *k*. In fact, the established bound is “\(\varepsilon \)-close” to the lower bound of Ramachandra.

### Theorem 9.12

In a later work, Adam Harper [189] improved the bound in Theorem 9.12 and proved that \(M_k(T) \le D_k T(\ln T)^{k^2}\) for some constant \(D_k\). Together with (9.17), this resolves the question of how large \(|\zeta |\) is on the critical line, up to a constant factor. The “only” problem is that its resolution is, like hundreds of other important theorems in the field, subject to the correctness of the Riemann hypothesis. If it turn out to be false, the conclusion of all such theorems could be false as well.

### Reference

K. Soundararajan, Moments of the Riemann zeta function, *Annals of Mathematics* **170**-2, (2009), 981–993.

## 9.13 Optimal Lattice Sphere Packing in Dimension 24

**Vectors and Lattices**

Vectors in the plane are, geometrically, directed line segments, connecting an initial point *A* with a terminal point *B*, and usually denoted \(\mathbf {AB}\). In the coordinate plane, we say that \(\mathbf {AB}\) has coordinates \((x_B-x_A, y_B-y_A)\), where \((x_A, y_A)\) and \((x_B, y_B)\) are coordinates of *A* and *B*, respectively. In particular, if \(O=(0,0)\) is the coordinate center, then \(\mathbf {OA}\) has the same coordinates as *A*. Vectors can be added and multiplied by constants using rules \((x_1,y_1)+(x_2,y_2)=(x_1+x_2, y_1+y_2)\) and \(\lambda (x, y)=(\lambda x, \lambda y)\).

A *lattice* \({\mathscr {L}}={\mathscr {L}}_{A, B}\) in the plane is the set of points *X* such that \(\mathbf {OX} = k \mathbf {OA} + m \mathbf {OB}\), where *A* and *B* are some fixed points such that *O*, *A* and *B* are not on the same line, and *k*, *m* are integers. For example, if *A* and *B* have coordinates (1, 0) and (0, 1), respectively, then \(k (1,0) + m (0,1) = (k, m)\), hence the lattice \({\mathscr {L}}_{A, B}\) consists of all points with integer coefficients.

**Counting Lattice Points per Unit Area**

*R*, how many points of \({\mathscr {L}}_{A, B}\) does it contain? To estimate this number, which we denote by \(N({\mathscr {L}}_{A,B}, R)\), let us associate to every lattice point \(X=(k, m)\) the unit square \(U_X\) for which

*X*is the left bottom vertex, that is, the square with vertex coordinates \((k, m), (k+1,m), (k+1,m+1), (k, m+1)\), see Fig. 9.13a. In most cases, a point

*X*is inside the circle if and only if \(U_X\) is inside it as well. This is not true if

*X*is near the boundary of the circle, but, for large

*R*, the number of lattice points near the boundary is much less than \(N({\mathscr {L}}_{A,B}, R)\), and this boundary effect can be ignored. Now, the number of unit squares inside the circle is, again up to boundary effects, equal to the ratio \(S(\text {Circle})/S(U_X)\), where \( S(\text {Circle})=\pi R^2\) and \(S(U_X)\) are the areas bounded by the circle and \(U_X\), respectively. Hence, the circle contains roughly about \(\pi R^2/S(U_X)\) lattice points, or, in other words, about \(1/S(U_X)\) lattice points on average per unit area. The quantity \(1/S(U_X)\) is called the

*density*of the lattice \({\mathscr {L}}\) in the plane. In our case, \(U_X\) is a unit square, \(S(U_X)=1\), hence its density \(1/S(U_X)\) is also equal to 1: our lattice contains on average 1 point per unit area.

**The Fundamental Parallelogram, and the Density of a Lattice**

If \(A=(1/2,0)\), \(B=(0,1/2)\), the lattice \({\mathscr {L}}_{A, B}\) consists of the points whose coordinates are either integers or half integers. In this case, every point \(X\in {\mathscr {L}}_{A, B}\) is a left bottom vertex of a square \(U_X\) with side length 0.5. Hence, \(S(U_X)=0.25\), and the density \(1/S(U_X)\) is equal to 4, that is, there are four points of the lattice per unit area. A slightly more complicated example is \(A=(3,0)\), \(B=(-2,1)\). Then \({\mathscr {L}}_{A, B}\) consists of points with coordinates \(k(3,0)+m(-2,1)=(3k-2m, m)=(3[k-m]+m, m)\), that is, of all points *X* with integer coordinates (*x*, *y*), such that \(x-y\) is a multiple of 3. In this case, *X* is a left bottom vertex of the parallelogram \(U_X\) with vertex coordinates \((x, y), (x+3,y), (x+1,y+1), (x-2,y+1)\), see Fig. 9.13b. The area \(S(U_X)\) and density \(1/S(U_X)\) are then equal to 3 and 1 / 3, respectively.

In general, the *fundamental parallelogram* *U* of a lattice \({\mathscr {L}}_{A, B}\) in the plane is the set of points *X* such that \(\mathbf {OX} = \alpha \mathbf {OA} + \beta \mathbf {OB}\), where \(\alpha \in [0,1]\) and \(\beta \in [0,1]\). In other words, *U* is the parallelogram with vertices *O*, *A*, *C*, *B*, where *C* has coordinates \((x_A+x_B, y_A+y_B)\). The whole lattice can be considered as the vertices of a tiling of the plane by copies of this parallelogram. The real number \(\rho ({\mathscr {L}}_{A, B}):=1/S(U)\) is called the *density* of \({\mathscr {L}}_{A, B}\).

**The Length of the Shortest Vector and Circle Packing**

*k*,

*m*except for \(k=m=0\). For example, for the lattice \({\mathscr {L}}_{A, B}\) defined by \(A=(1,0)\) and \(B=(0,1)\), we have \(h({\mathscr {L}}_{A,B})=\min \limits _{k, m}\sqrt{k^2+m^2}=1\). Similarly, if \(A=(1/2,0)\), \(B=(0,1/2)\), then \(h({\mathscr {L}}_{A, B})=1/2\), while in the lattice with \(A=(3,0)\), \(B=(-2,1)\), \(h({\mathscr {L}}_{A,B})=\min \limits _{k, m}\sqrt{(3k-2m)^2+m^2}=\sqrt{2}\), with the minimum achieved, for example, for \(k=m=1\).

*any*two points of the lattice, see Fig. 9.13b. Indeed, if \(\mathbf {OX} = k_1 \mathbf {OA} + m_1 \mathbf {OB}\) and \(\mathbf {OY} = k_2 \mathbf {OA} + m_2 \mathbf {OB}\), then

*O*,

*A*,

*B*form an equilateral triangle, that is, \(A=(1,0)\), \(B=(1/2,\sqrt{3}/2)\). Then \(S(U)=\sqrt{3}/2\), \(h({\mathscr {L}})=1\), and \(r({\mathscr {L}})=1^2/(4 \cdot \sqrt{3}/2) = 1/2\sqrt{3} \approx 0.29\).

**Lattice-Based Sphere Packings in Higher Dimensions**

The same question can be asked in any dimension. In dimension *n*, points and vectors are described by *n* coordinates, \((x_1, x_2, \dots , x_n)\), e.g. the coordinate center *O* is \((0,0,\dots , 0)\). The length \(|\mathbf {a}|\) of a vector \(\mathbf {a}=(x_1, x_2, \dots , x_n)\) is \(|\mathbf {a}|=\sqrt{x_1^2+x_2^2+\dots +x_n^2}\). Any *n* vectors \(\mathbf {a}_1, \dots \mathbf {a}_n\) define a lattice \({\mathscr {L}}={\mathscr {L}}(\mathbf {a}_1, \dots \mathbf {a}_n)\), which is the set of all points *X* such that \(\mathbf {OX} = k_1\mathbf {a}_1 + k_2\mathbf {a}_2 \dots k_n\mathbf {a}_n\) for some integers \(k_1, k_2, \dots , k_n\). The *fundamental parallelepiped* \(U_{\mathscr {L}}\) of a lattice \({\mathscr {L}}\) is the set of points *X* such that \(\mathbf {OX} = \beta _1\mathbf {a}_1 + \beta _2\mathbf {a}_2+ \dots +\beta _n\mathbf {a}_n\), where each \(\beta _i\) is a real number such that \(0 \le \beta _i \le 1\), \(i=1,2,\dots , n\). If the volume \(V(U_{\mathscr {L}})\) of \(U_{\mathscr {L}}\) in *n*-dimensional space is non-zero, \(1/V(U_{\mathscr {L}})\) has the meaning of the average number of points in \({\mathscr {L}}\) per unit volume.

*n*-dimensional spheres per unit volume, each of radius \(h({\mathscr {L}})/2\), such that the interiors of the spheres do not intersect. After scaling, this allows us to locate \((h({\mathscr {L}})/2)^n/V(U_{\mathscr {L}})\) non-intersecting spheres (per unit volume) of radius 1 each. This motivates the question of finding, for each dimension

*n*, a lattice \({\mathscr {L}}\) with ratio

### Theorem 9.13

In dimension \(n=24\), the maximal possible value of \(r({\mathscr {L}})\) is equal to 1.

It is interesting that, after dimensions \(1\le n \le 8\), the next resolved case is \(n=24\). The 24-dimensional lattice \({\mathscr {L}}\) with \(r({\mathscr {L}})=1\) was found by Leech in 1964, and has the name Leech lattice. The contribution of Theorem 9.13 is the proof that no 24-dimensional lattice \({\mathscr {L}}\) has \(r({\mathscr {L}}) > 1\). Moreover, the authors also proved that the Leech lattice is the only one with \(r({\mathscr {L}})=1\), up to scaling and isometries.

Theorem 9.13 implies that sphere packing using the Leech lattice is the densest possible one among all lattice-based sphere packings in dimension 24. In a later work [96], Cohn, Kumar, Miller, Radchenko, and Viazovska proved that this sphere packing is in fact the densest possible one among all packings in dimension 24, not necessary lattice-based ones.

### Reference

H. Cohn and A. Kumar, Optimality and uniqueness of the Leech lattice among lattices, *Annals of Mathematics* **170**-3, (2009), 1003–1050.

## 9.14 A Waring-Type Theorem for Large Finite Simple Groups

**Representing Integers as Sums of Perfect Powers**

One of the oldest classical topics in mathematics is representing integers as a sum of some “special” integers. For example, the Greek mathematician Diophantus, who lived in the 3rd century, was interested in representing integers as a sum of perfect squares, e.g. \(1=1^2\), \(2=1^2+1^2\), \(3=1^2+1^2+1^2\), \(4=2^2\), \(5=2^2+1^2\), \(6=2^2+1^2+1^2\), \(7=2^2+1^2+1^2+1^2\), and so on. You can see that we need at least 4 squares to represent 7. Diophantus was interested in the question of whether there exists a positive integer which requires at least 5 squares for such a representation, or if 4 squares always suffice. This question was answered by Lagrange in 1770. His celebrated four squares theorem states that four squares suffice: every positive integer *n* can be written as \(n=a^2+b^2+c^2+d^2\) for some integers *a*, *b*, *c*, *d*.

In the same year as Lagrange proved his theorem, Edward Waring asked if similar results can be proved for cubes, fourth powers, and so on. That is, does there exists a positive integer \(N_3\) such that every positive integer *n* can be written as a sum of at most \(N_3\) cubes? More generally, for every *k*, does there exist an \(N_k\) such that every positive integer *n* can be written as a sum of at most \(N_k\) *k*-th powers? This question was answered positively by Hilbert [202] in 1909, and is known as the Hilbert–Waring theorem.

**What is the Minimal Number of ** **k****-th Powers We Will Need?**

It follows from Lagrange’s theorem that the statement “every positive integer *n* can be written as a sum of at most \(N_2\) squares” holds with \(N_2=4\), and the example of \(n=7\) shows that it does not hold for \(N_2=3\). In other words, 4 is the *minimal* number of squares sufficient to represent every integer. One may then ask for the *minimal* number of cubes, 4th powers, and so on. In general, let *g*(*k*) be the *minimal* number of *k*-th powers sufficient to represent every positive integer.

By 1912, Wieferich and Kempner [217, 403] had shown that every integer is the sum of 9 cubes. Because 23 cannot be represented as a sum of 8 cubes, this proves that \(g(3)=9\). Later, mathematicians proved that \(g(4)=19\), \(g(5)=37\), and so on. In fact, it is now known that \(g(k)=2^k+[(3/2)^k]-2\) for all values of *k*, except for possibly finitely many exceptions. Here, \([(3/2)^k]\) denotes the largest integer less than \((3/2)^k\).

While the representation of 23 requires 9 cubes, Linnik [246] proved in 1943 that all \(n>454\) can be represented as a sum of at most 7 cubes. Also, while \(g(4)=19\), it is known that all \(n>13792\) are the sums of only 16 4th powers. The question “for given *k*, what is the minimal number of *k*-th powers required to represent any *sufficiently large* *n*” remains an active area of research today.

**Representing Rotations as a Composition of Some “Special” Rotations**

Similar questions of the form “Can we represent an object using some “special” objects?” can be asked not only about integers, but in many areas of mathematics. In geometry, one may study rotations of the plane around some fixed center *O* by some arbitrary angle \(\alpha \). If we perform any such rotation \(R'\), and then another rotation \(R''\), the result is again a rotation, which we denote by \(R'' \circ R'\), and call the *composition* of \(R'\) and \(R''\). Let *S* be a set of rotations for which \(\alpha \) has *n* degrees for some integer *n*. Let us call “perfect squares” some special rotations from *S*, which can be written as \(R \circ R\) for some \(R \in S\). For example, if \(R_1\) is a rotation clockwise by angle \(1^{\circ }\), then \(R_2 = R_1 \circ R_1\) is a rotation clockwise by angle \(2^{\circ }\), and, by definition, \(R_2\) is a perfect square. Now, by analogy with Lagrange’s theorem, one may ask if any rotation \(R \in S\) can be represented as a composition of such “perfect squares”. In fact, the answer is “no”. One can easily check that all “perfect squares” rotate the plane by an *even* number of degrees, and so do their compositions, hence any rotation by an odd number of degrees, such as \(R_1\), cannot be represented as such a composition.

**Representing Permutations as a Composition of Some “Special” Permutations**

As another example, let us consider functions from some *finite* set *S* to itself. If *S* has *n* elements, we can enumerate them, and write *S* as \(\{1,2,\dots , n\}\). Then any function \(f:S\rightarrow S\) can be described by listing its values: \(f=(f(1), f(2), \dots , f(n))\). For example, if \(n=3\), \(S=\{1,2,3\}\), then the function \(f(x)=1\) (constant function) is written as (1, 1, 1), while the function \(f(x)=x\) is written as (1, 2, 3). If all *f*(*i*), \(i=1,2,\dots , n\), are different, then *f* is called a *permutation*. For example, (1, 1, 1) is not a permutation, while (1, 2, 3) is. In general, let \(S_n\) be the set of all permutations \(g:\{1,2,\dots , n\} \rightarrow \{1,2,\dots , n\}\).

*f*and

*g*is the function

*h*such that \(h(x)=g(f(x))\) for all

*x*, and one can easily prove that the composition of any two permutations is again a permutation. Let us call a function \(g \in S_n\) a “perfect square” if \(g = f \circ f\) for some \(f \in S_n\). Can any \(h \in S_n\) be written as a composition of perfect squares? It turns out, not. For \(n=3\), there are exactly 6 permutations: \(a=(2,1,3), b=(1,3,2), c=(2,3,1), d=(3,1,2), e=(1,2,3)\), and \(f=(3,2,1)\). One can check that \(a \circ a = b \circ b = e \circ e = f \circ f = e\) while \(c \circ c = d\) and \(d \circ d = c\), see Fig. 9.14, hence the perfect squares are

*e*,

*c*and

*d*. Next, \(e \circ c = c \circ e = c\), \(e \circ d = d \circ e = d\), and \(c \circ d = d \circ c = e\), hence the composition of any two perfect squares is again a perfect square, and any permutation outside the set \(\{e,c, d\}\) cannot be represented in this way.

**Even Permutations**

In general, a permutation \(f \in S_n\) is called *even* if the number of pairs (*i*, *j*) such that \(i<j\) but \(f(i)>f(j)\) is even. In other words, a permutation is even if it exchanges the order of an even number of pairs (*i*, *j*). For example, the permutation *d* in Fig. 9.14, sending (1, 2, 3) to (3, 1, 2), exchanges the order in the pair (2, 3) (2 was on the left of 3 before permutation, but on the right of 3 after permutation), and in the pair (1, 3), but does not change the order in the pair (1, 2) (1 was on the left of 2 before permutation, and stays on the left of 2 after permutation). Hence, the total number of pairs with exchanged order is 2, an even number, and this permutation is an even permutation.

The set of all even permutations is usually denoted by \(A_n\). One can prove that all perfect squares always belong to \(A_n\), and so do all their compositions. Hence, there is no hope of representing every permutation \(f \in S_n\) as a composition of perfect squares. However, one may ask if at least every even permutation \(f \in A_n\) is representable in this way, and if so, how many perfect squares we would need for such a representation. The same question can be asked for cubes, and, more generally, for *k*-th powers for arbitrary *k*.

**Groups, Subgroups, and Simple Groups**

In the above examples, we considered integers, rotations of the plane, and permutations. All these are example of *groups*. A group is an arbitrary set *G* together with an operation \(\cdot \) such that (i) \(a\cdot b \in G\) for all \(a, b \in G\); (ii) \((a\cdot b)\cdot c = a\cdot (b\cdot c)\) for all \(a,b, c \in G\); (iii) there exists an \(e \in G\) (called the identity element of *G*) such that \(a\cdot e = e\cdot a = a\) for all \(a\in G\); and (iv) for every \(a\in G\), there exists an element \(a^{-1}\in G\) (called the *inverse* of *a*), such that \(a\cdot a^{-1} = a^{-1}\cdot a = e\). The set of integers form a group (usually denoted by \({\mathbb Z}\)) with the addition operation \(+\), while rotations and permutations form a group with the composition operation \(\circ \).

A subset *H* of a group *G* is called a *subgroup* of *G* if (i) \(a\cdot b \in H\) for all \(a, b \in H\); (ii) \(e \in H\), where *e* is the identity element of *G*; and (iii) \(a^{-1}\in H\) for every \(a\in H\). For example, the set of all even integers is a subgroup of \({\mathbb Z}\), while \(A_n\) is a subgroup of \(S_n\). A subgroup *H* of a group *G* is called *trivial* if either \(H=G\), or \(H=\{e\}\), and *non-trivial* otherwise. A subgroup *H* of a group *G* is called *normal* if \(g\cdot a\cdot g^{-1}\in H\) for any \(a \in H\) and \(g \in G\). One can check that \(A_n\) is a normal subgroup of \(S_n\).

A group *G* is called *simple* if it does not have any non-trivial normal subgroups. For example, the group \(S_n\) is not simple, because it has a non-trivial normal subgroup \(A_n\). However, it turns out that, for \(n \ge 5\), \(A_n\) has no non-trivial normal subgroups, hence it is a simple group.

**Representing Group Elements as a Composition of Some “Special” Elements**

A *square* in a group *G* is any element \(a \in G\) which can be written as \(a=b \cdot b\) for some \(b \in G\). Similarly, \(a \in G\) is called a *k-th power*, if \(a=b \cdot b \cdot \dots \cdot b\) (*k* times) for some \(b \in G\). One may ask if every element \(a \in G\) can be written as a composition of squares, or, more generally, *k*-th powers. In general, the answer is “no”, because all squares (or *k*-th powers) can belong to some normal subgroup *H* of *G*, and so are all their compositions. This motivates us to study the same question for the case when *G* is a *simple* group. It turns out that in this case the answer is “yes”, and one may then look for a *minimal* number of squares (or *k*-th powers) needed for such a representation.

In fact, squares and *k*-th powers are just special cases of the general notion of group words. A *word* *w* is any finite string of symbols, possibly with repetitions and with inverse symbol, like *aaa* or \(aabbbbc^{-1}a^{-1}a^{-1}c\). Let *w* be a word with *d* different symbols \(s_1, s_2, \dots , s_d\), *G* be a group, and \(g_1, g_2, \dots , g_d \in G\) be any *d* elements of *G*. Then we write \(w(g_1, g_2, \dots , g_d)\) to be the result of (i) substitution of \(g_1, g_2, \dots , g_d\) into *w* instead of \(s_1, s_2, \dots , s_d\), respectively, and (ii) performing the group operation. Let *w*(*G*) denote the set of all elements \(g \in G\) representable in the form \(g = w(g_1, g_2, \dots , g_d)\) for some \(g_1, g_2, \dots , g_d \in G\). For example, the set of all squares in *G* is just *w*(*G*) for \(w=aa\), while the set of all *k*-th powers is *w*(*G*) for \(w=aa\dots a\) (*k* times).

One can then ask if every element of a simple group *G* can be represented as a composition of elements from *w*(*G*), and if so, how many elements from *w*(*G*) we need for this. The following theorem, proved in [345], states that, for sufficiently large (but finite) *G*, every \(g \in G\) is in fact a composition of just *three* elements from *w*(*G*)!

### Theorem 9.14

Let *w* be any non-empty word. Then there exists a positive integer *N*, depending only on *w*, such that for every finite simple group *G* with \(|G| \ge N\), every element \(g \in G\) can be represented as \(g=g_1 \cdot g_2 \cdot g_3\), where \(g_i \in w(G)\), \(i=1,2,3\).

### Reference

A. Shalev, Word maps, conjugacy classes, and a noncommutative Waring-type theorem, *Annals of Mathematics* **170**-3, (2009), 1383–1416.