1 Eigenvalues, I: bound state of atoms

This is the second part of a two part article. There is one picture in this part and four pictures in Part 1. The reader should be sure to read the notational warnings near the end of Sect. 1 in Part 1. Just before those warnings is a summary of the organization of the full paper which includes the following about Part 2:

Part 2 begin with two pioneering works on aspects of bound states—his result on non-existence of positive energy bound states in certain two body systems and his paper on the infinity of bound states for Helium, at least for infinite nuclear mass.

Next four sections on scattering and spectral theory which discuss the Kato–Birman theory (trace class scattering), Kato smoothness, Kato–Kuroda eigenfunction expansions and the Jensen–Kato paper on threshold behavior.

Last is a set of three miscellaneous gems: his work on the adiabatic theorem, on the Trotter product formula and his pioneering look at eigenfunction regularity.

In a short companion paper [315] to his famous 1951 paper [314], Kato proved that

Theorem 11.1

(Kato [315]) The non-relativistic Helium atom with infinite nuclear mass has infinitely many bound states. With the physical masses, it has at least 25,585 bound states.

The number 25,585 seems unusual but it is just \(\sum _{j=1}^{42} j^2\) corresponding to the number of bound states in the first 42 complete shells of a Hydrogenic atom.

An operator like the Helium atom Hamiltonian typically has an essential spectrum, \([\Sigma ,\infty )\) (for an arbitrary self-adjoint operator, A, we define \(\Sigma (A) = \inf \{\lambda \,|\, \lambda \in \sigma _{ess}(A)\}\) where, we recall, \(\sigma _{ess}(A) =\sigma (A){\setminus }\sigma _d(A)\) and \(\sigma _d(A)\), the discrete spectrum, is the isolated points of \(\sigma (A)\), the spectrum, for which the spectral projection is finite dimensional (see Sect. 2 in Part 1).

There may be one or more eigenvalues of A below \(\Sigma \), i.e., counting multiplicity, \(\{E_k\}_{k=1}^N,\, N \in \{0,1,2,\ldots \}\cup \{\infty \}\) where \(E_{j-1} \le E_j <\Sigma \). If \(N=\infty \), then \(\lim _{k \rightarrow \infty } E_k = \Sigma \).

Most modern approaches to results like Theorem 11.1 rely on the min–max principle [616, Theorem 3.14.5] which says that if A is self-adjoint and bounded from below, and if one defines

$$\begin{aligned} \mu _n(A) = \sup _{\psi _1,\ldots ,\psi _{n-1}} \left( \inf _{\begin{array}{c} \varphi \in D(A),\, ||\varphi ||=1 \\ \varphi \perp \psi _1,\ldots ,\psi _{n-1} \end{array}} \langle \varphi ,A\varphi \rangle \right) \end{aligned}$$
(11.1)

then \(\mu _j(A) = E_j(A)\) for \(j \le N\) and if \(N < \infty \), then for \(j > N\), \(\mu _j(A) = \Sigma (A)\). Instead, Kato notes the following

Lemma 11.2

Let A be a self-adjoint operator which is bounded from below and \(W \subset D(A)\) a subspace of dimension k so that

$$\begin{aligned} \sup _{\varphi \in W,\,||\varphi ||=1} \langle \varphi ,A\varphi \rangle = J \end{aligned}$$
(11.2)

then

$$\begin{aligned} \dim {\text {ran}}\, P_{(-\infty ,J]}(A) \ge k \end{aligned}$$
(11.3)

Remarks

  1. 1.

    \(P_\Omega (A)\) are the spectral projections of A, see [616, Sect. 5.1].

  2. 2.

    While Kato uses this lemma instead of the min–max principle, it should be emphasized that this lemma can be used to prove that principle!

Proof

Suppose that \(\dim {\text {ran}}\, P_{(-\infty ,J]}(A) < k\). Then we can find \(\varphi \in W\) so \(\varphi \perp {\text {ran}}\, P_{(-\infty ,J]}(A)\). Thus, by the spectral theorem \(\langle \varphi ,A\varphi \rangle > J\) contrary to (11.2) \(\square \)

For Kato, \(\Sigma \) is defined not in terms of essential spectrum but by

$$\begin{aligned} \Sigma = \inf \{\lambda \,|\,\dim {\text {ran}}\, P_{(-\infty ,\lambda )}(A) = \infty \} \end{aligned}$$
(11.4)

although it is the same. His strategy is simple.

  1. (1)

    Get a lower bound, \(\Sigma _0\), on \(\Sigma \).

  2. (2)

    Find a k-dimensional subspace, W, and a J given by (11.2) which obeys \(J < \Sigma _0\). By (11.4), \(\dim {\text {ran}}\, P_{(\infty ,J]}(A) < \infty \) and by the lemma, it is at least k so there must be at least k discrete eigenvalues, counting multiplicity in \((-\infty ,J]\).

Let’s discuss first the case where the nuclear mass is infinite. The Hamiltonian in suitable units is

$$\begin{aligned} H=-\Delta _1-\Delta _2 - \frac{2}{r_1}-\frac{2}{r_2}+\frac{1}{|\varvec{r_1}-\varvec{r_2}|} \end{aligned}$$
(11.5)

on \(L^2({\mathbb {R}}^6,d^6 x)\) where \(x=(\varvec{r_1},\varvec{r_2}),\,\varvec{r_j} \in {\mathbb {R}}^3\). Kato then considers

$$\begin{aligned} \tilde{H} = H - \frac{1}{|\varvec{r_1}-\varvec{r_2}|} = h\otimes {\varvec{1}}+{\varvec{1}}\otimes h \end{aligned}$$
(11.6)

where

$$\begin{aligned} h=-\Delta -\frac{2}{r} \end{aligned}$$
(11.7)

He talks about “two independent Hydrogen like atoms” rather than tensor products, but it is the same thing. Thus the spectrum of \(\tilde{H} \) is \(\{\lambda _1+\lambda _2\,|\, \lambda _1,\lambda _2 \in \sigma (h)\}\). Since \(\sigma (h) = \{-1/n^2\}_{n=1}^\infty \cup [0,\infty )\), we see that \(\Sigma (\tilde{H}) = -1\). Since \(H \ge \tilde{H}\), we conclude that

$$\begin{aligned} \Sigma (H) \ge -1 \equiv \Sigma _0 \end{aligned}$$
(11.8)

(we’ll eventually see that this is actually equality). This concludes Step 1 in this infinite nuclear mass case.

Kato next picked the subspace, W, of trial functions. Let \(\varphi _0\) be the ground state of h, i.e.

$$\begin{aligned} h\varphi _0 = - \varphi _0 \end{aligned}$$
(11.9)

Kato notes the explicit formula, \(\varphi _0(\varvec{x}) = \pi ^{-1/2}e^{-|x|}\) but other than that it is spherically symmetric, the exact formula plays no role. He picks \(W=\{\varphi _0\otimes \eta \,|\, \eta \in W_1\}\) where \(W_1\) will be a suitable subspace of \(L^2({\mathbb {R}}^3)\), i.e. \(\varphi (\varvec{x_1} ,\varvec{x_2}) = \varphi _0(\varvec{x_1})\eta (\varvec{x_2})\)

One easily computes that

$$\begin{aligned} \langle \varphi ,H\varphi \rangle = -1+ \langle \eta ,(-\Delta +Q(x))\eta \rangle \end{aligned}$$
(11.10)

where

$$\begin{aligned} Q(x) = -\frac{2}{|x|}+\int |\varphi _0(y)|^2 \frac{1}{|x-y|}\,d^3y \end{aligned}$$
(11.11)

The second term in (11.11) is the gravitational potential of a spherically symmetric “mass distribution” \(|\varphi _0(y)|^2d^3y\) and this has been computed by Newton who showed that

$$\begin{aligned} \int _{S^2} \frac{d\omega }{|r\omega -\varvec{x}|} = \frac{1}{\max (|x|,r)} \end{aligned}$$
(11.12)

(where \(d\omega \) is normalized measure on the unit 2-sphere). Thus

$$\begin{aligned} Q(x)&= -\frac{2}{|x|} + \int |\varphi _0(y)|^2 \frac{1}{\max (|x|,|y|)}\,d^3y \nonumber \\&\le -\frac{1}{|x| } \end{aligned}$$
(11.13)

since \(\max (|x|,|y|) \ge |x|\). Thus

$$\begin{aligned} \langle \varphi ,H\varphi \rangle \le -1+\langle \eta ,(-\Delta -1/r)\eta \rangle \end{aligned}$$
(11.14)

Picking \(\eta \) in the space of dimension \(\tfrac{1}{6}k(k+1)(2k+1)\) of linear combinations of eigenfunctions of \(-\Delta -1/r\) of energies \(\{-\tfrac{1}{4j^2}\}_{j=1}^k\), we see that the J of (11.2) is \(-1-(1/4k^2) < \Sigma _0\), so there are infinitely many eigenvalues below \(\Sigma _0\) (which also shows that \(\Sigma =\Sigma _0\)).

If one now considers a nucleus of mass M and electrons of mass m, the Hamiltonian with the center of mass motion removed becomes [instead of (11.5)]

$$\begin{aligned} H=-\Delta _1-\Delta _2 - 2\alpha \varvec{\nabla _1}\cdot \varvec{\nabla _2}- \frac{2}{r_1}-\frac{2}{r_2}+\frac{1}{|\varvec{r_1}-\varvec{r_2}|} \end{aligned}$$
(11.15)

where

$$\begin{aligned} \alpha = \frac{m}{M+m} \end{aligned}$$
(11.16)

The extra \(2\alpha \varvec{\nabla _1}\cdot \varvec{\nabla _2}\) term, called the Hughes–Eckart term (after [261]), is present if one uses atomic coordinates, \(\mathbf {r}_j={\mathbf {x}}_j-{\mathbf {x}}_3;\,j=1,2\), where \({\mathbf {x}}_j\) is the coordinate of electron j and \(\mathbf {r}_3\) is the nuclear position (we’ll say a lot about such N-body kinematics below).

The second step in the proof is unchanged. Since \(\langle \varphi _0,\varvec{\nabla }\varphi _0 \rangle =0\) (by either the reality of \(\varphi \) or its spherical symmetry), the Hughes–Eckart terms contribute nothing to the calculation of \(\langle \varphi ,H\varphi \rangle \) and we get a subspace of trial functions of dimension \(\tfrac{1}{6}k(k+1)(2k+1)\) with \(J_k=-1-1/4k^2\).

Here is how Kato estimated \(\Sigma \) in this case. With \(p_j=-i\nabla _j\), one can write:

$$\begin{aligned} \varvec{p}_1^2+\varvec{p}_2^2+2\alpha \varvec{p}_1\cdot \varvec{p}_2=\alpha (\varvec{p}_1+\varvec{p}_2)^2+(1-\alpha ) (\varvec{p}_1^2+\varvec{p}_2^2) \end{aligned}$$
(11.17)

Since \(|\varvec{r}_1-\varvec{r}_2|^{-1} \ge 0\) and \(\alpha (\varvec{p}_1+\varvec{p}_2)^2 \ge 0\), we see that

$$\begin{aligned} H \ge (1-\alpha )(-\Delta _1-\Delta _2)-\frac{2}{r_1}-\frac{2}{r_2} \equiv H_{Kato} \end{aligned}$$
(11.18)

As in the infinite mass case, \(H_{Kato}\) is a sum of independent Hydrogen like atoms, so one finds that

$$\begin{aligned} \Sigma \ge \Sigma _0 = \Sigma (H_{Kato})=-\frac{1}{1-\alpha } \end{aligned}$$
(11.19)

Putting in the physical value of \(\alpha \) [i.e. (11.16) with \(M=\)Helium nuclear mass and \(m=\)electron mass], one finds that

$$\begin{aligned} \Sigma _0 \ge -1-1/4k^2 \text { if } k \le 42 \end{aligned}$$
(11.20)

so Kato concluded there were at least 42 shells and got the number 25,585 of Theorem 11.1.

Remarks

  1. 1.

    As Kato emphasized, before his work, it wasn’t proven that the Helium Hamiltonian had any bound states!

  2. 2.

    Kato ignored both spin (the Hamiltonian is spin-independent but each electron has two spin states, so on \(L^2({\mathbb {R}}^{3N};{\mathbb {C}}^2\otimes {\mathbb {C}}^2,d^{3N}x)\) there are 4 times as many states) and statistics (the Pauli principle, which, as interpreted by Fermi and Dirac, says the total wave function is antisymmetric under interchange of a pair of particles in both spin and space). H is symmetric under interchange of the two electrons in space alone, so its eigenfunctions can be chosen to be either symmetric or antisymmetric under spatial interchange. Kato’s trial functions are neither but the lower bound, \(N_{Kato}\) that he gets provides a lower bound on \(N_S+N_A\), the sum of the spatially symmetric and spatially antisymmetric functions. To get a state totally antisymmetric under interchange of space and spin, each spatially symmetric wave function is multiplied by a spin 0 state (multiplicity 1) and each spatially antisymmetric state is multiplied by a spin 1 state (multiplicity 3). So taking into account both spin and statistics, the total number of states is \(N_S+3N_A\) so

    $$\begin{aligned} N_S+N_A \le N_S+3N_A \le 3(N_S+N_A) \end{aligned}$$
    (11.21)

    In particular, \(N_{Kato}\) is a lower bound on \(N_S+3N_A\), so Kato’s estimates are lower bounds even if one properly takes into account spin and statistics.

  3. 3.

    Even in the infinite mass case, Kato’s method doesn’t work for three electron atoms. The problem is with his estimate of \(\Sigma \). If one drops the repulsion of electron 3 from both 1 and 2, one gets an independent sum of an ion and a charge 3 Hydrogen like atom. The bottom of the essential spectrum of such a system is actually twice the ground state energy of two of the charge 3 Hydrogen like atoms which is below the energy of the ion where one expects (and we actually know) the bottom of the essential spectrum really is.

This completes our description of Kato’s paper. To go beyond it, one realizes the weak point of his analysis (as seen in Remark 3 above) is no efficient way of estimating the bottom of the continuous spectrum. As a preliminary to discussing this bottom, we pause to present some N-body kinematics, an issue that already entered when we discussed the Hughes–Eckart term above. We’ll be more expansive than absolutely necessary, in part, because we’ll need this when we briefly turn to N-body scattering in Sects. 1315 and, in part, because the elegant formalism, which I learned from Sigalov–Sigal [566] (see also Hunziker–Sigal [264]), deserves to be better known.

Given N particles \((\varvec{r}_1,\ldots ,\varvec{r}_N)\) with masses \(m_1,\ldots ,m_N\), we consider the inner product

$$\begin{aligned} \langle r^{(1)},r^{(2)} \rangle = \sum _{j=1}^{N} m_j \varvec{r^{(1)}}_j \cdot \varvec{r^{(2)}}_j \end{aligned}$$
(11.22)

on x-space, \(X = {\mathbb {R}}^{\nu N}\). This is natural because the free Hamiltonian

$$\begin{aligned} H_0 = -\sum _{j=1}^{N} (2m_j)^{-1}\Delta _{\varvec{r}_j} \end{aligned}$$
(11.23)

is precisely one half the Laplace–Beltrami operator for the Riemann metric associated to (11.22).

We let \(X^*\) be the dual to X, which we think of as momentum space. If \(\varvec{p} \in X^*\) and \(\varvec{x} \in X\), they are paired as

$$\begin{aligned} \langle \varvec{p},\varvec{x} \rangle = \sum _{j=1}^{N} \varvec{p}_j \cdot \varvec{x}_j \end{aligned}$$
(11.24)

as occurs in the Fourier transform. This induces an inner product on \(X^*\)

$$\begin{aligned} \langle p^{(1)},p^{(2)} \rangle _{X^*} = \sum _{j=1}^{N} (m_j)^{-1} \varvec{p^{(1)}}_j \cdot \varvec{p^{(2)}}_j \end{aligned}$$
(11.25)

consistent with (11.23)

A coordinate change is associated to a linear basis, \(e_1,\ldots ,e_N\) of \({\mathbb {R}}^{N}\) via

$$\begin{aligned} \varvec{\rho }_j(\varvec{x}_1,\ldots ,\varvec{x}_N)= \sum _{r=1}^{N} e_{jr}\varvec{x}_r \end{aligned}$$
(11.26)

(the \(e_{jr} \in {\mathbb {R}}\) and \(\varvec{x}_r \in {\mathbb {R}}^\nu \).)

To be a trifle pedantic, we note that X and \(X^*\) depend on N and \(\nu \). We’ll use Y for the case \(\nu =1\) so that \(X = Y \otimes {\mathbb {R}}^\nu \) and the X inner product is the tensor product of the Y inner product and the Euclidean inner product on \({\mathbb {R}}^\nu \) which we denoted with \(\cdot \) in (11.22) and (11.26). Since the e’s act on Y, we think of them as lying in \(Y^*\) (acting isotropically on the \({\mathbb {R}}^\nu \) piece). The dual basis \(f_j\) is defined by

$$\begin{aligned} \langle f_j,e_\ell \rangle = \delta _{j\ell }{,} \quad \text { i.e. } \sum _{r=1}^{N} f_{jr} e_{\ell r} = \delta _{j \ell } \end{aligned}$$
(11.27)

If we think of EF as the \(N\times N\) matrices with \(F_{jr}=(f_j)_r, \, E_{jr} = (e_j)_r\), then (11.27) says that \(FE^T={\varvec{1}}\). Since \({\varvec{1}}^T={\varvec{1}}\) and for finite matrices \(AB={\varvec{1}}\Rightarrow BA={\varvec{1}}\), we conclude that \(EF^T=E^TF=F^TE={\varvec{1}}\), i.e.

$$\begin{aligned} \sum _j f_{rj}e_{sj} = \sum _j f_{jr}e_{js} = \sum _j f_{sj}e_{rj} = \sum _j f_{js}e_{jr} = \delta _{rs} \end{aligned}$$
(11.28)

First this implies that if

$$\begin{aligned} \varvec{k}_j(\varvec{p}_1,\ldots ,\varvec{p}_N) = \sum _{q=1}^{N} f_{jq}\varvec{p}_q \end{aligned}$$
(11.29)

then by (11.28)

$$\begin{aligned} \sum _{j=1}^{N} \varvec{k}_j\cdot \varvec{\rho }_j&= \sum _{j=1}^{N}\sum _{q=1}^{N}\sum _{r=1}^{N} f_{jq}e_{jr}\varvec{p}_q\cdot \varvec{x}_r \nonumber \\&= \sum _{q=1}^{N} \varvec{p}_q\cdot \varvec{r}_q \end{aligned}$$
(11.30)

so the k’s are the Fourier duals to the \(\rho \)’s and (11.28) describes the transformation of momenta.

Moreover, we claim that \(\langle e_j,e_k \rangle _{Y^*}\) and \(\langle f_j,f_k \rangle _Y\) are inverse matrices to each other, i.e.

$$\begin{aligned} \langle e,e \rangle _{Y^*}\langle f,f \rangle _Y = {\varvec{1}}\end{aligned}$$
(11.31)

If \(e_j^{(0)} = \delta _j\), then \(f_j^{0} = \delta _j\) and \(\langle e_j^{(0)},e_k^{(0)} \rangle _{Y^*} = m_j^{-1} \delta _{jk}\) is indeed the inverse to \(\langle f_j^{(0)},f_k^{(0)} \rangle _Y = m_j\delta _{jk}\). Since \(e_r = \sum _{q=1}^{N}E_{rq}e_q^{(0)}\) and \(f_j = \sum _{k=1}^{N}F_{jk}f_K^{(0)}\), we see that \(\langle e,e \rangle _{Y^*}\langle f,f \rangle _Y = E^T\langle e^{(0)},e^{(0)} \rangle _{Y^*}EF^T\langle f^{(0)},f^{(0)} \rangle _Y F = {\varvec{1}}\) by (11.28) and (11.27) for the \(e^{(0)},f^{(0)}\) special case just proven.

Finally by (11.26) and (11.27), we see that

$$\begin{aligned} \sum _{j=1}^{N} m_j \varvec{r}_j^2&= \sum _{r,s=1}^{N} \langle f_r,f_s \rangle _Y \varvec{\rho }_r \cdot \varvec{\rho }_s \end{aligned}$$
(11.32)
$$\begin{aligned} \sum _{j=1}^{N} m_j^{-1} \varvec{p}_j^2&= \sum _{r,s=1}^{N} \langle e_r,e_s \rangle _{Y^*} \varvec{k}_r \cdot \varvec{k}_s \end{aligned}$$
(11.33)

Example 11.3

(Removing the center of mass) First consider \(N=2\). Since we have \(V(\varvec{r}_1-\varvec{r}_2)\), we want \(\varvec{r}_1-\varvec{r}_2\) to be one coordinate, i.e. \(e_1=(1,-1)\). The natural second coordinate should be orthogonal in \(Y^*\), i.e. \(\tfrac{1}{m_1}e_{21}-\tfrac{1}{m_2}e_{22} = 0\) so \((m_1,m_2)\) will work but it is more usual to take \(e_2 = \tfrac{1}{M}(m_1,m_2),\, M=m_1+m_2\) the total mass. That is, the second coordinate is \((m_1\varvec{r}_1+m_2\varvec{r}_2)/M\), the center of mass. One computes

$$\begin{aligned} \langle e_1,e_1 \rangle _{Y^*}= & {} \frac{1}{m_1}+\frac{1}{m_2}\equiv \frac{1}{\mu } \qquad \langle e_1,e_2 \rangle _{Y^*}= 0\nonumber \\ \langle e_2,e_2 \rangle _{Y^*}= & {} \frac{1}{M^2}\left( \frac{m_1^2}{m_1}+\frac{m_2^2}{m_2}\right) =\frac{1}{M} \end{aligned}$$
(11.34)

We compute

$$\begin{aligned} f_1=\left( \frac{m_2}{M},-\frac{m_2}{M}\right) , \qquad f_2=(1,1) \end{aligned}$$
(11.35)

By either direct calculation or (11.31)

$$\begin{aligned} \langle f_1,f_1 \rangle _{Y}= & {} \frac{m_1m_2^2+m_1^2m_2}{M^2} = \frac{m_1m_2}{M} = \mu \qquad \langle f_1,f_2 \rangle _{Y}= 0\nonumber \\ \langle f_2,f_2 \rangle _Y= & {} m_1+m_2=M \end{aligned}$$
(11.36)

Thus

$$\begin{aligned} \varvec{r}_{12} = \varvec{r}_1-\varvec{r}_2 \qquad&\varvec{R} = \frac{1}{M}(m_1\varvec{r}_1+m_2\varvec{r}_2) \end{aligned}$$
(11.37)
$$\begin{aligned} \varvec{k}_{12} = \frac{m_2\varvec{p}_1-m_1\varvec{p}_2}{M} \qquad&\varvec{K} = \varvec{p}_1+\varvec{p}_2 \end{aligned}$$
(11.38)

and we see that

$$\begin{aligned} m_1\varvec{r}_1^2+m_2\varvec{r}_2^2 = \mu \varvec{r}_{12}^2+M\varvec{R}^2; \qquad H_0 = -\frac{1}{2M}\Delta _{\varvec{R}}-\frac{1}{2\mu }\Delta _{\varvec{r}_{12}} \end{aligned}$$
(11.39)

For N bodies, motivated by the above, we want to take \(f_N=(1,\ldots ,1)\) and \(f_1,\ldots ,f_{N-1}\) all orthogonal to it. Then \(\langle f,f \rangle _Y\) will be the direct sum of an \((N-1)\times (N-1)\) matrix and \(\langle f_N,f_N \rangle _Y=M\). Thus \(\langle e,e \rangle _{Y^*}\) with be the direct sum of an \((N-1)\times (N-1)\) matrix and \(\langle e_N,e_N \rangle _{Y^*}=1/M\). Moreover, we claim that

$$\begin{aligned} \langle e_N,f \rangle = \langle f_N,f \rangle /\langle f_N,f_N \rangle \end{aligned}$$
(11.40)

since this holds for each \(f_j\). Putting \(f=\delta _j\) in, we conclude that \(e_N=M^{-1}(m_1,\ldots ,m_N)\). We summarize in this Proposition.

Proposition 11.4

In any coordinate system, \(\varvec{\rho }_1,\ldots ,\varvec{\rho }_N\) where \(\varvec{\rho }_j,\,j=1,\ldots ,N-1\) is a linear combination of \(\varvec{r}_k-\varvec{r}_\ell \) and

$$\begin{aligned} \varvec{\rho }_N=\frac{1}{M}\sum _{j=1}^{N} m_j\varvec{r}_j \end{aligned}$$
(11.41)

we have that

$$\begin{aligned} H_0 = -\sum _{j=1}^{N} \frac{1}{2m_j}\Delta _{\varvec{r}_j} = h_0\otimes {\varvec{1}}+ {\varvec{1}}\otimes T_0 \end{aligned}$$
(11.42)

where \(h_0=-(2M)^{-1}\Delta _{\varvec{\rho }_N}\) and \(T_0\) is a quadratic form in \(-i\varvec{\nabla }_{\varvec{\rho }_j}, \, j=1,\ldots ,N-1\).

Example 11.5

(Atomic coordinates) This is named for the natural coordinates when there is a heavy nucleus, \(\varvec{r}_N\) and \(N-1\) electrons. We take (with \(m_j=m\) for \(j=1,\ldots ,N-1\))

$$\begin{aligned} \varvec{\rho }_j = \varvec{r}_j-\varvec{r}_N,\, j=1,\ldots ,N-1; \qquad \varvec{\rho }_N=\frac{1}{M}\sum _{j=1}^{N} m_j\varvec{r}_j \end{aligned}$$
(11.43)

Thus, by (11.26)

$$\begin{aligned} e_j=\delta _j-\delta _N; \qquad e_N=\frac{1}{M} \end{aligned}$$
(11.44)

Since \(\langle a,a \rangle _{Y^*}=\sum _{j=1}^{N} m_j^{-1}a_j^2\), we see that

$$\begin{aligned} \langle e_N,e_j \rangle _{Y^*}= & {} M^{-1} \delta _{Nj} \end{aligned}$$
(11.45)
$$\begin{aligned} \langle e_j,e_j \rangle _{Y^*}= & {} \frac{1}{m}+\frac{1}{m_N} \equiv \frac{1}{\mu } \qquad j=1,\ldots ,N-1 \end{aligned}$$
(11.46)
$$\begin{aligned} \langle e_j,e_k \rangle= & {} \frac{1}{m_N} \qquad 1\le j,k\le N-1;\, j\ne k \end{aligned}$$
(11.47)

Thus, by (11.33)

$$\begin{aligned} T_0&= -\sum _{j,k=1}^{N-1} \frac{1}{2} \langle e_j,e_k \rangle _{Y^*} \varvec{\nabla }_j\cdot \varvec{\nabla }_k \nonumber \\&= -\sum _{j=1}^{N-1} \frac{1}{2\mu }\Delta _j - \frac{1}{m_N} \sum _{j<k} \varvec{\nabla }_j\cdot \varvec{\nabla }_k \end{aligned}$$
(11.48)

(there is no 2 in front of \(m_N\) because we have changed from a sum over \(j \ne k\) to \(j \le k\).) Noting that

$$\begin{aligned} \frac{\mu }{m_N}=\frac{m\,m_n}{m+m_n}\frac{1}{m_n}=\frac{m}{m_n+m} \end{aligned}$$

which is (11.15)/(11.16) (taking into account a changed meaning for the symbol M there and here!).

Example 11.6

(Jacobi coordinates) These coordinate changes go back to classical mechanics. Jacobi noted one could avoid cross terms in the kinetic energy changing first from \(r_1\) and \(r_2\) to \(r_{1,2}\) and the center of mass, \(R_{12}\), of the first two particles. Then one goes from \(R_{12}\) and \(r_3\) to \(r_3-R_{12}\) and the center of mass of the first three particles. After \(N-1\) steps, one has R, the total center of mass as one of the coordinates, and \(N-1\) “internal” coordinates.

Example 11.7

(Clustered Jacobi coordinates) Given \(\{1,\ldots ,N\}\), a cluster decomposition or clustering, \({\mathcal {C}}= \{C_\ell \}_{\ell =1}^k\), is a partition, i.e. a family of disjoint subsets whose union is \(\{1,\ldots ,N\}\). We set \(\#(C_\ell )\) to be the number of particles in \(C_\ell \). A coordinate, \(\varvec{\rho }\), is said to be internal to \(C_\ell \) if it is a function only of \(\{\varvec{r}_m\}_{m \in C_\ell }\) and is invariant under \(\varvec{r}_m \rightarrow \varvec{r}_m+\varvec{a}\), equivalently, it is a linear combination of \(\{\varvec{r}_m-\varvec{r}_q\}_{m,q \in C_\ell }\). A clustered Jacobi coordinate system is a set of \(\#(C_\ell )-1\) independent internal coordinates for each cluster together with \(\varvec{R}_\ell = (\sum _{q \in C_\ell } m_q\varvec{r}_q)/(\sum _{q \in C_\ell } m_q)\), If we write \({\mathcal H}(C_\ell )\) to be \(L^2\) of the internal coordinates and \({\mathcal H}^{({\mathcal {C}})}\) to be \(L^2\) of the internal coordinates then

$$\begin{aligned} {\mathcal H}= & {} {\mathcal H}^{({\mathcal {C}})} \otimes \bigotimes _{\ell =1}^k {\mathcal H}(C_\ell ) \end{aligned}$$
(11.49)
$$\begin{aligned} H_0= & {} \widetilde{T}^{({\mathcal {C}})}\otimes {\varvec{1}}\cdots \otimes {\varvec{1}}+ \sum _{\ell =1}^{k} {\varvec{1}}\otimes \cdots \otimes T(C_\ell ) \otimes \cdots \otimes {\varvec{1}}\end{aligned}$$
(11.50)

where \(\widetilde{T}^{({\mathcal {C}})} = -\sum _{\ell =1}^{k} (2M(C_\ell ))^{-1}\Delta _{\varvec{R}_\ell }\) and \(T(C_\ell )\) is a quadratic form in the derivatives of the internal coordinates.

As noted, the big limitation in Kato’s work on Helium bound states concerns his estimate of \(\Sigma \), the bottom of the essential spectrum of H. We turn to understanding that. In the two body case, \(H=-\Delta +V\), one expects that \(\sigma _{ess}(H)=[0,\infty )\). This requires that V go to zero at spatial infinity in some sense. If one is looking at V’s for which \(D(H)=D(-\Delta )\), the natural condition is that \(V(-\Delta +1)^{-1}\) is a compact operator (see [616, Sect. 3.14]). To be explicit, we introduce \(L^p({\mathbb {R}}^\nu )+L^\infty ({\mathbb {R}}^\nu )_\epsilon \) to be the set of V so that for any \(\epsilon >0\), one can decompose \(V=V_{1,\epsilon }+V_{2,\epsilon }\) with \(V_{1,\epsilon } \in L^p({\mathbb {R}}^\nu )\) and \(||V_{2,\epsilon }||_\infty \le \epsilon \). If p is \(\nu \)-canonical, one can prove that if \(V \in L^p({\mathbb {R}}^\nu )+L^\infty ({\mathbb {R}}^\nu )_\epsilon \), then \(V(-\Delta +1)^{-1}\) is compact and \(\sigma _{ess}(H)=[0,\infty )\). If one wishes, there are Stummel-type conditions to replace this but we’ll make such \(L^p\) assumptions below for simplicity of exposition.

We also want to remove the total center of mass motion if all masses are finite. That is we let \(\varvec{R}=\left( \sum _{j=1}^{N} m_j\varvec{r}_j\right) /\left( \sum _{j=1}^{N}m_j\right) \) and pick some set of internal coordinates so that \({\mathcal H}^{full} = {\mathcal H}_{CM}\otimes {\mathcal H},\, {\mathcal H}^{full} = L^2({\mathbb {R}}^{\nu N}), {\mathcal H}_{CM}=\) functions of \(\varvec{R}\), \({\mathcal H}=\) functions of the internal coordinates. If \(H^{full}=H_0+\sum _{j<k} V_{jk}\), then under this tensor product decomposition

$$\begin{aligned} H^{full}=H_{0,CM}\otimes {\varvec{1}}+{\varvec{1}}\otimes H \end{aligned}$$
(11.51)

where \(H_{0,CM}=-(2\sum _{j=1}^{N}m_j)^{-1}\Delta _{\varvec{R}}\). We’ll consider H below.

In (11.50), the operator \(\widetilde{T}^{({\mathcal {C}})}\) has a decomposition like (11.51) where \({\mathcal H}\) is replaced by \({\mathcal H}^{({\mathcal {C}})}\), the functions of the differences of the centers of mass of the \(C_j\). We write

$$\begin{aligned} \widetilde{T}^{({\mathcal {C}})} = H_{0,CM}\otimes {\varvec{1}}+{\varvec{1}}\otimes T^{({\mathcal {C}})} \end{aligned}$$
(11.52)

Given a cluster decomposition, \({\mathcal {C}}=\{C_\ell \}_{\ell =1}^k\), we write \((jq) \subset {\mathcal {C}}\) if j and q are in the same cluster of \({\mathcal {C}}\) and \((jq) \not \subset {\mathcal {C}}\) if they are in different clusters. We define

$$\begin{aligned} V(C_\ell )&= \sum _{\begin{array}{c} j,q\in C_\ell \\ j<q \end{array}} V_{jq} \end{aligned}$$
(11.53)
$$\begin{aligned} V({\mathcal {C}})&= \sum _{\ell =1}^{k} V(C_\ell ) = \sum _{\begin{array}{c} (jq) \subset {\mathcal {C}}\\ j<q \end{array}} V_{jq} \end{aligned}$$
(11.54)
$$\begin{aligned} I({\mathcal {C}})&= \sum _{j<q} V_{jq} - V({\mathcal {C}}) = \sum _{\begin{array}{c} (jq) \not \subset {\mathcal {C}}\\ j<q \end{array}} V_{jq} \end{aligned}$$
(11.55)

\(V({\mathcal {C}})\) is the intracluster interaction and \(I({\mathcal {C}})\) the intercluster interaction. We define on \({\mathcal H}(C_\ell )\)

$$\begin{aligned} h(C_\ell )&= T(C_\ell ) + V(C_\ell ) \end{aligned}$$
(11.56)
$$\begin{aligned} H({\mathcal {C}})&= T^{({\mathcal {C}})}\otimes {\varvec{1}}\cdots \otimes {\varvec{1}}+\sum _{\ell =1}^{k} {\varvec{1}}\otimes \cdots \otimes h(C_\ell )\otimes \cdots \otimes {\varvec{1}} \end{aligned}$$
(11.57)
$$\begin{aligned}&= H-I({\mathcal {C}}) \nonumber \\ \Sigma ({\mathcal {C}})&= \sum _{\ell =1}^{k} \inf \sigma (H(C_\ell )) \end{aligned}$$
(11.58)

We let \({\mathcal {C}}_{min}\) be the one cluster decomposition of \(\{1,\ldots ,N\}\) so \(H({\mathcal {C}}_{min}) = H\). We note that

$$\begin{aligned} {\mathcal {C}}\ne {\mathcal {C}}_{min} \Rightarrow \sigma (T^{({\mathcal {C}})}) = [0,\infty ) \end{aligned}$$
(11.59)

By (11.57), we have that \(\sigma (H({\mathcal {C}}))=\sigma (T^{({\mathcal {C}})})+\sigma (H(C_1))+\cdots +\sigma (H(C_k))\). By (11.59)

$$\begin{aligned} {\mathcal {C}}\ne {\mathcal {C}}_{min} \Rightarrow \sigma (H({\mathcal {C}})) = [\Sigma ({\mathcal {C}}),\infty ) \end{aligned}$$
(11.60)

When we discuss N-body spectral and scattering theory briefly in Sects. 1214, we’ll be interested in thresholds. A threshold, t, is a decomposition \({\mathcal {C}}=\{C_\ell \}_{\ell =1}^k \ne {\mathcal {C}}_{min}\) and an eigenvalue, \(E_\ell \) of \(h(C_\ell )\) for each \(\ell =1,\ldots ,k\). The threshold energy is \(E(t)=\sum _{\ell =1}^{k} E_\ell \). Of course, \(E(t) \ge \Sigma ({\mathcal {C}})\).

Fix \({\mathcal {C}}\ne {\mathcal {C}}_{min}\). Pick distinct vectors, \(X_1,\ldots ,X_k \in {\mathbb {R}}^\nu \). For \(\lambda \in {\mathbb {R}}\), let \(U(\lambda )\) be the unitary implementing \(x_j \mapsto x_j+\lambda X_p\) if \(j \in C_q\). It is easy to see that \(U(\lambda )H({\mathcal {C}})U(\lambda )^{-1} = H({\mathcal {C}})\) and if each \(V_{jq} \in L^p({\mathbb {R}}^\nu )+L^\infty ({\mathbb {R}}^\nu )_\epsilon \), then for all \(\varphi \in D(-\Delta )\) one has that

$$\begin{aligned} \lim _{\lambda \rightarrow \infty }[U(\lambda )H U(\lambda )^{-1} - H({\mathcal {C}})]\varphi =0 \end{aligned}$$
(11.61)

which implies [616, Problem 3.14.5] that \(\sigma (H({\mathcal {C}})) = {[}\Sigma ({\mathcal {C}}),\infty ) \subset \sigma (H)\). In particular, if

$$\begin{aligned} \Sigma = \inf _{{\mathcal {C}}\ne {\mathcal {C}}_{min}} \Sigma ({\mathcal {C}}) \end{aligned}$$
(11.62)

then

$$\begin{aligned} {[}\Sigma ,\infty ) \subset \sigma (H) \end{aligned}$$
(11.63)

The celebrated HVZ theorem says that

Theorem 11.8

(HVZ Theorem) For N-body Hamiltonians with \(V_{jq} \in L^p({\mathbb {R}}^\nu )+L^\infty ({\mathbb {R}}^\nu )_\epsilon \) (with p \(\nu \)-canonical) one has that

$$\begin{aligned} \sigma _{ess}(H) = [\Sigma ,\infty ) \end{aligned}$$
(11.64)

Remarks

  1. 1.

    There is a variant where there are infinite mass particles, i.e. some \(V_j\) terms, and the center of mass isn’t removed. Decompositions are now of \(\{0,1,\ldots ,N\}\). One says that \((j) \subset {\mathcal {C}}\) if 0 and j are in the same cluster.

  2. 2.

    The result is named after Hunziker [262], van Winter [662] and Zhislin [712].

  3. 3.

    There are essentially three generations of proofs of this theorem. The initial proofs of Hunziker and van Winter relied on integral equations (what are now called the Weinberg–van Winter equations). van Winter restricted her work to \(L^2({\mathbb {R}}^3)\) potentials since she only considered Hilbert–Schmidt operators while Hunziker’s independent work handled the general case above. This work was independent of the earlier work of Zhislin who only considered and proved results for atomic Hamiltonians. His methods were geometric.

  4. 4.

    The second wave concerns geometric proofs by Enss [138], Simon [583], Agmon [5], Gårding [182] and Sigal [554]. In one variant, the key is a geometric fact that there exists a partition of unity \(\{J_{\mathcal {C}}\}_{{\mathcal {C}}\ne {\mathcal {C}}_{min}}\) indexed non-minimal partitions so that \(\sum _{{\mathcal {C}}} J_{\mathcal {C}}= {\varvec{1}}\) and so that on \({\text {supp}}J_{\mathcal {C}}\cap \{x\,|\, |x| > 1\}\), one has that, for some \(Q>0\), \(|x_j-x_k| \ge Q|x|\) if \((jk) \not \subset {\mathcal {C}}\). One proves that \([f(H)-f(H({\mathcal {C}}))]J_{\mathcal {C}}\) is a compact operator for continuous functions, f of compact support. This, in turn, implies that when \({\text {supp}}f \subset (-\infty ,\Sigma )\), then f(H) is compact. For details, see [101, Sect. 3.3]. Agmon’s version [5] looks at limits as one translates in an arbitrary direction and is especially intuitive. In this regard, Agmon considered a class of potentials that generalize N-body systems. \(\{\pi _j\}\) is a family of non-trivial projections in \({\mathbb {R}}^{\nu N}\) and \(V=\sum V_j(\pi _j x)\) where \(V_j\) is a functions on \({\mathbb {R}}^{\dim {\text {ran}}\pi _j}\). This setup has been used by many authors since.

  5. 5.

    The third generation works in cases where \(\sigma _{ess}(A)\) can have gaps. This approach appeared (more or less independently) in Chandler-Wilde–Lindner [86, 87], Georgescu–Iftimovici [186], Last–Simon [411, 412], Mǎntoiu [440] and Rabinovich [488]. Perhaps the cleanest result from [412] defines the notion of right limits and proves that \(\sigma _{ess}(H)\) is the union over all right limits of \(\sigma (H_r)\). See also [610, Sect. 7.2].

With the HVZ theorem in hand, one can easily carry Kato’s argument to its logical conclusion

Theorem 11.9

(Simon [569]) Let H be an N-body Hamiltonian with center of mass removed. Suppose that \(\Sigma \) is a two-body threshold, i.e. there is a cluster decomposition, \({\mathcal {C}}= \{C_1,C_2\}\) and vectors, \(\varphi _j \in {\mathcal H}(C_j),\, j=1,2\) so that \(H(C_j)\varphi _j = E_j\varphi _j\), \(||\varphi _j||=1\) and \(E_1+E_2 = \Sigma \). Define W on \({\mathbb {R}}^\nu \) as follows: \(y \in {\mathbb {R}}^\nu \) is the difference of the centers of mass of \(C_1\) and \(C_2\) and let \(x_k(y,\zeta _1,\zeta _2)\) be the position of particle k in terms of y and the internal coordinates \(\zeta _j\) of \(C_j\). Then

$$\begin{aligned} W(y) = \sum _{\begin{array}{c} q \in C_1 \\ k \in C_2 \end{array}} \int V_{qk}(x_q(y,\zeta _j)-x_k(y,\zeta _j)) |\varphi _1(\zeta _1)|^2 |\varphi _2(\zeta _2)|^2 d\zeta _1 d\zeta _2 \end{aligned}$$
(11.65)

Let \(\mu \) be the reduced mass of the two clusters and suppose that

$$\begin{aligned} -(2\mu )^{-1}\Delta _y + W(y) \end{aligned}$$
(11.66)

has an infinite number of eigenvalues below 0 as an operator on \(L^2({\mathbb {R}}^\nu )\). Then H has an infinite number of eigenvalues below \(\Sigma \).

Remarks

  1. 1.

    Thus, with \(M(C_j) = \sum _{k \in C_j} m_k\), we have that \(\mu ^{-1}=M(C_1)^{-1}+M(C_2)^{-1}\)

  2. 2.

    One might think that if \(j \in C_1\), then \(x_j(y,\zeta _1,\zeta _2)\) is independent of \(\zeta _2\) but that’s wrong for the total center of mass, \(\varvec{R}\), enters in \(x_j\) and that causes a \(\zeta _2\) dependence.

  3. 3.

    The proof is essentially unchanged from the ideas in Kato [315]. If \(\psi (y,\zeta _1,\zeta _2) = \varphi _1(\zeta _1) \varphi _2(\zeta _2)\eta (y)\), then \(\langle \psi ,H\psi \rangle = \Sigma +{\langle \eta ,(-(2\mu )^{-1}\Delta +W)\eta \rangle }\).

  4. 4.

    This result is from Simon [569] who revisited Kato’s paper after the discovery of the HVZ theorem.

Now fix \(Z,N > 0\). N is an integer but Z need not be. We define on \(L^2({\mathbb {R}}^{3N})\):

$$\begin{aligned} H(Z,N)= & {} \sum _{j=1}^{N} \left( -\Delta _j-\frac{Z}{|x_j|}\right) + \sum _{1 \le j,k \le N} \frac{1}{|x_j-x_k|} \end{aligned}$$
(11.67)
$$\begin{aligned} E(Z,N)= & {} \inf \sigma (H(Z,N)) \end{aligned}$$
(11.68)

One can accommodate Hughes Eckart terms in much of the discussion but we won’t include them.

By the arguments before (11.60), \(\sigma (H(Z,N-1)) \subset \sigma (H(Z,N))\) so the HVZ theorem implies that

$$\begin{aligned} \Sigma (H(Z,N))=E(Z,N-1) \end{aligned}$$
(11.69)

so we are interested in

$$\begin{aligned} \delta (Z,N) = -E(Z,N)+E(Z,N-1) \end{aligned}$$
(11.70)

the ionization energy to remove electron N from a nucleus of charge Z. Put differently, \(\delta \ge 0\) and \(\delta > 0\) if and only if N electrons bind to a charge Z nucleus.

Corollary 11.10

(Zhislin [712]) If \(Z > N-1\), then H(ZN) has infinitely many bound states below \(\Sigma \). In particular, \(\delta (Z,N) > 0\).

Remarks

  1. 1.

    This is because by induction, \(\Sigma \) is determined by a two cluster breakup into \(N-1\) particles (in the same cluster as 0) and one particle and then that \(W(y) = [Z-(N-1)]|y|^{-1} + \text {o}(1/|y|)\) and such a potential has infinitely many bound states.

  2. 2.

    This result was first proven by Zhislin using arguments somewhat more involved than Kato’s argument (and before Simon noted that Kato’s arguments work).

This completes the summary of the direct extensions of Kato’s work. We will end this section with a brief discussion of results on bound states of H(ZN) which are a direct descendent of Kato’s consideration. There is an enormous literature not only on this subject but also on bounds on the number of bound states when finite and on moments of the eigenvalues. We refer the reader to the forthcoming book of Frank et al. [160].

The other side of Corollary 11.10 is

Theorem 11.11

If \(Z \le N-1\), then H(ZN) has only finitely many bound states.

Remarks

  1. 1.

    This theorem is due to Zhislin [713]. There were earlier results of Uchiyama [659] (for \(N=2, Z < 1\)), and by Vugal’ter–Zhislin [667] and Yafaev [696, 697] (for \(Z=N-1\)).

  2. 2.

    The intuition is that the left over Coulomb repulsion (if \(Z < N-1\)) or residual Coulomb attraction (if \(Z=N-1\)) is such that an effective \(-\Delta +W\) has only finitely many states. Of course, one needs techniques to conclude that when an effective two body problem has that property, the full N-body does—one of the most effective methods is due to Sigal [554]. I note in passing that there are three particle systems with short range interactions that surprisingly have an infinite number of bound states, \(\{E_j\}_{j=1}^\infty \) with asymptotic geometric sequence placement, i.e. \(E_{j+1}/E_j \rightarrow \alpha < 1\). At least two of the three two body clusters must have zero energy resonances (what this means is discussed in Sect. 16) so the bottom of the essential spectrum is 0. The discovery on a formal level is due to Efimov [135] after whom the effect is named. For mathematical proofs see Yafaev [695], Tamura [634, 635], Sobolev [621] and Ovchinnikov–Sigal [472]. Wang [671, 672] discussed this for N-body systems. For popular science treatments of experimental verification of the geometric progression (even for small j!) see Ouellette [471] and Wolchover [688].

  3. 3.

    This theorem is stated for systems with no statistics. For \(Z < N-1\), the result extends without much trouble to Fermi statistics [713]. For \(Z=N-1\), one needs to assume that there is not an atomic ground state with a dipole moment (for there to be such a state, there would need to be a degeneracy of states with different parity) because \(-\Delta +\lambda \hat{e}\cdot \varvec{r}/(1+r)^3\) has an infinity of bound states when \(\lambda \) is large enough. In fact, in [583], it is claimed (quoting Lieb) that a molecule with two centers, \(Z_1=1/3, Z_2=2/3, N=2\) (so \(Z=N-1\)) and \(|\varvec{R_1} - \varvec{R_2}|\) large will have an infinity of bound states (although a proof has never been published to my knowledge). In any event, under an assumption about no atomic ground state with dipole moment, the theorem does extend to \(N=Z+1\) [667].

For most of the discussion below, we look at E(ZN) with Fermi statistics. One might expect that for Z fixed, one has that \(\delta (Z,N) = 0\) for all sufficiently large N, i.e. there is an \(N_c(Z)\) so that \(\delta (Z,N)=0\) if \(N \ge N_c(Z)\) and so that \(\delta (Z,N_c(Z)-1) > 0\). Ruskai [531, 532] and Sigal [554, 556] proved that for every Z, there is a such an \(N_c(Z)\) and Lieb [427] found a simple, elegant argument that \(N_c(Z) \le 2Z+1\) which, in particular, implies that \(H^{--}\) does not exist although \(H^{-}\) does.

In nature, there is no known example for \(\delta (Z,N) > 0\) if \(N \ge Z+2\), that is, there are once negatively charged ions in nature, but no twice negatively charged ions. So it might even be that \(N_c(Z)\) is always bounded by \(Z+1\). In any event, there is a conjecture [609] that \(N_c(Z) \le Z+k\) for some finite k. It is known (Lieb et al. [430]) that for fermion electrons one has that \(\lim _{Z \rightarrow \infty } N_c(Z)/Z =1\) but Benguria–Lieb [52] have proven that the \(\liminf \) is strictly bigger than 1 for bosonic electrons. There is considerable literature since these two basic papers, but since this is already removed from Kato’s work, we won’t try to summarize it.

2 Eigenvalues, II: lack of embedded eigenvalues

Consider on \({\mathbb {R}}^\nu \), the equation \((-\Delta +V)\varphi =\lambda \varphi \) with \(V(x) \rightarrow 0\) as \(|x| \rightarrow \infty \) and \(\lambda > 0\). Naively, one might expect that no solution, \(\varphi \), can be in \(L^2({\mathbb {R}}^\nu , d^\nu x)\). The intuition is clear: classically, if the particle is in the region \(\{x\,|\,|x| > R\}\) where R is picked so large that \(|x| > R \Rightarrow V(x) < \lambda /2\) and if the velocity is pointing outwards, the particle is not captured and so not bound. Due to tunnelling, in quantum theory, a particle will always reach this region so there shouldn’t be positive energy bound states. This intuition of no embedded eigenvalues is incomplete due to the fact that bumps can cause reflections even when the bumps are smaller than the energy, so an infinite number of small bumps which don’t decay too rapidly might be able to trap a particle. Indeed, in 1929, near the birth of modern quantum theory, von Neumann–Wigner [666] presented an example with an embedded eigenvalue of energy 1 (in fact they picked \(V(x) \rightarrow -1\) at infinity and \(\lambda =0\); we’ll shift energies by 1 and also pick their arbitrary constant A to be 1). They had the idea of guessing the wave function, \(\psi \), and setting \(V(x) = 1+\psi ^{-1}\Delta \psi (x)\). They picked \(\psi \) so that it had oscillations that cancelled the \(+1\) at infinity. Their choice as a function of \(r=|x|\) in three dimensions was

$$\begin{aligned} \psi (x) = \frac{\sin r}{r}[1+g(r)^2]^{-1}; \qquad g(r) = 2r-2\sin (2r) \end{aligned}$$
(12.1)

and they claimed that (where \(\tilde{g}(r) = 2r + 2\sin (2r)\))

$$\begin{aligned} V(x) = -32 \cos ^4r\frac{1-3\tilde{g}(r)^2}{[1+\tilde{g}(r)^2]^2} \end{aligned}$$
(12.2)

With slow enough decay, one can have much more than a single embedded eigenvalue. It is known (see Simon [599] and Kotani–Ushiroya [387]) that if \(0<\beta <1/2\) and \(q_\omega (x)\) is a random potential in one dimension with uniformly spaced independent, identically distributed random bumps, then \(-\tfrac{d^2}{dx^2}+(1+x^2)^{-\beta /2} q_\omega (x)\) has only dense pure point spectrum, i.e. the essential spectrum is \([0,\infty )\) and there is a complete orthonormal set of \(L^2\) eigenvectors!

In 1959, Kato proved the first strong result on the non-existence of positive eigenvalues:

Theorem 12.1

(Kato [330], announced in [329]) Let V(x) be continuous on \({\mathbb {R}}^\nu \) and obey

$$\begin{aligned} \lim _{r \rightarrow \infty } r \sup _{|y| > r} |V(y)| = 0 \end{aligned}$$
(12.3)

Then \((-\Delta +V)\varphi =\lambda \varphi \) with \(\lambda > 0\) has no (non-zero) \(L^2\) solutions.

Remarks

  1. 1.

    ODE techniques easily prove in one dimension and in arbitrary dimension if V is spherically symmetric, that there are no positive eigenvalues if \(\int _{1}^{\infty } |V(r)| \, dr < \infty \). This goes back at least to Weyl [680] who quotes results of Kneser [377]. In modern parlance, it follows from the existence of Jost solutions.

  2. 2.

    Earlier, Brownell [74, Theorem 6.7] proved the absence of such eigenvalues under bounds of the form \(|V(x)| \le C_1 \exp (-C_2 |x|)\).

  3. 3.

    There is both earlier and illuminating later work in the one dimensional (equivalently spherical symmetric) case. Let

    $$\begin{aligned} K \equiv \limsup _{|x| \rightarrow \infty } [|x||V(x)|] \end{aligned}$$
    (12.4)

    Kato proved in general dimension that there are no eigenvalues, E, with \(E \ge K^2\). The (corrected) Wigner–von Neumann example has \(K=8, E=1\) so one knows from that one can’t do better than \(K^2/64\) and it is easy to modify this example to show one can’t do better than \(K^2/4\). In 1948, Wallach [668] proved the \(E \le K^2\) in one dimension (extended by Borg [66] and Eastham [129]) and provided an example showing one couldn’t do better than \(K^2/4\). A breakthrough in this one dimensional case was made by Atkinson–Everitt [19] who proved there is no eigenvalue if \(E \ge 4K^2/\pi ^2\) and that there are examples with eigenvalues arbitrarily close to this bound. Note that \(4/\pi ^2 = .405\ldots \) lies in (1 / 4, 1). Their example is a relative of the Wigner–von Neumann example but uses \(\text {sgn}(\sin (r))\) in place of \(\sin (r)\). Their method using Prüfer transforms is very one dimensional. Eastham–Kalf [131] give a textbook presentation of this work and mention that Halvorsen (unpublished) also found the optimal \(4K^2/\pi ^2\). Remling [512] extended the Atkinson–Everitt result to prove no singular continuous spectrum in \([4K^2/\pi ^2,\infty )\).

  4. 4.

    Kato proved results about more than \(L^2\) solutions. For example, he proved that if \(|V(x)| \le (1+|x|)^{-\alpha }\) near infinity with \(\alpha >1\), and if \((-\Delta +V)\varphi =\lambda \varphi \) with \(\lambda > 0\) with \(\varphi (x) \rightarrow 0\) as \(x \rightarrow \infty \), then \(\varphi \) vanishes near infinity (and depending on the structure of the singularities of V, one can often use unique continuation (see below) to conclude that \(\varphi \equiv 0\)). This will be useful in Sect. 15.

The observant reader may have noticed that since \(g(r)/r \rightarrow 1\) as \(r \rightarrow \infty \), the potential, V(x), given by (12.2) is \(\text {O}(r^{-2})\) so it seems to be a counterexample to Theorem 12.1! In fact, von Neumann–Wigner had a calculational error: in the middle they used \(\cos r/\sin r = \tan r\) (!) and this error produces a remarkable cancellation. Doing the calculation correctly yields

$$\begin{aligned} V(r) = -32 \sin r \frac{g(r)^3\cos r -3 g(r)^2 \sin ^3r +g(r) \cos r+\sin ^3 r}{[1+g(r)^2]^2} \end{aligned}$$
(12.5)

so that \(V(r) = -8 \sin (2r)/r + \text {O}(r^{-2})\) consistent with Kato’s theorem. I once pointed out this error to Wigner, who thought for a moment and then said to me: “Oh, Johnny did that calculation”.

Kato proved some differential inequalities on \(M(r) = r^{\nu -1}\int |\varphi (r\omega )|^2 d\omega \) (where \(d\omega \) is surface measure on the unit sphere) and used them to prove that if \(\int ^{\infty } M(r) dr < \infty \) [i.e. \(\varphi \in L^2({\mathbb {R}}^\nu )\)], then \(M(r) = 0\) for \(r > R_0\) for some \(R_0\). The final step in his proof needs a result that any solution of \((-\Delta +W)\varphi =0\) that vanishes on an open set is identically zero. This is called a unique continuation theorem (we note the analog fails for hyperbolic equations). Such theorems go back to Carleman [83] in 1939. He only treated \(\nu =2\) and required that \(V \in L^\infty \). The kind of estimates he used, now called Carleman estimates, have been a staple, not only of later work on unique continuation, but for many other topics in the theory of elliptic PDEs. Unique continuation when \(V \in L^\infty \) and \(\nu \ge 3\) was proven by Müller [449] in 1954 (see also Aronszajn [17]). So when Kato did his work, there was only unique continuation for bounded \(V's\). Thus, in the final step, one needs to know there is a compact set, S, of measure zero so that \({\mathbb {R}}^\nu {\setminus } S\) is connected and so that V is locally bounded on this connected set.

Starting in 1980, there were a number of unique continuation results with \(L^p_{loc}\) conditions on V culminating in the classic 1985 paper of Jerison–Kenig [291] who require (for \(\nu \ge 3\); for \(\nu =2\), the condition is more complicated) that \(V \in L^{\nu /2}_{loc}\) which is known to be optimal.

In fact, one only needs something weaker than unique continuation, namely that there are no eigenfunctions of compact support. We will discuss this shortly.

Ikebe–Uchiyama [269] extended Kato’s result to allow magnetic fields which are \(\text {o}(x^{-1})\) at infinity and Roze [530] allowed suitable non-constant coefficient second order elliptic term.

In [178], Froese et al. proved a variant of Kato’s result. They first proved that if V is \(-\Delta \)-bounded and \((-\Delta +1)^{-1/2}(|x|V)(-\Delta +1)^{-1}\) is a compact operator, and if \((-\Delta +V)\varphi =\lambda \varphi ,\,\varphi \in D(H)\) and \(\lambda >0\), then \(e^{\alpha |x|} \varphi \in L^2\) for all \(\alpha > 0\). They then prove (and this also shows no compact support eigenfunctions) that if \(V(-\Delta +1)^{-3/4}\) is bounded, \(\lim _{\gamma \rightarrow \infty },\, ||V(-\Delta +\gamma )^{-3/4}|| = 0\) and \({\lim _{R \rightarrow \infty } ||\chi _R(1+|x|)V(-\Delta +1)^{-3/4}|| =0}\) (where \(\chi _R\) is the characteristic function of \(\{x\,|\, |x|>R\})\), then \((-\Delta +V)\varphi =\lambda \varphi \) and \(e^{\alpha |x|}\varphi \in L^2\) for all \(\alpha >0 \Rightarrow \varphi =0\). This provides a proof of a variant of Kato’s theorem without a need for pointwise bounds on V.

A very interesting alternate proof to a theorem very close to Kato is due to Vakulenko [660]. While Vakulenko and Yafaev [704] (who has a clear exposition of Vakulenko’s work) say that he recovers Kato’s result, instead he has a condition for a class of V’s with lots of overlap to, but distinct from, Kato’s condition (12.3). A Vakulenko bounding function, \(\eta (r)\), is a function on \((0,\infty )\) obeying:

$$\begin{aligned} \forall _{r \in (0,\infty )} \eta (r) >0; \qquad \lim _{r \downarrow 0} r\eta (r) = 0; \qquad \int _{0}^{\infty } \eta (r) dr < \infty \end{aligned}$$
(12.6)

A Vakulenko potential, V(x), on \({\mathbb {R}}^\nu \) is a measurable function for which there exists a Vakulenko bounding function, \(\eta (r)\), with

$$\begin{aligned} |V(x)| \le \eta (|x|) \end{aligned}$$
(12.7)

If \(\eta (x) = (1+|x|)^{-1-\epsilon }\) and V obeys (12.7), then V obeys both Vakulenko’s condition and Kato’s (12.3). If we consider \(V(x) = (1+|x|)^{-1} {[}\log (2+|x|)]^{-\alpha }\), then V obeys (12.3) if \(\alpha > 0\) but is only a Vakulenko potential if \(\alpha >1\). On the other hand, if

$$\begin{aligned} V(x) = \left\{ \begin{array}{ll} |x|^{-\beta }, &{} \hbox { if for some }n = 1,2,\ldots \; n^2<|x|<n^2+1\\ 0, &{} \hbox { otherwise} \end{array} \right. \end{aligned}$$
(12.8)

then V(x) obeys Kato’s (12.3) only if \(\beta >1\) but is a Vakulenko potential if \(\beta > 1/2\). So neither class is contained in the other, although they are very close. There is, of course, a connection to his condition and the fact that in one dimension, it has been long known that if the potential is in \(L^1\), then the positive spectrum is purely absolutely continuous (as mentioned in Remark 1 after Theorem 12.1).

Theorem 12.2

(Vakulenko [660]) Let V(x) be a Vakulenko potential with (12.7) for some \(\eta \). Let \(H = -\Delta +V\) and let B be multiplication by \(\sqrt{\eta }\). Then for any \(0< a<b < \infty \), there is a relatively H-bounded operator, A, so that for all \(\lambda \in [a,b]\) and all \(\varphi \in D(H)\), we have that

$$\begin{aligned} \text {Re} \langle (H-\lambda )\varphi ,A\varphi \rangle \ge ||B\varphi ||^2 \end{aligned}$$
(12.9)

In Sect. 15, we’ll see that (12.9) has implications for local smoothness of B and implies strong spectral properties of H. We’ll also prove the theorem when \(\nu =1\) and say something about the proof for general \(\nu \). For now, we note that

Corollary 12.3

(Vakulenko [660]) If V is a Vakulenko potential and \(H=-\Delta +V\), then H has no positive eigenvalues.

Proof

Let \(\lambda > 0\). Pick ab with \(0<a<\lambda<b<\infty \). If \(H\varphi =\lambda \varphi \) for \(\varphi \in D(H)\), by (12.9), we have that \(||B\varphi ||=0\). Since \(\eta \) is everywhere non-vanishing, we conclude that \(\varphi =0\). \(\square \)

The Wigner–von Neumann example has oscillations and one expects that if such oscillations are absent, then there should also be no positive eigenvalues. For example, if V(x) looks like \(r^{-\alpha }, \, 0<\alpha \le 1\), one expects that there should also be no positive eigenvalues. Odeh [468] proved that if \(\varvec{x}\cdot \varvec{\nabla } V \le 0\) for all large x, then Kato’s method could be modified to show there are no positive eigenvalues. Shortly thereafter, Agmon [2] and Simon [567], using Kato’s methods, independently proved (with enough local regularity to apply a unique continuation theorem) that there are no positive eigenvalues if \(V(x) = V_1(x)+V_2(x)\) so long as when \(x \rightarrow \infty \), one has that \(|x||V_1(x)| \rightarrow 0\), \(V_2(x) \rightarrow 0\) and \(\varvec{x}\cdot \varvec{\nabla }V_2(x) \rightarrow 0\). Most later works and, in particular, both Froese et al. [178] and Vakulenko [660], also considered such sums. Khosrovshahi et al. [367] and Kalf–Krishna Kumar [301] allow a third highly oscillatory piece and prove no positive eigenvalues (so for example, they allow \(r^{-1} \sin (r^\beta )\) for \(\beta > 1\) and Agmon–Simon allow \(\beta < 1\)).

Another way of extending Odeh’s result proves absence of positive eigenvalues using the Virial theorem as discussed below (see also the discussion of Lavine’s work in Sect. 15).

Before discussing more results on the absence of positive energy eigenvalues, we pause for some other examples, motivated by the Wigner–von Neumann example, where there are positive energy eigenvalues. By taking suitable sums of \(b_j \sin (\alpha _j r)/r\) (cutoff away from infinity), Naboko [451] and Simon [608] constructed, for each \(\delta >0\), V(x), bounded by \(r^{-1+\delta }\) near infinity with dense point spectrum. Here is one such result (taken from [608]):

Theorem 12.4

For any countable subset \(\{E_k\}_{k=1}^\infty \) of \((0,\infty )\) and any \(\epsilon ,\delta > 0\), there is V(x) on \((0,\infty )\) so that \(-\tfrac{d^2}{dx^2}+V(x)\) on \(L^2(0,\infty ;dx)\) with \(\varphi (0)=0\) boundary conditions has \(\varphi _k \in L^2\cap C^2(0,\infty )\), so \(\varphi _k(0)=0\) and \(-\varphi _k''+V\varphi _k = E_k\varphi _k\) and so that

$$\begin{aligned} |V(x)| \le \epsilon (1+|x|)^{-1+\delta } \end{aligned}$$
(12.10)

Remark

If \(0< \delta < 1/2\), it is known [93, 108, 369, 512] that \(-\tfrac{d^2}{dx^2}+V(x)\) has a.c. spectrum on all of \([0,\infty )\) so this is point spectrum embedded in continuous spectrum. As noted already, if \(\delta > 1/2\), one can find V’s with only point spectrum.

The Wigner–von Neumann and Naboko–Simon examples are spherically symmetric. Ionescu–Jerison [270] found examples where the slow \(\text {O}(r^{-1})\) decay is only in a parabolic tube about a single direction:

Theorem 12.5

(Ionescu–Jerison [270]) Fix \(\nu \ge 2\). There exists \(C>0\) and for each \(n=1,2,\ldots \) a potential obeying

$$\begin{aligned} |V(x_1,\ldots ,x_\nu )| \le \frac{C}{n+|x_1|+|x_2|^2+\cdots +|x_\nu |^2} \end{aligned}$$
(12.11)

and so that \((-\Delta +V)\varphi =\varphi \) has a non-zero \(L^2\) solution.

Frank–Simon [165] have simplified the Ionescu–Jerison construction by hewing more closely to the Wigner–von Neumann method. They use the wave function

$$\begin{aligned} \varphi _n(x) = \sin x_1 (n^2+g(x_1)^2+(x_2^2+\cdots +x_\nu ^2)^2)^{-\alpha } \end{aligned}$$
(12.12)

where \(\alpha > \nu /4\) (which implies that \(\psi _n \in L^2\)) and g is given by (12.1). \(V_n\) is then defined by

$$\begin{aligned} V_n(x) = \frac{\Delta \psi _n+\psi _n}{\psi _n} \end{aligned}$$
(12.13)

which is seen to obey (12.11). [165] also has versions of the central Wigner–von Neumann potentials for dimensions different from 1 and 3.

Notice that (12.11) implies that \(V_n \in L^p({\mathbb {R}}^\nu )\) for any \(p > \tfrac{1}{2}(\nu +1)\). That says that the value of p in the following is optimal:

Theorem 12.6

(Koch–Tataru [384]) Let \(\nu \ge 2\). If \(V \in L^{p_1}({\mathbb {R}}^\nu )+L^{p_2}({\mathbb {R}}^\nu )\) where \(p_1=\tfrac{1}{2}\nu < p_2=\tfrac{1}{2}(\nu +1)\) (if \(\nu =2\), one needs to take \(p_1>1\)), then \(-\Delta +V\) has no eigenvalues in \((0,\infty )\).

Remarks

  1. 1.

    Earlier Ionescu–Jerison [270] proved the weaker result where \(p_2=\tfrac{1}{2}(\nu +1)\) is replaced by \(p_2=\tfrac{1}{2}\nu \).

  2. 2.

    As we noted above, by Theorem 12.5, \(p_2=\tfrac{1}{2}(\nu +1)\) is optimal. The lower bound on p is needed to assure esa–\(\nu \).

  3. 3.

    The proof relies on \(L^p\) Carleman estimates and the machinery of [383].

In many ways, the most subtle results on the absence of positive eigenvalues concern N-body systems. After all, we saw in Sects. 3 and 4 in Part 1 (Examples 3.2 and 3.2 revisited) that N-body systems can have eigenvalues embedded in negative continua without carefully tuned potentials due to either non-interacting clusters or due to an eigenvalue of one symmetry embedded in a continuum of another symmetry. The earliest N-body results involve the Virial Theorem and showed no positive eigenvalues under specialized circumstances, for example repulsive potentials and also V’s homogeneous of degree \(\beta \) (i.e. \(V(\lambda \overrightarrow{x}) = \lambda ^\beta V(\overrightarrow{x}), \, 0>\beta > -2\)) which includes the physically important Coulomb case. This is discussed in Weidmann [674], Albeverio [8] and Kalf [298] (or [497, Theorems XIII.59 and XIII.60]).

Undoubtedly, the deepest results on lack of positive eigenvalues for N-body systems are in Froese–Herbst [176]. They assume that the \(V_{ij}(r) = v_{ij}(r_i-r_j)\) where \(v_{ij}\) as functions on \({\mathbb {R}}^\nu \) obey \(v_{ij}(-\Delta +1)^{-1}\) and \((-\Delta +1)^{-1}(y\cdot \nabla _y v_{ij})(y) (-\Delta +1)^{-1}\) are compact (here \(\Delta \) is the Laplacian and all operators act on \({\mathbb {R}}^\nu \)). These hypotheses are made so that Mourre theory applies (see Mourre [448], Perry et al. [482], Froese–Herbst [177], Amrein et al. [12] and Sahbani [533]).

One takes N particles \((x_1,\ldots ,x_N), x_j \in {\mathbb {R}}^\nu \) and defines

$$\begin{aligned} |x| = \left( 2\sum _{j=1}^{N} m_j |x_j-R|^2\right) ^{1/2} \end{aligned}$$
(12.14)

where \(R = \left( \sum _{j=1}^{N} m_j\right) ^{-1}\left( \sum _{j=1}^{N} m_j x_j\right) \). If we are looking at a Hamiltonian on \(L^2({\mathbb {R}}^{\nu (N-1)})\) with center of mass motion removed or if we have some \(v_j\) representing interactions with infinite mass particles, then we act on \(L^2({\mathbb {R}}^{\nu N})\), and set \(R=0\). What Froese–Herbst found is

Theorem 12.7

(Froese–Herbst [176]) Under the above hypotheses, if \(H\psi = \lambda \psi ,\,\psi \in L^2({\mathbb {R}}^\kappa )\), then

$$\begin{aligned} \beta \equiv \sup _{\alpha \ge 0} \{\alpha ^2+\lambda \,|\, e^{\alpha |x|}\psi \in L^2\} \in {\mathcal T}\cup \{\infty \} \end{aligned}$$
(12.15)

where \({\mathcal T}\) is the set of thresholds of the system (see Sect. 11 for a discussion of thresholds).

If there are no positive thresholds (which one can prove inductively if there is a way to prove no positive eigenvalues), then if \(\lambda >0\), the \(\beta \) in (12.15) must be \(\infty \). For suitable two body systems, we saw above that eigenfunctions can’t obey \(e^{\alpha |x|}\psi \in L^2\) for all \(\alpha >0\). Froese et al. [179] proved the same for suitable N-body systems (see the paper for precise conditions); see also [600, Theorem C.3.8]. In this way, one proves certain N-body systems have no positive eigenvalues.

The above touched on \(L^2\) isotropic exponential bounds (and as we’ll see in Sect. 19, that implies pointwise exponential bounds). There is a huge and beautiful literature on this subject and on non-isotropic bounds. We refer the reader to the book of Agmon [5] and the review article of Simon [600] which contains many references.

3 Scattering and spectral theory, I: trace class perturbations

This is the first of four sections on spectral and scattering theory. For the 15 years between 1957 and 1972, this area was a major focus of Kato. When Kato was invited to give a plenary lecture at the 1970 International Congress of Mathematicians, his talk [339] was entitled “Scattering Theory and Perturbation of Continuous Spectra” (interestingly enough, Agmon and Kuroda gave invited talks at the same congress and spoke on closely related subjects). This section and the next two have brief introductory remarks introducing this subject. This section’s introduction has much of the background we’ll give on scattering theory, the next section discusses the basics of spectral theory and something about the connection between time-independent and time–dependent scattering theory and Sect. 15 will say more about the background behind the time-independent approach.

figure a

Birmingham, AL, Meeting on Differential Equations, 1983.

Back row: Fröhlich, Yajima, Simon, Temam, Enss, Kato, Schechter, Brezis, Carroll, Rabinowitz.

Front row: Crandall, Ekeland, Agmon, Morawetz, Smoller, Lieb, Lax.

Starting with Rutherford’s 1911 discovery of the atomic nucleus, scattering has been a central tool in fundamental physics, so it isn’t surprising that one of the first papers in the new quantum theory was by Born [67] on scattering. At its root, scattering is a time-dependent phenomenon: something comes in, interacts and moves off. But since it relied on eigenfunctions, Born’s work used time-independent objects. He assumed that one could construct non-\(L^2\) eigenfunctions, \({(-\Delta +V)\varphi =k^2\varphi ,}\,(\overrightarrow{k} \in {\mathbb {R}}^3, k=|\overrightarrow{k}|)\) which as \(r \rightarrow \infty \) looks like

$$\begin{aligned} \varphi (\overrightarrow{x}) \sim e^{i\overrightarrow{k}\cdot \overrightarrow{x}}+f(\theta )\frac{e^{ikr}}{r};\quad r=|\overrightarrow{x}|\quad \overrightarrow{k}\cdot \overrightarrow{x} = kr \cos (\theta ) \end{aligned}$$
(13.1)

The time dependence gives \(e^{-itH}\varphi (x)\) a \(e^{i\overrightarrow{k}\cdot (\overrightarrow{x}-\overrightarrow{k}t)}\) term which is a usual plane wave with velocity \(\overrightarrow{k}\) and a scattered wave \(f(\theta )r^{-1}e^{ik(r-kt)}\). One expects such a term to live near points where \(r=kt\). So if \(t<0\) that term should not contribute (since \(r>0\)) while for t positive and large we have an outgoing spherical wave representing the scattering. We’ll say a little more about making mathematical sense of this formal argument in Sect. 15. \(|f(\theta )|^2\) was then interpreted as a scattering differential cross section. Born also found a leading order perturbation formula for \(f(\theta )\):

$$\begin{aligned} f(\theta ) = -(2\pi ) \int e^{i(\overrightarrow{k'}-\overrightarrow{k})\cdot \overrightarrow{x}}V(\overrightarrow{x})\,d\overrightarrow{x} \end{aligned}$$
(13.2)

where \(k'=k\) and \(\overrightarrow{k'}\cdot \overrightarrow{k}=k^2\cos \theta \). This Born approximation turns out to be leading order not only in V but also, for V fixed, as \(k \rightarrow \infty \).

In the early 1940s, the theoretical physics community first considered time dependent approaches to scattering. Wheeler [682] and Heisenberg [226, 227] defined the S-matrix and Møller [446] introduced wave operators as limits (with no precision as to what kind of limit).

It was Friedrichs in a prescient 1948 paper [172] who first considered the invariance of the absolutely continuous spectrum under sufficiently regular perturbations. Friedrichs was Rellich’s slightly older contemporary. Both were students of Courant at Göttingen in the late 1920s (in 1925 and 1929 respectively). By 1948, Friedrichs was a professor at Courant’s institute at NYU. Friedrichs considered two classes of examples in this paper. One was the model mentioned in Example 3.1 of a perturbation of an embedded point eigenvalue. The other was \(H=H_0+\lambda K\) where \(H_0\) is multiplication by x on \(L^2([0,1],dx)\) and K is a Hermitian integral operator with an integral kernel K(xy) assumed to vanish on the boundary (i.e. if x or y is 0 or 1) and to be Hölder continuous in x and y. Using what we’d call time-independent methods, Friedrichs constructed unitary operators, \(U_\lambda \), for \(\lambda \) sufficiently small, so that

$$\begin{aligned} H_0+\lambda K = U_\lambda H_0 U_\lambda ^{-1} \end{aligned}$$
(13.3)

While Friedrichs neither quoted Møller nor ever wrote down the explicit formulae

$$\begin{aligned} \Omega ^{\pm }(H,H_0) = {\text {s}-\lim }_{t \rightarrow {\mp } \infty } e^{itH}e^{-itH_0} \end{aligned}$$
(13.4)

(we remind the reader that the strange \({\pm }\) versus \({{\mp }}\) convention that we use is universal in the theoretical physics community and uncommon among mathematicians and is not the convention that Kato used), he did prove something equivalent to showing that the limit \(\Omega ^+\) existed and was \(U_\lambda \) and that the limit \(\Omega ^-\) existed and was equal to \(S_\lambda \Omega ^+\). Here \(S_\lambda \) was an operator he constructed and identified with the S-matrix (although it differs slightly with what is currently called the S-matrix).

Motivated in part by Friedrichs, in 1957, Kato published two papers [326, 327] that set out the basics of the theory we will discuss in this section. In the first, he had the important idea of defining

$$\begin{aligned} \Omega ^{\pm }(A,B) = {\text {s}-\lim }_{t \rightarrow {\mp } \infty } e^{itA}e^{-itB} P_{ac}(B) \end{aligned}$$
(13.5)

where \(P_{ac}(B)\) is the projection onto \({\mathcal H}_{ac}(B)\), the set of all \(\varphi \in {\mathcal H}\) for which the spectral measure of B and \(\varphi \) is absolutely continuous with respect to Lebesgue measure (see [616, Sect. 5.1] or the discussion at the start of Sect. 14). If these strong limits exist, we say that the wave operators \(\Omega ^{\pm }(A,B)\) exist.

By replacing t by \(t+s\), one sees that if \(\Omega ^{\pm }(A,B)\) exist then \(e^{isA}\Omega ^{\pm }=\Omega ^{\pm } e^{isB}\). Since \(\Omega ^{\pm }\) are unitary maps, \(U^{\pm }\), of \({\mathcal H}_{ac}(B)\) to their ranges, we see that . In particular, \({\text {ran}}\, \Omega ^{\pm }\) are invariant subspaces for A and lie in \({\mathcal H}_{ac}(A)\). It is thus natural to define: \(\Omega ^{\pm }(A,B)\) are said to be complete if

$$\begin{aligned} {\text {ran}}\, \Omega ^+(A,B) = {\text {ran}}\, \Omega ^-(A,B) = {\mathcal H}_{ac}(A) \end{aligned}$$
(13.6)

Remarks

  1. 1.

    Kato also noted the relation

    $$\begin{aligned} \Omega ^{\pm }(A,B)\Omega ^{\pm }(B,C)=\Omega ^{\pm }(A,C) \end{aligned}$$
    (13.7)

    in that if both wave operators on the left exist, so does the one on the right and one has the equality.

  2. 2.

    The wisdom of taking \(P_{ac}(B)\) in the definition of wave operator is shown by the fact that it follows from results of Aronszajn [18] and Donoghue [123] (see also Simon [592, 593]) that if \(A-B=\langle \varphi ,\cdot \rangle \varphi \) with \(\varphi \) a cyclic vector for B then \(e^{itA}e^{-itB}\psi \) has a limit if and only if \(\psi \in {\mathcal H}_{ac}(B)\).

In [326], Kato proved the following

Theorem 13.1

(Kato [326]) Let \(\Omega ^{\pm }(A,B)\) exist. Then they are complete if and only if \(\Omega ^{\pm }(B,A)\) exist.

The proof is almost trivial. It depends on noting that

$$\begin{aligned} \psi = \lim _{t \rightarrow \infty } e^{iAt}e^{-itB}\varphi \iff \varphi = \lim _{t \rightarrow \infty } e^{itB}e^{-itA}\psi \end{aligned}$$
(13.8)

since

$$\begin{aligned} ||\psi -e^{iAt}e^{-itB}\varphi || = ||e^{itB}e^{-itA}\psi -\varphi || \end{aligned}$$
(13.9)

That said, it is a critical realization because it reduces a completeness result to an existence theorem. In particular, it implies that symmetric conditions which imply existence also imply completeness. We’ll say more about this below.

To show the importance of this idea, motivated by it in [110], Deift and Simon proved that completeness of multichannel scattering for N-body scattering was equivalent to the existence (using the N-body language of Sect. 11) of \(\text {s}-\lim _{t \rightarrow {\pm }\infty } e^{itH}({\mathcal {C}})J_{\mathcal {C}}e^{-itH}P_{ac}(H)\) for the partition of unity \(\{J_{\mathcal {C}}\}_{{\mathcal {C}}\ne {\mathcal {C}}_{min}}\) discussed in Remark 4 after Theorem 11.8. All proofs of asymptotic completeness for N-body systems prove it by showing the existence of these Deift–Simon wave operators in support of Kato’s Theorem 13.1.

In [326], Kato proved

Theorem 13.2

(Kato [326]) Let \(H_0\) be a self-adjoint operator and V a (bounded) self-adjoint, finite rank operator. Then \(H=H_0+V\) is a self-adjoint operator and the wave operators \(\Omega ^{\pm }(H,H_0)\) exist and are complete.

This implies the unitary equivalence of and . Remarkably, in the same year Aronszajn [18] proved that this invariance holds for finite rank perturbations of boundary conditions for Sturm–Liouville operators (extended later using similar ideas by Donoghue [123] to general finite rank perturbations). Their methods are totally different from Kato’s and do not involve wave operators.

Later in 1957, Kato [327] proved

Theorem 13.3

(Kato–Rosenblum Theorem) The conclusions of Theorem 13.2 remain true if V is a (bounded) trace class operator.

In a sense this theorem is optimal. It is a result of Weyl–von Neumann [665, 679] (see [616, Theorem 5.9.2]) that if A is a self-adjoint operator, one can find a Hilbert–Schmidt operator, C, so that \(B=A+C\) has only pure point spectrum. Kato’s student, Kuroda [394], shortly after Kato proved Theorem 13.3, extended this result of Weyl–von Neumann to any trace ideal strictly bigger than trace class. So within trace ideal perturbations, one cannot do better than Theorem 13.3.

The name given to this theorem comes from the fact that before Kato proved Theorem 13.3, Rosenblum [528] proved a special case that motivated Kato: namely, if A and B have purely a.c. spectrum and \(A-B\) is trace class, then \(\Omega ^{\pm }(A,B)\) exist and are unitary (so complete).

I’d always assumed that Rosenblum’s paper was a rapid reaction to Kato’s finite rank paper which, in turn, motivated Kato’s trace class paper. But I recently learned that this assumption is not correct. Rosenblum was a graduate student of Wolf at Berkeley who submitted his thesis in March 1955. It contained his trace class result with some additional technical hypotheses; a Dec. 1955 Berkeley technical report had the result as eventually published without the extra technical assumption. Rosenblum submitted a paper to the American Journal of Mathematics which took a long time refereeing it before rejecting it. In April 1956, Rosenblum submitted a revised paper to the Pacific Journal in which it eventually appeared (this version dropped the technical condition; I’ve no idea what the original journal submission had).

Kato’s finite rank paper was submitted to J. Math. Soc. Japan on March 15, 1957 and was published in the issue dated April, 1957(!). The full trace class result was submitted to Proc. Japan Acad. on May 15, 1957. Kato’s first paper quotes an abstract of a talk Rosenblum gave to an A.M.S. meeting but I don’t think that abstract contained many details. This finite rank paper has a note added in proof thanking Rosenblum for sending the technical report to Kato, quoting its main result and saying that Kato had found the full trace class results (“Details will be published elsewhere”.). That second paper used some technical ideas from Rosenblum’s paper.

I’ve heard that Rosenblum always felt that he’d not received sufficient credit for his trace class paper. There is some justice to this. The realization that trace class is the natural class is important. As I’ve discussed, trace class is maximal in a certain sense. Kato was at Berkeley in 1954 when Rosenblum was a student (albeit some time before his thesis was completed) and Kato was in contact with Wolf. However, there is no indication that Kato knew anything about Rosenblum’s work until shortly before he wrote up his finite rank paper when he became aware of Rosenblum’s abstract. My surmise is that both, motivated by Friedrichs, independently became interested in scattering.

It should be emphasized that 1956–1957 was a year that (time-dependent) scattering theory seemed to be in the air. Cook [96] found a simple, later often used, method for proving that \(\Omega ^{\pm }(A,B)\) exists: if \(\int _{-\infty }^{\infty } ||(A-B)e^{-iuB}\varphi ||\, du < \infty \), then by integrating a derivative

$$\begin{aligned} \limsup _{\begin{array}{c} t,s \rightarrow \infty \\ \text {or }t,s \rightarrow -\infty \end{array}} ||e^{itA}e^{-itB}\varphi -{e^{isA}e^{-isB}\varphi }|| \le \lim \int _{s}^{t} ||(A-B)e^{-iuB}\varphi ||\, du = 0\nonumber \\ \end{aligned}$$
(13.10)

so it suffices that

$$\begin{aligned} \int _{-\infty }^{\infty } ||(A-B)e^{-iuB}\varphi ||\, du < \infty \end{aligned}$$
(13.11)

for a dense set of \(\varphi \) for \(\Omega ^{\pm }(A,B)\) to exist. Cook applied this to \(B=-\Delta ;\,A=-\Delta +V;\, V \in L^2({\mathbb {R}}^3)\) (which translates to \(\text {O}(|x|^{-3/2-\epsilon })\) decay). Hack [213] and Kuroda [395] extended this to allow \(\text {O}(|x|^{-1-\epsilon })\) decay.

Since, for the free dynamics, \(x \sim ct\), one expects and can prove that if \(\alpha \le 1\), then \(\int _{-\infty }^{\infty } ||(1+|x|)^{-\alpha } e^{iu\Delta }\varphi ||\, du = \infty \) for all \(\varphi \). Indeed, Dollard [121] showed that one needs modified wave operators for Coulomb potentials (again, there is a large literature on the subject of Coulomb or slower decay of which we mention Christ–Kiselev [93] and Dereziński and Gérard [116]).

Extensions of Cook’s ideas and other scattering theory notions to quadratic form perturbations can be found in Kuroda [397], Schechter [537], Simon [584] and Kato [350]. Kato states his results in a two Hilbert space setting (see below). J is a bounded linear operator from \({\mathcal H}_1\) to \({\mathcal H}_2\) and \(H_j\) are self-adjoint operators on \({\mathcal H}_j; \, j=1,2\). For \(z \in {\mathbb {C}}{\setminus }{\mathbb {R}}\), let \(C(z) = (H_2-z)^{-1}J-J(H_1-z)^{-1}\). Kato proves that if for some z and \(\varphi \in {\mathcal H}_1\), one has that

$$\begin{aligned} \int _{0}^{\infty } ||C(z)e^{-itH_1}\varphi ||_2\, dt < \infty \end{aligned}$$
(13.12)

then

$$\begin{aligned} \lim _{t \rightarrow \infty } e^{itH_2}Je^{-itH_1}\varphi \text { exists} \end{aligned}$$
(13.13)

He then shows that this allows some cases where \(H_2\) is only defined as a quadratic form, e.g. \(H_1=-\Delta , H_2=-\Delta +V\) with \(V \ge 0,\,V \in L^1({\mathbb {R}}^3,(1+|x|)^{1-\epsilon }\, dx)\).

For many years, it was thought that this simple idea of Cook was limited to existence but not useful for completeness or spectral theory. This was overturned by a brilliant paper of Enss [139] (see also Perry [481], Reed–Simon [496, Sect. XI.17] or Simon [589]), a subject we will not pursue here.

In 1958–1959, there were also several influential papers by Jauch [282] and Jauch and Zinnes[283] that discussed scattering in a general framework.

In considering extensions of the Kato–Rosenblum, I begin with four issues that involve work by Kato himself. First, we discuss proofs. Like Friedrichs, both Kato and Rosenblum proved a time-dependent limit exists by first constructing objects with time-independent methods which they prove is the required limit. The first fully time-dependent proof of Theorem 13.3 is in a Japanese language paper by Kato [328] also published in 1957. His argument was repeated with permission in a paper by his student Kuroda [396]. The slickest version of this time-dependent proof is in Kato’s 1966 book [345]. It is a variant of this argument that Pearson used in his proof of Theorem 13.4 below.

The second concerns Kato’s paper [334] on what is called the invariance principle: for suitable functions \(\Phi \), one shows that \(A-B\) trace class \(\Rightarrow \Omega ^{\pm }(\Phi (A),\Phi (B))\) exist and are complete. In case that \(\Phi \) is strictly monotone increasing (respectively decreasing), one has that \(\Omega ^{\pm }(\Phi (A),\Phi (B))=\Omega ^{\pm }(A,B)\) (resp \(\Omega ^{\mp }(A,B)\)). The first examples of this phenomenon are due to Birman [57, 58]. Kato focused on the general form of the principle. There is a considerable literature on non-trace class versions of an invariance principle; see [496, Notes to Section XI.3] for references.

The third involves two Hilbert space scattering theory [336]. This came out of a set of concrete problems. In Sect. 8 in Part 1 (see the discussion beginning with (8.6)), we saw that the equation \(\tfrac{\partial ^2 u}{\partial t^2} = (\Delta -V)u\) had a unitary propagation in the norm \(\left[ ||\dot{u}||_2^2+\langle u,(-\Delta +V)u \rangle \right] ^{1/2}\). This means to compare solutions of this equation to, say, the one with \(V=0\), one needs to consider two different Hilbert space norms. If for some \(0< \alpha<\beta < \infty \) one has for all x that \(\alpha \le V(x) \le \beta \), then there is a natural map, J between the two spaces so that J is bounded with bounded inverse which takes \(\varphi \) viewed as an element of one Hilbert space into itself but viewed in the other Hilbert space. One is interested in the limit in (13.13) (and also the limit as \(t \rightarrow -\infty \)). A similar setup applies to other hyperbolic systems, especially to the physically significant Maxwell’s equation. Long after Kato’s work on the subject, Isozaki–Kitada [274] discovered one could use a J operator to discuss long range scattering where ordinary wave operators do not exist. Before [336], several authors (Schmidt [538], Shenk [552], Thoe [644], Wilcox [685]) discussed scattering theory for some concrete examples of such systems. Kato [336] looked at the theory systematically, focusing, for example, on J’s with \(\text {s--}\lim _{t \rightarrow {\pm }\infty } (J^*J-{\varvec{1}})e^{-itH_1}=0\) which implies that the wave operators are isometries if they exist. Under certain invertibility hypotheses on J, Kato could carry over the usual trace class scattering theory to get some two Hilbert space results. Stronger results were subsequently obtained by Belopol’skii–Birman [46], Birman [60] and then Pearson [476] who proved

Theorem 13.4

(Pearson’s Theorem [476]) Let AB be self-adjoint operators on Hilbert spaces \({\mathcal H}_1\) and \({\mathcal H}_2\). Let J be a bounded operator from \({\mathcal H}_1\) to \({\mathcal H}_2\) so that \(C=AJ-JB\) is trace class (in the sense that there is a bounded operator C from \({\mathcal H}_1\) to \({\mathcal H}_2\) with \(\sqrt{C^*C}\) trace class and for \(\varphi \in D(B)\) and \(\psi \in D(A)\) we have that \(\langle A\psi ,J\varphi \rangle -\langle \psi ,JB\varphi \rangle =\langle \psi ,C\varphi \rangle \)). Then

$$\begin{aligned} \Omega ^{\pm }(A,B;J)={\text {s--}\lim }_{t \rightarrow {\mp }\infty } e^{itA}Je^{-itB}P_{ac}(B) \end{aligned}$$
(13.14)

exists.

No completeness is claimed (e.g., consider \(J=0\)) but one can sometimes get completeness. For example, if \({\mathcal H}_1={\mathcal H}_2={\mathcal H}\) and \(A, B \ge 0\) are two positive operators on \({\mathcal H}\) so that \((A+1)^{-1}-(B+1)^{-1}\) is trace class, then one can pick \(J=(A+1)^{-1}(B+1)^{-1}\). C is trace class, so \(\Omega ^{\pm }(A,B;J)\) exist. Apply this to \((B+1)\varphi \) to see that \(\Omega ^{\pm }(A,B;(A+1)^{-1})\) exists. Since \((A+1)^{-1}-(B+1)^{-1}\) is compact, the Riemann–Lebesgue lemma shows that \(\Omega ^{\pm }(A,B;(A+1)^{-1}-(B+1)^{-1})=0\). It follows that \(\Omega ^{\pm }(A,B;(B+1)^{-1})\) exists. Applying this to \((B+1)\varphi \), we see that \(\Omega ^{\pm }(A,B)\) exists. By symmetry, it is complete. We thus recover Birman’s result (see below) that \((A+1)^{-1}-(B+1)^{-1}\) trace class implies that \(\Omega ^{\pm }(A,B)\) exists and is complete. Pearson’s proof is a clever variant of Kato’s time-dependent proof from [345]; see [496, pp. 33–38] for details and further applications.

Example 13.5

The fourth of Kato’s applications/extensions of the trace class theory is an example in a joint paper with Kuroda [361]. They consider three Hamiltonians on \(L^2({\mathbb {R}}^2,d^2 x)\):

$$\begin{aligned} H_0=-\frac{\partial ^2}{\partial x_1^2}-\frac{\partial ^2}{\partial x_2^2}; \quad H_1=H_0+V(x_2);\quad H=H_1+K \end{aligned}$$
(13.15)

where \(V \in L^1({\mathbb {R}})\cap L^2({\mathbb {R}})\) and K is a rank 1 operator, \(Ku = c\langle \varphi ,u \rangle \varphi \) with \(\varphi \) a norm 1 function in \(L^2({\mathbb {R}}^2)\) and c is a constant. Moreover, they pick V so that \(h_1 = -\frac{d^2}{dx^2}+V(x)\), as an operator on \(L^2({\mathbb {R}})\), has exactly one eigenvalue in \((-\infty ,0]\).

Let \(h_0=-\frac{d^2}{dx^2}\). By results of Kuroda [395], using the trace class theory, \(\Omega ^{\pm }(h_1,h_0)\) exist and are complete. Since \(H_0, H_1\) are of the form \(H_j={\varvec{1}}\otimes h_j+h_0\otimes {\varvec{1}}\), one sees that \(\Omega ^\pm (H_1,H_0)\) exist with \({\text {ran}}\,\Omega ^+(H_1,H_0)={\text {ran}}\,\Omega ^-(H_1,H_0)\). But they are not complete because \({\mathcal H}_{ac}(H_1)\) has vectors of the form \(\psi \otimes \varphi _0\) where \(\psi \in L^2({\mathbb {R}})\) and \(\varphi _0\) is the bound state of \(h_1\).

Since K is rank 1, \(\Omega ^{\pm }(H,H_1)\) exist and so by the chain rule \(\Omega ^{\pm }(H,H_0)\) exist. But by a calculation, K links the two parts of the a.c. spectrum of \(H_1\), at least for c small. Thus they claim that, for c small, \({\text {ran}}\,\Omega ^+(H,H_0) \ne {\text {ran}}\,\Omega ^-(H,H_0)\) and the S-matrix is non-unitary. Hence the title of their paper “A Remark on the Unitarity Property of the Scattering Operator”.

However, as Kuroda [393] subsequently noted, this analysis leaves something out. The S-matrix is unitary if one looks at the right S-matrix! This is a multichannel system and if one includes also the channel for \(\{\psi \otimes \varphi _0\}\), the arguments do imply unitarity. So rather than find a non-unitary S-matrix, they found the first example of a multichannel scattering system with asymptotic completeness!

We conclude this section with some brief remarks on developments in the trace class scattering theory subsequent to Kato’s original work. Many of the significant results are due to M. S. Birman, so much so that the theory has taken the name Kato–Birman theory.

  1. (1)

    A first key issue was making the theory apply to Schrödinger operators, \(H_0=-\Delta , H=-\Delta +V\) on \(L^2({\mathbb {R}}^\nu )\). The pioneer was Kato’s student, Kuroda, who first proved an extension of the Kato–Rosenblum theorem. If V is \(H_0\)-bounded with relative bound less than 1 and \(|V|^{1/2}(H_0+1)^{-1}\) is Hilbert–Schmidt, then Kuroda proved that \(\Omega ^{\pm }(H,H_0)\) exist and are complete. He used this to prove existence and completeness if \(\nu \le 3\) and \(V \in L^1({\mathbb {R}}^\nu )\cap L^2({\mathbb {R}}^\nu )\). In terms of V’s with

    $$\begin{aligned} |V(x)| \le C(1+|x|)^{-\alpha } \end{aligned}$$
    (13.16)

    this requires \(\alpha >\nu \) whereas existence by Cook’s method only needs \(\alpha >1\), so for \(\nu \ge 2\), there is a gap that we’ll discuss much more in the next two sections. Kuroda also noted that if \(V(\overrightarrow{x})=V(|\overrightarrow{x}|)\) is a central potential, then, for any \(\nu \) one can do a partial wave expansion (see [615, Theorem 3.5.8]) and reduce the problem to half-line problems. Since it is known that when (13.16) holds for any \(\alpha >0\), that the essential spectrum for the half-line problem is \([0,\infty )\) and the spectrum is simple, one can see that existence implies completeness without needing the trace class theory.

  2. (2)

    Birman is responsible for a wide variety of extensions and applications of the trace class theory. First, he proved with Krein [61] an extension to the situation where U and V are two unitaries for which \(V-U\) is trace class. In that case, \(\text {s--}\lim _{n \rightarrow {\pm }\infty }(V^*)^nU^n P_{ac}(U)\) exists, has range \({\text {ran}}\,P_{ac}(V)\) and is a unitary equivalence of the a.c. parts of U and V. Secondly [57, 58], he proved that if AB are self-adjoint and \((A-z)^{-1}-(B-z)^{-1}\) is trace class for some \(z \notin \sigma (A)\cup \sigma (B)\), then \(\Omega ^{\pm }(A,B)\) exist and are complete (de Branges [107] proved the same result). Kuroda’s result on \(|V|^{1/2}(H_0+1)^{-1}\) Hilbert Schmidt follows from this. Later Birman [59] proved that if \(P_I(A)(A-B)P_I(B)\) is trace class for all bounded intervals, I, and if a technical condition called mutual subordinancy holds, then \(\Omega ^{\pm }(A,B)\) exist and are complete. His proof was involved but using Pearson’s Theorem (Theorem 13.4), one can easily prove this result of Birman (see [496, Theorem XI.10]). With this result, one can prove existence and completeness of \(\Omega ^{\pm }(H,H_0)\) for \(H_0=-\Delta ,\,H=-\Delta +V\) on \(L^2({\mathbb {R}}^\nu )\) if \(V \in L^{\nu /2}({\mathbb {R}}^\nu )\cap L^1({\mathbb {R}}^\nu )\), so \(\alpha >\nu \) in (13.16) leaving quite a gap from the expected \(\alpha >1\) (see the next two sections).

  3. (3)

    One can apply the trace class theory to changes of boundary condition. The pioneer here is Birman [55, 56]; see also Deift–Simon [109, Appendix].

  4. (4)

    When A and B are bounded and \(A-B\) is trace class. one can define an \(L^1({\mathbb {R}},dx)\) function, \(\xi (x)\), called the Krein spectral shift so that for f a \(C^2\) function of compact support, one has that \(f(A)-f(B)\) is trace class and

    $$\begin{aligned} {\text {Tr}}(f(A)-f(B)) = -\int f'(x)\xi (x)\,dx \end{aligned}$$
    (13.17)

    (see Simon [592, Sect. 11.4] [593] or Yafaev [699, Chap. 8] for more on the spectral shift function). Birman–Krein [61] prove the beautiful Birman–Krein formula:

    $$\begin{aligned} {{\mathrm{det}}}(S(\lambda ))=e^{-2\pi i\xi (\lambda )} \end{aligned}$$
    (13.18)

    when \(A-B\) is trace class. Here \(S=\Omega ^-(A,B)^*\Omega ^+(A,B)\) is a unitary operator on \({\mathcal H}_{ac}(B)\) which commutes with B, so according to the spectral multiplicity theory ([616, Sect. 5.4]), B has a direct integral decomposition \({\mathcal H}_{ac}(B) = \int ^\oplus _{\sigma _{ac}(B)} {\mathcal H}_\lambda \,d\lambda , \, B=\int ^\oplus _{\sigma _{ac}(B)}\) and \(S=\int _{\sigma _{ac}(B)}^{\oplus } S(\lambda )\,d\lambda \) where \(S(\lambda )\) is a unitary operator on \({\mathcal H}_\lambda \). Birman–Krein prove that \(S(\lambda )-{\varvec{1}}\) is a trace class operator on \({\mathcal H}_\lambda \) and (13.18) holds where \({{\mathrm{det}}}\) is the Fredholm determinant [616, Sect. 3.10].

4 Scattering and spectral theory, II: Kato smoothness

This is the second section on spectral and scattering theory. We begin with a quick primer on spectral theory that will assume familiarity with the spectral theorem and spectral measures (see [616, Sects. 5.1 and 7.2]). For a self-adjoint operator, H, on a (complex, separable) Hilbert space, \({\mathcal H}\), the most basic questions are connected to the Lebesgue decomposition theorem [612, Theorem 4.7.3] that says that any measure, \(d\mu \) on \({\mathbb {R}}\) can be uniquely decomposed \(d\mu =d\mu _{ac}+d\mu _{sc}+d\mu _{pp}\) where \(d\mu _{pp}\) is pure point, \(d\mu _{ac}\) is dx-absolutely continuous and \(d\mu _{sc}\) has no pure points and is singular with respect to dx (so “singular continuous”). There is a corresponding decomposition \({\mathcal H}= {\mathcal H}_{ac}(H)\oplus {\mathcal H}_{sc}(H)\oplus {\mathcal H}_{pp}(H)\) where \({\mathcal H}_{y}\) is the set of those vectors, \(\varphi \), whose H-spectral measure is purely of type y.

In simple quantum mechanical systems, \({\mathcal H}_{ac}\) spectrum is often associated with scattering theory as we’ve seen, and \({\mathcal H}_{pp}\) is associated with bound states. As my advisor, Arthur Wightman, told me there is no reasonable interpretation for states in \({\mathcal H}_{sc}\) so he called the idea that \({\mathcal H}_{sc}=\{0\}\) the “no goo hypothesis”. A major concern of quantum theoretic spectral theorists in the period from 1960 to 1985, and, in particular, of Kato, was the proof that \({\mathcal H}_{sc}=\{0\}\) for two-( and N-)body quantum systems whose potentials obey (13.16) for \(\alpha > 1\).

Ironically, after Kato became less active in NRQM, it was discovered that, in some ways, singular continuous spectrum is ubiquitous. As I’ve remarked: “I seem to have spent the first part of my career proving that singular continuous spectrum never occurs and the second proving that it always does”. A key breakthrough was the discovery by Pearson [477] that sparse potentials with slow decay have purely s.c. spectrum. I explored this in a series of papers [112, 113, 292, 605,606,607, 629] of which a typical result concerns h on \(\ell ^2({\mathbb {Z}})\) given by \((hu)_n=u_{n+1}+u_{n-1}+b_n u_n\). Fix \(\alpha > 0\) and let \(Q_\alpha \) be the Banach space of \(b's\) with \(\sup _n\left[ (1+|n|)^\alpha |b_n|\right] \equiv ||b||_\alpha < \infty \) with \(|n|^\alpha |b_n| \rightarrow 0\) as \(|n| \rightarrow \infty \). Then (see [605]), if \(\alpha < 1/2\), for a dense \(G_\delta \) in \(Q_\alpha \), the associated h has purely s.c. spectrum (i.e. \({\mathcal H}_{sc}(h)={\mathcal H}\)).

A main tool in the quest to prove that \({\mathcal H}_{sc}=\{0\}\) is the fact that Stone’s formula [616, Eq. (5.7.30)]

$$\begin{aligned} \lim _{\epsilon \downarrow 0} \int _{a}^{b} \text {Im}\langle \varphi ,R(x+i\epsilon )\varphi \rangle \,dx = \langle \varphi ,\tfrac{1}{2}\left[ P_{(a,b)}(H)+P_{[a,b]}(H)\right] \varphi \rangle \end{aligned}$$
(14.1)

(where \(R(z) = (H-z)^{-1}\) for \(z \in {\mathbb {C}}{\setminus }{\mathbb {R}}\)) immediately implies that for any \(p >1\), we have that

$$\begin{aligned} \sup _{0< \epsilon< 1} \int _{a}^{b} |\text {Im}\langle \varphi ,R(x+i\epsilon )\varphi \rangle |^p\,dx < \infty \Rightarrow P_{(a,b)}(H)\varphi \in {\mathcal H}_{ac}(H) \end{aligned}$$
(14.2)

Thus, the most common way of proving that \({\mathcal H}_{sing}=\{0\}\) is showing that for a dense set of \(\varphi \), and enough intervals (ab), we have that

$$\begin{aligned} \sup _{\begin{array}{c} \epsilon >0 \\ a<x<b \end{array}} |\langle \varphi ,R(x+i\epsilon )\varphi \rangle | < \infty \end{aligned}$$

(stronger than needed, but what one often gets).

We’ll say a lot more about time-independent scattering in the next section, but we note that in some sense, the key notion of that theory is that control of \(\langle \varphi ,R(x+i\epsilon )\varphi \rangle \) as \(\epsilon \downarrow 0\) also says something about long time behavior of dynamics as seen in

$$\begin{aligned} \int _{0}^{\infty } e^{-\epsilon t}e^{it\lambda }e^{-itH}\varphi \, dt = -iR(\lambda +i\epsilon )\varphi \end{aligned}$$
(14.3)

for any \(\varphi \in {\mathcal H}\) because \(\int _{0}^{\infty } e^{-\epsilon t}e^{i(\lambda -x)t}\,dt=-i(x-\lambda -i\epsilon )^{-1}\).

We turn now to the theory of Kato smoothness which is based primarily on two papers of Kato [335, 337]. The first is the basic one with four important results: the equivalence of many conditions giving the definition, the connection to spectral analysis, the implications for existence and completeness of wave operators and, finally, a perturbation result. The second paper concerns the Putnam–Kato theorem on positive commutators.

To me, the 1951 self-adjointness paper is Kato’s most significant work (with the adiabatic theorem paper a close second), Kato’s inequality his deepest and the subject of this section his most beautiful. One of the things that is so beautiful is that there isn’t just a relation between the time-independent and time-dependent objects—there is an equivalence! Here is the set of equivalent definitions:

Theorem 14.1

(Kato [335]) Let H be a self-adjoint operator and A a closed operator. The following are all equal (\(R(\mu )=(H-\mu )^{-1}\)):

$$\begin{aligned}&\sup _{\begin{array}{c} ||\varphi ||=1\\ \epsilon >0 \end{array}} \frac{1}{4\pi ^2} \int _{-\infty }^{\infty } \left( ||AR(\lambda +i\epsilon )\varphi ||^2+||AR(\lambda -i\epsilon )\varphi ||^2\right) \,d\lambda \end{aligned}$$
(14.4)
$$\begin{aligned}&\sup _{||\varphi ||=1} \frac{1}{2\pi } \int _{-\infty }^{\infty } ||Ae^{-itH}\varphi ||^2\,dt \end{aligned}$$
(14.5)
$$\begin{aligned}&\sup _{\begin{array}{c} ||\varphi ||=1,\,\varphi \in D(A^*)\\ -\infty<a< b < \infty \end{array}} \frac{||P_{(a,b)}(H)A^*\varphi ||^2}{b-a} \end{aligned}$$
(14.6)
$$\begin{aligned}&\sup _{\begin{array}{c} \mu \notin {\mathbb {R}},\,\varphi \in D(A^*) \\ ||\varphi ||=1 \end{array}} \frac{1}{2\pi } |\langle A^*\varphi ,[R(\mu )-R(\bar{\mu })]A^*\varphi \rangle | \end{aligned}$$
(14.7)
$$\begin{aligned}&\sup _{\begin{array}{c} \mu \notin {\mathbb {R}},\,\varphi \in D(A^*) \\ ||\varphi ||=1 \end{array}} \frac{1}{\pi } ||R(\mu )A^*\varphi ||^2 \, |\text {Im}\mu | \end{aligned}$$
(14.8)

In particular, if one is finite (resp. infinite), then all are.

Remarks

  1. 1.

    In (14.4)/(14.5), we set \(||A\psi ||=\infty \) if \(\psi \notin D(A)\), so, for example, to say that (14.5) is finite implies that for each \(\varphi \), we have that \(e^{-itH}\varphi \in D(A)\) for Lebesgue a.e. \(t \in {\mathbb {R}}\).

  2. 2.

    If one and so all of the above quantities are finite we say that A is H-smooth. The common value of these quantities is called \(||A||_H^2\).

  3. 3.

    The proof is not hard. If the integral in (14.5) has a factor of \(e^{-2\epsilon t}\) put inside it, the equality of the integrals in (14.4) and (14.5) follows from (14.3) and the Plancherel theorem. By monotone convergence, the \(\sup \) of the time integral with the \(e^{-2\epsilon t}\) factor is the integral without that factor.

  4. 4.

    The equivalence of (14.7) and (14.8) is just \(R(\mu )-R(\bar{\mu })={(\mu -\bar{\mu })R(\mu )R(\bar{\mu })}\).

  5. 5.

    If \(d\nu _{A^*\varphi }\) is the H-spectral measure for \(A^*\varphi \) (so \(\int f(\lambda )d\nu _{A^*\varphi }(\lambda ) = \langle A^*\varphi ,f(H)A^*\varphi \rangle \)), then the equivalence of (14.6) and (14.7) involves the relation of \(\frac{\epsilon }{\pi } \int \frac{d\nu (\lambda )}{(\lambda -x)^2+\epsilon ^2}\) and \(\frac{\nu ((a,b))}{b-a}\). A bound like (14.6) implies a.e. in \(d\lambda \) a bound on \(\frac{d\nu (\lambda )}{d\lambda }\). Since \( \frac{\epsilon }{\pi }\int \frac{d\lambda }{(\lambda -x)^2+\epsilon ^2} =1\), we get (14.7). Conversely (14.7) implies (14.6) via Stone’s formula.

  6. 6.

    To see that (14.6\(\le \) (14.4), it suffices by taking limits to consider the case where a and b are not eigenvalues of H. One writes \(P_{(a,b)}(H)\) by Stone’s formula to see that

    $$\begin{aligned} |\langle A^*\varphi ,P_{(a,b)}(H)\psi \rangle |&\le \frac{1}{2\pi }||\varphi ||\limsup _{\epsilon \downarrow 0} \int _{a}^{b} ||A[R(\lambda +i\epsilon )-R(\lambda -i\epsilon )]\psi ||\,d\lambda \\&\le ||\varphi ||\left( \int _{a}^{b} 1\, d\lambda \right) ^{1/2}\left( \frac{1}{4\pi ^2}\int _{a}^{b} \text {Integrand in (14.4)}\,d\lambda \right) ^{1/2} \end{aligned}$$

    proving that \(||P_{(a,b)}(H)A^*\varphi ||\le ||\varphi ||[\)RHS of (14.4)\(]^{1/2}|b-a|^{1/2}\).

  7. 7.

    To see that (14.4\(\le \) (14.7), thereby completing the proof of all the equivalences, let \(\alpha \) be the \(\sup \) in (14.7). For \(z \in {\mathbb {C}}_+\), let K(z) be the positive square root of \((2\pi i)^{-1}(R(z)-R(\bar{z}))\). Then \(||AK(z)||^2 \le \alpha \), so

    $$\begin{aligned} \text {Quantity whose }\sup \text { is taken in (14.4)}&= \int _{-\infty }^{\infty } ||AK(\lambda +i\epsilon )^2\varphi ||^2\,d\lambda \\&\le \alpha \int _{-\infty }^{\infty } ||K(\lambda +i\epsilon )\varphi ||^2 \, d\lambda \\&= \alpha ||\varphi ||^2 \end{aligned}$$
  8. 8.

    By (14.3), if A is H-smooth, then

    $$\begin{aligned} ||AR(\lambda +i\mu )\varphi ||&\le \int _{0}^{\infty } e^{-\mu t}||Ae^{-itH}\varphi ||\,dt \\&\le \left( \int _{0}^{\infty } e^{-2\mu t}\,dt\right) ^{1/2} \left( \int _{0}^{\infty }||Ae^{-itH}\varphi ||^2\,dt\right) ^{1/2} \\&\le (2\mu )^{-1/2}(2\pi )^{1/2} ||A||_H \end{aligned}$$

    so A H-smooth \(\Rightarrow \) A is H-bounded with relative bound zero.

  9. 9.

    In [335], Kato states this equivalence in stages since, as the title of the paper indicates, his focus is on controlling certain non-self-adjoint operators (we focus on the self-adjoint case of greatest interest in NRQM). He first considers general H with \(\sigma (H) \subset {\mathbb {R}}\) and proves a version of Theorem 14.6 below and then (following Friedrichs [172]) constructs similarity operators using a stationary replacement for wave operators. He next adds to H a condition that it generate a group \(\{U(t)\}_{t \in {\mathbb {R}}}\) of bounded operators with \(||U(t)||=\text {O}(e^{\epsilon t})\) for all \(\epsilon >0\). Then (14.3) holds with \(e^{-itH}\) replaced by U(t) and Kato proves the equality of (14.4) and (14.5) in that case. Finally, he proves the full Theorem 14.1 when H is self-adjoint.

Example 14.2

Let \(H = -i\frac{d}{dx}\) on \(L^2({\mathbb {R}})\) and let A be multiplication by f(x). Since \(e^{-itH}\varphi (x) = \varphi (x-t)\), we compute

$$\begin{aligned} \int _{-\infty }^{\infty } ||Ae^{-itH}\varphi ||^2\,dt&= \int _{{\mathbb {R}}^2} f(x)^2\varphi (x-t)^2 \,dx\,dt \\&= ||f||_2^2||\varphi ||_2^2 \end{aligned}$$

so, if \(f \in L^2({\mathbb {R}})\), then A is H-smooth.

Example 14.3

If \(H_0\) is \(-\Delta \) on \(L^2({\mathbb {R}}^3)\), it is known [612, (6.9.48)] that \((H_0+\kappa ^2)^{-1}\) with \(\text {Re}\,\kappa >0\) has integral kernel \(\frac{1}{4\pi |x-y|}e^{-\kappa |x-y|}\). Suppose that

$$\begin{aligned} \frac{1}{4\pi }\int \frac{|V(x)|\,|V(y)}{|x-y|^2}\,d^3x\,d^3y \equiv ||V||_R^2 < \infty \end{aligned}$$

called the Rollnik class in Simon [570] after Rollnik [526]. Then the Hilbert–Schmidt norm \(|||V|^{1/2}(H_0+\kappa ^2)^{-1}|V|^{1/2}||_{HS} \le ||V||_R\), so, by (14.7) \(|V|^{1/2}\) is \(H_0\)-smooth with \(|||V|^{1/2}||_{H_0} \le \pi ^{-1}||V||_R^{1/2}\). If \(V \in L^{3/2}({\mathbb {R}}^3)\), the HLS inequality ([615, Thm 6.2.1], [161, 426]) implies that V is Rollnik.

Smoothness has an immediate consequence for the spectral type of H:

Theorem 14.4

(Kato [335]) Let H be a self-adjoint operator and let A be H-smooth. Then \({\text {ran}}(A^*) \subset {\mathcal H}_{ac}(H)\). In particular, if \(\ker (A)=\{0\}\), then Hhas purely a.c. spectrum.

The proof is very easy. If \(d\nu \) is the H-spectral measure for \(A^*\varphi \), then (14.6) says that

$$\begin{aligned} \nu (I) \le ||A||_H ||\varphi ||^2 |I| \end{aligned}$$
(14.9)

(where \(|\cdot |\) is Lebesgue measure) for open intervals, I. By taking unions and using outer regularity, (14.9) holds for all sets, so \(\nu \) is absolutely continuous.

Smoothness also implies existence and completeness of wave operators.

Theorem 14.5

(Kato [335]) Let \(H, H_0\) be two self-adjoint operators. Let AB be closed operators so that A is H-smooth and B is \(H_0\)-smooth and so that

$$\begin{aligned} H-H_0 = A^*B \end{aligned}$$
(14.10)

in the sense that for \(\psi \in D(H)\) and \(\varphi \in D(H_0)\), we have that

$$\begin{aligned} \langle H\psi ,\varphi \rangle - \langle \psi ,H_0\varphi \rangle = \langle A\psi ,B\varphi \rangle \end{aligned}$$
(14.11)

Then \(\Omega ^{\pm }(H,H_0)\) exist and are complete.

Remarks

  1. 1.

    Since smoothness implies relative boundedness, if \(\psi \in D(H)\) and \(\varphi \in D(H_0)\), then the right side of (14.11) makes sense.

  2. 2.

    In some applications, one assumes that \(H-H_0=\sum _{j=1}^{n}A_j^*B_j\) with each \(A_j\) H-smooth and each \(B_j\) is \(H_0\)-smooth. The proof in remark 3 extends to this case or, alternatively, one can define smoothness for closed operators, A, from \({\mathcal H}\), the space on which H is defined to \({\mathcal K}\), a perhaps distinct Hilbert space, and then pick \({\mathcal K}=\oplus _{j=1}^n {\mathcal H}, \, B = \oplus _{j=1}^n B_j,\, A=\oplus _{j=1}^n A_j\) so \(A^*B = \sum _{j=1}^{n} A_j^* B_j\).

  3. 3.

    The proof is again easy (indeed, one of the beauties of Kato smoothness theory is how much one gets with simple proofs). If \(\psi \in D(H)\) and \(\varphi \in D(H_0)\), \(W(t) = e^{+itH}e^{-itH_0}\), then for \(s < t\),

    $$\begin{aligned} |\langle \psi ,(W(t)-W(s))\varphi \rangle |&= \left| \int _{s}^{t} \langle Ae^{-iuH}\psi ,Be^{-iuH_0}\varphi \rangle \, du \right| \\&\le \left( \int _{-\infty }^{\infty } ||Ae^{-iuH}\psi ||^2\,du\right) ^{1/2} \left( \int _{-s}^{t} ||Be^{-iuH_0}\varphi ||^2\,du\right) ^{1/2} \\&\le \sqrt{2\pi } ||A||_H ||\psi || \left( \int _{-s}^{t} ||Be^{-iuH_0}\varphi ||^2\,du\right) ^{1/2} \end{aligned}$$

    so

    $$\begin{aligned} ||(W(t)-W(s))\varphi || \le \sqrt{2\pi } ||A||_H \left( \int _{-s}^{t} ||Be^{-iuH_0}\varphi ||^2\,du\right) ^{1/2} \end{aligned}$$
    (14.12)

    is Cauchy. Therefore, \(\Omega ^{\pm }(H,H_0)\) exists. Since \(H_0-H = -B^*A\), we conclude that they are also complete by Theorem 13.1

We say that a closed operator, A is H-supersmooth if and only if

$$\begin{aligned} ||A||_{H,SS}^2 \equiv \sup _{z \in {\mathbb {C}}{\setminus }{\mathbb {R}}} ||A(H-z)^{-1}A^*|| < \infty \end{aligned}$$
(14.13)

The notion is in Kato [335] and the name is from Kato–Yajima [365] in 1989. The name hasn’t stuck but I like it, so I’ll use it. The fourth important result in Kato [335] is

Theorem 14.6

(Kato [335]) Let \(H_0\) be a self-adjoint operator. Let A be \(H_0\)-supersmooth and C a bounded self-adjoint operator so that

$$\begin{aligned} \alpha \equiv ||C||||A||_{H_0,SS}^2 < 1 \end{aligned}$$
(14.14)

Let \(B=A^*CA\). Then B is relatively form bounded with relative form bound at most \(\alpha \). If \(H=H_0+B\), then A is also H-supersmooth with

$$\begin{aligned} ||A||_{H,SS} \le ||A||_{H_0,SS}(1-\alpha )^{-1/2} \end{aligned}$$
(14.15)

In particular, \(\Omega ^{\pm }(H,H_0)\) exist and are complete.

Remarks

  1. 1.

    Once again, the proofs are simple. The key is a formal geometric series:

    $$\begin{aligned} A(H-z)^{-1}A^*&= A(H_0-z)^{-1}A^* \nonumber \\&\quad + \sum _{j=0}^{\infty } (-1)^{j+1} A(H_0-z)^{-1}A^* \left[ C A(H_0-z)^{-1}A^*\right] ^j C A(H_0-z)^{-1}A^* \end{aligned}$$
    (14.16)

    One proves the form boundedness and uses that to justify a formula like (14.16) but with an error term. Since \(||C A(H_0-z)^{-1}A^* || \le \alpha \), the error goes to zero and the series converges. The final assertion then comes from Theorem 14.5.

  2. 2.

    By the same analysis, the analog of Remark 2 after Theorem 14.5 holds. If \(H = H_0+\sum _{j=1}^{n} A_j^*B_j\) and \(\gamma _{jk} = \sup _{z \in {\mathbb {C}}{\setminus }{\mathbb {R}}} ||B_j(H_0-z)^{-1}A^*_k||\) is finite and \(\Gamma = \{\gamma _{jk}\}_{1 \le j,k \le n}\) is a matrix of norm \(\alpha < 1\), and if each \(A_j\) and \(B_j\) is supersmooth, then \(\Omega ^{\pm }(H,H_0)\) exist and are complete.

  3. 3.

    We repeat that in [335], Kato considers cases where \(H_0\) and C need not be self-adjoint. He assumes that \(\sigma (H_0) \subset {\mathbb {R}}\) and \({||C||\sup _z || A(H_0-z)^{-1}A^* || < 1}\) and then defines an operator H which is formally \(H_0+A^*CA\) with a resolvent that obeys (14.16). He then uses ideas going back to Friedrichs [172] to define (in terms of resolvents, not time limits) invertible operators \(W^{\pm }\) so that \(W^{\pm } H_0 (W^{\pm })^{-1} = H\).

That completes our discussion of [335]. The main result of [337] is

Theorem 14.7

(Putnam–Kato Theorem [337, 487]) Let A and B be bounded self-adjoint operators so that \(D \equiv i[A,B]\) is strictly positive in the sense that for all \(\varphi \ne 0\), we have that

$$\begin{aligned} \langle \varphi ,D\varphi \rangle > 0 \end{aligned}$$
(14.17)

Then A and B have purely a.c. spectrum.

Remarks

  1. 1.

    The result is due to Putnam. Kato found the really simple proof in the next remark.

  2. 2.

    The proof is easy. For let C be the square root of i[AB]. Then \(\frac{d}{dt}\langle e^{-itA}\varphi ,Be^{-itA}\varphi \rangle ^2 = ||Ce^{-itA}\varphi ||^2\) so the integral of \(||Ce^{-itA}\varphi ||^2\) from s to t is bounded by \(2||B||||\varphi ||\), Thus C is A-smooth and A has only a.c. spectrum on the closure of \({\text {ran}}(C)\) which is all of \({\mathcal H}\).

Example 14.8

(Weak coupling 2-body) In [335], Kato applied smoothness ideas to Schrödinger operators. If \(\nu =3\), as we’ve seen in Example 14.3, if \(V \in L^{3/2}\) (indeed, if V is Rollnik), then \(|V|^{1/2}\) is \(-\Delta \)-supersmooth, so for small real \(\lambda \), the wave operators, \(\Omega ^{\pm }(-\Delta +\lambda V, -\Delta )\) exist and are unitary. On \((0,\infty )\), if \(h_0=-\tfrac{d^2}{dx^2}\) with \(u(0)=0\), then \((h_0-z)^{-1}\) has an integral kernel dominated by \(\min (x,y)\) (see [616, (7.9.53]) for all \(z \in {\mathbb {C}}{\setminus } {\mathbb {R}}\), so if \(\int _{0}^{\infty } x|V(x)| dx < \infty \), then \(|V|^{1/2}\) is \(h_0\)-supersmooth and one knows that for \(\lambda \) small, that \(\Omega ^{\pm }(h_0+\lambda V,h_0)\) exists and are unitary.

One knows that if \(\nu =1\) or 2 and \(V \in C_0^\infty ({\mathbb {R}}^\nu ); V \not \equiv 0\), then for all \(\lambda \ne 0\), either \(-\Delta +\lambda V\) or \(-\Delta -\lambda V\) (or both) have a negative energy bound state [581] so there cannot be \(-\Delta \)-supersmoothness.

By interpolating between \(||e^{it\Delta }\varphi ||_\infty \le (4\pi t)^{-\nu /2}||\varphi ||_1\) and \(||e^{it\Delta }\varphi ||_2 = ||\varphi ||_2\), Kato [335] showed that if \(\nu \ge 4\) and \(V \in L^{\nu /2+\epsilon } \cap L^{\nu /2-\epsilon }\), then \(|V|^{1/2}\) is \(-\Delta \)-supersmooth and he conjectured that this held for \(\epsilon =0\). Indeed, the next theorem is true.

Theorem 14.9

Let \(\nu \ge 3\) and \(V \in L^{\nu /2}({\mathbb {R}}^\nu )\). Then V is supersmooth. In particular, for \(|\lambda |\) small and \(H = -\Delta +\lambda V, H_0=-\Delta \), we have that \(\Omega ^{\pm }(H,H_0)\) exist and are unitary so that H has purely a.c. spectrum.

Remarks

  1. 1.

    This result appeared in Kato–Yajima [365]. As they added in a “Note added in proof”, shortly before their paper, Kenig et al. [366] proved estimates that imply Theorem 14.9.

  2. 2.

    In [272], Iorio–O’Carroll used supersmoothness to show N-body systems with weak coupling (and \(\nu \ge 3\)) have unitary wave operators (so no bound states, no non-trivial scattering channels and purely a.c. spectrum). They required that the two body potentials lie in \(L^{\nu /2+\epsilon } \cap L^{\nu /2-\epsilon }\), but given Theorem 14.9, their method works for two body potentials in \(L^{\nu /2}\).

Kato–Yajima [365] also proved that \((1+|x|^2)^{-1/2}(1-\Delta )^{1/4}\) is \(-\Delta \)-supersmooth (which says something about \(V(x) = |x|^{-2}\) on \(L^2({\mathbb {R}}^\nu ); \nu \ge 3\)). Further developments are due to Ben-Artzi–Klainerman [47] and Simon [604]. In particular, Simon obtained optimal constants in the associated smoothness estimates; for \(\nu \ge 3\)

$$\begin{aligned}&\int _{-\infty }^{\infty } ||(x^2+1)^{-1/2}(-\Delta )^{1/4}e^{it\Delta }\varphi ||^2\,dt \le \frac{\pi }{2} ||\varphi ||^2 \end{aligned}$$
(14.18)
$$\begin{aligned}&\int _{-\infty }^{\infty } |||x|^{-1} e^{it\Delta }\varphi ||^2\,dt \le \frac{\pi }{\nu -2} ||\varphi ||^2 \end{aligned}$$
(14.19)

Next, having completed our discussion of Kato’s contributions to smoothness, we turn some applications beginning with repulsive potentials. In this (and other) regards, it is useful to have the notion of local smoothness due to Lavine [415]. Let \(\Omega \subset {\mathbb {R}}\) be a bounded Borel set. We say that A is locally H-smooth on \(\Omega \) if \(AP_\Omega (H)\) is H-smooth (where \(P_X(H)\) is a spectral projection for H and set X [616, Sect. 5.1]). It is easy to see [497, Theorem XIII.30] that if A is an operator with \(D(H) \subset D(A)\) and either \(\sup _{0< {\pm }\epsilon< 1; \lambda \in \Omega } \epsilon \, ||AR(\lambda +i\epsilon )||< \infty \) or \(\sup _{0< \epsilon< 1; \lambda \in \Omega } ||AR(\lambda +i\epsilon )A^*|| < \infty \), then A is locally H-smooth on \(\Omega \). It is also obvious that if \({\text {ran}}(A^*)\) is dense, then, if A is locally H-smooth, is purely absolutely continuous. The following is what makes local H-smoothness so useful:

Theorem 14.10

(Lavine [415]) Let H and \(H_0\) be self-adjoint and \(\Omega \subset {\mathbb {R}}\) a bounded open set. Suppose that \(H=H_0+A^*B\) where B is \(H_0\)-bounded and locally \(H_0\)-smooth on \(\Omega \) and A is H-bounded and locally H-smooth on \(\Omega \). Then

$$\begin{aligned} \Omega ^{\pm }(H,H_0;P_\Omega (H_0)) = {\text {s--}\lim }_{t \rightarrow {\mp } \infty } e^{itH}e^{-itH_0} P_\Omega (H_0) \end{aligned}$$
(14.20)

exist and have range \(P_\Omega (H)\).

Remarks

  1. 1.

    For complete proofs, see [415] or [497, Theorem XIII.31].

  2. 2.

    The same proof as Theorem 14.5 shows that \({\text {s--}\lim }_{t \rightarrow {\mp } \infty } P_\Omega (H) e^{itH}e^{-itH_0} P_\Omega (H_0)\) exists.

  3. 3.

    Since \(B e^{-itH_0} P_\Omega (H_0)(H_0-z)^{-1}\varphi \) is in \(L^2\) with an \(L^2\) derivative, we conclude that for any \(z \in {\mathbb {C}}{\setminus }{\mathbb {R}}\)

    $$\begin{aligned} {\text {s--}\lim }_{t \rightarrow {\mp } \infty } Be^{-itH_0} P_\Omega (H_0)(H_0-z)^{-1} = 0 \end{aligned}$$
  4. 4.

    Writing \((H-z)^{-1}-(H_0-z)^{-1} = \left[ A(H-\bar{z})^{-1}\right] ^*B(H_0-z)^{-1}\) and using the assumed boundedness of \(A(H-\bar{z})^{-1}\), we conclude by remark 3 that \({\text {s--}\lim }_{t \rightarrow {\mp } \infty } \left[ (H-z)^{-1}-(H_0-z)^{-1}\right] e^{-itH_0} P_\Omega (H_0)=0\) and then by the Stone–Weierstrass gavotte [101, Appendix to Chapter 3] that \({\text {s--}\lim }_{t \rightarrow {\mp } \infty } \left[ f(H)-f(H_0)\right] e^{-itH_0} P_\Omega (H_0)=0\) for any continuous function, f, so that \(1-f\) has compact support. Using this, one sees if \(I \subset \Omega \) is a compact set with \({\text {dist}}(I,{\mathbb {R}}{\setminus }\Omega ) > 0\), then \(\text {s--}\lim _{t \rightarrow {\mp } \infty } P_{{\mathbb {R}}{\setminus }\Omega } e^{itH}e^{-itH_0} P_I(H_0) = 0\). This implies that the limits in (14.20) exist and that \({\text {ran}}\,\Omega ^{\pm }(H,H_0;P_\Omega (H_0)) \subset {\text {ran}}\, P_\Omega (H)\). This plus symmetry between H and \(H_0\) plus the idea behind Theorem 13.1 imply that \({\text {ran}}\,\Omega ^{\pm }(H,H_0;P_\Omega (H_0)) = {\text {ran}}\, P_\Omega (H)\).

A potential, V, on \({\mathbb {R}}^\nu \) is called repulsive if and only if \({\mathbf {x}}\cdot \mathbf {\nabla }V \le 0\) (e.g. \(V(x) = (1+|x|)^{-\alpha }\), any \(\alpha >0\)). If \(V(x) \rightarrow 0\) at infinity, then \(V(x) \ge 0\). If \({A=\tfrac{i}{2}({\mathbf {x}}\cdot \mathbf {\nabla }+\mathbf {\nabla }\cdot {\mathbf {x}})}\) is the generator of dilations and V is repulsive, then \(i[A,H_0+V]=2H_0- {\mathbf {x}}\cdot \mathbf {\nabla }V \ge 0\). One cannot use the Putnam–Kato theorem since neither A nor H is bounded. If you look at the above proof of the Putnam–Kato theorem, that H is unbounded isn’t a problem if our goal is to find a C which is H-smooth. But the unbounded A is. Lavine’s idea was to cutoff \({\mathbf {x}}\) in the definition of A and get an \(\tilde{A}\) which is H-bounded and so that \(i[\tilde{A},H] \ge c(1+|x|^2)^{-\beta }\) for suitable \(\beta \) and as in the Putnam–Kato argument, get that \({(1+|x|^2)^{-\beta /2}(H+1)^{-1}}\) is H-smooth. In this way (he used local smoothness to get wave operators), Lavine proved

Theorem 14.11

(Lavine [413,414,415,416]) Let H be an N-body Hamiltonian with center of mass removed on \(L^2({\mathbb {R}}^{(N-1)\nu })\) whose two body potentials \(V_{ij}\) lie in \(L^p({\mathbb {R}}^\nu )+L^\infty ({\mathbb {R}}^\nu )\) (with p \(\nu \)-canonical) and are repulsive. Then H has purely absolutely continuous spectrum. If moreover, for some \(\beta > 5/2\), we have that \(|V_{ij}(x)| \le C(1+|x|)^{-\beta }\), then \(\Omega ^{\pm }(H,H_0)\) exist and are complete.

Remark

5 / 2 is an artifact of the proof and when the \(V_{ij}\) are spherically symmetric, it has been improved to \(\beta > 1\) in Lavine [416].

Our final major topic concerns ideas of Vakulenko [660]; the reader should first look at the discussion around Eq. (12.6) for definitions of Vakulenko bounding function and Vakulenko potential.

Lemma 14.12

(Vakulenko [660]) Let H be self-adjoint and A a closed H-bounded operator. Let [ab] be a bounded closed interval in \({\mathbb {R}}\) and B a closed operator with \(D(H) \subset D(B)\) so that for all \(\varphi \in D(H)\) and \(\lambda \in [a,b]\), we have that

$$\begin{aligned} \text {Re} \langle (H-\lambda )\varphi ,A\varphi \rangle \ge ||B\varphi ||^2 \end{aligned}$$
(14.21)

Then B is H-smooth on [ab].

Remarks

  1. 1.

    As a preliminary, we note that since \(\frac{|x-\lambda |}{|x-(\lambda +i\epsilon )|} \le 1\), we have that

    $$\begin{aligned} ||(H-\lambda )R(\lambda +i\epsilon )|| \le 1 \end{aligned}$$
    (14.22)
  2. 2.

    As a second preliminary, if

    $$\begin{aligned} ||A\varphi || \le \alpha ||H\varphi || + \beta ||\varphi || \end{aligned}$$
    (14.23)

    then

    $$\begin{aligned} ||AR(\lambda +i\epsilon )\psi ||&\le \alpha ||[(H-\lambda )+\lambda ]R(\lambda +i\epsilon )\psi || + \beta ||R(\lambda +i\epsilon )\psi || \nonumber \\&\le (\alpha +\alpha |\lambda |\epsilon ^{-1}+\beta \epsilon ^{-1})||\psi || \end{aligned}$$
    (14.24)
  3. 3.

    Letting \(\varphi =R(\lambda +i\epsilon )\psi \) in (14.21), we see that

    $$\begin{aligned} ||BR(\lambda +i\epsilon )\psi ||^2&\le ||(H-\lambda )R(\lambda +i\epsilon )||||AR(\lambda +i\epsilon )||||\psi ||^2 \nonumber \\&\le C\epsilon ^{-1} ||\psi ||^2 \end{aligned}$$
    (14.25)

    (by (14.22)/(14.24)) for \(0< \epsilon < 1\) and all \(\lambda \in [a,b]\) where C is a constant depending on \(\alpha , \beta , a\) and b. This implies local smoothness by the discussion prior to Theorem 14.10.

  4. 4.

    Vakulenko’s A is close to i times a cutoff dilation generator, so the left side of (14.21) is like an expectation of a commutator and thus this is a variant of a Mourre estimate but unlike the Mourre estimate, there is no (compact) error term.

In Theorem 12.2, we stated a bound of the form (14.21) which immediately implies (given the lemma)

Theorem 14.13

(Vakulenko [660]) Let V(x) be a Vakulenko potential with (12.7) for some Vakulenko bounding function \(\eta \). Then \(\sqrt{\eta }\) is \(-\Delta +V\) locally smooth on \((0,\infty )\). In particular, the spectrum of \(-\Delta +V\) is purely absolutely continuous on \((0,\infty )\) and the wave operators exist and are complete.

Remarks

  1. 1.

    Since \(\eta \) is everywhere non-vanishing, \({\text {ran}}\, \sqrt{\eta }\) is dense and this implies the absolute continuity on \((0,\infty )\).

  2. 2.

    \(\sqrt{\eta }\) is locally smooth for both \(-\Delta +V\) and \(-\Delta \) (since the zero potential is a Vakulenko potential with bounding function \(\eta \)). Since \(|V|^{1/2} \le \sqrt{\eta }\), we see that \(|V|^{1/2}\) is locally smooth which implies that wave operators exist and are complete.

  3. 3.

    The proof of Theorem 12.2 is particularly easy when \(\nu =1\). Fix \(\lambda _0 > 0\) and let

    $$\begin{aligned} \omega (x) = \exp \left[ \frac{2}{\sqrt{\lambda _0}}\int _{-\infty }^{x} \eta (y) \, dy\right] \end{aligned}$$
    (14.26)

    and

    $$\begin{aligned} A = 2\omega \frac{d}{dx} \end{aligned}$$
    (14.27)

    Since \(\eta \in L^1({\mathbb {R}})\), \(\omega \) is bounded so since \(\frac{d}{dx}(-\Delta +V+i)^{-1}\) is bounded, we see that A is H-bounded. It is easy to see (since \(\eta \) and V are real) that it suffices to prove (14.21) when \(\varphi \) is real in which case:

    $$\begin{aligned} \langle (H-\lambda )\varphi ,A\varphi \rangle = \int _{-\infty }^{\infty } \left[ \omega '\left[ (\varphi ')^2+\lambda (\varphi )^2\right] +2\omega V\varphi \varphi '\right] \, dx \end{aligned}$$
    (14.28)

    which we get by integration by parts in

    $$\begin{aligned} 2\int _{-\infty }^{\infty } (-\varphi ''-\lambda \varphi )\omega \varphi '\,dx = -\int _{-\infty }^{\infty } \omega \left[ (\varphi ')^2+\lambda (\varphi )^2\right] '\,dx \end{aligned}$$

    Since \((|\varphi '|-\sqrt{\lambda }\varphi )^2 = (\varphi ')^2+\lambda (\varphi )^2-2\sqrt{\lambda }|\varphi '||\varphi |\) we see that

    $$\begin{aligned} \text {RHS of (14.28)} \ge \int _{-\infty }^{\infty } \left( \omega '-\frac{|V(x)|}{\sqrt{\lambda }} \omega \right) \left[ (\varphi ')^2+\lambda (\varphi )^2\right] '\,dx \end{aligned}$$
    (14.29)

    By construction of \(\omega \), \(|V| \le \eta \), \(\omega \ge 1\) and \(\lambda > \lambda _0\), we have that

    $$\begin{aligned} \omega '-\frac{|V(x)|}{\sqrt{\lambda }} \omega \ge \frac{1}{\sqrt{\lambda _0}}\omega \eta \ge \frac{\eta }{\sqrt{\lambda _0}} \end{aligned}$$
    (14.30)

    Thus

    $$\begin{aligned} \text {RHS of (14.29)}&\ge \int _{-\infty }^{\infty } \frac{\lambda }{\sqrt{\lambda _0}} \eta (x)(\varphi )^2\,dx \nonumber \\&\ge \sqrt{\lambda _0} ||\sqrt{\eta }\varphi ||^2 \end{aligned}$$
    (14.31)

    which is (14.21). The higher dimensional case needs a carefully constructed \(\omega \) but is along similar lines.

  4. 4.

    Since \(\eta (x) = (1+|x|)^{-\alpha }, \alpha > 1\) is a Vakulenko bounding function, we get the Corollary below.

Corollary 14.14

If

$$\begin{aligned} |V(x)| \le C(1+|x|)^{-\alpha } \end{aligned}$$
(14.32)

for some \(\alpha > 1\), then \(H=-\Delta + V\) has purely a.c. spectrum on \((0,\infty )\) and with \(H_0=-\Delta \), \(\Omega ^{\pm }(H,H_0)\) exist and are complete.

Thus Vakulenko obtained a new and beautiful proof of an Agmon–Kato–Kuroda type theorem of the kind we discuss in the next section (albeit 15 years after their work). Unlike their method, this one seems to require pointwise bounds and doesn’t allow for local singularities.

Yafaev [701] has an approach to long range 2-body scattering that exploits some ideas from the theory of smooth perturbations.

We note that the earliest proofs of N-body asymptotic completeness for \(\text {0}(|x|^{-1-\epsilon })\) potentials (at least when \(N \ge 4\)) were by Sigal–Soffer [564, 565] and then by Graf [196] and Dereziński [115]. Dereziński [115] and Sigal and Soffer [565] have results on long range results. In [700], Yafaev found a proof that exploits smoothness ideas (as well as some of the tools—Mourre estimates [177, 448, 482], Deift–Simon wave operators [110], Enss type phase space analysis [139, 140]—of the earlier approaches). Kato never considered N-body scattering, which is quite involved, so we refer the reader to Yafaev’s original paper [700] or lecture notes [703] for details.

5 Scattering and spectral theory, III: Kato–Kuroda theory

This is the third section on spectral and scattering theory; it focuses on stationary, aka time-independent, methods. As with the prior two sections, we’ll include an overview portion but we want to begin by describing the problem we’ll discuss and the contributions of Agmon, Kato and Kuroda. While it is significant that local singularities can be accommodated, we’ll mainly discuss the case (13.16), i.e.

$$\begin{aligned} |V(x)| \le C(1+|x|)^{-\alpha } \end{aligned}$$
(15.1)

We consider \(H_0=-\Delta , H=-\Delta +V(x)\) on \(L^2({\mathbb {R}}^\nu ,d^\nu x)\). These are the questions that will concern us:

  1. (A)

    Existence and Completeness of \(\Omega ^{\pm }(H,H_0)\)

  2. (B)

    Absence of singular continuous spectrum

As a sidelight of the methods, one also gets continuum eigenfunction expansions of a type I will discuss below. There is also the issue of positive eigenvalues which except for the work of Vakulenko (as discussed in Sects. 12 and 14) was studied using very different methods from those used in this section; see Sect. 12.

As we explained in Sect. 13, it follows from Cook’s method that \(\Omega ^{\pm }(H,H_0)\) exist if \(\alpha >1\) while they may not if \(\alpha \le 1\). It is known (see Sect. 20) that (B) can fail if \(\alpha < 1\) (although this was not known in the 1970s), so in the 15 years after 1957, a lot of effort was made on studying problems (A) and (B) when \(\alpha >1\). We’ll say a lot more about the detailed history later but start with the best results of Kato–Kuroda on the subject and on the optimal result.

In 1969, Kato [338] using, in part, ideas of Kato–Kuroda (of which we’ll say a lot more below) proved

Theorem 15.1

(Kato [338]) Let V obey (15.1) and \(H, H_0\) as above. Then

  1. (a)

    If \(\alpha >1\), the wave operators exist and are complete.

  2. (b)

    If \(\alpha > 5/4\), H has no singular continuous spectrum.

In 1970, Agmon [3] announced.

Theorem 15.2

(Agmon [3, 4]) Let V obey (15.1) and H as above. If \(\alpha > 1\), H has no singular continuous spectrum.

While Agmon did not discuss scattering in his announcement, Lavine [417] noted that his estimates and Lavine’s theory of local smoothness implied existence and completeness of wave operators (and later, both Agmon and Hörmander presented other approaches to get completeness). We also note that as discussed, for example, in [497, Sect. XIII.8], one can accommodate local singularities; in place of assuming \((1+|x|)^\alpha V(x)\) is bounded, one need only assume that it is a relatively compact perturbation of \(-\Delta \).

Agmon was able to go from 5 / 4 to 1 by an astute observation (Step 8 in the scheme at the end of the chapter). By using the same idea, Kuroda could extend that Kato–Kuroda argument up to \(\alpha >1\). Later we’ll say more about work of others on these problems.

Our goal in this section is to explain the machinery behind certain proofs of Theorems 15.1 and 15.2. We begin with some general overview of the stationary approach to scattering. The earliest mathematical approach to stationary scattering is in Friedrichs [172] but we will focus on a slightly later one of Povzner [485] in 1953 and Kato’s student, Ikebe [265], in 1960 that discusses eigenfunction expansion. Their expansions are to be distinguished from what [600] calls BGK expansions after Berezanskii, Browder, Gårding, Gel’fand and Kac (see references in [600]). The BGK expansion is essentially a variant of the spectral theorem when an operator A on \(L^2({\mathbb {R}}^\nu ,d^\nu x)\) has local trace class properties (i.e. \(f(x)P_{[a,b]}(A)f(x)\) is trace class for \(f \in C^\infty _0({\mathbb {R}}^\nu )\)). This expansion is stated in terms of the spectral measures and so has no implications for the spectral properties. The advantage of BGK expansions is that they are always applicable for Schrödinger operators (see [600]) while the Povzner–Ikebe expansion only works in special situations, but when it does, it provides a lot of additional information.

The IP expansion of Povzner–Ikebe involves not spectral measures but \(d^\nu k\) which is why it has important spectral consequences. The model is the Fourier transform for \(H_0=-\Delta \) which in this introductory discussion we’ll denote as \(\hat{f}_0\) (since we’ll use \(\hat{f}\) for something else) defined on \({\mathbb {R}}^\nu \) by

$$\begin{aligned} \hat{f}_0({\mathbf {k}})= & {} (2\pi )^{-\nu /2} \int \overline{\varphi _0({\mathbf {x}},{\mathbf {k}})} f({\mathbf {x}}) \, d^\nu x \end{aligned}$$
(15.2)
$$\begin{aligned} \varphi _0({\mathbf {x}},{\mathbf {k}})= & {} e^{i{\mathbf {k}}\cdot {\mathbf {x}}} \end{aligned}$$
(15.3)

(see [612, Sect. 6.5] for the meaning of (15.2) when f is only in \(L^2\) and not in \(L^1\)). This provides an eigenfunction expansion of \(H_0\) in that (except for places where we want to emphasize the vector nature of \({\mathbf {x}}\) and \({\mathbf {k}}\), we will start using non-boldface)

$$\begin{aligned} f(x)= & {} (2\pi )^{-\nu /2} \int \varphi _0(x,k) \hat{f}_0(k) \, d^\nu k \end{aligned}$$
(15.4)
$$\begin{aligned} \widehat{H_0f}_0(k)= & {} |k|^2 \hat{f}_0(k) \end{aligned}$$
(15.5)

so that formally (and much more), \(H_0\varphi (\cdot ,k)=|k|^2\varphi (\cdot ,k)\).

For suitable V and \(H=H_0+V\), what Povzner and Ikebe found are functions, \(\varphi ({\mathbf {x}},{\mathbf {k}})\), so that if \(\hat{f}\) is defined by

$$\begin{aligned} \hat{f}(k) = (2\pi )^{-\nu /2} \int \overline{\varphi (x,k)} f(x) \, d^\nu x \end{aligned}$$
(15.6)

and if \(\{\varphi _n(x)\}_{n=1}^N\) is an orthonormal basis of \(L^2\) eigenfunctions for \({\mathcal H}_{pp}(H)\) with \(H\varphi _n=E_n\varphi _n\), then

$$\begin{aligned} f(x) = \sum _{n=1}^{N} \langle \varphi _n,f \rangle \varphi _n(x) + (2\pi )^{-\nu /2} \int \varphi (x,k)\hat{f}(k) \, d^\nu k \end{aligned}$$
(15.7)

and

$$\begin{aligned} {\widehat{Hf}}(k) = |k|^2 \hat{f}(k) \end{aligned}$$
(15.8)

This implies that H has point spectrum plus a.c. spectrum solving problem (B).

They also proved a connection to scattering

$$\begin{aligned} \widehat{\Omega ^+f}=\hat{f}_0 \end{aligned}$$
(15.9)

so that formally

$$\begin{aligned} \Omega ^+\varphi = \varphi _0 \end{aligned}$$
(15.10)

(we’ll say more about this shortly). This implies that \({\text {ran}}\,\Omega ^+={\mathcal H}_{ac}(H)\) and then, since \(\Omega ^+\bar{f}=\overline{\Omega ^-f}\) (where is complex conjugate), we have that \({\text {ran}}\,\Omega ^+={\mathcal H}_{ac}(H)\) solving problem (A).

In the physics literature, Gell’Mann and Goldberger [185] appealing to stationary phase arguments [614, Sect. 15.3], considered the meaning of (15.10) and formally proved, that pointwise it held if the limit in the definition of wave operator is an abelian limit (i.e. an \(e^{-\epsilon t}\) is added to the quantity in the limit and then one takes \(\epsilon \downarrow 0\)). Indeed, Ikebe proved (15.9) in terms of abelian limits and then used the existence of the ordinary limit proven by other means.

Of course, one has to find suitable continuum eigenfunctions, \(\varphi ({\mathbf {x}},{\mathbf {k}})\), so that (15.9) holds. Some thought about Born’s ideas suggests one wants \(\varphi \) to have the asymptotics (13.1) near \({\mathbf {x}}=\infty \). We’ll explain that \(\varphi \) obeys an integral equation called the Lippmann–Schwinger equation introduced by two physicists [432] in 1950. Following Lippmann–Schwinger, Povzner and Ikebe, we only consider \(\nu =3\) where the integral kernel for \((H_0-k^2)^{-1}\) is especially simple.

Since formally \((H_0+V-k^2)\varphi =0\), we might expect that \(\varphi \) obeys \(\varphi =-(H_0-k^2)^{-1}V\varphi \). There are two problems with this. First, since \(k^2\) is in the spectrum of \(H_0\), we can’t use \((H_0-k^2)^{-1}\) as a bounded operator on \(L^2\). If \(\text {Im}(k) > 0\) (so \(k^2 \notin {\mathbb {R}}\)), then \((H_0-k^2)^{-1}\) has an integral kernel, \(G_0({\mathbf {x}},\mathbf {y};k^2)\), given by

$$\begin{aligned} G_0(x,y;k^2)=\frac{e^{ik|x-y|}}{4\pi |x-y|} \end{aligned}$$
(15.11)

This has a pointwise limit as \(k^2 \rightarrow {\mathbb {R}}\), indeed two different limits if one takes \(\epsilon \downarrow 0\) for \(k^2{\pm } i\epsilon \). We thus define for \(k>0\)

$$\begin{aligned} G_0(x,y,k^2{\pm } i0) = \frac{e^{{\pm } ik|x-y|}}{4\pi |x-y|} \end{aligned}$$
(15.12)

As we’ll see, to get (15.9), we want to pick \(+i0\), not \(-i0\). It is the use of plus here that led physicists to use \(\Omega ^+\) for the limit as \(t \rightarrow -\infty \). This gives meaning to \(-(H_0-k^2)^{-1}V\varphi \).

The second problem with \(\varphi =-(H_0-k^2)^{-1}V\varphi \) is that if V has rapid decay (e.g. V has compact support), it is not hard to see that \(-(H_0-k^2)^{-1}V\varphi \) looks like the second term in (13.1), so it is tempting to try \(\varphi =e^{ik.x}-(H_0-k^2)^{-1}V\varphi \). Notice that since \((H_0-k^2)\) has a kernel (among “reasonable” functions), we are allowed to add elements of the kernel when inverting; put differently \((H_0-k^2)[e^{ik.x}-(H_0-k^2)^{-1}V\varphi = -V\varphi \) and thus our formal eigenfunctions will be solutions of the Lippmann–Schwinger equation

$$\begin{aligned} \varphi ({\mathbf {x}},{\mathbf {k}}) = e^{i{\mathbf {k}}\cdot {\mathbf {x}}}-\frac{1}{4\pi } \int \frac{e^{i|{\mathbf {k}}||{\mathbf {x}}-\mathbf {y}|}}{|{\mathbf {x}}-\mathbf {y}|}V(\mathbf {y})\varphi (\mathbf {y}) d^3y \end{aligned}$$
(15.13)

The pioneer in using the Lippmann–Schwinger equation to prove mathematical results about eigenfunction expansions was Povzner [485, 486]. In [485], published in 1953, he considered \(C^\infty \) potentials, V, obeying (15.1) for \(\nu =3, \alpha > 7/2\) and solved problem (B) affirmatively for such \(\alpha \). In 1955, in [486], for V’s of compact support, he solved problem (A) (when \(\nu =3\)). Bear in mind that the results of Cook, Hack and Kuroda on existence (via Cook’s method) didn’t exist when Povzner wrote [486]. As we’ll see, Ikebe’s approach to solving problem (A) uses these a priori existence results.

In 1960, Ikebe [265] used eigenfunction expansions to solve problems (A) and (B) when \(\nu =3\) and V obeys (15.1) near infinity for \(\alpha >2\) and moreover, V is Hölder continuous away from a finite number of points where it is locally \(L^2\). Let us sketch the ideas that he used:

  1. (i)

    Let B be the Banach space, \(C_\infty ({\mathbb {R}}^3)\), of bounded functions vanishing at \(\infty \) with \(||\cdot ||_\infty \). For \(\text {Im}(\kappa ) \ge 0\), define

    $$\begin{aligned} (T_\kappa g)(x) = -\frac{1}{4\pi } \int \frac{e^{i\kappa |x-y|}}{|x-y|} V(y) g(y)\, d^3y \end{aligned}$$
    (15.14)

    Then if V obeys (15.1) with \(\alpha >2\), \(T_\kappa \) is a bounded, indeed a compact, operator of B to B which is analytic in \(\kappa \) on \({\mathbb {C}}_+\) and Hölder continuous on \(\overline{{\mathbb {C}}_+}{\setminus }\{0\}\).

  2. (ii)

    One shows that \(T_\kappa \psi = \psi \) has no non-zero solution for \(\text {Im}(\kappa ) > 0\) (since \(\psi \) is then exponentially decreasing and so in \(L^2\) violating self-adjointness) and then also for \(\text {Im}(\kappa ) = 0, \kappa \ne 0\) since one can use Kato’s result mentioned in Remark 4 after Theorem 12.1. In this analysis, Ikebe shows that if \(\kappa \in {\mathbb {R}}{\setminus }\{0\}\) and \(\psi \) solves \(T_\kappa \psi =\psi \), then \(\varphi \equiv \psi \in L^2({\mathbb {R}}^3)\) obeys

    $$\begin{aligned} \int _{|k|=\kappa } |\hat{\varphi }(k)|^2 \, d\omega = 0 \end{aligned}$$
    (15.15)

    suitably interpreted. This result, also found by Povzner, is important as we’ll see later.

  3. (iii)

    By Fredholm theory, since \(T_\kappa \psi =\psi \) has no solutions, \({\varvec{1}}-T_\kappa \) is invertible. One defines \(\varphi (\cdot ,{\mathbf {k}})\) to be \(({\varvec{1}}-T_{|k|})^{-1}\varphi _0(\cdot ,{\mathbf {k}})\) with \(\varphi _0\) given by (15.3) (\(\varphi _0 \notin B\) since it doesn’t vanish at infinity but if \(\eta = \varphi -\varphi _0\), then \(\varphi =\varphi _0+T_{|k|}\varphi \iff \eta =T_{|k|}\varphi +T_{|k|}\eta \). Note that \(\eta \) and \(T_{|k|}\varphi _0\) are in B). In this way, one gets solutions of the Lippmann–Schwinger equation.

  4. (iv)

    One also solves \(G=G_0+T_\kappa G\) [where \(G_0\) is the free Green’s function (15.12)] and uses this plus Stone’s theorem to verify the expansion (15.7)

  5. (v)

    By following arguments of Gell’Mann–Goldberger [185], one proves (15.9) where \(\Omega ^+\) is an abelian limit. By the results of Cook–Hack–Kuroda, this abelian limit is equal to the ordinary limit.

  6. (vi)

    Equation (15.7) solves problem (B) and (15.9) solves problem (A) as noted above.

  7. (vii)

    There is a gap in [265] found and filled in Simon [570] and also filled by Ikebe [266].

We should briefly mention two variants of Ikebe’s work. First, Thoe [645] extended the result to \({\mathbb {R}}^\nu \) for general \(\nu \). Secondly, for Rollnik potentials (any V obeying (15.1) for \(\alpha >2\) is in \(L^{3/2}\) and so Rollnik but Rollnik allows \(L^{3/2}\) local singularities), following Rollnik [526] and Grossman–Wu [208, 209], one can rewrite the Lippmann–Schwinger equation in an equivalent form:

$$\begin{aligned} \xi (x)&= \xi _0(x) - \frac{1}{4\pi } \int |V(x)|^{1/2}\frac{e^{ik|x-y|}}{|x-y|} V^{1/2}(y) \xi (y) \nonumber \\&\equiv \xi _0(x)+(W_{|k|}\xi )(x) \end{aligned}$$
(15.16)

where \(V^{1/2}(y) = |V(y)|^{1/2} \text {sgn}(V(y))\) and \(\xi (x) = |V(x)|^{1/2}\varphi (x)\). The point is that the integral kernel in (15.16) is Hilbert–Schmidt for \(\text {Im}(k) \ge 0\) if V is Rollnik. This was used by Simon [570] to carry over Ikebe’s arguments. One big difference is that there is no Kato argument to eliminate solutions of the homogeneous equations. But by Fredholm theory, in any event, the set of points where \(1-W_{|k|}\) is not invertible is the set of zeros of a function analytic on \({\mathbb {C}}_+\) and continuous on its closure, so a subset of \({\mathbb {R}}\) with real Lebesgue measure zero. This allows a proof of completeness but not a solution of problem (B). We’ll say more about this issue below. We note that this factorization idea is used in several of the approaches to the Agmon–Kato–Kuroda theory and, in particular, an option in the work of Kato and Kuroda. We’ll not discuss this further.

Subsequent to Ikebe solving problem (B) if \(\alpha > 2\), the search for the general \(\alpha > 1\) result was solved in stages: Jäger [277] did it for \(\alpha > 3/2\), Rejto [499] for \(\alpha > 4/3\), Kato [338] using Kato–Kuroda theory did \(\alpha > 5/4\) as we’ve seen, Rejto [503] did \(\alpha > 6/5\) and finally Agmon [4] (and shortly afterwards, independently Saito [534]) handled \(\alpha > 1\). As we’ll explain using one simple idea from Agmon, Kuroda and Rejto could extend their methods to handle \(\alpha > 1\). Howland [254] had earlier work on this problem and Schechter [536] used Kato–Kuroda theory to study higher order elliptic operators (as we’ll see Agmon, Hörmander and Kuroda also did).

In two papers [362, 363], Kato and Kuroda developed what they called an abstract theory of scattering. As Kuroda told me “it was too abstract to become popular” (blaming himself for this). In recognition of the history, Reed–Simon dubbed the basic result for \(\alpha > 1\) the Agmon–Kato–Kuroda Theorem but it is Agmon’s approach that has stuck around. And this is due not only to the abstraction but also to the elegance and simplicity of Agmon’s approach and its flexibility. Moreover, two early, widely-used monograph presentations (Reed–Simon [497, Sect. XIII.8] and Hörmander [252, 253]) exposed the Agmon approach. All this said, while Agmon’s technicalities are distinct from Kato–Kuroda, the underlying conceptual framework is similar. We will describe this scheme using Agmon’s approach to explicitly implement the steps.

Agmon uses the spaces \(L^2_\beta ({\mathbb {R}}^\nu )\) defined by

$$\begin{aligned} ||\varphi ||_\beta ^2 = \int (1+|x|^2)^\beta |\varphi (x)|^2 \, d^\nu x < \infty \end{aligned}$$
(15.17)

These are Hilbert spaces. One suppresses the natural duality of Hilbert spaces and associates the dual of \(L^2_\beta \) with \(L^2_{-\beta }\) so that \(\psi \in L^2_{-\beta }\) is associated with the linear functional \(\varphi \mapsto \int \overline{\psi (x)} \varphi (x) \, d^\nu x\). Here are the basic facts about Fourier transform on \(L^2_\beta \) that we’ll need. For proofs, see [495, Sect. IX.9]; basically, one proves things for \(\nu = 1\) and uses spherical coordinates for the other variables. We return to using \(f \mapsto \hat{f}\) for the Fourier transform.

  1. (1)

    Let \(\beta > 1/2\). There is for each \(\lambda \in (0,\infty )\), a bounded map, \(T_\lambda :L^2_\beta ({\mathbb {R}}^\nu ) \rightarrow L^2(S^{\nu -1},d\omega )\) (where \(d\omega \) is unnormalized measure on the unit sphere in \({\mathbb {R}}^\nu \)), so that if \(f \in {\mathcal S}({\mathbb {R}}^\nu )\), then

    $$\begin{aligned} (T_\lambda f)(\omega ) = \widehat{f}(\lambda \omega ) \end{aligned}$$
    (15.18)
  2. (2)

    \(T_\lambda \) is norm Hölder continuous in \(\lambda \) of order \(\beta -1/2\) if \(1/2<\beta <3/2\).

  3. (3)

    Fix \(\beta > 1/2\). As maps of \(L^2_\beta \) to \(L^2_{-\beta }\), \((-\Delta -\kappa ^2)^{-1}\) defined initially for \(\text {Im}\kappa > 0\) has a continuous extension to \(\kappa \in {\mathbb {R}}{\setminus }\{0\}\).

  4. (4)

    If \(\varphi \in L^2_\beta ,\, \beta >1/2\) and \(\kappa >0\), then

    $$\begin{aligned} \lim _{\epsilon \downarrow 0} \text {Im} \langle \varphi ,(-\Delta -(\kappa ^2+i\epsilon ))^{-1}\varphi \rangle = \frac{\pi \kappa ^{\nu -2}}{2} ||T_\kappa \varphi ||_2^2 \end{aligned}$$
    (15.19)

    (This is just a version of \(\lim _{\epsilon \downarrow 0} \frac{1}{x-i\epsilon }={\mathcal P}\left( \frac{1}{x}\right) +i\pi \delta (x)\)).

  5. (5)

    Let \(\beta >1/2\). Fix \(\kappa >0\) and suppose that \(\varphi \in L^2_\beta \) with \(T_\kappa \varphi =0\). Define \(Q_\kappa \varphi \) by

    $$\begin{aligned} \widehat{Q_\kappa \varphi }({\mathbf {k}}) = (k^2-\kappa ^2)^{-1}\hat{\varphi }({\mathbf {k}}) \end{aligned}$$
    (15.20)

    Then for each \(\epsilon > 0\), \(Q_\kappa \varphi \in L^2_{\beta -1-\epsilon }\) and

    $$\begin{aligned} ||Q_\kappa \varphi ||_{\beta -1-\epsilon } \le C_{\epsilon ,\kappa ,\nu ,\beta }||\varphi ||_\beta \end{aligned}$$
    (15.21)

    where C depends continuously on its parameters in the region \(\epsilon , \kappa> 0, \, \beta >1/2\). The point here is that without \(T_\kappa \varphi =0\), we can define the limit as \(\epsilon \downarrow 0\) of \((k^2-\kappa ^2-i\epsilon )^{-1}\hat{\varphi }({\mathbf {k}})\) which for \(\varphi \in L^2_\beta \) with \(\beta >1/2\) lies in \(L^2_{-\beta }\) but we can never get better than \(L^2_{-1/2}\). When \(T_\kappa \varphi =0\), by having \(\beta \) large we can get \(\varphi \) into a suitable \(L^2_{\gamma }\) and, in particular, into \(L^2\).

We can now describe the basic strategy of solving problems (A) and (B) for any \(\alpha > 1\).

  • Step 1 Find a triple of spaces \(X \subset L^2({\mathbb {R}}^\nu ,d^\nu x) \subset X^*\) where X is a dense subspace of \(L^2\) and which is a Banach space in a norm, \(||\cdot ||_X\), so that for \(\psi \in X\), we have that \(||\psi ||_2 \le ||\psi ||_X\). Any \(\varphi \in L^2\) acts as a bounded linear functional on X via \(\ell _\varphi (\psi ) = \langle \bar{\varphi },\psi \rangle \) so \(L^2 \subset X^*\) which can be shown to be dense. Note that when \(\varphi \in L^2\), we have that \(||\varphi ||_{X^*} \le ||\varphi ||_2\). In the Agmon approach, \(X = L^2_\beta ({\mathbb {R}}^\nu )\) for some \(\beta > 1/2\) and \(X^*=L^2_{-\beta }({\mathbb {R}}^\nu )\). Let \(H_0\) be a self-adjoint operator which in the Agmon setup is a constant coefficient elliptic partial differential operator although we’ll mainly be interested in the case \(H_0=-\Delta \). By the norm inequalities, for any \(z \in {\mathbb {C}}{\setminus } [E_0,\infty )\) (where \(E_0\) is the bottom of the spectrum of \(H_0\)), \((H_0-z)^{-1}\) is bounded from X to \(X^*\). One must pick X so that \((H_0-z)^{-1}\), as bounded maps from X to \(X^*\) has a continuous extension to \([E_0,\infty )\) with a finite set of points removed. The extension is from above or below the real axis and the two limits need not be equal. In our case where \(E_0=0\), the finite set is only \(E_0\). In the general elliptic case, it is the set of critical points of the defining symbol. As explained above, in the Agmon setup, where \(X=L^2_\beta , \beta > 1/2\), we have the required continuity of the boundary values. In the Kato–Kuroda theory, X is an abstract space which can be chosen in various ways.

  • Step 2 Restrict acceptable potentials, V, to functions \(V: X^* \rightarrow X\) or, more generally so that \(V(H_0-E_0+1)^{-1}\) is bounded from X to itself. In fact, we require this to be a compact operator from X to itself. In the Agmon \(L^2_\beta \) case, for \(H_0=-\Delta \), one needs that \((1+|x|^2)^\beta V(-\Delta +1)^{-1}\) is compact as an operator on \(L^2\). In particular, if (15.1) holds, we need that \(\alpha > 2\beta \), so if \(\alpha > 1\) we can pick \(\beta \) with \(1/2< \beta <\alpha /2\). Thus, the results below will solve problems (A) and (B) when \(\alpha > 1\).

  • Step 3 For simplicity, we henceforth suppose \(E_0=0\) and that \(E_0\) is the only critical point as happens for the Schrödinger case. Under these assumptions, \(B(z) = -(H_0-\kappa ^2)^{-1}V\) for \(z=\kappa ^2; \, \text {Im}\kappa \ge 0, \kappa \ne 0\) is compact operator on \(X^*\), continuous in \(\kappa \) and analytic for \(\kappa \in {\mathbb {C}}_+\). By a version of the analytic Fredholm theorem (see [494, Theorem VI.14]), there is a set \({\mathcal {E}}\subset (0,\infty )\), so that \({\mathcal {E}}\) is a closed set (i.e. its only limit points are in \({\mathcal {E}}\) or are 0 or \(\infty \)) of (real) Lebesgue measure 0 and so that if \(z \notin {\mathcal {E}}\), then \(({\varvec{1}}-B(z))^{-1}\) exists and is continuous in z there. One proves that \((H-z)^{-1} = (1-B(z))^{-1}(H_0-z)^{-1}\) originally for \(\text {Im} z \ne 0\) and then as maps from X to \(X^*\) for \(z \notin {\mathcal {E}}\).

  • Step 4 This suffices to get existence and completeness of wave operators. Kato–Kuroda [362, 363] have arguments to get this. In his original announcement, Agmon [3] didn’t mention scattering. If one can decompose \(V=A^*B\) so that \(A, B: X^* \rightarrow L^2\) (perhaps after multiplication by \((H_0+1)^{-1/2}\)), then one can show that AB are locally smooth for both H and \(H_0\) on \((0,\infty ){\setminus }{\mathcal {E}}\) and so by Theorem 14.10, one gets existence and completeness (ideas due to Lavine [415, 417]). In later publications, Agmon and Hörmander have other ways of proving existence and completeness by exploiting a radiation condition.

  • Step 5 In general, from this, one gets purely a.c. on \((0,\infty ){\setminus }{\mathcal {E}}\) so any singular spectrum on \((0,\infty )\) lies in \({\mathcal {E}}\).

  • Step 6 Suppose we show that any \(\lambda _0 \in {\mathcal {E}}\) is an \(L^2\) eigenvalue of H. Then \({\mathcal {E}}\cup \{0,\infty \}\) is a countable closed subset of \({\mathbb {R}}\) which cannot support a singular continuous measure. In this way, one solves problem (B).

  • Step 7 If \(\varphi \in L^2_{-\beta }\) and \(B(\lambda _0+i0)\varphi =\varphi , \, \lambda _0=\kappa ^2\), then

    $$\begin{aligned} 0=\text {Im}\langle V\varphi ,\varphi \rangle&= \text {Im}\langle V\varphi ,(H_0-\lambda _0-i0)^{-1}V\varphi \rangle \\&= \frac{\pi \kappa ^{\nu -2}}{2} ||T_\kappa V\varphi ||^2 \end{aligned}$$

    so \(T_\kappa V\varphi =0\). Therefore by (15.21), \(Q_\kappa V\varphi =B(\kappa )\varphi \in L^2_{\alpha -\beta -1-\epsilon }\) for all \(\epsilon >0\). For example, if \(\alpha >3/2\), we can pick \(\beta > 1/2\) but close to it and \(\epsilon \) small so that \(\alpha -\beta -1-\epsilon \ge 0\). Thus \(\varphi \in L^2\) and is an eigenfunction. By invoking Step 6, we see that when \(\alpha > 3/2\), we can solve problem (B). The restriction \(\alpha > 5/4\) in Theorem 15.1 comes from a consideration like this—what is needed to deduce that \(\varphi \in L^2\).

  • Step 8 Agmon had the idea of iterating the argument in Step 7! If we know that \(\varphi \in L^2_\gamma \), since \(T_\kappa V\varphi = 0\), we have that \(\varphi \in L^2_{\alpha +\gamma -1-\epsilon }\), so if \(\alpha >1\), we can increase \(\gamma \) by an arbitrary amount less than \(\alpha -1\). If \(\alpha - 1 > 1/2m\) starting in \(L^2_{-\beta }\) with \(\beta \) very close to 1 / 2, we see by iterating m times that \(\varphi \) is an \(L^2\) eigenfunction. In this way, one solves problem (B) for all \(\alpha > 1\).

  • Step 9 Once one controls the resolvent, one can obtain eigenfunctions via the Lippmann–Schwinger equation. Knowing that \({\mathcal {E}}\) is countable shows the expansion only has a.c. spectrum and point spectrum.

This completes our sketch of the scheme behind the work of Kato–Kuroda and Agmon; see Reed–Simon [497, Sect. XIII.8] for more details. After Agmon’s argument appeared, various authors realized that the iteration idea in Step 8 could improve their results. In particular, Kuroda [398, 399] was able to extend the proof of Theorem 15.1 to \(\alpha > 1\). He extended this work to fairly general elliptic operators.

The ideas in the Agmon–Kato–Kuroda work have been extended to long range potentials [where (15.1) holds for suitable \(\alpha \in (0,1]\) but we also have \((1+|x|)^{-1-\alpha }\) decay of \(\mathbf {\nabla }V\)]. One needs to use modified wave operators following Dollard [121]. There is a vast literature and we will not try to summarize it—see the books of Dereziński–Gérard [116] and Yafaev [699, 704].

The above approach uses the fact that for \(L^2_\beta , \, \beta >1/2\), there is a map restricting \(\hat{\varphi }\) to the sphere. One proves this by essentially flattening the sphere. If we replaced \(L^2_\beta \) by \(L^p\), we cannot restrict to hyperplanes but remarkably, one can sometimes restrict to curved hypersurfaces like the spheres we needed above. The associated bounds are known as the Tomas–Stein Theorem (see [615, Sect. 6.8]). Ionescu–Schlag [271] have developed a theory of scattering and spectral theory under suitable \(L^p\) conditions on V using the Tomas–Stein bounds.

6 Scattering and spectral theory, IV: Jensen–Kato theory

This is the last section on “scattering and spectral” theory although it involves something closer to diffusion than scattering and the connection to spectral theory is weak. Still, since it involves large time behavior of \(e^{-itH}\), it belongs in this set of ideas. In any event, we’ll discuss a lovely paper of Jensen and Kato [288] involving Schrödinger operators, \(H=-\Delta +V\), on \({\mathbb {R}}^3\).

One issue that they discuss is the large time behavior of \(e^{-itH}\) and its rate of decay. At first sight, speaking of decay seems puzzling since for \(\varphi \in L^2\), we have that \(||e^{-itH}\varphi ||_2=||\varphi ||_2\) has no decay. But consider the integral kernel when \(V=0\) on \({\mathbb {R}}^\nu \)

$$\begin{aligned} e^{it\Delta }(x,y) = (4\pi it)^{-\nu /2}e^{i|x-y|^2/4t} \end{aligned}$$
(16.1)

which shows that

$$\begin{aligned} \sup _{x,y} |e^{it\Delta }(x,y)| = (4\pi |t|)^{-\nu /2} \end{aligned}$$
(16.2)

so

$$\begin{aligned} ||e^{it\Delta }\varphi ||_\infty \le (4\pi |t|)^{-\nu /2} ||\varphi ||_1 \end{aligned}$$
(16.3)

(16.3) is, in fact, equivalent to (16.2). Since Jensen and Kato use Hilbert space methods, instead of maps from \(L^1\) to \(L^\infty \), they consider maps between weighted \(L^2\) spaces, specifically from \(L^2_s\) to \(L^2_{-s}\) where \(L^2_s\) is given by (15.17). For example, (16.3) immediately implies that \(||e^{it\Delta }\varphi ||_{2,-s} \le C_{\nu ,s} t^{-\nu /2}||\varphi ||_{2,s}\) so long as \(s \ge \nu /2\).

If \(H_0=-\Delta \) is replaced by \(H=H_0+V\), there is a new issue that arises. If \(H\varphi =E\varphi \) for \(\varphi \in L^2\), then \(e^{-itH}\varphi =e^{-itE}\varphi \) has no decay in any norm. Thus one must only try to prove decay of \(e^{-itH}P_c(H)\) where \(P_c(H)\) (“c” is for continuous spectrum; if there is no singular continuous spectrum, it is the same as \(P_{ac}\)) is the projection onto the orthogonal complement of the eigenvectors. Jensen–Kato don’t use \(e^{-itH}P_c(H)\) but the equivalent

$$\begin{aligned} e^{-itH} - \sum _{j=1}^{N}e^{-iE_jt}P_j \end{aligned}$$
(16.4)

where \(\{E_j\}_{j=1}^N\) are the eigenvalues and \(P_j\) the projections onto the associated eigenspace \(\ker (H-E_j)\).

In the free case, we note that it is easy to see [496, Corollary to Theorem XI.14] that if \(0 \notin {\text {supp}}(\widehat{\varphi })\) for \(\varphi \in {\mathcal S}({\mathbb {R}}^\nu )\), then \(\sup _{|x|\le R} |e^{it\Delta }\varphi (x)|\) is \(\text {O}(t^{-N})\) for all N. That is the diffusive term \(t^{-\nu /2}\) is connected to low energies. A critical realization of Jensen–Kato is that large t asymptotics as maps of \(L^2_{s}\) to \(L^2_{-s}\) is connected to the behavior of the resolvent \((H-z)^{-1}\) near \(z=0\).

For a while now we return to \(\nu =3\), the only case considered by Jensen–Kato. As we’ll see, \(\nu =3\) is perhaps the simplest case with a rich structure. Roughly speaking, Jensen–Kato consider \(V's\) obeying

$$\begin{aligned} |V(x)| \le C(1+|x|)^{-\beta } \end{aligned}$$
(16.5)

They always require \(\beta > 2\) and often need \(\beta > 3\) or even larger. In fact, for some of their results, they only need \((1+|x|)^\beta V \in L^{3/2}_{unif}\), but for simplicity we’ll only quote results below where the pointwise bound (16.5) holds. Prior to their paper, there was work of Rauch [489] which motivated them. He supposed \(|V(x)| \le C_1 e^{-C_2|x|}\) and instead of \(L^2\)-operator norms of \((1+|x|)^{-s}e^{-itH}P_c(H)(1+|x|)^{-s}\), he considered norms \(e^{-\epsilon |x|}e^{-itH}P_c(H)e^{-\epsilon |x|}\). He found for all but a discrete set of \(\xi \in {\mathbb {R}}\), with \(H(\xi ) = -\Delta +\xi V\), one has \(t^{-3/2}\) decay for the relevant norms of \(e^{-itH(\xi )}\) and, for a discrete set of \(\xi \)’s, \(t^{-1/2}\) decay. Jensen–Kato extended this result for \(L^2_s\) to \(L^2_{-s}\) with \(s > 5/2\) and \(\beta > 3\). Several years earlier, Yafaev [698], in connection with his work on the Efimov effect [695], had studied low energy behavior of the resolvent (but not high energy asymptotics of the unitary group) in the case of a zero energy resonance [case (1) in the language of Jensen–Kato].

It is natural to restrict at least to \(\beta > 2\) for small energy behavior. The Birman–Schwinger kernel [616, Sect. 7.9], \(|V(x)|^{1/2}V(y)^{1/2}/4\pi |x-y|\), is Hilbert–Schmidt if (16.5) holds for \(\beta >2\) and, in general may not even be a bounded operator if \(\beta < 2\) (and if \(\beta =2\), can be bounded but not compact). Thus, \(\beta > 2\) implies that \(-\Delta +V\) has only finitely many negative eigenvalues, each of finite multiplicity.

As we’ve mentioned, the key input for the Jensen–Kato large time results is an analysis of the resolvent, \(R(z)=(H-z)^{-1}\) for z near zero. The free resolvent \(R_0(z)=(H_0-z)^{-1}\) has integral kernel

$$\begin{aligned} G_0(x,y;z) = \frac{e^{i\kappa |x-y|}}{4\pi |x-y|} \end{aligned}$$
(16.6)

where \(\kappa \) obeys \(\kappa ^2=z\) with \(\text {Im}(\kappa )>0\) for \(z \in {\mathbb {C}}{\setminus } [0,\infty )\) (with obvious limits if z approaches \({\mathbb {R}}\) from either \({\mathbb {C}}_+\) or \({\mathbb {C}}_-\)). It is only in dimension 3 (and 1) that \(G_0\) is so simple; in other dimensions, it is a more complicated Bessel function. For \(z \in {\mathbb {C}}{\setminus } [0,\infty )\), one has that

$$\begin{aligned} R(z)=(1+R_0(z)V)^{-1}R_0(z) \end{aligned}$$
(16.7)

Following Agmon and Kuroda (see Sect. 15), Jensen–Kato use the weighted Sobolev spaces, \(H^{m,s}({\mathbb {R}}^3)\) of those \(\varphi \) which obey

$$\begin{aligned} ||\varphi ||_{m,s}=||(1+|x|^2)^{s/2}(1-\Delta )^{m/2}\varphi ||_2 < \infty \end{aligned}$$
(16.8)

For example, we can take the completion of \({\mathcal S}({\mathbb {R}}^3)\) in this norm or, since \((1+|x|^2)^{s/2}(1-\Delta )^{m/2}\) is a map of tempered distribution to themselves, we can take those tempered distributions for which the quantity in the norm on the right of (16.8) is in \(L^2\).

Let \(K_0\) be the operator with integral kernel \((4\pi |x-y|)^{-1}\), i.e. \(G_0(x,y;0)\). Jensen–Kato prove that if V obeys (16.5) with \(\beta > 2\), then \(K_0V\) is a compact operator on \(L^2_{-s}\) if \(1/2<s<\beta -1/2\), indeed it is compact on \(H^{1,s}\). It is also true that extended from \(\kappa \in {\mathbb {C}}_+\) to it \(\kappa \in {\mathbb {C}}_+\cup {\mathbb {R}}\), \(VR_0(\kappa ^2)\) is Hölder continuous (and compact). While Jensen–Kato don’t prove it that way, we note that this follows from the generalized Stein–Weiss inequalities [615, Theorem 6.2.5].

Thus, to understand the small z behavior of R(z), we need to know about \((1+K_0V)^{-1}\). By compactness, this inverse exists if and only if

$$\begin{aligned} (1+K_0V)\varphi =0 \end{aligned}$$
(16.9)

has no non-zero solutions, \(\varphi \in H^{1,-s}\). If \(\varphi \) obeys (16.9), it is a distributional solution of \((-\Delta +V)\varphi =0\). Let \({\mathcal M}\) be the set of all solutions of (16.9) in \(H^{1,-s}\); Jensen–Kato prove that it is independent of which s is chosen in \((1/2,\beta -1/2)\). By compactness \(\dim {\mathcal M}< \infty \). It is important to know if \(\varphi \in L^2\). (16.9) says that

$$\begin{aligned} \varphi (x)=-\frac{1}{4\pi }\int \frac{1}{|x-y|} V(y)\varphi (y)\,d^3y \end{aligned}$$
(16.10)

so that

$$\begin{aligned} \varphi (x) = -\frac{1}{4\pi |x|}\int V(y)\varphi (y) \, d^3y+ \text {o}\left( \frac{1}{|x|}\right) \end{aligned}$$
(16.11)

Thus, if \(\int V(y)\varphi (y)\,d^3y \ne 0\), then \(\varphi \notin L^2\). One can show that if \(\int V(y)\varphi (y)\,d^3y = 0\), then \(\varphi \in L^2\). Thus, in \({\mathcal M}\), the set of \(L^2\) solutions is either all of \({\mathcal M}\) or a space of codimension 1. If \({\mathcal M}\) has non-\(L^2\)-solutions, we say that there is a zero energy resonance. Jensen–Kato thus consider four cases:

  1. (0)

    (regular case) \({\mathcal M}=\{0\}\) so \((1+K_0V)^{-1}\) exists. Since \(K_0V\) is compact, the set of \(\xi \in {\mathbb {R}}\) for which \(\xi V\) is not regular is a discrete set.

  2. (1)

    (pure resonant case) \({\mathcal M}\ne \{0\}\) but there are no \(L^2\) functions in \({\mathcal M}\). This implies that \(\dim {\mathcal M}= 1\).

  3. (2)

    (pure eigenvalue case) \({\mathcal M}\ne \{0\}\) and \({\mathcal M}\subset L^2\). Thus 0 is an eigenvalue but there is no resonance.

  4. (3)

    (mixed case) \({\mathcal M}\ne \{0\}\) and \({\mathcal M}\) contains both \(L^2\) and non-\(L^2\) functions. Then \(\dim {\mathcal M}\ge 2\) and the set of \(L^2\) solutions has codimension 1.

Later, we’ll see that in a sense, case (1) is generic among the singular cases. We’ll see similar qualitative behavior in the three singular cases but the detailed expressions for coefficients depends on the case.

Jensen–Kato start by noting the expansion in \(\kappa = \sqrt{z}\) when \(V=0\). Given (16.6), we see that

$$\begin{aligned} R_0(\kappa ^2) = \sum _{j=0}^{\infty } (i\kappa )^j K_j \end{aligned}$$
(16.12)

where \(K_j\) has the integral kernel

$$\begin{aligned} K_j(x,y) = |x-y|^{j-1}/4\pi j! \end{aligned}$$
(16.13)

Then, for \(j \ge 1\), \(K_j\) is bounded from \(H^{-1,s}\) to \(H^{1,-s}\) if and only if \(s > j+1/2\). That means if we fix s, we have an asymptotic series only to any order \(J < s-1/2\). Since V obeying (16.5) maps \(L^2_{-s}\) to \(L^2_{s}\) if and only if \(s < \beta /2\), we see that for fixed \(\beta \), we can only expect to get an expansion including \(\kappa ^j\) terms if \(j <\tfrac{1}{2}(\beta -1)\). This explains the conditions on \(\beta \) in the theorems below. Jensen–Kato prove, with explicit formulae for \(B^{(0)}_j,\,j=0,1\),

Theorem 16.1

(Jensen–Kato [288]) Assume that V is regular at \(\kappa =0\), \(\beta > 3\) and \(s > 3/2\). Then for explicit operators \(B^{(0)}_0 \ne 0\) and \(B^{(0)}_1\) from \(L^2_s\) to \(L^2_{-s}\) as operators between those spaces and \(\text {Im}\kappa \ge 0\)

$$\begin{aligned} R(\kappa ^2)= B^{(0)}_0 + i\kappa B^{(0)}_1+\text {o}(\kappa ) \end{aligned}$$
(16.14)

If \(\beta > 5\) and \(s > 5/2\), then \(\text {o}(\kappa )\) can be replaced by \(\text {O}(\kappa ^2)\).

They also prove (with explicit formula for \(B^{(k)}_j\)) that

Theorem 16.2

(Jensen–Kato) Assume that V is not regular at \(\kappa =0\), \(\beta > 5\) and \(s > 5/2\). Then for explicit operators \(B^{(k)}_{-2}\) and \(B^{(k)}_{-1}, \, k=1,2,3\) from \(L^2_s\) to \(L^2_{-s}\) as operators between those spaces and \(\text {Im}\kappa \ge 0\), one has that

$$\begin{aligned} R(\kappa ^2)= -\kappa ^{-2} B^{(k)}_{-2} - i\kappa ^{-1} B^{(k)}_{-1}+\text {O}(1) \end{aligned}$$
(16.15)

if the singular point is of type k. Moreover, if \(k=1\), \(B^{(1)}_{-2} = 0,\,B^{(1)}_{-1}\ne 0\) and if \(k=2, 3\), then \(B_{-2}^{(k)} \ne 0\).

Remarks

  1. 1.

    The explicit formulae have \(B^{(k)}_j\) of finite rank for \(k=-2,-1\). If \(\beta \) and s are large enough, there should be asymptotic series of any prescribed order and the coefficients are all finite rank [450, 289, Prop. 7.1].

  2. 2.

    Rauch [489] says that \(B^{(k)}_{-1} \ne 0\) for all k but Jensen–Kato have an explicit example where \(B^{(2)}_{-1} = 0\).

  3. 3.

    Using ideas from Klaus–Simon [375] (discussed further below), one can prove not only that regular V’s are generic but among the irregular V’s, type (1) is generic and among those not of type (1), type (3) is generic. For example, one can prove that for any \(\beta > 5\), if \(X_\beta =\{V \,|\, ||V||_\beta = \sup _x |(1+|x|)^\beta |V(x)| < \infty \}\), then the regular V’s are a dense open set and, in the set, \(\widetilde{X}_\beta \) of not regular V’s (which is closed and so a complete metric space), the set of type (1) V’s is a dense open set. Klaus–Simon only discuss \(V \in C^\infty _0({\mathbb {R}}^3)\) but that is for simplicity and their ideas work in this broader context. These genericity results are not true for spherically symmetric V’s. In that case. the space \({\mathcal M}\), if non-zero, generically has a single angular momentum, \(\ell \), and always has a finite number of them. For each \(\ell \), the set of V’s with only that \(\ell \) is a relatively open subset of the closed subset of spherically symmetric elements of \(\widetilde{X}_\beta \), so none is generic in the singular V’s. \(\ell =0\) is type (1), \(\ell \ne 0\) is of type (2). Cases of more one \(\ell \) are of type (3) or (1) depending only on whether one of the \(\ell \) values is 0.

Jensen–Kato also studied low energy asymptotics of the S-matrix, and, importantly for the study of asymptotics of \(e^{-itH}\), the low energy behavior of the derivative of the spectral measure

$$\begin{aligned} \frac{d}{d\lambda } P_{(-\infty ,\lambda )}(H) \equiv P'_H(\lambda ) \end{aligned}$$
(16.16)

A little thought about Stone’s formula shows that if R(z) has a limit \(R(\lambda +i0)\) uniformly for \(\lambda \in (a,b) \subset {\mathbb {R}}\), then

$$\begin{aligned} P'_H(\lambda ) = \pi ^{-1} \text {Im}\,R(\lambda +i0) \end{aligned}$$
(16.17)

where, for an operator, A, one writes \(\text {Im}\,A = (A-A^*)/2i\).

Since \(z=\kappa (z)^2\) with \(\text {Im}\kappa > 0\) has that \(\kappa (\bar{z})=-\overline{\kappa (z)}\), we see that by (16.17) that if

$$\begin{aligned} R(\kappa ^2) = \sum _{j=-2}^{J} (i\kappa )^jQ_j+ \text {o}(|\kappa |^J) \end{aligned}$$
(16.18)

then \(Q_j^*=Q_j\) and so, with \(L=\left[ \frac{J-1}{2}\right] \),

$$\begin{aligned} P'(\lambda )=\pi ^{-1}\sum _{\ell =-1}^{L} (-1)^\ell \sqrt{\lambda }^{2\ell +1}Q_{2\ell +1}+\text {o}(\sqrt{\lambda }^J) \end{aligned}$$
(16.19)

In particular, if (16.14) holds (with an \(\text {O}(\kappa ^2)\) term), then

$$\begin{aligned} P'(\lambda ) = \pi ^{-1} B^{(0)}_1 \lambda ^{1/2} + \text {O}(\lambda ) \end{aligned}$$
(16.20)

and if (16.15) holds, then

$$\begin{aligned} P'(\lambda ) = \pi ^{-1} B^{(k)}_{-1} \lambda ^{-1/2} + \text {O}(1) \end{aligned}$$
(16.21)

In this way Jensen–Kato control \(P'(\lambda )\) for small \(\lambda \).

They also find a large \(\lambda \) result. They prove that for \(k=1,2,\ldots \) and \(s> k+1/2, \, \beta > 2k+1\), then as maps from \(L^2_s\) to \(L^2_{-s}\), one has that

$$\begin{aligned} \left( \frac{d}{d\lambda }\right) ^kP'(\lambda ) = \text {O}(\lambda ^{-(k+1)/2}) \end{aligned}$$
(16.22)

as \(\lambda \rightarrow \infty \).

With these in hand they can estimate

$$\begin{aligned} e^{-itH}P_c(H) = \int _{0}^{\infty } e^{-it\lambda }P'(\lambda )\,d\lambda \end{aligned}$$
(16.23)

The large \(\lambda \) contribution as \(t \rightarrow \infty \) can be controlled using repeated integration by parts and the decay estimates in (16.22) on derivatives of \(P'(\lambda )\). One sees that the integral on the right side of (16.23) is dominated by the small \(\lambda \) contributions. Using the fact that the Fourier transform of \(\lambda ^{(j-1)/2} \chi _{(0,\infty )}(\lambda )\) is the distribution \((-it)^{-(j+1)/2}\) regularized at \(t=0\), one sees that

Theorem 16.3

(Jensen–Kato [288]) Let V obey (16.5) with \(\beta >3\), \(s>5/2\). Suppose that V is regular at zero energy. As a map from \(L^2_s\) to \(L^2_{-s}\), we have that as \(t \rightarrow \infty \), (16.4) is asymptotic in norm to

$$\begin{aligned} -(4\pi i)^{-1}B^{(0)}_1 t^{-3/2} + \text {o}(t^{-3/2}) \end{aligned}$$
(16.24)

Theorem 16.4

(Jensen–Kato [288]) Let V obey (16.5) with \(\beta >3\), \(s>5/2\). Suppose that V has an exceptional point of type (1) at zero energy. Then, for a suitably normalized solution \(\psi \in {\mathcal M}\), we have that as a map from \(L^2_s\) to \(L^2_{-s}\), as \(t \rightarrow \infty \), (16.4) is asymptotic in norm to

$$\begin{aligned} (\pi i)^{1/2} t^{-1/2} \langle \psi ,\cdot \rangle \psi + \text {o}(t^{-1/2}) \end{aligned}$$
(16.25)

Remark

\(\psi \) is normalized by \(\int V(x)\psi (x)\,d^3x=\sqrt{4\pi }\)

That completes our discussion of the Jensen–Kato paper. One obvious question left open by this work is what happens when \(\nu \ne 3\). This was answered for \(\nu \ge 5\) by Jensen [286] and for \(\nu =4\) by Jensen [287] and Murata [450] (who also had results for more general elliptic operators); see also Albeverio et al. [9, 10]. The case \(\nu =2\) with \(\int V(x) \, d^2x \ne 0\) was treated by Bollé et al. [62] and the general case by Jensen–Nenciu [289]. For \(\nu =1\) with exponentially decaying potentials, the behavior was analyzed by Bollé et al. [63, 64] and, in general, by Jensen–Nenciu [289]. Ito–Jensen [275] discuss Jacobi matrices (discrete \(\nu =1\)).

For \(\nu \ge 5\), an important observation is that there are no resonances at zero energy. This is because functions \(\varphi \in {\mathcal M}\) obey

$$\begin{aligned} \varphi (x) = - c_\nu \int |x-y|^{-(\nu -2)}V(y)\varphi (y)\,d^\nu y \end{aligned}$$
(16.26)

and so are \(\text {O}(|x|^{-(\nu -2)})\) at infinity and thus are in \(L^2\) if \(\nu \ge 5\).

There is a difference between odd \(\nu \) and even \(\nu \), so we begin with \(\nu \ge 5\), odd. In that case, for there to be \(t^{-\nu /2}\) decay for \(e^{-itH_0}\) from \(L^2_s\) to \(L^2_{-s}\), we need that \(P_0'(\lambda ) \sim \lambda ^{-(\nu -2)/2}\) for small \(\lambda \). At first sight, this seems surprising since \(R_0(\kappa ^2)\) has \(\text {O}(1)\) terms, so we might guess also \(\text {O}(\kappa )\) terms. In fact, only even powers of \(\kappa \) occur until \(\kappa ^{\nu -2}\). This can be seen by analyzing the integral kernel for \(G_0(x,y;\kappa ^2)\) which is a modified Bessel function of the second kind (see [612, discussion following (6.9.35)]) which is how Jensen [286] does it or by looking at (15.19). [It is an interesting exercise to write \(T_\kappa \varphi \) in terms of Taylor coefficients of \(\widehat{\varphi }\) at \(k=0\) and so recover the kernels \(K_j\) of (16.13) for j odd.]

If 0 is not an eigenvalue of H, it is easy to prove that as maps from \(L^2_s\) to \(L^2_{-s}\), for suitable s and \(\beta \), one has an asymptotic series for \(R(\kappa ^2)\) whose first odd term is \((i\kappa )^{\nu -2}\) and then that \(e^{-itH}P_c(H)\) as a map between suitable \(L^2_r\) spaces is \(\text {O}(t^{-\nu /2})\). If zero is an eigenvalue and \(\beta \) and s are large enough, one can have any of \(\text {O}(t^{-\tfrac{1}{2}\nu + 2})\), \(\text {O}(t^{-\tfrac{1}{2}\nu + 1})\) or \(\text {O}(t^{-\tfrac{1}{2}\nu })\) and all three possibilities can occur.

Looking at the odd \(\nu \) situation, it seems surprising that one can have \(\text {O}(t^{-m})\) for \(m \in {\mathbb {Z}}\) but it happens when \(\nu \) is even for the free case. In fact, if there were an asymptotic series in powers of \(\kappa \), the imaginary part cannot have even powers of \(\kappa \) as we’ve seen. The point is that in even dimensions the Bessel functions have log terms and for \(m \in {\mathbb {Z}}\), we have that \(\text {Im}\left[ \lambda ^m\log (-\lambda +i0)\right] =\pi \lambda ^m\). Because of this all the above odd \(\nu \ge 5\) results extend to even \(\nu \ge 5\).

For \(\nu =4\), there can be a resonance and/or bound state as when \(\nu =3\) so there are three types of singular points. In the regular case, the leading term is \(\text {O}(t^{-2})\), but when \(\beta \) and s are sufficiently large, the next term is \(\text {O}(t^{-3}\log (t))\) [unlike \(\nu =3\) where the term after \(\text {O}(t^{-3/2})\) is \(\text {O}(t^{-5/2})\)]. If there is a singular point with only bound states, the leading term is \(\text {O}(t^{-1})\) but when there are resonances there is only a bound by \(\text {O}(1/\log t)\).

Jensen–Nenciu [289] analyze \(\nu =1,2\) with a new method that also works in general dimension. These dimensions are special in that there is a zero energy resonance for \(H_0=-\Delta \)—this is especially clear in the coupling constant threshold point of view discussed soon. For \(\nu =1\), if \(\int _{-\infty }^{\infty } |x|\,|V(x)|\,dx < \infty \), it is known that every non-zero solution of \(-\varphi ''+V\varphi =0\) is either asymptotic to \(a_{\pm } x+\text {o}(x)\) as \(x \rightarrow {\pm } \infty \) with \(a_{\pm } \ne 0\) or is asymptotic to \(b_{\pm } + \text{ o }(1)\) with \(b_{\pm } \ne 0\) (in which case we say that \(a_{\pm }=0\)). Thus, 0 is never an eigenvalue and is a resonance if and only if there is \(\varphi \) with \(a_+=a_-=0\). For suitable s and \(\beta \) in the right norm \(e^{-itH}\) is \(\text {O}(t^{-3/2})\) in the regular case, while in the resonance case, one can have \(\text {O}(t^{-1/2})\) behavior. \(\nu =2\) is very involved. The resonant subspace can be of dimension up to 3 and the small \(\kappa \) expansion is jointly in \(\kappa \) and \(\log (\kappa )\).

Next, we want to mention the connection between resonances and coupling constant behavior. Simon [585] considered \(A+\xi B\) for general self-adjoint operators, A and B, where \(A \ge 0\), \(|B|^{1/2}(A+1)^{-1/2}\) is compact and \(0 \in \sigma _{ess}(A)\) so that \(N(\xi ) \equiv \dim {\text {ran}}\,P_{(-\infty ,0)}(A+\xi B) < \infty \) for all \(\xi \in (0,\Xi )\). Then N is increasing and there is a discrete set \(0 \le \xi _1 \le \xi _2 \le \cdots \) so that \(N(\xi ) \ge j \iff \xi > \xi _j\). That is, the \(\xi _j\) are coupling constant thresholds, where, depending on whether you think of \(\xi \) as increasing or decreasing, new eigenvalues are born out of 0 or old ones are absorbed. Simon proves that

$$\begin{aligned} \lim _{\xi \downarrow \xi _j} -\frac{E_j(\xi )}{\xi -\xi _j} \end{aligned}$$

always exists and is non-zero if and only if 0 is an eigenvalue of \(A+\xi _j B\) (with a more complicated statement if \(\xi _k = \xi _j\) for some \(k \ne j\)).

This links up to the Kato–Jensen work in that the \(\xi \)’s where \({\mathcal M}(H_0+\xi V) \ne \{0\}\) are exactly the coupling constant thresholds. If there are eigenvalues \(E_j(\xi )\) for \(\xi >\xi _j\) with \(E_j(\xi _j)=0\) and \(E_j(\xi ) \le -c(\xi -\xi _j);\, c>0\), then \(H_0+\xi _j V\) has a zero eigenvalue. If instead \(E_j = \text {o}(\xi -\xi _j)\), then there is a resonance. For Schrödinger operators, this was explored by Rauch [490] and by Klaus–Simon [375]. In particular, Klaus–Simon show for sufficiently large \(\beta \), \(-E_j(\xi ) = \text {O}((\xi -\xi _j)^2)\) and, in that case, if V has compact support, \(E_j(\xi )\) is analytic at \(\xi =\xi _j\). In the bound state case, they prove that \(E_j(\xi )\) is not analytic at \(\xi _j\) (as we’ll discuss below, typically, \(E_j\) has a non-zero imaginary part for \(\xi < \xi _j\) and real). These ideas also explain why if \(\nu =1\) or \(\nu =2\), \(H_0\) has a resonance at zero energy since it is known (Simon [581]) that if V obeys (16.5) for \(\nu =1,2\) and \(\beta > 3\) and \(\int V(x) \, d^\nu x \le 0\), then for all \(\xi >0\), \(H_0+\xi V\) has a bound state.

Simon [597, 598] discusses large time behavior of the \(L^\infty \) to \(L^\infty \) norm of \(e^{-tH}\) (note \(-t\), not \(-it\)) when there is and when there is not a zero energy resonance.

If there is a zero energy eigenvalue at a threshold \(\xi _j\), then it turns into a negative eigenvalue for \(\xi > \xi _j\). If \(\xi < \xi _j\), on the basis of the discussion in Sect. 4 in Part 1, one expects that this half-embedded eigenvalue turns into a resonance (in the sense discussed in that Section, not the notion earlier in this section). It’s imaginary part is not \(\text {O}((\xi -\xi _j)^{2})\) as it is in the normal Fermi golden rule situation discussed in Sect. 4; rather, as shown in Jensen–Nenciu [290], one typically has that it is \(\text {O}(|\xi -\xi _j|^{3/2})\). For related results, see Dinu et al. [118, 119].

Jensen–Kato discussed dispersive decay in terms of \(L^2_s\) spaces but there has been considerable interest in \(L^p\) estimates, where for \(-\Delta +V\) on \(L^2({\mathbb {R}}^\nu )\), one hopes, based on the case \(V=0\), that for \(1 \le p \le 2\)

$$\begin{aligned} ||e^{-itH}P_c(H)\varphi ||_{L^{p'}({\mathbb {R}}^\nu )} \le C |t|^{-\nu \left( \frac{1}{p}-\frac{1}{2}\right) } ||\varphi ||_{L^p({\mathbb {R}}^\nu )} \end{aligned}$$
(16.27)

where \(p'=p/p-1\) is the dual index to p. \(L^p\) norms are translation invariant making (16.27) much more suitable for use in the theory of non-linear evolution equations so there is a large literature on such estimates.

The first estimates of the type (16.27) were found by Schonbek [542] in 1979 who considered \(\nu =3,p=1\) and V small. The first general result for \(\nu \ge 3\) and V so that H has neither an eigenvalue nor resonance at zero energy were in a classic paper of Journé et al. [294] (see also Schonbek–Zhou [543]).

An interesting approach to (16.27) is due to Yajima [705,706,707,708] who asked about when the wave operators are bounded from \(L^p\) to \(L^p\). You might think that this has nothing to do with (16.27) but since \((\Omega ^{\pm })^*\) are then bounded from \(L^{p'}\) to \(L^{p'}\) and \(e^{-itH}P_{ac}(H) = (\Omega ^{\pm })^*e^{-itH_0}(\Omega ^{\pm })\), \(L^p\) estimates on \(\Omega ^{\pm }\) and (16.27) for \(H_0\) imply it for H.

There is a considerable literature on \(L^p\) dispersive estimates when 0 is an eigenvalue of resonance. We refer the reader to Yajima [709] which includes many references.

Finally, we note that Fournais–Skibsted [158] and Skibsted–Wang [620] have results on low energy behavior of the resolvent of \(-\Delta +V\) when asymptotically \(V(x) \sim -c|x|^{-\beta }\) with \(c > 0\) and \(\beta \le 2\). Both discuss low energy resolvent behavior and [158] also discussed long time asymptotics of \(e^{-itH}\).

7 The Adiabatic theorem

In 1950, Kato published a paper in a physics journal (denoted as based on a presentation in 1948) on the quantum adiabatic theorem. It is his only paper on the subject but has strongly impacted virtually all the huge literature on the subject and related subjects ever since (there are more Google Scholar citations of this paper than of [314]). We will begin by describing his theorem and its proof which introduced what he called adiabatic dynamics and I’ll call the Kato dynamics. We’ll see that the Kato dynamics defines a notion of parallel transport on the natural vector bundle over the manifold of all k-dimensional subspaces of a Hilbert space, \({\mathcal H}\), and so a connection. This connection is called the Berry connection and its holonomy is the Berry phase (when \(k=1\)). All this Berry stuff was certainly not even hinted at in Kato’s work but it is implicit in the framework. Then I’ll say something about the history before Kato and finally a few brief words about some of the other later developments.

To start, we need a basic result about linear ODEs on Banach spaces:

Proposition 17.1

Let X be a Banach space and \(\{{\mathcal M}_t\}_{0 \le t \le T}\) a family of norm continuous (in t) linear maps on X.

  1. (a)

    For each \(x_0 \in X\), there is a function \(t \mapsto x(t;x_0);\, 0 \le t \le T\) which is \(C^1\) in t which is the unique solution of

    $$\begin{aligned} \frac{d}{dt}x(t) = {\mathcal M}_t(x(t)); \qquad x(0) = x_0 \end{aligned}$$
    (17.1)

    Moreover, for each t, the map \(W(t):x_0 \mapsto x(t;x_0)\) is a bounded linear map on X and \(t \mapsto W(t)\) is \(C^1\) and is the unique solution of (17.1) when the map \({\mathcal M}\) acts on the bounded operators on \({\mathcal L}(X)\) by left operator multiplication by \({\mathcal M}_t\) with initial condition that \(W(0) = {\varvec{1}}\).

  2. (b)

    Let \({\mathcal H}\) be a (separable, complex) Hilbert space and take either \(X={\mathcal H}\) or \(X={\mathcal L}({\mathcal H})\) and suppose that

    $$\begin{aligned} {\mathcal M}_t(x) = iA(t)x \end{aligned}$$
    (17.2)

    where A(t) is a norm continuous map to the bounded self-adjoint operators on \({\mathcal H}\). Then there is a \(C^1\) family of unitary maps, U(t), with \(U(0)={\varvec{1}}\) so the solution of (17.1) is

    $$\begin{aligned} t \mapsto U(t)x_0 \end{aligned}$$
    (17.3)

Remarks

  1. 1.

    In (17.2), A(t)x is either interpreted as applying A(t) to a vector \(x \in {\mathcal H}\) or as left multiplication if \(x \in {\mathcal L}({\mathcal H})\).

  2. 2.

    The U(t) in (17.3) depend only on \(\{A(s)\}_{0 \le s \le T}\) (indeed only on \(s \le t\)) and not on \(x_0\).

  3. 3.

    The proof is elementary. For (a), one shows that the differential equation with initial condition (17.1) is equivalent to the integral equation

    $$\begin{aligned} x(t) = x_0 + \int _{0}^{t} {\mathcal M}_s(x(s)) \,ds \end{aligned}$$
    (17.4)

    on C([0, T]; X), the X-valued norm continuous functions on [0, T]. One then either uses a contraction mapping theorem (if necessary shrinking T to get a contraction and piecing together unique solutions on several intervals) or else one iterates the integral equation proving an estimate that the nth new term in the iteration is bounded by \(T^n \left[ \sup _{0 \le t \le T} ||M_t||\right] ^n/n!\) to prove that the iteration converges to a convergent sum.

  4. 4.

    For (b), one sees that if U(t) solves the equation on \({\mathcal L}({\mathcal H})\) for \(x_0 = {\varvec{1}}\), then \(U(t)x_0\) solves the equation in general. Moreover, by a simple calculation

    $$\begin{aligned} \frac{d}{dt} U^*(t)U(t) = 0; \qquad \frac{d}{dt} U(t)U^*(t) = i [A, U(t)U^*(t)] \end{aligned}$$
    (17.5)

The first equation and \(U(0)={\varvec{1}}\) implies immediately that \(U^*(t)U(t) = {\varvec{1}}\). The second equation with initial condition \(U(0)U^*(0)={\varvec{1}}\) is clearly solved by \(U(t)U^*(t)={\varvec{1}}\) so by uniqueness of solutions we see that \(U(t)U^*(t)={\varvec{1}}\). Thus U(t) is unitary.

The adiabatic theorem considers a family of time dependent Hamiltonians, \(H(s),\, 0 \le s \le 1\) and imagines changing them slowly, i.e. looking at \(H(s/T),\, 0 \le s \le T\) for T very large. Thus, we look for \(\tilde{U}_T(s)\) solving

$$\begin{aligned} \frac{d}{ds} \tilde{U}_T(s) = -iH(s/T)\tilde{U}_T(s),\,\, 0 \le s \le T; \qquad \tilde{U}_T(0) = {\varvec{1}}\end{aligned}$$
(17.6)

Letting \(U_T(s) = \tilde{U}_T(sT), \, 0 \le s \le 1\), we see that \(U_T(s),\,0\le s \le 1\) solves

$$\begin{aligned} \frac{d}{ds} U_T(s) = -iT H(s) U_T(s),\,\, 0 \le s \le 1; \qquad U_T(0) = {\varvec{1}}\end{aligned}$$
(17.7)

Here is Kato’s adiabatic theorem

Theorem 17.2

(Kato [313]) Let H(s) be a \(C^2\) family of bounded self-adjoint operators on a (complex, separable) Hilbert space, \({\mathcal H}\). Suppose there is a \(C^2\) function, \(\lambda (s)\), so that for all s, \(\lambda (s)\) is an isolated point in the spectrum of H(s) and so that

$$\begin{aligned} \alpha \equiv \inf _{0 \le s \le 1} {\text {dist}}(\lambda (s), \sigma (H(s)){\setminus }\{\lambda (s)\}) > 0 \end{aligned}$$
(17.8)

Let P(s) be the projection onto the eigenspace for \(\lambda (s)\) as an eigenvalue of H(s). Then

$$\begin{aligned} \lim _{T \rightarrow \infty } (1-P(s))U_T(s)P(0) = 0 \end{aligned}$$
(17.9)

uniformly in s in [0, 1].

Remarks

  1. 1.

    Thus if \(\varphi _0 \in {\text {ran}}\, P(0)\), this says that when T is large, \(U_T(s)\varphi _0\) is close to lying in \({\text {ran}}\, P(s)\). That is as \(T \rightarrow \infty \), the solution gets very close to the “curve” \(\{{\text {ran}}\, P(s)\}_{0 \le s \le 1}\).

  2. 2.

    If there is an eigenvalue of constant multiplicity near \(\lambda (0)\) for s small, it follows from (2.1) that P(s) and \(\lambda (s)\) are \(C^2\).

  3. 3.

    It is easy to see that \(\dim {\text {ran}}\, P(s)\) is constant. It can even be infinite dimensional.

  4. 4.

    This result is even interesting if \(\dim {\text {ran}}\, P(s)\) is 1 and/or \(\dim {\mathcal H}< \infty \).

  5. 5.

    Kato made no explicit assumptions on regularity in s saying “Our proof given below is rather formal and not faultless from the mathematical point of view. Of course it is possible to retain mathematical rigour by detailed argument based on clearly defined assumptions, but it would take us too far into unnecessary complication and obscure the essentials of the problem”. It is hard to imagine the Kato of 1960 using such language! In any event, the proof requires that P(s) be \(C^2\).

  6. 6.

    We’ll discuss history more later but Kato notes that his work has two advantages over the earlier work of Born–Fock [68]: (1) They assume complete sets of eigenvectors and do not allow continuous spectrum. (2) They assume that \(\lambda (s)\) is simple, i.e. \(\dim {\text {ran}}\, P(s) = 1\) while Kato can handle degenerate eigenvalues.

  7. 7.

    As we’ll see, the size estimate for (17.9) is \(\text {O}(1/T)\).

Kato’s wonderful realization is that there is an explicit dynamics, W(s) for which (17.9) is exact, i.e.

$$\begin{aligned} (1-P(s))W(s)P(0) = 0 \end{aligned}$$
(17.10)

He not only constructs it but proves the theorem by showing that (this formula only holds in case \(\lambda (s) \equiv 0\)))

$$\begin{aligned} \lim _{T \rightarrow \infty } [U_T(s)-W(s)]P(0) = 0 \end{aligned}$$
(17.11)

The W(s) that Kato constructs, he called the adiabatic dynamics. It is sometimes called Kato’s adiabatic dynamics. We call it the Kato dynamics. Here is the basic result:

Theorem 17.3

(Kato dynamics [313]) Let W(s) solve

$$\begin{aligned} \frac{d}{ds}W(s)= & {} iA(s)W(s), 0 \le s \le 1; \qquad W(0)={\varvec{1}}\end{aligned}$$
(17.12)
$$\begin{aligned} iA(s)\equiv & {} [P'(s),P(s)] \end{aligned}$$
(17.13)

Then W(s) is unitary and obeys

$$\begin{aligned} W(s)P(0)W(s)^{-1} = P(s) \end{aligned}$$
(17.14)

Proof

That W(s) is unitary follows from Proposition 17.1. Note that since \(P(s)^2=P(s)\) we have that

$$\begin{aligned} P'(s) = P'(s)P(s)+P(s)P'(s) \Rightarrow P(s)P'(s)P(s) = 0 \end{aligned}$$
(17.15)

since the first equation and \(P^2=P\) imply that \(PP'P=2PP'P\). Expanding the commutator defining A(s) and using \(PP'P=0\) yields

$$\begin{aligned} iP(s)A(s)= & {} -P(s)P'(s) \end{aligned}$$
(17.16)
$$\begin{aligned} iA(s)P(s)= & {} P'(s)P(s) \end{aligned}$$
(17.17)

so by the first equation in (17.15), we have that

$$\begin{aligned} P'(s) = i[A(s),P(s)] \end{aligned}$$
(17.18)

By (17.12)

$$\begin{aligned} (P(s)W(s))'&= (P'(s)+iP(s)A(s))W(s) \end{aligned}$$
(17.19)
$$\begin{aligned}&= iA(s)P(s)W(s) \end{aligned}$$
(17.20)

by (17.18). Taking adjoints,

$$\begin{aligned} (W(s)^{-1}P(s))' = -iW(s)^{-1}P(s)A(s) \end{aligned}$$
(17.21)

Since \(W(s)^{-1}P(s)W(s) = (W(s)^{-1}P(s))(P(s)W(s))\), we see that

$$\begin{aligned} (W(s)^{-1}P(s)W(s))'&= iW(s)^{-1}P(s)A(s)P(s)W(s) \nonumber \\&\qquad - iW(s)^{-1}P(s)A(s)P(s)W(s) = 0 \end{aligned}$$
(17.22)

At \(s=0\), this is P(0) so

$$\begin{aligned} W(s)^{-1}P(s)W(s) = P(0) \end{aligned}$$
(17.23)

which is equivalent to (17.14). \(\square \)

Proof of Theorem 17.2

By replacing H(s) by \(H(s) - \lambda (s){\varvec{1}}\), we can suppose that \(\lambda (s) \equiv 0\) (doing this changes some formulae, particularly the critical (17.25)—we’ll address this after the proof). We will prove that

$$\begin{aligned} ||U_T(s)^*W(s)P(0)-P(0)|| = \text {O}(1/T) \end{aligned}$$
(17.24)

Since \(U_T\) is unitary, this implies that

$$\begin{aligned} ||W(s)P(0)-U_T(s)P(0)|| = \text {O}(1/T) \end{aligned}$$
(17.25)

Since \((1-P(s))W(s)P(0) = (1-P(s))P(s)W(s) = 0\), this implies (17.9) with an explicit \(\text {O}(1/T)\) error estimate.

Thus we define

$$\begin{aligned} G(s)=U_T^*(s)W(s)P(0) \end{aligned}$$
(17.26)

and compute

$$\begin{aligned} G'(s) = (U_T^*(s))'W(s)P(0) + U_T^*(s)W'(s)P(0) \end{aligned}$$
(17.27)

Applying \(^*\) to (17.7) implies that

$$\begin{aligned} (U_T^*(s))' = iTU_T^*(s)H(s) \end{aligned}$$
(17.28)

so, using (17.14), the first term in (17.27) is

$$\begin{aligned} iTU_T^*(s)H(s)W(s)P(0) = iTU_T^*(s)H(s)P(s)W(s) = 0 \end{aligned}$$
(17.29)

since \(\lambda (s) \equiv 0 \Rightarrow H(s)P(s) =0\). This is useful because it says that a potential \(\text {O}(T)\) term is zero!

Next note that since \(PP'P=0\) we have that \(PAP=0\) and thus

$$\begin{aligned} P(s)W'(s)P(0)&= iP(s)A(s)W(s)P(0) \nonumber \\&= iP(s)A(s)P(s)W(s) \nonumber \\&= 0 \end{aligned}$$
(17.30)

If now S(s) is the reduced resolvent of H(s) (see (2.8)) \({S(s) \equiv (1-P(s))H(s)^{-1}}\), then on account of (17.30), we have that

$$\begin{aligned} W'(s)P(0) = (1-P(s))W'(s)P(0)=H(s)S(s)W'(s)P(0) \end{aligned}$$
(17.31)

so, by (17.21)

$$\begin{aligned} G'(s)&= U_T^*(s)H(s)S(s)W'(s)P(0) \end{aligned}$$
(17.32)
$$\begin{aligned}&= (iT)^{-1}[U_T^*(s)]'S(s)W'(s)P(0) \end{aligned}$$
(17.33)

by (17.28). Thus

$$\begin{aligned} G(s)-P(0) = (iT)^{-1} \int _{0}^{s}[U_T^*(w)]'S(w)W'(w)P(0) \,dw \end{aligned}$$
(17.34)

As we’ve seen \(U_T'\) is \(\text {O}(T)\) but we can integrate by parts. Since \(U_T(w)\) has norm one and S(w) and \(W'(s)\) are bounded, the boundary terms in the integration by parts are \(\text {O}(1/T)\). Since we assumed that P(s) is \(C^2\), one has that \(S'(s)\) and \(W''(s)\) are bounded so the integrand after integration by parts is bounded and we have proven that \({||G(s)-P(0)|| = \text {O}(1/T)}\), i.e. (17.24) holds.

This completes our discussion of what was in this influential paper of Kato. Kato left at least two important items “on the table”. One is the possibility of better estimates than \(\text {O}(1/T)\). We discuss this further below.

The other item concerns the fact that (17.25) says a lot more than (17.9). (17.9) says that as \(T \rightarrow \infty \), \(U_T(s)\) maps \({\text {ran}}\, P(0)\) to \({\text {ran}}\, P(s)\). (17.25) actually tells you what the precise limiting map is! One should note that if \(\lambda (s)\) is not identically zero, the proper form of (17.25) is

$$\begin{aligned} ||U_T(s)P(0)-e^{-iT\int _{0}^{s} \lambda (s)\,ds} W(s)P(0)|| = \text {O}(1/T) \end{aligned}$$
(17.35)

One fancy pants way of describing this is as follows. Fix \(k \ge 1\) in \({\mathbb {Z}}\). Let \({\mathcal M}\) be the manifold of all k-dimensional subspaces of some Hilbert space, \({\mathcal H}\). We want \(\dim ({\mathcal H}) \ge k\), but it could be finite. Or \({\mathcal M}\) might be a smooth submanifold of the set of all such subspaces. For each \(\omega \in {\mathcal M}\), we have the projection \(P(\omega )\). There is a natural vector bundle of k-dimensional spaces over \({\mathcal M}\), namely, we associate to \(\omega \in {\mathcal M}\), the space \({\text {ran}}\, P(\omega )\). If \(k=1\), we get a complex line bundle.

The Kato dynamics, W(s), tells you how to “parallel transport” a vector \(v \in {\text {ran}}\, P(\gamma (0))\) along a curve \(\gamma (s);\,0 \le s \le 1\) in \({\mathcal M}\). In the language of differential geometry, it defines a connection and such a connection has a holonomy and a curvature. In less fancy terms, consider the case \(k=1\). Suppose \(\gamma \) is a closed curve. Then W(1) is a unitary map of \({\text {ran}}\, P(0)\) to itself, so multiplication by \(e^{i\Gamma _B(\gamma )}\). Returning to \(U_T\), it says that the phase change over a closed curve isn’t what one might naively expect, namely \(\exp (-i\int _{0}^{T} \lambda (s/T)\,ds) = \exp (-iT\int _{0}^{1} \lambda (s)\,ds)\). There is an additional term, \(\exp (i\Gamma _B)\). This is the Berry phase discovered by Berry [53] in 1983 (it was discovered in 1956 by Pancharatnam [474] but then forgotten). Simon [602] realized that this was just the holonomy of a natural bundle connection and that, moreover, this bundle and connection is precisely the one whose Chern integers are the TKN\(^2\) integers of Thouless et al. [646] (as discussed by Avron et al. [30]). Thouless got a recent physics Nobel prize in part for the discovery of the TKN\(^2\) integers. The holonomy, i.e. Berry’s phase, is an integral of the Kato connection [PdP]. As usual, this line integral over a closed curve is the integral of its differential [dPdP] over a bounding surface. This quantity is the curvature of the bundle and has come to be called the Berry curvature (even though Berry did not use the differential geometric language). Naively [dPdP] would seem to be zero but it is shorthand for the two-form

$$\begin{aligned} \sum _{i \ne j}\left[ \frac{\partial P}{\partial s_i},\frac{\partial P}{\partial s_j}\right] ds_i\wedge ds_j \end{aligned}$$
(17.36)

This formula of Avron et al. [30] for the Berry curvature is a direct descendant of formulae in Kato’s paper, although, of course, he did not consider the questions that lead to Berry’s phase.

Now, a short excursion into the history of adiabatic theorems. “Adiabatic” first entered into physics as a term in thermodynamics meaning a process with no heat exchange. In 1916, Ehrenfest [136] discussed the “adiabatic principle” in classical mechanics. The basic example is the realization (earlier than Ehrenfest) that while the energy of a harmonic oscillator is not conserved under time dependent change of the underlying parameters, the action (energy divided by frequency) is fixed in the limit that the parameters are slowly changed (the reader should figure out what Kato’s adiabatic theorem says about a harmonic oscillator with slowly varying frequency). See Henrard [238] for discussion of applications of the classical adiabatic invariant. Interestingly enough, many adiabatic processes in the thermodynamic sense are quite rapid, so the Ehrenfest use has, at best, a very weak connection to the initial meaning of the term!

Ehrenfest used these ideas by asserting that in old quantum theory, the natural quantum numbers were precisely these adiabatic invariants. Once new quantum mechanics was discovered, Born and Fock [68] in 1928 discussed what they called the quantum adiabatic theorem, essentially Theorem 17.2 for simple eigenvalues with a complete set of (normalizable) eigenfunctions. It was 20 years before Kato found his wonderful extension (and then more than 30 years before Berry made the next breakthrough).

Next, we turn to error estimates. The error on the right side of (17.34) is a sum of two terms after an integration by parts: the boundary term and an integral. For the integral, one can reuse (17.29) as we did to get (17.34) and see that the integral is \(\text {O}(1/T^2)\). The boundary term is \(\text {O}(1/T)\) but the coefficients will vanish if \(P(s)-P(0)\) and \(P(t)-P(s)\) vanish sufficiently fast as \(s \downarrow 0\) and \(s \uparrow t\). The natural setup is to take \(s \in (-\infty ,\infty )\) rather than [0, 1] and to require that \(H({\pm } \infty ) = \lim _{s \rightarrow {\pm }\infty } H(s)\) exist with approach \(\text {O}(1/|s|^k)\) for all k. If one does this, one gets an adiabatic theorem with \(\text {O}(1/T^k)\) errors for all k. Under suitable analyticity conditions on H(s), one can even prove exponential approach, see [464] for an early paper on this subject and [137, 278, 279, 296, 465] for additional discussion. In particular, Joye–Pfister [296] uses arguments very close to Kato’s.

The occurrence of the reduced resolvent, S, in Kato’s approach suggests that an eigenvalue gap is an important ingredient. Nevertheless, there are results on adiabatic theorems without gaps, see Avron et al. [29] and Hagedorn [215] for some special situations and Avron–Elgart [22] for a very general result. Teufel [641] has an alternate proof for this Avron–Elgart result and he has a book [642] on the subject. Avron et al. [23] and Joye [295] have Banach space versions.

For other approaches to adiabatic evolution, see Jansen et al. [280] and Hastings–Wen [224]. For some applications, see Avron et al. [32], Klein–Seiler [376] and Bachmann et al. [34].

8 Kato’s ultimate trotter product formula

We begin this section by describing what is called the Lie product formula. Let AB be two finite matrices over \({\mathbb {C}}^n\). Fix \(T > 0\) and for \(0 \le s \le T\), define

$$\begin{aligned} g(s) = e^{s(A+B)}-e^{sA}e^{sB} \end{aligned}$$
(18.1)

Then \(g(0)=g'(0)=0\) so, by Taylor’s theorem with remainder

$$\begin{aligned} ||g(s)|| \le Cs^2; \qquad 0 \le s \le T \end{aligned}$$
(18.2)

Writing

$$\begin{aligned} e^{s(A+B)}-\left[ e^{sA/n}e^{sB/n}\right] ^n&= [e^{s(A+B)/n}]^n-\left[ e^{sA/n}e^{sB/n}\right] ^n \\&= \sum _{j=1}^{n} [e^{s(A+B)/n}]^{j-1} g(\tfrac{s}{n}) \left[ e^{sA/n}e^{sB/n}\right] ^{n-j} \end{aligned}$$

has norm bounded by \(n \exp (s(||A||+||B||)) ||g(\tfrac{s}{n})|| \rightarrow 0\) by (18.2) Thus, for finite matrices, we have that

$$\begin{aligned} e^{s(A+B)} = \lim _{n \rightarrow \infty } \left[ e^{sA/n}e^{sB/n}\right] ^n \end{aligned}$$
(18.3)

This is called the Lie product formula. Although it seems he never wrote it down explicitly, Lie did consider differential equation results on groups close to (18.3). In 1959, Trotter [656] proved a version of the Lie product formula for certain semigroups on Banach spaces:

Theorem 18.1

(Trotter product formula) Let X be a Banach space and \(S(t) = e^{-tA}, \quad t >0\) and \(T(t) = e^{-tB}, \quad t>0\) two strongly continuous semigroups on X that obey

$$\begin{aligned} \text {s}-\lim _{t \downarrow 0} S(t) = \text {s}-\lim _{t \downarrow 0} T(t) = {\varvec{1}}; \qquad ||S(t)||+||T(t)|| \le Ce^{Dt} \end{aligned}$$
(18.4)

Suppose that the operator closure of A+B on \(D(A) \cap D(B)\) generates a strongly continuous semigroup, \(W(t) {\text {``=''}} e^{-t(A+B)}\) obeying (18.4), Then

$$\begin{aligned} \text {s}-\lim _{n \rightarrow \infty } \left[ S(\tfrac{t}{n})T(\tfrac{t}{n})\right] ^n = W(t) \end{aligned}$$
(18.5)

Remarks

  1. 1.

    If S(t) is a semigroup obeying (18.4), then one defines

    $$\begin{aligned} D(A) = \{\varphi \,|\, \lim _{t \downarrow 0} \left( \frac{{\varvec{1}}-S(t)}{t}\right) \varphi \text { exists}\} \end{aligned}$$

    and sets \(A\varphi \) to be the limit. One then writes \(S(t) = e^{-tA}\).

  2. 2.

    If X is a Hilbert space, S(t) is self-adjoint and a contraction, then \(S(t) = e^{-tA}\) for a positive (possibly unbounded) self-adjoint operator, A. This sets up a \(1-1\) correspondence between such semigroups and positive self-adjoint operators.

  3. 3.

    It is a famous theorem of Stone [616, Sect. 7.3] that when X is a Hilbert space, then S(t) is unitary for all t and strongly continuous at 0 (with \(S(0)={\varvec{1}}\)) if and only if \(S(t) = e^{-itA}\) for a self-adjoint operator A.

For a very simple proof when X is a Hilbert space, A and B are self-adjoint and \(A+B\) is self-adjoint (rather than only esa) on \(D(A) \cap D(B)\), see [494, Theorem VIII.30]. The proof is due to Nelson [460] and looks like the finite matrix proof plus one use of the uniform boundedness principle.

The limitation that \(A+B\) have a closure that is a semigroup generator is quite strong. For example, there are cases where \(D(A) \cap D(B) = \{0\}\) but formally \(A+B\) makes sense. Remarkably, Kato proved a result that, at least for self-adjoint contraction semigroups, always holds. Let A and B be self-adjoint operators and \(q_A\), \(q_B\) their closed quadratic forms as discussed in Example 10.3. Their form sum \(q_C=q_A+q_B\) is always a closed form but \(V_{q_C}\) may not be dense. We’ll write \(C=A \dot{+} B\). We need to define \(e^{-tC}\) for C’s which are associated to closed quadratic forms where \(V_q\) might not be dense. We follow the philosophy discussed in Sect. 10 of Part 1 in the discussion of monotone convergence. If q is a closed quadratic form and C is the self-adjoint operator on \(\overline{V_q}\) with \(V_q=D(C^{1/2})\) and \(q(\varphi ) = \langle C^{1/2}\varphi ,C^{1/2}\varphi \rangle \) for \(\varphi \in V_q\), then we define \(e^{-t\dot{C}}\) to be the operator

$$\begin{aligned} e^{-t\dot{C}} = e^{-tC}P \end{aligned}$$
(18.6)

where P is the orthogonal projection onto \(\overline{V_q}\). Here is Kato’s result

Theorem 18.2

(Kato’s ultimate trotter product formula [347]) Let \(q_1, q_2\) be two closed quadratic forms on a Hilbert space, \({\mathcal H}\), with associated semigroups \(e^{-t\dot{A}},\,e^{-t\dot{B}}\). Let \(e^{-t\dot{C}}\) be the semigroup associated to the closed form sum \(q_1+q_2\). Then

$$\begin{aligned} \text {s}-\lim _{n \rightarrow \infty }\left[ e^{-t\dot{A}/n}e^{-t\dot{B}/n}\right] ^n = e^{-t\dot{C}} \end{aligned}$$
(18.7)

Remarks

  1. 1.

    The proof is somewhat technical; we refer the reader to the original paper [347] or to Reed–Simon [494, Theorem S.21]. The proof relies on a general result of Chernoff [88] (see also [494, Theorem S.19]).

  2. 2.

    Earlier results on Trotter product formula for form sums include Chernoff [88, 89, 91], Faris [146] and Kato himself [344].

  3. 3.

    It would be nice to have some kind of result for \(e^{-it\dot{C}}\) but it is unlikely there is one when the approximation is applied to a vector not in \(\overline{V_q}\). That said, (18.7) holds for all \(t \in {\mathbb {C}}\) with \(|\arg (t)| < \pi /2\) and, as explained by Kato in a Note to his paper [347], by an argument that he got from me, one can extend the result from positive self-adjoint AB, to generators of holomorphic contraction semigroups.

  4. 4.

    It could be argued with some justice that this paper doesn’t so much belong in Kato’s work on NRQM but to his work on linear semigroups. But, as found by Nelson [460] (see also Simon [590, 591]) the Trotter product formula is central to the proof of the Feynman–Kac formula and also to interpreting Feynman integrals for \(e^{-itH}\). Moreover, we saw its appearance in Sect. 9 in Part 1—see Theorem 9.3.

  5. 5.

    Kato–Masuda [364] found an extension to nonlinear semigroups. Their paper also has a new result in the linear case, namely instead of \(A\dot{+}B\), one can consider k positive, self-adjoint operators, \(A_1,\ldots ,A_k\) and their form sum \(A_1\dot{+}\cdots \dot{+}A_k\).

Example 18.3

Let P and Q be two orthogonal projections on a Hilbert space. Define

$$\begin{aligned} q_1(\varphi ) = \left\{ \begin{array}{ll} 0, &{} \hbox { if } \varphi \in {\text {ran}}\, P \\ \infty , &{} \hbox { if } \varphi \notin {\text {ran}}\, P \end{array} \right. \end{aligned}$$
(18.8)

and similarly for \(q_2\) and Q. Then \(e^{-t\dot{A}}=P,\,e^{-t\dot{B}}=Q\) for all t. It is easy to see that the form sum \(q_1+q_2\) has the same structure as (18.8) but with \({\text {ran}}\, P\) replaced by \({\text {ran}}\, P\cap {\text {ran}}Q\). If R is the projection onto this intersection, then Kato’s result says that

$$\begin{aligned} \text {s}-\lim _{n \rightarrow \infty } (PQ)^n = R \end{aligned}$$
(18.9)

It is interesting that this geometrically well known fact is a special case of Kato’s result (18.7).

9 Regularity of eigenfunctions and the Kato cusp condition

If one wants to understand the wider impact of Kato’s work, a good place to get insight is to look at citations at Google Scholar (https://scholar.google.co.il/scholar?hl=en&q=tosio+kato). Of course, the publication with the most references by far is Kato’s book [345] with over 20,000 citations. In second place (with over 1700 citations) is the 1957 paper [325] discussed in this section. This may be surprising to some, but it reflects its importance to quantum chemists and atomic physicists.

In this paper, Kato begins by saying that he regards this paper as a continuation of [314]. In that earlier paper, he stated “If V is the Coulomb potential as in the case of real atoms, it follows that the eigenfunctions satisfy the wave equation everywhere except at singular points of the potential (they are even analytic since the Coulomb potential is an analytic function). Regarding their behavior at these singular points, we can derive no conclusion from the above theorem. A detailed study shows, however, that they are bounded even at such points”. He is interested in the properties of \(L^2\)-eigenfunctions and what he calls generalized eigenfunctions or wave packets by which he means \(\psi \in {\mathcal H}\) with \(\psi \in {\text {ran}}E_\Omega (H)\) where H is a quantum Hamiltonian, \(\Omega = [a,b]\), a bounded interval, and \(E_\Omega (H)\) is a spectral projection [616, Sect. 5.1]. In fact, we’ll soon see that \(\psi \in {\text {ran}}(e^{-sH})\) for some \(s>0\) suffices for some of the results that Kato proved. Kato focused on local regularity of \(\psi \) with some global estimates (like on \(||\varvec{\nabla }\psi ||_\infty \)). In particular, he delivered on the boundedness result he claimed in 1951.

There is a huge literature on other aspects of eigenfunctions which we’ll not discuss except for a few words now. First, there is the issue of exponential decay which we mentioned briefly at the end of Sect. 12; below all we’ll discuss, in the context of proving pointwise bounds, is how to go from \(L^2\) exponential decay to pointwise exponential decay. Secondly, there is literature on the structure of nodes (i.e. the zero set); see, for example, Zelditch [710]. Finally there are the issues of continuum eigenfunction expansions and the related theorem that \(\sigma (H)\) is the closure of the set of E for which \(H\psi =E\psi \) has a polynomially bounded solution; see [600, Corollary C.5.5].

Kato considers two classes of Hamiltonians. The first, which we’ll call general H, acts on \(L^2({\mathbb {R}}^{\nu N})\) with \(\varvec{x} = (x_1,\ldots ,x_N);\, x_j \in {\mathbb {R}}^\nu \) (Kato only considers the case \(\nu =3\), but we’ll discuss the more general case below). H then has the form

$$\begin{aligned} H=-\Delta +\sum _{j=1}^{N} V_j(x_j)+\sum _{1 \le j < k \le N} V_{jk}(x_j-x_k) \end{aligned}$$
(19.1)

with each \(V_j,V_{jk} \in L^p({\mathbb {R}}^\nu )+L^\infty ({\mathbb {R}}^\nu )\), where p is \(\nu \)-canonical (see just prior to Theorem 7.9) so that H is esa-\(\nu \) (see Sect. 7 of Part 1). \(-\Delta \) assumes equal masses of the light particle and an infinite mass heavy particle but one easily accommodates general masses using the formalism in Sect. 11.

Kato also considered what we will call atomic Hamiltonians

$$\begin{aligned} H = -\Delta -\sum _{j=1}^{N}\frac{Z}{|x_j|} + \sum _{1 \le j < k \le N} \frac{1}{|x_j-x_k|} \end{aligned}$$
(19.2)

on \(L^2({\mathbb {R}}^{3N})\). Kato allows Hughes–Eckart terms, allows Z to be j dependent and allows \(\tfrac{z_{jk}}{|x_j-x_k|}\) rather than \(\tfrac{1}{|x_j-x_k|}\). All these are easy to accommodate as is the molecular case where \(\tfrac{Z}{|x_j|}\) is replaced by

$$\begin{aligned} \sum _{\ell =1}^{L} \frac{Z_\ell }{|x_j-R_\ell |} \end{aligned}$$
(19.3)

Most of the time, for simplicity of exposition, we’ll discuss the atomic case.

In the atomic case, we’ll be especially interested in the set of singularities where some \(|x_j|\) or \(|x_j-x_k|\) vanish, i.e.

$$\begin{aligned} \Sigma = \{\varvec{x}=(x_1,\ldots ,x_N) \,|\, \prod _{j=1}^{N} |x_j| \,\prod _{1 \le j < k \le N} |x_j-x_k| =0\} \end{aligned}$$
(19.4)

In [325], Kato proved three main theorems. For the first two, we need a definition. Let \(0 < \alpha \le 1\) and \(j =0\) or 1. Then

$$\begin{aligned} C^{j,\alpha }= & {} \{\psi \,| \, \psi \text { is } C^j \text { and obeys (19.5)} \}\nonumber \\&\exists C\, \forall _{x,y \, |\, |x-y| \le 1} \, |D^{(j)}\psi (x) - D^{(j)}\psi (y)| \le C|x-y|^\alpha \end{aligned}$$
(19.5)

(\(\alpha =1\) is called Lipschitz; otherwise, we are saying the derivative is Hölder continuous). If the constant C in (19.5) is allowed to depend on a compact K requiring \(x,y\in K \subset {\mathbb {R}}^\nu \), we say that \(\psi \in C^{j,\alpha }_{loc}\).

Theorem 19.1

(Kato [325]) Let \(\nu =3\) and let \(V_j, V_{jk} \in L^\sigma ({\mathbb {R}}^3)+L^\infty ({\mathbb {R}}^3)\) for some \(\sigma \ge 2\). Let \(\psi \) be an eigenfunction or wave packet. Then:

  1. (a)

    For all \(\alpha \) with \(\alpha \le 1\) and \(\alpha < 2-\tfrac{3}{\sigma }\), we have that

    $$\begin{aligned} \psi \in C^{0,\alpha } \end{aligned}$$
    (19.6)
  2. (b)

    If \(\sigma > 3\), we have that for all \(\alpha < 1-\tfrac{3}{\sigma }\) that

    $$\begin{aligned} \psi \in C^{1,\alpha } \end{aligned}$$
    (19.7)

The Coulomb case allows any \(\sigma \) with \(\sigma < 3\) but not \(\sigma =3\) so it is borderline for \(\psi \) being Lipschitz. Nevertheless, Kato proved that

Theorem 19.2

(Kato [325]) Let \(\nu =3\) and let H be an atomic Hamiltonian. Let \(\psi \) be an eigenfunction or wave packet. Then \(\psi \in C^{0,1}\) (i.e. is Lipschitz). Indeed \(\psi \) is \(C^1\) on \({\mathbb {R}}^{3n}{\setminus }\Sigma \) with \(\varvec{\nabla }\psi \in L^\infty \).

Remarks

  1. 1.

    It is easy to see by the fact that \(\Sigma \) is closed of measure zero, that the \(C^1\) result with bounded derivative implies the \(C^{0,1}\) result.

  2. 2.

    As Kato remarks, in the atomic case, there were no previous positive results on regularity of eigenfunctions if \(N \ge 2\) although it was known that certain series expansions did not work.

  3. 3.

    Since the potentials are real analytic on \({\mathbb {R}}^{3N}{\setminus }\Sigma \), it is known by elliptic regularity [171, 190, 483] that genuine eigenfunctions are real analytic on \({\mathbb {R}}^{3N}{\setminus }\Sigma \). So the point of the theorem is control on \(\Sigma \) and the uniformity of the bounds.

Kato’s third result concerns the exact behavior at the two particle coincidences. To understand why he states the theorem as he does, consider Hydrogen-like Hamiltonians where the eigenfunctions are exactly known.

Example 19.3

Let \(h = -\Delta -\tfrac{2}{|x|}\) on \(L^2({\mathbb {R}}^3)\). It is known [207] that the unnormalized ground (1s) state is given by

$$\begin{aligned} \varphi _0(\varvec{r}) = e^{-r}; \qquad r = |\varvec{r}| \end{aligned}$$
(19.8)

obeying \(h\varphi _0 = -\varphi _0\). Notice that \(\varphi _0\) is not \(C^1\) at \(\varvec{r}=0\) but has a cusp there, i.e.

$$\begin{aligned} \varvec{\nabla }\varphi _0(\varvec{r}) = -\frac{\varvec{r}}{r} e^{-r} \end{aligned}$$
(19.9)

so that the limit of the derivative ar \(\varvec{r}=0\) is directionally dependent.

The 2p state (with \(m=0\)) is given by

$$\begin{aligned} \varphi _1(\varvec{r}) = z e^{-r/2}; \qquad \varvec{r} = (x,y,z) \in {\mathbb {R}}^3 \end{aligned}$$
(19.10)

obeying \(h\varphi _1 = -\tfrac{1}{4} \varphi _1\). Thus

$$\begin{aligned} \varvec{\nabla }\varphi _1(\varvec{r}) = -\frac{1}{2} z \frac{\varvec{r}}{r} e^{-r/2} + (0,0,1) e^{-r/2} \end{aligned}$$
(19.11)

This derivative is continuous at \(\varvec{r}=0\) and non-zero at \(\varvec{r}=0\). Kato had the realization that by taking a spherical average of \(\psi \), one captures (at least in the one electron case) exactly the s states which have cusps. That explains why he took the average in the next Theorem.

Theorem 19.4

(Kato cusp condition [325]) Let H be an atomic Hamiltonian and let \(\psi \) be an \(L^2\) eigenfunction for H. Let \(\varvec{x}=(x_1,\ldots ,x_N)\). Define on \((0,\infty ) \times {\mathbb {R}}^{3(N-1)}\)

$$\begin{aligned} \widetilde{\psi }(r,x_2,\ldots ,x_N) = \frac{1}{4\pi } \int _{S^2} \psi (r\omega ,x_2,\ldots ,x_N) \, d\omega \end{aligned}$$
(19.12)

where \(d\omega \) is the surface measure on the two dimensional sphere, so \(\widetilde{\psi }\) is a spherical average. Then except for \((x_2,\ldots ,x_N)\) in a set of lower dimension (i.e. less than \(3N-3\)), one has that

$$\begin{aligned} \left. \frac{\partial \widetilde{\psi }}{\partial r}\right| _{r=0} = -\frac{Z}{2} \psi (0,x_2,\ldots ,x_N) \end{aligned}$$
(19.13)

Remarks

  1. 1.

    (19.13) is the celebrated Kato cusp condition.

  2. 2.

    There is a similar result at \(x_j-x_k=0\); \(-\tfrac{Z}{2}\) is replaced by \(+\tfrac{1}{2}\).

  3. 3.

    In (19.13), the left side means to compute the derivative for \(r > 0\) (using that \(\psi \) is \(C^1\) there \(\Rightarrow \widetilde{\psi }\) is \(C^1\)) and then take \(r \downarrow 0\). (19.13) says that \(\widetilde{\psi }(r,x_2,\ldots ,x_N) = -\tfrac{Z}{2}r\psi (0,x_2,\ldots ,x_N)+\text {o}(r)\) so that, if \(\psi (0,x_2,\ldots ,x_N) \ne 0\), \(\widetilde{\psi }\) has a cusp as it does for Hydrogen.

  4. 4.

    Most modern variational calculations for atoms and molecules use basis elements that have the cusp condition, so this theorem is very influential.

Kato’s proofs depend on rewriting the time-independent Schrödinger equation as an integral equation and analyzing that equation. This completes what we want to say about Kato’s paper itself. We turn to later work, first concerning general Hamiltonians and Theorem 19.1. The most powerful results use path integral methods (pioneered by Herbst–Sloan [244], Carmona [84] and Aizenman–Simon [7]; two comprehensive references are [590, 591, 600]) and are expressed in terms of a class of spaces \(K_\nu ^{(\alpha )}; \nu = 1,2,\ldots ;\, \alpha \in [0,2)\) defined by (we suppose \(\nu \ge 2\) and when \(\alpha = 0\) that \(\nu \ge 3\); we refer the reader to [600] for the other cases):

Definition

\(K_\nu ^{(\alpha )}\) is defined by

  1. (a)

    for \(\alpha \in (0,1)\cup (1,2)\) and \(\nu \ge 2\) as those V with

    $$\begin{aligned} \sup _{x} \int _{|x-y| \le 1} |x-y|^{-(\nu -2+\alpha )} |V(y)| \, dy < \infty \end{aligned}$$
    (19.14)
  2. (b)

    if \(\alpha = 0\) or \(\alpha =1\) and \(\nu \ge 3\) by

    $$\begin{aligned} \lim _{r \downarrow 0} \sup _{x} \int _{|x-y| \le r} |x-y|^{-(\nu -2+\alpha )} |V(y)| \, dy = 0 \end{aligned}$$
    (19.15)

Remarks

  1. 1.

    If \(\alpha = 0\), \(K_\nu ^{(0)} = K_\nu \) as defined in (9.32).

  2. 2.

    If \(\alpha _1 > \alpha \), then \(K_\nu ^{(\alpha _1)} \subset K_\nu ^{(\alpha )}\)

  3. 3.

    If \(p > \nu /(2-\alpha )\), then \(L^p_{unif} \subset K^{(\alpha )}_\nu \) by Hölder’s inequality. In particular \(v \in L^\sigma ({\mathbb {R}}^3)+L^\infty ({\mathbb {R}}^3) \Rightarrow v \in K_3^{(\alpha )}\) so long as \(\alpha < 2-3/\sigma \).

  4. 4.

    As with \(K_\nu \), \(v(x) \in K_\nu ^{(\alpha )}\) for \(x \in {\mathbb {R}}^\nu \) implies that \(V(x,y) \equiv v(x),\, x\in {\mathbb {R}}^\nu , y\in {\mathbb {R}}^{\mu -\nu } \Rightarrow V \in K_\mu ^{(\alpha )}\). Thus in the context of Theorem 19.1 \(V_j(x_j)\) and \(V_{jk}(x_j-x_k)\) on \({\mathbb {R}}^{3N}\) will lie in \(K_{3N}^{(\alpha )}\) if the V’s, \(\alpha \) and \(\sigma \) are as in Remark 2. This means that Theorem 19.1 follows from Theorem 19.6 below.

  5. 5.

    As with \(K_\nu \), these spaces are special cases of a class of spaces of Schechter [535]. In this context, they were introduced by Simon [600].

  6. 6.

    \(K_{\nu ,loc}^{(\alpha )}\) is those V whose restriction to each ball in \({\mathbb {R}}^\nu \) lies in \(K_\nu ^{(\alpha )}\).

One of Kato’s realizations is that eigenfunctions are bounded and continuous. In this regard, the following is useful.

Theorem 19.5

(Subsolution estimate) Let V be a function on \({\mathbb {R}}^\nu \) with \(V \in K_\nu \). Let \(\psi \in L^2_{loc}\) solve \((-\Delta +V)\psi = 0\) in distributional sense. Then \(\psi \) is a continuous function and for any \(r > 0\), there is C depending only on the \(K_\nu \)-norm of \(V_- \equiv \max (V(x),0)\) (and, in particular, not on \(\psi \)) so that

$$\begin{aligned} |\psi (x)| \le C \int _{|x-y| \le r} |\psi (y)| \, dy \end{aligned}$$
(19.16)

Remarks

  1. 1.

    Such estimates go back to Stampacchia [623] and Trudinger [657] who had stronger hypotheses on V. For \(V \in K_\nu \), Agmon [5, Chapter 5] has an analytic proof and Aizenman–Simon [7] a path integral proof; see also [600].

  2. 2.

    It is enough to have \(V_- \in K_\nu \) and \(V_+ \equiv V+V_- \in K_{\nu , loc}\).

  3. 3.

    The name comes from the fact that it is a result proven for positive functions, u with \((-\Delta +V)u \le 0\) (so subsolutions rather than solutions as in subharmonic rather than harmonic). Kato’s inequality shows that if \((-\Delta +V)\psi = 0\), then \(u = |\psi |\) is a subsolution. In this form, the inequality is intimately connected to Harnack’s inequality [7, 600].

Subsolution estimates are important because they say that \(\psi \in L^2 \Rightarrow \psi \in L^\infty \) (with, in fact, the function going pointwise to zero at \(\infty \)) and so they give the bounded continuous part of Kato’s Theorem 19.1 (for eigenfunctions; for wave packets, see below). They also show that \(e^{ar}\psi \in L^2 \Rightarrow e^{ar}\psi \in L^\infty \) and so the \(L^2\) exponential decay estimates discussed in Theorem 12.7 imply pointwise exponential decay.

The following has Theorem 19.1 as a special case:

Theorem 19.6

Let \(0< \alpha < 2\). Let \(V_-\in K_{\nu }^{(\alpha )}, \,V_+\in K_{\nu , loc}^{(\alpha )}\). Let \(f \in L^2({\mathbb {R}}^\nu )\). Then, for each \(t > 0\), \(e^{-tH}f\) lies in

  1. (a)

    \(C^{0,\alpha }\) if \(\alpha \in (0,1)\)

  2. (b)

    Is \(C^1\) and in \(C^{0,1}\) if \(\alpha =1\)

  3. (c)

    \(C^{1,\alpha -1}\) if \(\alpha \in (1,2)\)

and the norms only depend on t, the \(L^2\)-norm of f and the \(K_\nu \) norm of \(V_-\).

Remarks

  1. 1.

    The proof using functional integration can be found in Simon [600, Theorem B.3.5].

  2. 2.

    For eigenfunctions, there are subsolution type estimates for the constants in Hölder estimates; see [600, Theorem C.2.5].

  3. 3.

    To control \(\varvec{\nabla }\psi \), one needs \(\alpha =1\). The Coulomb potentials in atomic and molecular Hamiltonians are in \(K_{3N}^{(\alpha )}\) for \(\alpha \in [0,1)\) but not for \(\alpha =1\). Nevertheless, Hoffmann-Ostenhof et al. [250] have proven for such potentials and \(L^2\) eigenfunctions, one has that

    $$\begin{aligned} \sup _{|y-x| \le R} |\varvec{\nabla }\psi (y)| \le C \sup _{|y-x| \le 2R} |\psi (y)| \end{aligned}$$
    (19.17)

    for any x, where C is a universal constant depending only on R and H. This includes and improves Kato’s theorem 19.2; one improvement is that exponential decay of \(\psi \) implies exponential decay of its first derivatives.

There has been considerable literature dealing with the questions discussed in Kato’s Theorems 19.2 and 19.4; a substantial fraction of this literature is by Maria and Thomas Hoffmann–Ostenhof and their collaborators. We want to discuss some of the highlights.

The first result sheds additional light on the behavior near pair singularities. We define

$$\begin{aligned} \Sigma _j = \{x \,|\, |x_j|=0\}; \qquad \Sigma _{jk} = \{x\,|\, |x_j-x_k|=0\},\, j< k \end{aligned}$$
(19.18)

so \(\Sigma =\left( \bigcup _{j=1}^N \Sigma _j\right) \cup \left( \bigcup _{j < k} \Sigma _{jk}\right) \).

Theorem 19.7

(Fournais et al. [157]) Let \(x^{(0)} \in \Sigma _1, \, x^{(0)} \notin \left( \bigcup _{j=2}^N \Sigma _j\right) \cup \left( \bigcup _{j < k} \Sigma _{jk}\right) \). Let \(\psi \) be an \(L^2\) eigenfunction of H. Then there are two functions, \(\varphi _1\) and \(\varphi _2\), defined and analytic in a neighborhood, Q, of \(x^{(0)} \in {\mathbb {R}}^{3N}\), so that in Q

$$\begin{aligned} \psi (x) = \varphi _1(x)+ |x_1| \varphi _2(x) \end{aligned}$$
(19.19)

Remarks

  1. 1.

    Near \(x^{(0)}\),

    $$\begin{aligned} \psi (x) = \varphi _1(x^{(0)})+|x_1| \varphi _2(x^{(0)}) + \varvec{\nabla }\varphi _1(x^{(0)})\cdot (x-x^{(0)}) + \text {O}((x-x^{(0)})^2) \end{aligned}$$

    clearly showing the cusp.

  2. 2.

    Similar results hold for each \(\Sigma _j\) and each \(\Sigma _{jk}\).

  3. 3.

    For a proof, see [157]. They were motivated by earlier work of Hill [248].

  4. 4.

    This shows a cusp, but supplements rather than proves the Kato cusp equality (19.13). Indeed, that equality implies that \(\varphi _2(x^{(0)}) = -\tfrac{Z}{2}\varphi _1(x^{(0)})\).

The cusp condition only holds at simple singular points where only a single pair among \(\{0,x_1,\ldots ,x_N\}\) coincides (in the atomic case). In 1954, Fock [152] (the same Fock of Born–Fock 26 years earlier and the Hartree–Fock approximation 24 years earlier and of Fock space 22 years earlier!) gave arguments that there are \(\langle x_j,x_k \rangle \log (|x_j|^2+|x_k|^2)\) terms at points where both \(|x_j|\) and \(|x_k|\) go to zero. These are called Fock terms.

The following includes and improves the Kato cusp condition, Theorem 19.4,

Theorem 19.8

(Fournais et al. [156]) On \({\mathbb {R}}^{3N}\), let

$$\begin{aligned} F_2(x)&= - \frac{Z}{2}\sum _{j=1}^{N}|x_j|+\frac{1}{4}\sum _{1\le j<k\le N}|x_j-x_k| \end{aligned}$$
(19.20)
$$\begin{aligned} F_3(x)&= \frac{2-\pi }{12\pi } \sum _{1\le j<k\le N} \langle x_j,x_k \rangle \log (|x_j|^2+|x_k|^2) \end{aligned}$$
(19.21)

For any \(\varphi \), write

$$\begin{aligned} \psi = e^{F_2+F_3}\varphi \end{aligned}$$
(19.22)

Then, if \(\psi \) solves \(H\psi =E\psi \) on a bounded set, \(\Omega \), we have that

$$\begin{aligned} \varphi \in C^{1,1} \end{aligned}$$
(19.23)

Remarks

  1. 1.

    Writing \(\psi \) in the form \(e^F\varphi \) is often called a Jastrow trial function after Jastrow [281] who had the idea of modifying Slater determinants, \(\varphi \) by multiplying by \(e^F\) with F a simple rational function of the \(|x_j|\) and \(|x_j-x_k|\).

  2. 2.

    The weaker result where \(F_3\) isn’t included and \(\varphi \in C^{1,\alpha }\) was proven earlier by Hoffmann-Ostenhof et al. [250]. The above theorem is from Fournais et al. [156] where the reader can find a proof that depends on looking at the PDE that \(\varphi \) obeys and standard elliptic estimates. All depend on noting that

    $$\begin{aligned} \Delta F_2 = V \end{aligned}$$
    (19.24)
  3. 3.

    The reader may be puzzled by \(-\tfrac{Z}{2}\) but \(\tfrac{1}{4}\) rather than \(\tfrac{1}{2}\) (since the effective Z for the jk pair is \(+1\)). But \(\varvec{\nabla }F_2\) has only one \(\varvec{\nabla }_j\) acting non-trivially on \(|x_j|\) but both \(\varvec{\nabla }_j\) and \(\varvec{\nabla }_k\) act non-trivially on \(|x_j-x_k|\) turning the \(\tfrac{1}{4}\) into a \(\tfrac{1}{2}\) which is also why we get (19.24).

  4. 4.

    Hoffmann-Ostenhof et al. [250] noted that their result implies that

    $$\begin{aligned} \varvec{\nabla }\psi - \psi \varvec{\nabla }F_2 \in C^{1,\alpha }, \, \alpha \in (0,1) \end{aligned}$$
    (19.25)

    while Fournais et al. [156] note that their results imply that

    $$\begin{aligned} \varvec{\nabla }\psi - \psi \varvec{\nabla }(F_2+F_3) \in C^{1,1} \end{aligned}$$
    (19.26)

    The first is a strong form of the Kato cusp condition (which follows from the continuity of \(\varvec{\nabla }\psi - \psi \varvec{\nabla }F_2\)) and unlike Kato, they prove results at multiple coincidences. The second result implies that second derivatives of \(\psi \) are bounded at simple coincidences and have a logarithmic blow up at points where \(|x_j|\) and \(|x_k|\) go to zero.

  5. 5.

    The obvious extensions hold for molecular Hamiltonians.

  6. 6.

    A interesting alternate approach to understanding the Kato cusp conditions in terms of singularities at corners is found in Ammann et al. [11].

That completes what we want to say about regularity of eigenfunctions; we end this section with a few remarks on the closely related subject of regularity of the one electron density defined by

$$\begin{aligned} \rho _\psi (x) = N \int |\psi (x,{{\mathbf {x}}}_2,\ldots ,x_N)|^2 \, d^3x_2\ldots d^3x_n \end{aligned}$$
(19.27)

(this is the formula if \(\psi \) is symmetric or antisymmetric; otherwise the “N” in front is replaced by summing against putting x in each of the N slots). It measures the electron density.

Theorem 19.9

(Fournais et al. [155]) For any atomic or molecular eigenfunction, the density, \(\rho _\psi \) is real analytic away from the nuclei (\(x=0\) in the atomic case and \(x=R_j,\, j=1,\ldots ,K\) in the molecular case).

This was proven in [155]. Earlier the same authors had proven that \(\rho _\psi \) is \(C^\infty \) [154]. Jecko [284] has an alternate proof of Theorem 19.9.

10 Two conjectures

I thought it would be appropriate to end this paper with two open questions in the areas that interested Kato. One dates from 1971 when Kato was still active and the other from 2000, the year after he died.

Conjecture 20.1

(Jörgens’ Conjecture) Let \(\Omega \subset {\mathbb {R}}^\nu \) be open. Let \(V \in L^2_{loc}(\Omega )\), so that \(-\Delta +V\) is bounded from below and esa on \(C_0^\infty (\Omega )\). Suppose that \(V_1 \ge V\) is also in \(L^2_{loc}(\Omega )\). Then \(-\Delta +V_1\) is also esa on \(C_0^\infty (\Omega )\).

This result would be interesting even for \(\Omega ={\mathbb {R}}^\nu \), where, of course when \(V \equiv 0\), this is just the famous result of Kato in Sect. 9 of Part 1. The case where \(\Omega = {\mathbb {R}}^\nu ; \, \nu \ge 5\) and \(V(x) = -\nu (\nu -4)|x|^{-2}\) (results of Kalf–Walter and Simon) is mentioned in Sect. 9.

In the early 1970s, there were a set of almost annual meetings at Oberwolfach on spectral and scattering theory and frequent PDE meetings. They were quite important. For example, Agmon announced his result Theorem 15.2 in 1970 [3] but only published the full paper [4] in 1975. In between, the standard source for his work were personal notes some people took of a series of lectures that he gave at one of these Oberwolfach meetings. In connection with the 1971 PDE conference (organized by Haack, Heinz and Hellwig), Konrad Jörgens (1926–1974), who died tragically of a brain tumor less than 3 years later, made the above conjecture. At the conference, Kalf discussed his work with Walter [302] mentioned in Sect. 9.

Here is the story that Kalf told me: Before the talks of the conference started Hellwig introduced me to Jörgens and Weidmann (then Jörgens’s assistant) and proudly mentioned the result Walter and I had proved. Jörgens’s immediate reaction was, “This is false, because the Laplacian is not e.s.a. on \({\mathbb {R}}^n {\setminus } \{0\}\)”. Weidmann interfered with the remark that this depended on the dimension. Jörgens thought for a moment, and then he said “The result is trivial because an e.s.a. operator remains e.s.a. when the potential is increased”. Fortunately, I had the presence of mind to say that there are examples where LPC and LCC alternate when a parameter is increased. Jörgens was astonished to hear that. After dinner I showed him the corresponding paper by Sears. After a while he said “This is a case where the operator is not bounded from below; it cannot happen for semibounded operators”.

Note that Simon’s and Kato’s papers discussed in the historical part of Sect. 9 were both preprinted in early 1972 after this conjecture, so the original conjecture was made for a local Stummel space rather than \(L^2_{loc}\) but eventually, it was updated to \(L^2_{loc}\).

In one dimension, this is related to a result of Kurss [401] who proved the result for continuous V although his argument doesn’t need continuity (essentially it follows from a simple comparison argument for positive solutions and limit point-limit circle methods). In 1966, in [625], Stetkaer-Hansen extended Theorem 8.6 to the case where V is locally Stummel. Since, if \(\nu \le 3\), \(L^2_{loc}\) is the same as locally Stummel, this implies Jörgen’s conjecture for \(\Omega = \emptyset \) and these \(\nu \) (indeed without the need of a comparison potential!)

Many people, especially in the various German groups studying Schrödinger operators worked hard on this problem. In 1980, Cycon [99] proved a result when there was an additional technical condition on \(-\Delta +V\). He remarked that given the failure to find a proof, some researchers began to suspect that it might be false.

Conjecture 20.2

(Simon’s Conjecture) Let V be a measurable function on \({\mathbb {R}}^\nu ,\,\nu \ge 2\) obeying

$$\begin{aligned} \int |x|^{-\nu +1} |V(x)|^2\, d^\nu x < \infty \end{aligned}$$
(20.1)

Then \(-\Delta +V\) has a.c. spectrum of infinite multiplicity on \([0,\infty )\).

This was made by Simon [609]. While not explicit, there is a presumption that \(-\Delta +V\) is esa-\(\nu \). If V obeys (16.5), one needs \(\beta > 1/2\). It would be interesting to prove the conjecture for all V’s obeying (16.5) for any fixed \(\beta \in (1,\tfrac{1}{2})\).

Here is some background on the conjecture. Kato–Kuroda–Agmon and others studied V’s obeying (16.5) for any \(\beta >1\) and found (much more than) \(\sigma _{ac}(-\Delta +V)=[0,\infty )\). As noted in Sect. 14, when \(\nu =1\), if is known that for any \(\beta < 1/2\), there are V’s with no a.c. spectrum; in fact, in a sense, this is generic. In the mid 1990s, I realized that determining what happened when \(1> \beta > 1/2\) was a natural problem and alerted my graduate student advisees to this fact. Kiselev [373] proved that when \(\nu =1\) and \(\beta > 3/4\), one could prove that \(\sigma _{ac}(-\Delta +V)=[0,\infty )\). (It was eventually realized that this regime differed from \(\beta > 1\) in that one could also have singular continuous spectrum mixed in). This was then pushed, again when \(\nu =1\) to \(\beta > 1/2\) by Christ–Kiselev [93] and Remling [512]. It seemed natural that the precise borderline was \(V \in L^2\) and in 1999, Deift and Killip (then my Ph.D. student) [108] proved

Theorem 20.3

(Deift–Killip [108]) Let \(V \in L^2({\mathbb {R}},dx)\). Then \(H = -\frac{d^2}{dx^2}+V\) on \(L^2\) has a.c. spectrum \([0,\infty )\) with multiplicity 2.

If \(V({\mathbf {x}})=V(|x|)\) is spherically symmetric on \({\mathbb {R}}^\nu \), then (20.1)\(\Rightarrow \int _{0}^{\infty } |V(r)|^2\,dr < \infty \), so the Deift–Killip result implies and is essentially equivalent to Conjecture 20.2 for spherically symmetric V.

Several people have worked quite hard on this conjecture without success (although sometimes they found weaker results that they published). The reader trying to understand the Deift–Killip result should also consult Killip–Simon [368, 369].