Abstract
The present work provides a definitive answer to the problem of quantifying relaxation to equilibrium of the solution to the spatially homogeneous Boltzmann equation for Maxwellian molecules. Under really mild conditions on the initial datum and a weak, physically consistent, angular cutoff hypothesis, our main result (Theorem 1) contains the first precise statement that the total variation distance between the solution and the limiting Maxwellian distribution admits an upper bound of the form \(C e^{\varLambda _b t}\), \(\varLambda _b\) being the least negative eigenvalue of the linearized collision operator and \(C\) a constant depending only on the initial datum. The validity of this quantification was conjectured, about fifty years ago, by Henry P. McKean. As to the proof of our results, we have taken as point of reference an analogy between the problem of convergence to equilibrium and the central limit theorem of probability theory, highlighted by McKean.
Similar content being viewed by others
1 Introduction and new results
On the basis of an analogy pointed out by McKean in [50, 51], a few years ago we started a program which aims at studying the long-time behavior of solutions of some kinetic equations, by means of representations which connect these solutions to probability laws of certain weighted sums of independent and identically distributed (i.i.d.) random variables. The discovery of the right representation is comparatively simple for the solution of the spatially homogeneous one-dimensional Kac equation. This fact has produced both new results and improvements on the existing ones concerning the Kac equation. See [3–5, 17, 32, 33, 38, 39]. Our goal in the present paper is to go back to the original kinetic model, the spatially homogeneous Boltzmann equation for Maxwellian molecules (SHBEMM), which had inspired the aforesaid one-dimensional model. The reason for having deferred its treatment is connected, on the one hand, with the mathematical complexity of the subject and, on the other hand, with the hope that useful insights could be derived from the study of simpler allied cases. More specifically, we discuss here the problem of quantifying the “best” rate of relaxation to equilibrium. The starting point of the argument is the new probabilistic representation exhibited in Sect. 1.5 of the present paper.
The last part of the program, to be developed in forthcoming papers, is concerned with the inhomogeneous Boltzmann equation for Maxwellian molecules. Although the assumption of spatial homogeneity adopted here may seem a strong restriction, it is nonetheless proving an interesting and inspiring basis for studying qualitative properties of the complete model.
1.1 The equation
In classical kinetic theory, a gas is thought of as a system of a very large number \(N\) of like particles, described by means of a time-dependent statistical distribution \(\mu (\cdot , t)\) on the phase space \(X \times {\mathbb {R}}^3\), where \(X\) stands for the spatial domain. Then, for any subset \(A\) of \(X \times {\mathbb {R}}^3\), \(\mu (A, t)\) provides an approximation, independent of \(N\), of the statistical frequency of particles in \(A\), at time \(t\). It is worth noting that \(\mu (\cdot , t)\) can be also interpreted, consistently with its statistical meaning, as a genuine probability distribution (p.d.) by arguing about \(\mu (A, t)\) as probability that the position-velocity of a randomly selected particle, at time \(t\), belongs to \(A\). See the discussion in Subsection 2.1 in Chapter 2A of [65]. The basic assumptions for the derivation of the classical equation which governs the evolution of \(\mu (\cdot , t)\) are that the gas is dilute (Boltzmann-Grad limit) and that the particles interact via binary, elastic and microscopically reversible collisions. Particles which are just about to collide are viewed as stochastically independent (Boltzmann’s Stosszahlansatz). See [22, 23, 63] for a comprehensive treatment. In this work, we also assume spatial homogeneity, so that the phase space reduces to \({\mathbb {R}}^3\) and the SHBEMM can be written as
where \((\mathbf{v}, t)\) varies in \({\mathbb {R}}^3\times (0, +\infty )\), \(f(\cdot , t)\) stands for a density function of \(\mu (\cdot , t)\) and \(u_{S^2}\) for the uniform p.d. (normalized Riemannian measure) on the unit sphere \(S^2\), embedded in \({\mathbb {R}}^3\). The symbols \(\mathbf{v}_{*}\) and \(\mathbf{w}_{*}\) denote post-collisional velocities which, according to the conservation laws of momentum and kinetic energy, must satisfy
Throughout the paper, \(\mathbf{v}_{*}\) and \(\mathbf{w}_{*}\) are parametrized according to the \(\varvec{\omega }\)-representation, i.e.
where \(\cdot \) denotes the standard scalar product. The angular collision kernel \(b\) is a non-negative measurable function on \([-1, 1]\). Henceforth, for the sake of mathematical convenience, it will be assumed that \(b\) meets the symmetry condition
for all \(x\) in \((-1, 1)\), an assumption which does not reduce the generality of (1), as explained in Subsection 4.1 of Chapter 2A of [65].Footnote 1 In presence of a general interaction potential governing the mechanism of binary collisions, \(b\) is replaced by a more complex function called collision kernel. See Section 3 of Chapter 2A of [65]. Maxwell [49] was the first to study particles which repel each other with a force inversely proportional to the fifth power of their distance, named Maxwellian molecules after him. In this particular circumstance, the resulting collision kernel turns out to be a specific function only of \(\frac{\mathbf{w}- \mathbf{v}}{|\mathbf{w}- \mathbf{v}|} \cdot \varvec{\omega }\), as in (1), with a non-summable singularity near 0. It is customary, as we do here, to call Maxwellian any collision kernel which is a function only of \(\frac{\mathbf{w}- \mathbf{v}}{|\mathbf{w}- \mathbf{v}|} \cdot \varvec{\omega }\), and to distinguish Maxwellian kernels depending on whether they are summable or not. The former case corresponds to a SHBEMM with Grad (angular) cutoff. Without any loss of generality, this condition can be formalized assuming that
since any SHBEMM with cutoff can be reduced, via a time-scaling, to a SHBEMM with a kernel satisfying (3). The case when \(b\) is not summable corresponds to a SHBEMM of the non-cutoff type. We shall confine ourselves to considering the weak (angular) cutoff, i.e.
This condition is actually satisfied by the explicit form of \(b\) given by Maxwell, namely the only form of \(b\) that has been justified from a physical standpoint.
The first rigorous results on existence and uniqueness, given a probability density function \(f_0\) on \({\mathbb {R}}^3\) as initial datum, were obtained in [53, 67] under the validity of (3). To discuss this question about the SHBEMM with or without cutoff within a unitary framework, one needs a reformulation of the problem. Accordingly, the weak version of (1) used throughout this paper reads
where \(\psi \) varies in \(\text {BL}({\mathbb {R}}^3)\), the space of all bounded and Lipschitz-continuous functions defined on \({\mathbb {R}}^3\). This formulation enables us to consider any p.d. \(\mu _0\) on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\) as initial datum, \({\fancyscript{B}}({{\mathbb {R}}}^3)\) standing for the Borel class on \({\mathbb {R}}^3\). The term weak solution designates any family \(\{\mu (\cdot , t)\}_{t \ge 0}\) of p.d.’s on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\) such that
-
(i)
\(\mu (\cdot , 0) = \mu _0(\cdot )\);
-
(ii)
\(t \mapsto \int _{{\mathbb {R}}^3}\psi (\mathbf{v}) \mu (\mathrm{d}\mathbf{v}, t)\) belongs to \(\text {C}([0, +\infty )) \cap \text {C}^1((0, +\infty ))\) for all \(\psi \) in \(\text {BL}({\mathbb {R}}^3)\);
-
(iii)
\(\int _{{\mathbb {R}}^3}|\mathbf{v}| \mu (\mathrm{d}\mathbf{v}, t) < +\infty \) for all \(t \ge 0\), if \(b\) is not summable but obeys (4);
-
(iv)
\(\mu (\cdot , t)\) satisfies (5) for all \(t > 0\) and for all \(\psi \) in \(\text {BL}({\mathbb {R}}^3)\).
From now on, the term solution of (1) has to be meant as weak solution, according to the above definition, of the Cauchy problem with initial datum \(\mu _0\). Tanaka [61] gave a rigorous result of existence and uniqueness for weak solutions by probabilistic arguments. The sole assumption required on \(\mu _0\), only if \(b\) is not summable but obeys (4), is the finiteness of the absolute first moment. Apropos of the uniqueness, see also [62].
It should be recalled that, in the non-cutoff case, existence can be recovered from the existence of the solution to the SHBEMM with cutoff, via a truncation procedure originally introduced in [1]. More precisely, given a non-summable \(b\) satisfying (4) and a p.d. \(\mu _0\) on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\) with finite first absolute moment, consider the sequence of collision kernels \(\{[b(x) \wedge n]/B_n\}_{n \ge 1}\), with \(B_n := \int _{0}^{1} [b(x) \wedge n] \text {d}x\). Since \([b(x) \wedge n]/B_n\) satisfies (3), one can find the solution \(\mu _n(\cdot , t)\) to (1), with \(b\) replaced by \([b(x) \wedge n]/B_n\) and initial datum \(\mu _0\). Following [1, 31], it can be shown that \(\mu _n(\cdot , B_n t)\) converges weakly as \(n\) goes to infinity to some limit \(\mu (\cdot , t)\), for every \(t \ge 0\), and that \(\mu (\cdot , t)\) turns out to be the solution to the original Cauchy problem. Recall that a sequence \(\{P_n\}_{n \ge 1}\) of p.d.’s on some topological space \(\text {T}\), endowed with its Borel \(\sigma \)-algebra, converges weakly to a p.d. \(P\) on the same space if and only if \(\lim _{n \rightarrow \infty } \int _{\text {T}} h \text {d}P_n = \int _{\text {T}} h \text {d}P\), for every bounded and continuous function \(h\) on \(\text {T}\). Henceforth, this kind of convergence will be denoted with \(P_n \Rightarrow P\).
Apropos of the long-time behavior of \(\mu (\cdot , t)\), a well-known fact is the macroscopic conservation of momentum and kinetic energy, i.e.
for every \(t \ge 0\), which hold true when the hypothesis
is in force. Section 8 of [61] is a reference also for the non-cutoff case. Another fundamental fact is that the equilibrium corresponds to the so-called Maxwellian distribution
which is characterized by the first two moments \(\mathbf{v}_0 = \int _{{\mathbb {R}}^3}\mathbf{v}\mu _0(\text {d}\mathbf{v})\) and \(\sigma ^2 = \frac{1}{3} \int _{{\mathbb {R}}^3}|\mathbf{v}- \mathbf{v}_0|^2 \mu _0(\text {d}\mathbf{v})\). Note that \(\gamma _{\mathbf{v}_0, 0}\) stands for the unit mass \(\delta _{\mathbf{v}_0}\) at \(\mathbf{v}_0\). The already quoted paper [61] proves that, under (4) and (7), \(\mu (\cdot , t)\Rightarrow \gamma _{\mathbf{v}_0, \sigma ^2}\) as \(t\) goes to infinity.
1.2 The conjecture and its motivations
Relaxation to equilibrium of solutions to the Boltzmann equation is at the core of kinetic theory ever since the works of Boltzmann himself. The importance of accurate estimates of the rate of convergence is tightly connected with the issue on the physical value of any convergence statement of Boltzmann-equation solutions w.r.t. the time scale on which the Boltzmann description may be relevant. See, for example, Section 2 of Chapter 2C of [65]. Within this framework, a first preliminary question arises apropos of the choice of the topology in which this convergence ought to take place, keeping in mind that one is dealing with convergence of probability measures (p.m.’s). In fact, the literature has dealt with a variety of probability metrics, but no doubt the total variation distance (t.v.d.) continues to be a formidable reference for the study of relaxation to equilibrium in kinetic models. Recall that, for any pair \((\alpha , \beta )\) of p.d.’s on some measurable space \((S, {\fancyscript{S}})\), such a distance is defined by
and that it can be written as
when \(\lambda \) is any \(\sigma \)-finite measure dominating both \(\alpha \) and \(\beta \), and \(p\), \(q\) are probability density functions w.r.t. \(\lambda \) of \(\alpha \) and \(\beta \), respectively. See Chapter III of [60] for more information. Once the right metric has been singled out, the problem of convergence to equilibrium is greatly enhanced by the knowledge of the rate of approach to the limiting distribution and even more so by a precise bound on the error in approximating the limit for each fixed instant. To introduce the reader to the essential part of the problem, we recall that, for an entire class \({\mathcal {I}}\) of initial data \(\mu _0\), one can prove that
is met with a suitable constant \(C_{*}\) and
This result can be reached from a well-known statement by Ikenberry and Truesdell [46], according to which
holds true with suitable constants \(C_{\varvec{\alpha }}\), for any multi-index \({\varvec{\alpha }}\) such that \(\int _{{\mathbb {R}}^3}|\mathbf{v}|^{|{\varvec{\alpha }}|} \mu _0(\text {d}\mathbf{v}) < +\infty \). Recently, it has been proved that \({\mathcal {I}}\) contains all the p.d.’s \(\mu _0\) satisfying \(\int _{{\mathbb {R}}^3}e^{i \varvec{\xi }\cdot \mathbf{v}} \mu _0(\text {d}\mathbf{v}) = \int _{{\mathbb {R}}} e^{i |\varvec{\xi }| x} \zeta _0(\text {d}x)\) for every \(\varvec{\xi }\) in \({\mathbb {R}}^3\), where \(\zeta _0\) is a symmetric p.d. on \(({\mathbb {R}}, {\fancyscript{B}}({\mathbb {R}}))\) with non-zero kurtosis coefficient. See [33]. Such being the case, inequality (9) is conducive to checking whether it is possible to establish also the reverse relation
Actually, when (9) and (12) are in force simultaneously, \(\varLambda _b\) can be viewed as the best rate of exponential convergence of \(\mu (\cdot , t)\) to equilibrium. The characterization of the largest class of initial data for which (12) is valid is commonly referred to as McKean’s conjecture. The reference to McKean is due to the fact that, relative to the solution \(\mu (\cdot , t)\) of the well-known Kac’s simplification of the SHBEMM, he was the first to prove rigorously, in [50], that \(\text {d}_{\text {TV}}(\mu (\cdot , t), \gamma _{\mathbf{v}_0, \sigma ^2}) \le C^{'} e^{\lambda t}\) holds true with \(\lambda \approx -0.016\) and for a suitable constant \(C^{'}\). However, this value of \(\lambda \) is strictly greater than \(\varLambda _b\), equal to \(-1/4\) in the case of Kac’s equation. See [32].
As a completion of the argument, it is interesting to point out the meaning of \(\varLambda _b\) w.r.t. the asymptotic behavior of \(\mu (\cdot , t)\). Besides the important role played in (11), \(\varLambda _b\) represents also the least negative eigenvalue of the linearized collision operator
defined on \({\mathcal {H}} := \text {L}^2({\mathbb {R}}^3, \gamma _{\mathbf{v}_0, \sigma ^2}(\text {d}\mathbf{x}))\). Hilbert [44] was the first to derive this operator from a linearization of (1) and to highlight the opportunity of choosing the domain \({\mathcal {H}}\) with a view to carrying out the spectral analysis. In the Hilbert setting, \(L_b\) turns out to be self-adjoint and negative with discrete spectrum and \(|\varLambda _b|\) represents the spectral gap. See [28]. Finally, it is worth recalling that \(\varLambda _b\) arises also in Kac-like derivations of the SHBEMM [47], based on a stochastic evolution of an \(N\)-particle system. See [15, 19].
1.3 A glance at the literature on McKean’s conjecture
The formulation of the Boltzmann H-theorem originated a significant mathematical research, aimed at studying the convergence to equilibrium in total variation, whose first rigorous outcomes are in [10, 54]. In any case, in spite of the huge literature on this subject, the number of works which expressly pursued the validation of the conjecture is small. Essentially, four lines of research have been followed to achieve the goal, based on: (1) use of contractive functionals or probability metrics; (2) entropy methods; (3) linearization; (4) central limit theorem. (1) As for the first line of research, the papers [18, 40, 56, 61, 64] are worth mentioning. In particular, Theorem 1.1 in [18] constitutes the closest result to the McKean conjecture obtained so far. It is valid only under (3) and states that, for every \(\varepsilon > 0\), there is \(C_{\varepsilon }(\mu _0, b)\) such that
holds for every \(t \ge 0\), but this \(C_{\varepsilon }\) goes to infinity as \(\varepsilon \) goes to zero. Therefore, the presence of \(\varepsilon \), together with such a behavior of \(C_{\varepsilon }(\mu _0, b)\), defeats the hope of extending (13) to the solution of the SHBEMM of non-cutoff type through the truncation argument explained in Sect. 1.1: A strong motivation for the pursuit of a bound with \(\varepsilon = 0\) and of a constant \(C(\mu _0)\) depending only on \(\mu _0\), in the place of \(C_{\varepsilon }(\mu _0, b)\). Moreover, (13) has been deduced thanks to rather strong conditions on \(\mu _0(\text {d}\mathbf{x}) = f_0(\mathbf{x}) \text {d}\mathbf{x}\), such as finiteness of all absolute moments, Sobolev regularity and finiteness of the Linnik functional. (2) Entropy methods aim at proving quantitative H-theorems, on the basis of the seminal ideas introduced in [11, 12]. An attempt to improve this strategy, towards the achievement of the McKean conjecture, was represented by the Cercignani conjecture which, however, proved to be false in the case of Maxwellian molecules. See [9, 66]. Anyway, quantitative H-theorems are still considered as conducive to the most powerful strategy to study relaxation to equilibrium in non-homogeneous frameworks. See [26]. (3) The strategy of the linearization is outlined in [29, 42]. It gives general positive answers to the problem of quantifying the relaxation to equilibrium only when the solution enters a small neighborhood of the equilibrium itself, so that the spectral analysis of \(L_b\), as an operator on \({\mathcal {H}}\), becomes relevant to the study of the nonlinear problem. It is only recently that, in the case of the homogeneous Boltzmann equation with hard potentials, the linearization has been used successfully to prove the conjecture. See [55]. However, the radical difference between the situation of hard potentials and that of Maxwellian molecules hampers a direct extension of the positive conclusion from the former to the latter. (4) Finally, the link with the central limit theorem discovered by McKean in [50, 51] has been taken into serious consideration only recently in [13, 14], two works which have strongly inspired and motivated our program.
1.4 The main result
A precise and complete formulation is encapsulated in the following theorem where \(\hat{\mu }\) stands for the Fourier transform of the p.d. \(\mu \) on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\), namely \(\hat{\mu }(\varvec{\xi }) := \int _{{\mathbb {R}}^3}e^{i \varvec{\xi }\cdot \mathbf{v}} \mu (\text {d}\mathbf{v})\) for \(\varvec{\xi }\) in \({\mathbb {R}}^3\).
Theorem 1
Assume that (2) and (4) are in force and that the initial datum \(\mu _0\) satisfies
and
for some strictly positive \(p\). Then, the solution \(\mu (\cdot , t)\) meets
for every \(t \ge 0\), where \(\varLambda _b\) is given by (10) and \(C(\mu _0)\) is a positive constant which depends only on \(\mu _0\).
Indications for numerical evaluation of \(C(\mu _0)\) can be derived from specific passages of the proof, in Sect. 2.2. With reference to the SHBEMM with cutoff, this theorem represents the first direct validation of the McKean conjecture, without unnecessary extra-conditions. Moreover, as far as the non-cutoff case is concerned, the same theorem is, at the best of our knowledge, the only existing sharp quantification of the speed of convergence to equilibrium. A detailed explanation of these points is given in the following
Remarks
-
1.
The proof of Theorem 1 will be developed, in Sect. 2.2, under the cutoff condition (3). Indeed, once (16) has been established under (3), one can resort to the truncation procedure described in Sect. 1.1 to write, for every \(n\) in \({\mathbb {N}}\),
$$\begin{aligned} \text {d}_{\text {TV}}(\mu _n(\cdot , B_n t), \gamma _{\mathbf{v}_0, \sigma ^2}) \le C(\mu _0) \exp \left\{ -2 t \int \limits _{0}^{1} x^2(1 - x^2) [b(x) \wedge n] \text {d}x\right\} . \end{aligned}$$Now, the combination of this inequality with
$$\begin{aligned} \text {d}_{\text {TV}}(\mu (\cdot , t), \gamma _{\mathbf{v}_0, \sigma ^2}) \le \liminf _{n \rightarrow \infty } \text {d}_{\text {TV}}(\mu _n(\cdot , B_n t), \gamma _{\mathbf{v}_0, \sigma ^2}) \end{aligned}$$leads to the desired conclusion.
-
2.
Let us now discuss assumption (14). It is interesting to recall that, under the cutoff condition, convergence in the total variation metric to the Maxwellian holds under (7). See [20]. The necessity of this condition, in a cutoff setting, is stated both in [16] and in Theorem 3 of the present paper. In [20] it is also shown that convergence to equilibrium, under the sole assumption of finiteness of the second moment of \(\mu _0\), could be arbitrarily slow, whereas the finiteness of the \((2+\delta )\)-th absolute moment, for some \(\delta > 0\), is enough to get exponentially decreasing bounds. Nevertheless, if \(\delta < 2\), these bounds can be worse than that conjectured by McKean. Here is an example which shows that, even if the tail condition (15) is fulfilled, the desired bound is not achieved because of “infinitesimal” deviations from hypothesis (14). Consider the class of initial data \(\mu ^{(q)}_{0}(\text {d}\mathbf{v}) = f_{0, q}(\mathbf{v}) \text {d}\mathbf{v}\) with
$$\begin{aligned} f_{0, q}(\mathbf{v}) = \frac{q}{4\pi \ |\mathbf{v}|^{3 + q}} 1\!\!1_{\{|\mathbf{v}| \ge 1\}} \end{aligned}$$for \(q\) in \((3, 4)\). The Fourier transform of this density at \(\varvec{\xi }\) is
$$\begin{aligned} 1 - \frac{q}{6 (q - 2)} |\varvec{\xi }|^2 - \frac{\varGamma (1 - q) \cos (q \pi /2)}{1+q} |\varvec{\xi }|^q - q \sum _{m \ge 2} \frac{(-1)^m |\varvec{\xi }|^{2m}}{(2m + 1)! (2m - q)} \end{aligned}$$which meets \(\hat{\mu }^{(q)}_{0}(\varvec{\xi }) = O(|\varvec{\xi }|^{-1})\) when \(|\varvec{\xi }|\) goes to infinity. Then, \(\mu ^{(q)}_{0}\) satisfies (15) and has finite absolute moment of order \((3 + \delta )\) for every \(\delta \) in \((0, q - 3)\), but has infinite absolute fourth moment. Denoting by \(\mu ^{(q)}(\cdot , t)\) the solution of (1) relative to \(\mu ^{(q)}_{0}\), one can mimic the argument explained in [33] to prove that
$$\begin{aligned} \text {d}_{\text {TV}}(\mu ^{(q)}(\cdot , t), \gamma _{\mathbf{0}, \sigma ^2}) \ge C_q \exp \{-(1 - 2l_q(b)) t\} \end{aligned}$$holds for every \(t \ge 0\), where \(3\sigma ^2 = q/(q - 2)\), \(C_q\) is a strictly positive constant independent of \(b\), \(l_q(b) := \int _{0}^{1} (1 - x^2)^{q/2} b(x) \text {d}x\) and \(\varLambda _b < -(1 - 2l_q(b)) < 0\).
-
3.
As far as the tail assumption (15) is concerned, it is worth noting that it is implied by the finiteness of the Linnik functional, according to Lemma 2.3 in [18]. Also the relationship between (15) and certain regularity conditions adopted to guarantee the validity of classical local limit theorems of probability theory are worth noting. See, for example, Theorem 19.1 in [7].
1.5 A probabilistic representation of the solution
The proof of Theorem 1 relies on a representation of the solution \(\mu (\cdot , t)\)—already proposed and studied in [27]—which is valid under the cutoff condition (3). The motivation for this representation is twofold. On the one hand, it leads us to study the problem of convergence to equilibrium from the standpoint of the central limit problem of probability theory. On the other hand, it lends itself to computability of certain derivatives of the Fourier transform of \(\mu (\cdot , t)\) involved in the first steps of the proof of Theorem 1. See, for example, (58) below. It should be mentioned that the existing representations, essentially based on the Bobylev identity (see Section 3 of [8]), turn out to be unfit for the aforesaid computations.
In a nutshell, the probabilistic representation at issue states that
for every \(t \ge 0\) and every \(B\) in \({\fancyscript{B}}({{\mathbb {R}}}^3)\), where \({\mathsf{E}}_t\) is an expectation and \({\mathcal {M}}\) is a random p.m. connected with a distinguished weighted sum of random vectors, to be defined below. Here and in the rest of the paper we use the term random p.m. to designate any measurable function from some measurable space into the space \(\mathcal {P}({\mathbb {R}}^3)\) of all p.m.’s on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\), endowed with the Borel \(\sigma \)-algebra of weak convergence of p.m.’s. See, e.g., Chapters 11–12 of [48] for further details. Then, to carry out our programme, it remains to provide the reader with those definitions and preliminary results which are necessary to understand (17). In this way, we shall present also the core of the notation used in the rest of the paper.
The starting point is the introduction of the sample space
where: For any nonempty set \(X\), \(X^{\infty }\) stands for the set of all sequences \((x_1, x_2, \ldots )\) whose elements belong to \(X\); \({\mathbb {T}} := \mathsf X _{\begin{array}{c} n \ge 1 \end{array}} {\mathbb {T}}(n)\) and \({\mathbb {T}}(n)\) is the (finite) set of all McKean binary trees with \(n\) leaves. We write \({\mathfrak {t}}_n\) to denote an element of \({\mathbb {T}}(n)\). Then, \({\mathfrak {t}}_{n, k}\) indicates the germination of \({\mathfrak {t}}_n\) at its \(k\)-th leaf, obtained by appending a two-leaved tree to the \(k\)-th leaf of \({\mathfrak {t}}_n\). Finally, \({\mathfrak {t}}_n^l\) and \({\mathfrak {t}}_n^r\) symbolize the two trees, of \(n_l\) and \(n_r\) leaves respectively, obtained by a split-up of \({\mathfrak {t}}_n\). See [13, 14, 37, 50, 51] for a more detailed explanation of these concepts, and [34] for a recent and comprehensive treatment of random trees.
Then, associate with \(\Omega \) the \(\sigma \)-algebra
where \(2^X\) stands for the power set of \(X\) and \({\fancyscript{B}}(X)\) for the Borel class on \(X\). Define
to be the coordinate random variables of \(\Omega \) and, by them, generate the \(\sigma \)-algebras
Now, for every \(t \ge 0\), consider the unique p.d. \({\mathsf{P}}_t\) on \((\Omega , {\fancyscript{F}})\) which makes the random coordinates stochastically independent, consistently with the following marginal p.d.’s:
-
(a)
$$\begin{aligned} {\mathsf{P}}_t[ \nu = n ] = e^{-t}(1 - e^{-t})^{n-1} \quad \ (n = 1, 2, \ldots ) \end{aligned}$$(18)
with the proviso that \(0^0 := 1\).
-
(b)
\(\{\tau _n\}_{n \ge 1}\) is a Markov sequence driven by
$$\begin{aligned} \begin{array}{l} {\mathsf{P}}_t[\tau _1 = {\mathfrak {t}}_1] = 1 {} \\ {\mathsf{P}}_t[ \tau _{n+1} = {\mathfrak {t}}_{n, k}\ | \ \tau _n = {\mathfrak {t}}_n] = \frac{1}{n} \quad \ \text {for}\ k = 1, \ldots , n \\ {\mathsf{P}}_t[ \tau _{n+1} = {\mathfrak {t}}_{n+1} \ | \ \tau _n = {\mathfrak {t}}_n] = 0 \quad \text {if}\ {\mathfrak {t}}_{n+1} \not \in {\mathbb {G}}({\mathfrak {t}}_n) \end{array} \end{aligned}$$(19)for every \(n\) in \({\mathbb {N}}\) and \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\), where, for a given \({\mathfrak {t}}_n\), \({\mathbb {G}}({\mathfrak {t}}_n)\) is the subset of \({\mathbb {T}}(n+1)\) containing all the germinations of \({\mathfrak {t}}_n\).
-
(c)
The elements of \(\{\phi _n\}_{n \ge 1}\) are i.i.d. random numbers with p.d.
$$\begin{aligned} \beta (\text {d}\varphi ) := \frac{1}{2} b(\cos \varphi ) \sin \varphi \text {d}\varphi , \quad (\varphi \in [0, \pi ]). \end{aligned}$$(20) -
(d)
The elements of \(\{\vartheta _n\}_{n \ge 1}\) are i.i.d. with uniform p.d. on \((0, 2\pi )\), \(u_{(0, 2\pi )}\).
-
(e)
The elements of \(\{\mathbf{V}_n\}_{n \ge 1}\) are i.i.d. with p.d. \(\mu _0\), the initial datum of the Cauchy problem relative to (1).
According to the above notation, \({\mathsf{E}}_t\) denotes expectation w.r.t. \({\mathsf{P}}_t\).
A constituent of the representation under study is \({\varvec{\pi }} := \{\pi _{j, n}\ | \ j = 1, \ldots , n; n \in {\mathbb {N}}\}\), an array of \([-1,1]\)-valued random numbers. They are obtained by setting
for \(j = 1, \dots , n\) and \(n\) in \({\mathbb {N}}\). The \(\pi _{j, n}^{*}\)’s are functions on \({\mathbb {T}}(n) \times [0, \pi ]^{n-1}\) defined by putting \(\pi _{1, 1}^{*} \equiv 1\) and, for \(n \ge 2\),
for every \({\varvec{\varphi }} = ({\varvec{\varphi }}^l, {\varvec{\varphi }}^r, \varphi _{n-1})\) in \([0, \pi ]^{n-1}\), with
An induction argument shows that
for every \(n\) in \({\mathbb {N}}\). It is also worth recalling the identity
valid for every \(t, s > 0\), with \(l_s(b) := \int _{0}^{1} (1 - x^2)^{s/2} b(x) \text {d}x\). The original derivation is in [37] but, for the sake of completeness, we have included its proof in Appendix A.1. Throughout the paper, A.\(n\) designates the \(n\)-th subsection of the Appendix. With a view to the proof of Theorem 1, it is interesting to point out that
Another constituent of the desired representation is the array \({\mathbf{O }} := \{\text {O}_{j, n} | j = 1, \ldots , n; n \in {\mathbb {N}}\}\) of random matrices \(\text {O}_{j, n}\), taking values in the Lie group \({\mathbb {SO}}(3)\) of orthogonal matrices with positive determinant, defined by
for \(j = 1, \dots , n\) and \(n\) in \({\mathbb {N}}\). The \(\text {O}_{j, n}^{*}\)’s are \({\mathbb {SO}}(3)\)-valued functions obtained by putting \(\text {O}_{1, 1}^{*} \equiv \text {Id}_{3 \times 3}\) and, for \(n \ge 2\),
for every \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\), \({\varvec{\varphi }}\) in \([0, \pi ]^{n-1}\) and \({\varvec{\theta }}\) in \((0, 2\pi )^{n-1}\). Here
and, finally,
Working out the recursion formula (26) gives
where \(\prod _{h=1}^{n} A_h := A_1 \times \dots \times A_n\) and \(\delta _j({\mathfrak {t}}_n)\) indicates the depth of the \(j\)-th leaf of \({\mathfrak {t}}_n\), that is the number of generations separating this leaf from the root (the top node of the tree). The \(\epsilon _h({\mathfrak {t}}_n, j)\)’s take values in \(\{l, r\}\) and, in particular, \(\epsilon _1({\mathfrak {t}}_n, j)\) equals \(l\) (\(r\), respectively) if \(j \le n_l\) (\(j > n_l\), respectively). Then,
when \(h \ge 2\). Each \(m_h\) belongs to \(\{1, \dots , n-1\}\) and \(m_1 \ne \dots \ne m_{\delta _j({\mathfrak {t}}_n)}\). In fact, \(m_1({\mathfrak {t}}_n, j) := n-1\) for every \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\), \(j = 1, \dots , n\), and
when \(h \ge 2\).
Now, choose a non-random measurable function \(\text {B}\) from \(S^2\) onto \({\mathbb {SO}}(3)\) such that \(\text {B}(\mathbf{u}) \mathbf{e}_3 = \mathbf{u}\) for every \(\mathbf{u}\) in \(S^2\), and define the random functions \(\varvec{\psi }_{j, n} : S^2 \rightarrow S^2\) through the relation
for \(j = 1, \dots , n\) and \(n\) in \({\mathbb {N}}\), with \(\mathbf{e}_3 := (0, 0, 1)^t\). It should be noticed that such a \(\text {B}\) actually exists and that it cannot be continuous. See, e.g., Chapter 5 of [45].
The central object of our construction is the random sum
whose characteristic function (c.f.) serves the new representation according to
Theorem 2
Assume that (2)–(3) are in force. Then, the function
with \(\varvec{\xi }\) in \({\mathbb {R}}^3{\setminus }\{\mathbf{0}\}\), \(\rho := |\varvec{\xi }|\), \(\mathbf{u}:= \varvec{\xi }/|\varvec{\xi }|\) and \({\varvec{\phi }} := (\phi _1, \dots , \phi _{\nu - 1})\), is the Fourier transform of a random p.d. on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\), denoted by \({\mathcal {M}}\). This \({\mathcal {M}}\) turns out to be independent of the choice of \({\mathrm {B}}\), and satisfies (17) for every \(t \ge 0\) and \(B\) in \({\fancyscript{B}}({{\mathbb {R}}}^3)\).
The proof of the theorem is contained in Sect. 2.1. Many relevant properties of \({\mathcal {M}}\) rely on the analysis of the random function
which, as a function of \(\varvec{\xi }\), is not c.f. and depends on the choice of \(\text {B}\).
One of the merits of representation (17) is that it allows the formulation of a central limit-like theorem for the asymptotic behavior of the solution of the SHBEMM with cutoff, condensed in the following
Theorem 3
When (2)–(3) are in force, \(\mu (\cdot , t)\) converges weakly as \(t\) goes to infinity if and only if (7) holds true. Moreover, in case this condition is satisfied, the limiting distribution is given by (8).
As already mentioned at the beginning of Remark 2, this theorem is well-known. In fact, the “if part” was proved in [20, 61], while the “only if” part was proved, in a quite different way, in [16]. What is new is the proof we develop in Sect. 2.3 on the basis of (17).
2 Proofs
In this section, we present the skeleton of the proofs of Theorems 1, 2 and 3. Some technical issues are deferred to the Appendix and to [30, 31]. We start from the basic representation formulated in Theorem 2.
2.1 Proof of Theorem 2
When (2)–(3) are in force, recall that \(\mu (\cdot , t)\) can be expressed by means of the so-called Wild-McKean sum [51, 67], namely
for every \(t \ge 0\) and \(B\) in \({\fancyscript{B}}({{\mathbb {R}}}^3)\). According to McKean, the weights \(p_n({\mathfrak {t}}_n)\) are defined inductively starting from \(p_1({\mathfrak {t}}_1) := 1\) and then putting
for every \(n \ge 2\) and \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\). These \(p_n\)’s are connected with the p.d. of \(\{\tau _n\}_{n \ge 1}\) through the identity
valid for every \(n\) in \({\mathbb {N}}\) and \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\). See Appendix A.2 for the proof. As far as the \({\mathcal {Q}}_{{\mathfrak {t}}_n}\)’s are concerned,
where \({\mathcal {Q}}\) is an operator which sends a pair \((\zeta , \eta )\) belonging to \(\mathcal {P}({\mathbb {R}}^3)\times \mathcal {P}({\mathbb {R}}^3)\) into a new element \({\mathcal {Q}}[\zeta , \eta ]\) of \(\mathcal {P}({\mathbb {R}}^3)\) according to the following rule. First, take two sequences \(\{\zeta _n\}_{n\ge 1}\) and \(\{\eta _n\}_{n\ge 1}\) of absolutely continuous p.m.’s such that \(\zeta _n \Rightarrow \zeta \) and \(\eta _n \Rightarrow \eta \), and denote with \(p_n\) (\(q_n\), respectively) the density of \(\zeta _n\) (\(\eta _n\), respectively). Then, denoting the limit w.r.t. weak convergence by \(\text {w-lim}\), put
where
Note that \(Q[p, q] = Q[q, p]\), as a consequence of (2). As shown in [31], the limit in (35) exists and is independent of the choice of the approximating sequences \(\{\zeta _n\}_{n\ge 1}\) and \(\{\eta _n\}_{n\ge 1}\).
To carry on with the proof, consider the Fourier transform and apply the well-known Bobylev formula, as in [31], to get
for every \(\varvec{\xi }\) in \({\mathbb {R}}^3{\setminus }\{\mathbf{0}\}\). This, by the change of variable \(\varvec{\omega }= \varvec{\omega }(\varphi , \theta , \varvec{\xi }) = \sin \varphi \cos \theta \mathbf{a}(\mathbf{u}) + \sin \varphi \sin \theta \mathbf{b}(\mathbf{u}) + \cos \varphi \mathbf{u}\), becomes
where \(\rho = |\varvec{\xi }|\), \(\mathbf{u}= \varvec{\xi }/|\varvec{\xi }|\) and \({\varvec{\psi }}^l\), \({\varvec{\psi }}^r\) are abbreviations for the quantities
which depend on the choice of the orthonormal basis \(\{\mathbf{a}(\mathbf{u}), \mathbf{b}(\mathbf{u}), \mathbf{u}\}\) of \({\mathbb {R}}^3\). The components of this basis are exactly the columns of the matrix \(\text {B}\) introduced in (29). The inner integral in (36), that is
has interesting properties, which are at the basis of the new representation (17). In particular, \(I\) is a measurable function of \((\varvec{\xi }, \varphi )\) independent of the choice of \(\{\mathbf{a}(\mathbf{u}), \mathbf{b}(\mathbf{u}), \mathbf{u}\}\). Moreover, for every fixed \(\varphi \) in \([0, \pi ]\), \(I(\cdot , \varphi )\) is the Fourier transform of a p.m. on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\), say \({\mathcal {C}}[\zeta , \eta ; \varphi ]\), that is \(I(\varvec{\xi }, \varphi ) = \hat{\mathcal {C}}[\zeta , \eta ; \varphi ](\varvec{\xi })\) for every \(\varvec{\xi }\) in \({\mathbb {R}}^3\). The link with \({\mathcal {Q}}\) is given by
for every \(B\) in \({\fancyscript{B}}({{\mathbb {R}}}^3)\). The proof of these facts is contained in Appendix A.3. At this stage, mimicking the iteration procedure developed for \({\mathcal {Q}}\) leads to the following definition
for every \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\) and \({\varvec{\varphi }}\) in \([0, \pi ]^{n-1}\), with the proviso that \({\varvec{\varphi }}^l\) (\({\varvec{\varphi }}^r\), respectively) is void when \(n_l\) (\(n - n_l\), respectively) is equal to one. For every \(n \ge 2\) and \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\), the mapping \({\varvec{\varphi }} \mapsto {\mathcal {C}}_{{\mathfrak {t}}_n}[\mu _0; {\varvec{\varphi }}]\) is a random p.m. and
holds true for every \(B\) in \({\fancyscript{B}}({{\mathbb {R}}}^3)\), as proved in Appendix A.4. In view of this link, the Wild-McKean sum can be re-written as
which coincides with \({\mathsf{E}}_t\left[ {\mathcal {C}}_{\tau _{\nu }}[\mu _0; (\phi _1, \dots , \phi _{\nu - 1})](B)\right] \). Therefore, to show the validity of (17), it is enough to verify that \({\mathcal {M}}(B) = {\mathcal {C}}_{\tau _{\nu }}[\mu _0; (\phi _1, \dots , \phi _{\nu - 1})](B)\) for every \(B\) in \({\fancyscript{B}}({{\mathbb {R}}}^3)\) or, equivalently, that
hold true for every \(n \ge 2\), \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\), \({\varvec{\varphi }}\) in \([0, \pi ]^{n-1}\) and \(\varvec{\xi }\ne \mathbf{0}\). The \(\mathbf{q}_{j, n}\)’s are defined inductively starting from \(\mathbf{q}_{1, 1}({\mathfrak {t}}_1, \emptyset , \emptyset , \mathbf{u}) := \mathbf{u}\) and then putting
for every \(n \ge 2\), \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\), \({\varvec{\varphi }}\) in \([0, \pi ]^{n-1}\) and \({\varvec{\theta }}\) in \((0, 2\pi )^{n-1}\).
To prove (41), first consider the case when \(n = 2\) and observe that \(\pi _{1, 2}^{*} = \cos \varphi _1\), \(\pi _{2, 2}^{*} = \sin \varphi _1\), \(\mathbf{q}_{1, 2} = \varvec{\psi }^l\), \(\mathbf{q}_{2, 2} = \varvec{\psi }^r\). Then, (41) reduces to (38) with \(\zeta = \eta = \mu _0\). Next, by mathematical induction, assume \(n \ge 3\) and combine (38) with the definition of \({\mathcal {C}}_{{\mathfrak {t}}_n}\) to write
Thus, assuming that (41) holds true for every \(m\) in \(\{1, \dots , n-1\}\) and every tree \({\mathfrak {t}}_{m}\) in \({\mathbb {T}}(m)\), deduce
where \((s, x)\) is \((l, \rho \cos \varphi _{n-1})\) or \((r, \rho \sin \varphi _{n-1})\). To complete the argument, combine the last two equalities with (22) and (43).
As far as the proof of (42) is concerned, start by noting that \(\mathbf{q}_{j, 2}({\mathfrak {t}}_2, \varphi , \theta , \mathbf{u})\) equals \(\text {B}(\mathbf{u})\text {O}_{j, 2}^{*}({\mathfrak {t}}_2, \varphi , \theta ) \mathbf{e}_3\) for \(j = 1, 2\), for every \(\varphi \) in \([0, \pi ]\), \(\theta \) in \((0, 2\pi )\) and \(\mathbf{u}\) in \(S^2\), provided that the basis \(\{\mathbf{a}(\mathbf{u}), \mathbf{b}(\mathbf{u}), \mathbf{u}\}\) in (37) is formed by the three columns of \(\text {B}(\mathbf{u})\). Then, assume \(n \ge 3\) and argue by induction starting from (41), definitions (26) and (43). Whence,
where
and
For the sake of clarity, the integral \(\int _{(0, 2\pi )^{n_l-1}}\) (\(\int _{(0, 2\pi )^{n_r-1}}\), respectively) in (45) should not be written if \(n_l = 1\) (\(n_r = 1\), respectively) since \(\varvec{\theta }^l\) (\(\varvec{\theta }^r\), respectively) corresponds to the empty set. At this stage, it will be proved that
holds for every \(\rho \) in \({\mathbb {R}}\), \(\mathbf{u}\) in \(S^2\), \(\varvec{\varphi }\) in \([0, \pi ]^{n-1}\) and \(\theta _{n-1}\) in \((0, 2\pi )\). If \(n_l = 1\), the proof of (46) reduces to verify that
To proceed, since the third column of \(\text {B}(\varvec{\psi }^l(\varphi _{n-1}, \theta _{n-1}, \mathbf{u}))\) is the same as that of \(\text {B}(\mathbf{u}) \text {M}^l(\varphi _{n-1}, \theta _{n-1})\), then there exists an orthogonal matrix
for which \(\text {B}(\varvec{\psi }^l(\varphi _{n-1}, \theta _{n-1}, \mathbf{u})) = \text {B}(\mathbf{u}) \text {M}^l(\varphi _{n-1}, \theta _{n-1}) \text {R}(\alpha )\), where \(\alpha \) depends only on \(\varphi _{n-1}\), \(\theta _{n-1}\) and \(\mathbf{u}\). Now, note that \(\text {R}(\alpha ) \text {M}^s(\varphi , \theta ) = \text {M}^s(\varphi , \theta + \alpha )\) is valid for \(s = l, r\) and for every \(\varphi \) and \(\theta \). Then, when \(n_l \ge 2\), consider the definition of \(P_{j,n}^{l}\), recall (28) and take account that the product \(\text {R}(\alpha )\text {M}^{\epsilon _{1}({\mathfrak {t}}_n^l, j)}(\varphi _{n_l-1}, \theta _{n_l-1})\) equals \(\text {M}^{\epsilon _{1}({\mathfrak {t}}_n^l, j)}(\varphi _{n_l-1}, \theta _{n_l-1} + \alpha )\). The change of variable \(\theta _{n_l-1}^{'} = \theta _{n_l-1} + \alpha \) transforms the LHS of (46) into
which, in view of (26), turns out to be the same as the RHS of (46). The proof of (42) is completed using (45), after noting that an equality similar to (46) can be stated by changing subscripts and superscripts from \(l\) to \(r\), and replacing \(\text {O}_{j, n}^{*}\) with \(\text {O}_{j + n_l, n}^{*}\).
Finally, the invariance of \({\mathcal {M}}\) w.r.t. \(\text {B}\) is equivalent to the invariance of representation (42) when \(\text {B}(\mathbf{u})\) is replaced by any matrix \(\text {B}^{'}(\mathbf{u})\) having the same characteristics as \(\text {B}(\mathbf{u})\). Anyway, such an equivalence follows from the above reasoning.
2.2 Proof of Theorem 1
In the first place, we recall that the entire proof will be developed under hypotheses (2)–(3) on \(b\), in view of Remark 1 in Sect. 1.4. Then, we set a few conditions on \(\mu _0\) to simplify a number of arguments without loss of generality. In this sense, we make use of (6) to assume, from now on,
implying that the limiting Maxwellian is \(\gamma := \gamma _{\mathbf{0}, 1}\). We also assume that the covariance matrix \(V = V[\mu _0]\) of \(\mu _0\) is diagonal. In fact, since for any covariance matrix \(V\) there is an orthogonal matrix \(Q\) such that \(Q V Q^t\) is diagonal, then \(\mu _0 \circ f_{Q}^{-1}\) has a diagonal covariance matrix, \(f_Q\) standing for the function \(\mathbf{x}\mapsto Q\mathbf{x}\). At this stage, since \(\text {d}_{\text {TV}}( \mu (\cdot , t) \circ f_{Q}^{-1}, \gamma )\) is equal to \(\text {d}_{\text {TV}}( \mu (\cdot , t), \gamma )\) for every \(t\), we can prove (16) by taking \(\mu _0 \circ f_{Q}^{-1}\) as initial distribution. Compare [31] for a more detailed explanation. Hence, we suppose that
are in force. In fact, extra-conditions (47)–(48) yield the following
Proposition 4
Let \(\mu _0\) satisfy (47)–(48) in addition to the hypotheses of Theorem 1. Then, there exists a constant \(\lambda \) such that
is valid for every \(\varvec{\xi }\) in \({\mathbb {R}}^3\), with \(q = 1/(2 \lceil 2/p \rceil )\).
Here, \(\lceil x \rceil \) indicates the least integer not less than \(x\), while \(p\) is the same as in (15). As to the numerical evaluation of \(\lambda \), the reader is referred to the proof of the proposition in Appendix A.5.
As first step of the real proof, an application of (17) yields
After introducing the random number
we put
to define the partition \(\{U, U^c\}\) of \(\Omega \) by
This can be used to write
where \({\mathsf{E}}_t[X; S]\) denotes \(\int _{S} X \text {d}{\mathsf{P}}_t\). The former summand on the right of (52) will be bounded by utilizing the fact that \(U\) has “asymptotically small” probability. As to the latter, it will be shown that \({\mathcal {M}}(\cdot ; \omega )\) has nice analytical properties for each \(\omega \) in \(U^c\), so that a proper bound will be derived from these very same properties. In fact, as \(\text {d}_{\text {TV}}({\mathcal {M}}, \gamma ) \le 1\) entails \({\mathsf{E}}_t[\text {d}_{\text {TV}}({\mathcal {M}}, \gamma ); U] \le {\mathsf{P}}_t(U)\), we get
The inequality \({\mathsf{P}}_t\{\nu \le r\} \le r e^{-t}\) follows from (18), while \({\mathsf{P}}_t\{\prod _{j=1}^{\nu } \pi _{j, \nu } = 0\}\) equals zero since \({\mathsf{P}}_t\{\prod _{j=1}^{\nu } \pi _{j, \nu } = 0 \ | \ \nu , \tau _{\nu }\} = 0\). This claim is obvious on \(\{\nu = 1\}\) while, on \(\{\nu \ge 2\}\),
and the RHS is equal to zero since each \(\phi _j\) has an absolutely continuous law. To complete the evaluation of \({\mathsf{P}}_t(U)\), it is enough to combine the Markov inequality with (24)–(25) to get \({\mathsf{P}}_t\{\text {W}\ge a_{*}\} \le (1/a_{*}) \ {\mathsf{E}}_t[\text {W}] = (2^r r!) \ e^{\varLambda _b t}\). Whence,
The argument to deduce a bound for the expectation over \(U^c\) occupies the rest of this subsection. It is based on the following multidimensional extension of a result by Beurling [6].
Proposition 5
Let \(\chi \) be a finite signed measure on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\) such that \(\int _{{\mathbb {R}}^3}|\mathbf{x}|^2 |\chi |({\mathrm {d}}\mathbf{x}) < +\infty \), \(|\chi |\) standing for the total variation of \(\chi \). Then,
where \(\varDelta _{\varvec{\xi }}\) denotes the Laplacian operator.
The proof is deferred to Appendix A.6. The applicability of this proposition to \(\chi = {\mathcal {M}}- \gamma \) is made possible by
Proposition 6
If (47) holds and
for \(h = 1, \dots , 2k\) and some integer \(k \ge 2\), then there are positive constants \(g_h\) depending on \(\mu _0\) only through \({\mathfrak {m}}_h\), such that
for \(h = 1, \dots , 2k\) and any choice of \({\mathrm {B}}\), \({\mathsf{P}}_t\)-almost surely. Moreover, \(\rho \mapsto \frac{\partial ^h}{\partial \rho ^h} \hat{{\mathcal {M}}}(\rho \mathbf{u})\) exists for every \(\mathbf{u}\) in \(S^2\) and
\({\mathsf{P}}_t\)-almost surely with \(h = 1, \dots , 2k\), which entails
\({\mathsf{P}}_t\)-almost surely and \(\varvec{\xi }\mapsto \hat{{\mathcal {M}}}(\varvec{\xi }) \in \text {C}^{2k}({\mathbb {R}}^3)\).
See Appendix A.7 for the proof and the numerical evaluation of the constants \(g_h\). At this stage, Proposition 5 yields
To evaluate the integrals on the RHS, we change the variables according to the isometry \(i : {\mathbb {R}}^3{\setminus }\{\mathbf{0}\} \rightarrow (0, +\infty ) \times S^2\) defined by \(i : \varvec{\xi }\mapsto (|\varvec{\xi }|, \varvec{\xi }/|\varvec{\xi }|)\). In view of Theorem 3.11, Example 3.23 and Lemma 3.27 in [41], denoting the \(d\)-dimensional Lebesgue measure by \({\fancyscript{L}}^d\), integrals w.r.t. \({\fancyscript{L}}^3(\text {d}\varvec{\xi })\) become integrals w.r.t. \(4\pi \rho ^2 {\fancyscript{L}}^1 \otimes u_{S^2}(\text {d}\rho \text {d}\mathbf{u})\) and the standard Laplacian \(\varDelta _{\varvec{\xi }}\) changes into \(\varDelta _{(\rho , \mathbf{u})} := \frac{\partial ^2}{\partial \rho ^2} + \frac{2}{\rho }\frac{\partial }{\partial \rho } + \frac{1}{\rho ^2}\varDelta _{S^2}\), where \(\varDelta _{S^2}\) stands for the Laplace–Beltrami operator on \(S^2\). Now, from \(|z_1 + z_2 + z_3|^2 \le 3(|z_1|^2 + |z_2|^2 + |z_3|^2)\), we write
and then we define the random functions
Hence, for the sum of the two integrals on the RHS of (58) we obtain
where
In the following sub-subsections we analyze the integrals appearing in (59), calling inner (outer, respectively) any integral on \((0, \text {R}] \times S^2\) (\((\text {R}, +\infty ) \times S^2\), respectively).
2.2.1 Outer integral of \(\mathrm {I}_1(\rho , \mathbf{u})\)
An application of the inequality \(|z_1 + z_2|^2 \le 2|z_1|^2 + 2|z_2|^2\) yields
and a first proposition is given to analyze those summands which contain the Gaussian c.f..
Proposition 7
Let \(m, s, k\) be real numbers such that \(m \ge 0\), \(s \ge 1\) and \(k\) in \({\mathbb {N}}_0\). Then, there exists a positive constant \(c(m, s, k)\) such that
holds for every \(x > 0\).
See Appendix A.8 for the proof and an evaluation of \(c(m, s, k)\). At this stage, applying successively the above statement with \((x, m, s, k) = (\text {R}, 2, 8, 0), (\text {R}, 0, 8, 1)\) and \((\text {R}, 2, 8, 2)\) gives
with \(\overline{C}_1 := [2c(2, 8, 0) + 6c(2, 8, 2) + 24c(0, 8, 1)]\).
Then we study those terms on the RHS of (61) which depend on \({\mathcal {M}}\) making use of the next proposition, whose statement involves the random function
with \(\lambda \) and \(q\) as in Proposition 4.
Proposition 8
If (14)–(15) and (47)–(48) are in force, then
and
hold for every \(\rho \) in \([0, {\mathrm {R}}]\), with the exception of a set of \({\mathsf{P}}_t\)-probability zero. Moreover, there are two non-random polynomials \(\wp _1\) and \(\wp _2\) of degree 2 and 4 respectively, with positive coefficients depending only on \(\mu _0\), such that
and
hold for \({k}=1,2\) and every \(\rho \) in \([0, {\mathrm {R}}]\), with the exception of a set of \({\mathsf{P}}_t\)-probability zero.
A complete characterization of \(\wp _1\) and \(\wp _2\) is given in the course of the proof of this proposition, in Appendix A.9. For the sake of completeness, we observe that (64) and (66) hold true for any choice of \(\text {B}\) in (29).
One of the advantages of the splitting (52) consists in the fact that all the realizations of \(\varPsi \) on \(U^c\) share a property of uniform integrability, as shown in the following
Proposition 9
Over \(U^c\), the inequality
is valid for every \(x > 0\), with \(\epsilon := (2 r!)^{-1}\) and \(r\) given by (51). Therefore,
holds true for every \(x > 0\), \(s > 0\) and \(m < (2rqs - 1)\).
See Appendix A.10 for the proof. We are now in a position to complete the study of the outer integral of \(\text {I}_1(\rho , \mathbf{u})\). First, taking into account that \(2rq = 11\), combination of (65) with (69) yields
The applicability of (69) is guaranteed by the fact that, when \(s = 2\) and \(m = 2\), one has \(m < 4rq - 1 = 21\). Second, since (56), (65) and (68) entail
on \(U^c\), after integrating by parts we get
Thus, (68)–(69) with \(s=1\) and \(m=0\) lead to
To study the last integral, we recall that \(\prod _{j=1}^{\nu } \pi _{j, \nu } \ne 0\) on \(U^c\) and then combine (65) with (67)–(68) to prove that
At this stage, after two integrations by parts, we have
and, in view of Proposition 9, the above RHS is bounded by
The final bound can be obtained, via the Tonelli theorem, by starting from (61) and collecting the upper bounds in (62) and (70)–(72). Indeed, these last upper bounds are independent of \(\mathbf{u}\) and are expressed as sums of powers of \(\text {R}\), of order less than or equal to \(-8\). Therefore, recalling (60) and the inequality \(\text {W}\le 1\), we obtain
with
2.2.2 Outer integral of \({\mathrm {I}}_2(\rho , \mathbf{u})\)
As first step, we use the Tonelli theorem to write the outer integral of \(\text {I}_2(\rho , \mathbf{u})\) as
Then, we apply Theorem 3.16 in [41] to obtain
which, by virtue of (65), yields
At this stage, to handle the computations involving the Laplace–Beltrami operator, we define the following plane domains
along with the parametrizations
to form the atlas \({\mathcal {A}}\) on \(S^2\) composed by the charts \(\Omega _k := \mathbf{h}_k(D_k) \subset S^2\) for \(k = 1, \dots , 4\). Then, \(\varDelta _{S^2}^2\) can be expressed in local coordinates as
by virtue of (3.84) in [41], and hence
where \(\varvec{\alpha }\) indicates the multi-index \((\alpha _1, \alpha _2)\), \(\partial _{\varvec{\alpha }}\) stands for the partial derivative \(\frac{\partial ^{\alpha _1 + \alpha _2}}{\partial u^{\alpha _1} \partial v^{\alpha _2}}\), and \(\overline{\varDelta } = 4 (2 + \sqrt{3})^2 (6 + \sqrt{3})\) is the maximum absolute value of the coefficients of \(\varDelta _{(u, v)}^{2}\). To study \(\partial _{\varvec{\alpha }} \ \hat{{\mathcal {M}}}(\rho \mathbf{h}_k(u, v))\) we resort to the multi-dimensional Faà di Bruno formula stated and proved in [25]. Therefore, taking into account that \(| \partial _{\varvec{\alpha }} \ \mathbf{h}_k(u, v) | \le 1\) for every multi-index \(\varvec{\alpha }\), we have
where the \(a_{h,l}\)’s are constants specified in [25], and \({\mathfrak {M}}_l := \int _{{\mathbb {R}}^3}|\mathbf{v}|^l {\mathcal {M}}(\text {d}\mathbf{v})\). At this stage, (74)–(75) and (77)–(78) yield
Moreover, the Lyapunov inequality gives \({\mathfrak {M}}_l \le {\mathfrak {M}}_{4}^{l/4}\) for \(l\) in \([0, 4]\) and then, from (56), we get \({\mathfrak {M}}_4 \le 3\sum _{i = 1}^{3} \left( \lim _{\rho \rightarrow 0}\frac{\partial ^4}{\partial \rho ^4} \hat{{\mathcal {M}}}(\rho \mathbf{e}_i)\right) \le \ 9g_4\). Now, an application of (69) with \(s = 1\) and \(m = h - 2\), combined with (60), leads to
with
2.2.3 Inner integral of \({\mathrm {I}}_1(\rho , \mathbf{u})\)
The analysis is essentially based on certain new Berry–Esseen-type inequalities presented in [30], after observing the analogy between \(\rho \mapsto \hat{{\mathcal {M}}}(\rho \mathbf{u})\) and the c.f. \(\varphi _n(t)\) therein. Indeed, for any \(\mathbf{u}\) in \(S^2\) and for every choice of \(\text {B}\) in (29), each realization of \(\hat{{\mathcal {N}}}(\rho ; \mathbf{u})\), as a function of \(\rho \), coincides with the c.f. of a weighted sum of independent random numbers, according to (32). Moreover, the definition of \(\text {R}\) in (60) corresponds to the upper bound \(\tau \) appearing in the Berry–Esseen-type inequalities proved in [30]. To implement the aforesaid inequalities within the present framework, it is worth introducing the following entities
where \(T(\mathbf{u})\) belongs to \({\fancyscript{H}}\). With this new notation at hand, the Berry–Esseen-type inequality can be re-written as
for \(l = 0, 1, 2\), \(\rho \) in \([0, \text {R}]\) and \(\mathbf{u}\) in \(S^2\), \(u_{2,l}\), \(u_{3,l}\), \(u_{4,l}\), \(v_l\) being non-random rapidly decreasing continuous functions depending only on \(\mu _0\). See [30] for their definition. The above inequality yields
for \(l = 0, 1, 2\), \(m \ge 0\) and \(\mathbf{u}\) in \(S^2\). The integrals \(\overline{u}_{h, l, m} := \int _{0}^{+\infty } u_{h,l}^{2}(\rho ) \rho ^m \text {d}\rho \) and \(\overline{v}_{l, m} := \int _{0}^{+\infty } v_{l}^{2}(\rho ) \rho ^m \text {d}\rho \) are finite and depend only on \(\mu _0\) for \(h = 2, 3, 4\), \(l = 0, 1, 2\) and \(m \ge 0\). As to the above conditional expectation, we have
the latter inequality following from the conditional Markov inequality. Now, we apply (64) and (66) and, after observing that the upper bounds provided therein are \({\fancyscript{G}}\)-measurable, we obtain
for \(l = 1, 2\) and any \(m\) in \([0, 13)\). In addition, by virtue of Proposition 9, the integrals \(\overline{z}_m := \int _{0}^{+\infty } \varPsi ^2(\rho ) \rho ^m \text {d}\rho \) and \(\overline{w}_{l, m} := \int _{0}^{+\infty } \wp _{l}^{2}(\rho ) \varPsi ^2(\rho ) \rho ^m \text {d}\rho \) are finite and depend only on \(\mu _0\) when \(\omega \) varies in \(U^c\). Coming back to the integral of interest, the Tonelli theorem can be applied to write
Since the inner integral on the RHS has already been studied, it remains to explain how it depends on \(\mathbf{u}\). For this, a fundamental role is played by \(\text {B}\), which appears in the RHS of (85) through the random variables \(\text {X}\), \(\text {Y}\) and \(\text {Z}\). Apropos of this, it should be recalled that the so-called hairy ball theorem—see, e.g., Chapter 5 of [45]—asserts that a function \(\text {B}\), meeting the properties specified to write (29), cannot be continuous everywhere. Nevertheless, we know that the definition of \({\mathcal {M}}\) is independent of the choice of \(\text {B}\). We take advantage of this fact to overcome the aforesaid drawback by splitting \(S^2\) into the charts \(\Omega _k\) introduced in the previous subsection and by choosing for each \(\Omega _k\) a specific \(\text {B}\), say \(\text {B}_k\), smooth on \(\overline{\Omega }_k\). This possibility is guaranteed by the fact that \(S^2{\setminus }\overline{\Omega }_k\) contains at least two antipodal points. We now have, by (85) and (87)–(88),
where \(\text {X}_k\), \(\text {Y}_k\), \(\text {Z}_k\) are the same as in (82)–(84) respectively, with \(\text {B} = \text {B}_k\) and
2.2.4 Inner integral of \({\mathrm {I}}_2(\rho , \mathbf{u})\)
With reference to (59), the integral at issue is analyzed by splitting \(S^2\) into the charts \(\Omega _k\) defined in Sect. 2.2.2. On the basis of considerations made apropos of \(\text {B}\) at the end of the previous sub-subsection, here we choose the \(\text {B}_k\)’s as follows:
for \(k = 1, 2\) and
for \(k = 3, 4\). Then, equality \(\hat{{\mathcal {M}}}(\rho \mathbf{u}) = {\mathsf{E}}_t[\hat{{\mathcal {N}}}(\rho ; \mathbf{u})\ | \ {\fancyscript{G}}]\), in combination with the definition of \(T(\mathbf{u})\) in (80), produces this upper bound for \(| \varDelta _{S^2} \hat{{\mathcal {M}}}(\rho \mathbf{u}) |^2\):
for every \(\mathbf{u}\) in \(\Omega _k\), where \({\mathcal {N}}_k\) and \(T_k(\mathbf{u})\) are the same as \({\mathcal {N}}\) and \(T(\mathbf{u})\), respectively, with \(\text {B} = \text {B}_k\). To bound the former summand we make use of the following
Proposition 10
Assume that the tail condition (15) is in force together with the moment assumptions (14) and (47)–(48). Then, there exists a non-random polynomial \(\wp _L\) of degree 6, with positive coefficients which depend only on \(\mu _0\), such that
holds for every \(\rho \) in \([0, {\mathrm {R}}]\), with the exception of a set of \({\mathsf{P}}_t\)-probability zero.
The proof is deferred to Appendix A.9, where \(\wp _L\) is given explicitly. At this stage, we note that the upper bound in (93) is \({\fancyscript{G}}\)-measurable and, afterwards, we apply (86) to obtain
If we consider the random variable \(\int _{0}^{+\infty } \rho ^2\wp _{L}^{2}(\rho )\varPsi ^2(\rho ) \text {d}\rho \) on \(U^c\), then Proposition 9 can be used to conclude that this random variable is bounded by the constant \(J_L := \int _{0}^{1} \rho ^2\wp _{L}^{2}(\rho ) \text {d}\rho + \left( \frac{\lambda ^{2r}}{\epsilon }\right) ^{2q} \int _{1}^{+\infty } \rho ^{-20} \wp _{L}^{2}(\rho )\text {d}\rho \).
In the final part of this sub-subsection we provide an upper bound for the latter summand in the RHS of (92), by means of the following statement which involves new random quantities such as
where the \(\text {M}_{j, n}^{(m)}(\mathbf{u})\)’s are the same as in (81), \(\nabla _{S^2}\) is the Riemannian gradient on \(S^2\) and \(\mid \mid \!\cdot \! \mid \mid _{S^2}\) the Riemannian length.
Proposition 11
Let the moment assumptions (14) and (47)–(48) be in force. Then, for every \(k = 1, \dots , 4\), there exist (non-random) rapidly decreasing continuous functions \(z_1, \dots , z_6\), depending only on \(\mu _0\), such that
holds for every \(\mathbf{u}\) in \(\Omega _k\) and \(\rho \) in \([0, {\mathrm {R}}]\), with the exception of a set of \({\mathsf{P}}_t\)-probability zero. \(\text {X}_{L, k}\), \(\text {Y}_{L, k}\), \(\text {Z}_k\), \(\text {Z}_{G, k}\) and \(\text {Z}_{L, k}\) are defined as in (95)–(98) and (84) with \({\mathrm {B}}_k\) in place of \({\mathrm {B}}\).
For the proof and the definition of the \(z_i\)’s see Appendix A.11. Now, a straightforward application of the above proposition yields
where \(\overline{B}_{i, L} := 6 \int _{0}^{+\infty } z_{i}^{2}(\rho ) \rho ^2 \text {d}\rho \) for \(i = 1, \dots , 6\).
The final bound is achieved by collecting inequalities (92), (94) and (100), according to
2.2.5 The final step
With a view to bounding the RHS of (58), we use the ultimate results of Sects. 2.2.1–2.2.4, encapsulated in (73), (79), (89) and (101) respectively, to write
Then, we proceed by taking expectation of both sides of (102). Apropos of this computation it is worth noting that, if \(\mu _0\) meets the additional conditions
then \(\text {M}_{j, n}^{(2)} \equiv 1\) and \(\text {M}_{j,n}^{(3)} \equiv 0\), implying that all random variables in the RHS of (102) vanish, except for \(\text {W}\). Since \({\mathsf{E}}_t[\text {W}] = e^{\varLambda _b t}\) in view of (24)–(25), the proof of Theorem 1 would be complete. Let us carry on with the computation of the aforesaid expectations to show they all admit an upper bound like \(C e^{\varLambda _b t}\), even under the original more general conditions.
As for the random variables \(\text {X}_k\) and \(\text {X}_{L, k}\), a key role is played by the identity
valid for \(j = 1, \dots , n\), \(n\) in \({\mathbb {N}}\) and \(\mathbf{u}\) in \(S^2\), independently of the choice of \(\text {B}\) in (29). The \(\zeta _{j, n}\)’s are given by
and the \(\zeta _{j, n}^{*}\)’s are defined on \({\mathbb {T}}(n) \times [0, \pi ]^{n-1}\) as follows. Put \(\zeta _{1, 1}^{*} \equiv 1\) and, for \(n \ge 2\),
for every \(\varvec{\varphi }\) in \([0, \pi ]^{n-1}\). The reader is referred to Appendix A.12 for the proof of (103). Combination of (103) with (82) and (95) yields \(\text {X}_{k}(\mathbf{u}) = \big |\sum _{s = 1}^{3} \sigma _{s}^{2} (u_{s}^{2} - 1/3)\big | \cdot \sum _{j = 1}^{\nu } \pi _{j, \nu }^{2} |\zeta _{j, \nu }|\) and \(\text {X}_{L, k}(\mathbf{u}) = \big |\sum _{s = 1}^{3} \sigma _{s}^{2} \varDelta _{S^2}(u_{s}^{2})\big | \cdot \sum _{j = 1}^{\nu } \pi _{j, \nu }^{2} |\zeta _{j, \nu }|\) for \(k = 1, \dots , 4\). Whence,
where
are constants. At this stage, it is worth noticing that
holds for every \(t \ge 0\), with \(f(b) := \int _{0}^{\pi } \sin ^2\varphi \ \big | \frac{3}{2}\sin ^2\varphi - \frac{1}{2} \big | \ \beta (\text {d}\varphi )\). See Appendix A.1. Then, we combine the inequality \(\sin ^2\varphi \ \big | \frac{3}{2}\sin ^2\varphi - \frac{1}{2} \big | + \cos ^2\varphi \ \big | \frac{3}{2}\cos ^2\varphi - \frac{1}{2} \big | \ \le \sin ^4\varphi + \cos ^4\varphi \) with (2) to show that \(\varLambda _b \ge -(1 - f(b))\), i.e. the RHS in (106) approaches zero faster than \(e^{\varLambda _b t}\) as \(t\) goes to infinity. Therefore, we can conclude that
As for the random variables \(\text {Y}_k\) and \(\text {Y}_{L, k}\) are concerned, we write
with \({\mathfrak {L}}_{3}:= \int _{{\mathbb {R}}^3}|\mathbf{v}|^2 \mathbf{v}\mu _0(\text {d}\mathbf{v})\). Now, the analog of (103) is given by the couple of identities
valid for \(j = 1, \dots , n\), \(n\) in \({\mathbb {N}}\) and \(\mathbf{u}\) in \(S^2\), independently of the choice of \(\text {B}\) in (29), and for \(l_3(\mathbf{u}) := \int _{{\mathbb {R}}^3}[(\mathbf{u}\cdot \mathbf{v})^3 - \frac{3}{5}|\mathbf{v}|^2 (\mathbf{u}\cdot \mathbf{v})]\mu _0(\text {d}\mathbf{v})\). The \(\eta _{j, n}\)’s are given by
while the \(\eta _{j, n}^{*}\)’s are defined on \({\mathbb {T}}(n) \times [0, \pi ]^{n-1}\) as follows. Put \(\eta _{1, 1}^{*} \equiv 1\) and, for \(n \ge 2\),
for every \(\varvec{\varphi }\) in \([0, \pi ]^{n-1}\). The reader is referred to Appendix A.12 for the proof of (110)–(111). Combination of (109)–(111) with (83) and (96) entails \(\text {Y}_{k}(\mathbf{u}) \le |l_3(\mathbf{u})| \cdot \sum _{j = 1}^{\nu } |\pi _{j, \nu }^{3} \eta _{j, \nu }| + \frac{3}{5}|{\mathfrak {L}}_{3}\cdot \mathbf{u}| \text {W}\) and \(\text {Y}_{L, k}(\mathbf{u}) \le |\varDelta _{S^2}l_3(\mathbf{u})| \cdot \sum _{j = 1}^{\nu }|\pi _{j, \nu }^{3} \eta _{j, \nu }| + \frac{3}{5}|\varDelta _{S^2}({\mathfrak {L}}_{3}\cdot \mathbf{u})| \text {W}\) for \(k = 1, \dots , 4\). By elementary inequalities we obtain
where
are constants. At this stage, to compute the expectation in the above inequalities, it is worth highlighting that the identity
holds for every \(t \ge 0\), with \(g(b) := \int _{0}^{\pi } \sin ^4\varphi \ \big | \frac{5}{2}\sin ^2\varphi - \frac{3}{2} \big | \ \beta (\text {d}\varphi )\). See Appendix A.1. Now, we combine the inequality \(\sin ^4\varphi \ \big | \frac{5}{2}\sin ^2\varphi - \frac{3}{2} \big | + \cos ^4\varphi \ \big | \frac{5}{2}\cos ^2\varphi - \frac{3}{2} \big | \ \le \sin ^4\varphi + \cos ^4\varphi \) with (2) to show that \(\varLambda _b \ge -(1 - g(b))\), which says that the RHS in (114) approaches zero faster than \(e^{\varLambda _b t}\) as \(t\) goes to infinity. Relations (109)–(114) lead to
It remains only to deal with the expectations involving \(\text {Z}\), \(\text {Z}_G\) and \(\text {Z}_L\). Unfortunately, unlike the \(\text {X}\)’s and the \(\text {Y}\)’s, it is not possible to write the random variables \(\text {Z}\), \(\text {Z}_G\) and \(\text {Z}_L\) as product of a given function of \(\mathbf{u}\) by some other random variable independent of \(\mathbf{u}\) and “contracting” in some sense. Nevertheless, such a contraction property can be found on the integrals of the \(\text {Z}\)’s over \(\Omega _k\). Accordingly, we show that the expectations of the last three random variables in (102) admit bounds like \(C e^{\varLambda _b t}\) with \(C\) depending only on \(\mu _0\). To prove this, we apply the Jensen inequality and exploit (48) to get
where \(\varvec{\psi }_{j, n; k}\) is the analog of (29) when \(\text {B}\) is replaced by \(\text {B}_k\), \(\psi _{j, n; k, s}\) denotes its \(s\)-th component and \(\text {S}_{k, s} := \sum _{j = 1}^{\nu } \pi _{j, \nu }^{2}\big ( 3 \psi _{j, \nu ; k, s}^{2} - 1 \big )\). Whence, by a further application of Jensen’s inequality and of an obvious inequality concerning the square root of a sum,
Both the square roots and the squares after the brackets constitute an obstacle for the interchange of the integral with the expectation \({\mathsf{E}}_t\) and for the consequent application of useful properties of conditional expectation. To overcome this difficulty, we resort to the imbedding of the Sobolev space \(\text {W}^{1, 1}(\Omega _k)\) into \(\text {L}^2(\Omega _k)\). See, e.g., Chapter 2 of [2]. Taking the same constants \(A_1(0)\) and \(K(2, 1)\) as in Theorem 2.28 therein, we write
where \({\fancyscript{D}}\) can be \(\text {Id}\), \(\nabla _{S^2}\), \(\varDelta _{S^2}\), and \(\big |{\fancyscript{D}}\text {S}_{k, s}\big |\) is to be interpreted in accordance with the meaning of \({\fancyscript{D}}\). To work out the last term in the above inequality, we use (5.1.25) in [60] to say that
holds true \({\mathsf{P}}_t\)-almost surely. Moreover, when \({\fancyscript{D}}\) is \(\text {Id}\) or \(\varDelta _{S^2}\), the Leibnitz rule for the gradient entails
When \({\fancyscript{D}}\) is \(\nabla _{S^2}\), the definition of the Hessian as symmetric bilinear form leads to
for every vector field \(V\), \(D\) standing for the Levi-Civita connection. See Exercise 11 in Chapter 6 of [21]. Whence,
where \(\mid \mid \!\cdot \! \mid \mid _{*}\) denotes the \(\text {L}^2\)-norm of the Hessian given by \(\mid \mid \!\text {Hess}_{S^2}[\text {S}_{k, s}]\! \mid \mid _{*}^2 \,\,:= \sum _{ij} [\text {Hess}_{S^2}[\text {S}_{k, s}](V_i, V_j)]^2\) for some orthonormal basis \(\{V_1, V_2\}\) of vector fields. At this stage, it comes in useful to emphasize the fact that, in view of (121)–(123), the latter summand in (120) can be bounded by a sum of terms sharing the same structure of the former summand. Then, to provide an effective bound for the RHS of (120) it is enough to prove that
holds for some suitable constant \(c({\fancyscript{D}}^{'})\), \({\fancyscript{D}}^{'}\) being one of the following operators: \(\text {Id}\), \(\nabla _{S^2}\), \(\varDelta _{S^2}\), \(\nabla _{S^2}\varDelta _{S^2}\), \(\text {Hess}_{S^2}\). For the proof of (124), cf. Appendix A.13. Now, we are in a position to write explicit bounds for the last three terms in (102), which read
with
To conclude, we gather (24)–(25), (107)–(108), (115)–(116), (125)–(127) and we resort to (58) and (102) to obtain
with
Finally, we recall (52) and combine the last inequality with (53).
2.3 Proof of Theorem 3
Without any loss of generality, we prove the sufficiency of (7) for the weak convergence to the Maxwellian distribution, under extra-conditions (47)–(48) and
with \(\delta := -\varLambda _b/16\). This last assumption is not restrictive since the Cauchy problem associated with (1) is autonomous and \(\max _{i = 1, 2, 3} \big |\int _{{\mathbb {R}}^3}v_{i}^{2}\mu (\text {d}\mathbf{v}, t) - 1\big |\) approaches zero as \(t\) goes to infinity. See [29, 31, 46]. The argument proceeds, as in Section 9.1 of [24], on the basis of the Lévy continuity theorem. Therefore, fix \(\varvec{\xi }\ne \mathbf{0}\) and write
where \(\rho = |\varvec{\xi }|\), \(\mathbf{u}= \varvec{\xi }/|\varvec{\xi }|\) and \(T^2 := \sum _{j = 1}^{\nu } \pi _{j, \nu }^2 \varvec{\psi }_{j, \nu }^{t}(\mathbf{u}) V[\mu _0] \varvec{\psi }_{j, \nu }(\mathbf{u})\). As to the first summand in (129), use (6)–(7) in Section 9.1 of [24] to obtain
with \(\varepsilon > 0\) and \(A_j(\varepsilon ) := \{\mathbf{v}\in {\mathbb {R}}^3\ \big | \ |\pi _{j, \nu } (\varvec{\psi }_{j, \nu }(\mathbf{u}) \cdot \mathbf{v})| \ge \varepsilon |T|\}\) for \(j = 1, \dots , \nu \). Then, one has \(\sigma _{*}^2 := \min \{\sigma _{1}^2, \sigma _{2}^2, \sigma _{3}^2\} \le T^2 \le 3\) and
with \(\pi _o := \max _{1 \le j \le \nu } |\pi _{j, \nu }|\). Put \(M(y) := \int _{\{|\mathbf{v}| \ge 1/y\}} |\mathbf{v}|^2 \mu _0(\text {d}\mathbf{v})\) for \(y > 0\) and note that \(M\) is a monotonically increasing bounded function satisfying \(\lim _{y \downarrow 0} M(y) = 0\). Moreover, from
one can conclude that
holds true for every strictly positive \(\varepsilon \). At this stage, take \(\varepsilon = \sqrt{\pi _o}\) and combine (130)–(132) to get
To complete the analysis of the first summand in the RHS of (129), one shows that the expectation of the RHS of (133) approaches zero as \(t\) goes to infinity, for every \(\rho \) in \([0, +\infty )\). Indeed, for any monotonically increasing bounded function \(g : (0, \infty ) \rightarrow (0, \infty )\) satisfying \(\lim _{x \downarrow 0} g(x) = 0\), one has
for every \(z\) in \((0, \infty )\). By virtue of (24)–(25), \({\mathsf{E}}_t[\pi _{o}^4] \le e^{\varLambda _b t}\) and, after choosing \(z = -\varLambda _b/8\), one obtains \(\lim _{t \rightarrow +\infty } {\mathsf{E}}_t[g(\pi _{o})] = 0\). This argument, applied with \(g(x) = M\left( \frac{\sqrt{x}}{\sigma _{*}}\right) \rho ^2 + \sqrt{27 x} \rho ^3 + \frac{9}{8} x^2 \rho ^4\), leads to the desired result. As far as the latter summand in (129) is concerned, a plain application of (17) implies that \({\mathsf{E}}_t[e^{- T^2 \rho ^2/2}]\) can be thought of as the Fourier transform of the solution of (1) when the initial datum coincides with \(\prod _{i = 1}^{3} \frac{1}{\sigma _i \sqrt{2\pi }} \exp \{-\frac{v_{i}^{2}}{2 \sigma _{i}^{2}}\} \text {d}v_i\), where the \(\sigma _i\)’s have been fixed initially. Now, in view of (128), this initial datum belongs to a convenient neighborhood of the equilibrium \(\gamma \)—according to Theorem 1.1 in [29]—so that
holds true for every \(t \ge 0\) with the same \(C_{*}\) as in the above-quoted theorem.
As to the necessity of (7), suppose that \(\mu (\cdot , t)\) converges weakly to some limit as \(t\) goes to infinity. Following a technique developed in [35], the argument starts with the introduction of the random vector
defined on \((\Omega , {\fancyscript{F}})\). To explain the three right-most symbols above, one fixes an arbitrary point \(\mathbf{u}_0\) in \(S^2\) and defines:
-
(i)
\(\varvec{\lambda } := \{\lambda _1(\cdot ), \dots , \lambda _{\nu }(\cdot ), \delta _0(\cdot ), \delta _0(\cdot ), \dots \}\) to be the sequence of random p.d.’s on \(({\mathbb {R}}, {\fancyscript{B}}({\mathbb {R}}))\) such that \(\hat{\lambda }_j(\xi ) := \hat{\mu }_0(\xi \pi _{j, \nu } \varvec{\psi }_{j, \nu }(\mathbf{u}_0))\), for \(j = 1, \dots , \nu \) and \(\xi \) in \({\mathbb {R}}\).
-
(ii)
\(\varLambda \) to be the random p.d. on \(({\mathbb {R}}, {\fancyscript{B}}({\mathbb {R}}))\) obtained as convolution of all elements of \(\varvec{\lambda }\), i.e. \(\varLambda = \lambda _1 *\dots *\lambda _{\nu }\).
-
(iii)
\(\mathbf{U} := \{U_1, U_2, \dots \}\) to be the sequence of random numbers defined by \(U_k := \max _{1 \le j \le \nu } \lambda _j\left( \left[ -\frac{1}{k}, \frac{1}{k}\right] ^c\right) \) for every \(k\) in \({\mathbb {N}}\).
To grasp the usefulness of \(W\), one can note that its components are the essential ingredients of the central limit problem for independent uniformly asymptotically negligible summands. See Sections 16.6-9 of [36]. Apropos of the negligibility condition, it is easy to prove that \(\lim _{t \rightarrow +\infty } {\mathsf{P}}_t[U_k > \alpha ] = 0\) holds for every \(k\) in \({\mathbb {N}}\) and for every \(\alpha \) in \((0, +\infty )\). In fact, the inclusion \(\{\mathbf{v}\in {\mathbb {R}}^3\ \big | \ |\pi _{j, \nu } \varvec{\psi }_{j, \nu } \cdot \mathbf{v}| \ge 1/k\} \subset \{\mathbf{v}\in {\mathbb {R}}^3\ \big | \ |\pi _{j, \nu } \mathbf{v}| \ge 1/k\}\) entails
To conclude, apply the argument used to prove Lemma 2 in [38]. Now, think of the range of \(W\) as a subset of
where: \(\overline{{\mathbb {N}}} := \{1, 2, \dots , +\infty \}\) and \(\overline{{\mathbb {T}}}\) are the one-point compactifications of \({\mathbb {N}}\) and \({\mathbb {T}}\), respectively; \(\overline{\mathbb {R}} := [-\infty , +\infty ]\); \({\mathcal {P}}(X)\) is the space of all p.d.’s on \(X\). Here, \({\mathcal {P}}(\overline{\mathbb {R}})\) is metrized, consistently with the topology of weak convergence, in such a way that it turns out to be a separable, compact and complete metric space. Cf. Section 6.II of [57]. Then, \({\mathbb {S}}\) is a separable, compact and complete metric space w.r.t. the product topology and so the family of probability distributions \(\{{\mathsf{P}}_t\circ W^{-1}\}_{t \ge 0}\) is tight. This implies that any sequence \(\{\mathsf{P}_{t_m} \circ W^{-1}\}_{m \ge 1}\), when \(t_m\) strictly increases to infinity, contains a subsequence \(\{\mathsf{Q}_l\}_{l \ge 1}\), with \(\mathsf{Q}_l := \mathsf{P}_{t_{m_l}} \circ W^{-1}\), which converges weakly to a p.d. \(\mathsf{Q}\). It is worth noting that, thanks to the weak convergence of \(\mu (\cdot , t)\), \(\mathsf{Q}\) is supported by
This claim can be verified by recalling Lemma 3 in [38]. Since \({\mathbb {S}}\) is Polish, one can now invoke the Skorokhod representation theorem (see Theorem 4.30 in [48]). Therefore, there are a probability space \((\tilde{\Omega }, \tilde{\fancyscript{F}}, \tilde{{\mathsf{P}}})\) and \({\mathbb {S}}\)-valued random elements on it, say \(\tilde{W}_l = \big (\tilde{\nu }_l, \{\tilde{\tau }_{n, l}\}_{n \ge 1}, \{\tilde{\phi }_{n, l}\}_{n \ge 1}, \{\tilde{\vartheta }_{n, l}\}_{n \ge 1}, \tilde{\varvec{\lambda }}_l, \tilde{\varLambda }_l, \tilde{\mathbf{U}}_l\big )\) and \(\tilde{W}_{\infty }\), which have respective p.d.’s \(\mathsf{Q}_l\) and \(\mathsf Q \), for every \(l\) in \({\mathbb {N}}\). Moreover, for every \(\tilde{\omega }\) in \(\tilde{\Omega }\), one has \(\tilde{W}_l(\tilde{\omega }) \rightarrow \tilde{W}_{\infty }(\tilde{\omega })\) (in the metric of \({\mathbb {S}}\)) as \(l\) goes to infinity, which entails
\(\tilde{\varLambda }_{\infty }\) being an element of \({\mathcal {P}}({\mathbb {R}})\). The distributional properties of \(\tilde{W}_l\) imply that \(\tilde{\varLambda }_l\) is the convolution of the elements of \(\tilde{\varvec{\lambda }}_l\), and that \(\tilde{U}_{k, l}\) coincides with \(\max _{1 \le j \le \tilde{\nu }_l} \tilde{\lambda }_{j, l}\left( \left[ -\frac{1}{k}, \frac{1}{k}\right] ^c\right) \) for every \(k\) in \({\mathbb {N}}\), \(\tilde{{\mathsf{P}}}\)-almost surely. For convenience, denote with \(q^{(s)}\) the symmetrized form of the p.d. \(q\), i.e. \(\hat{q^{(s)}(\cdot )} := |\hat{q}(\cdot )|^2\). Now, (134) entails \(\tilde{\varLambda }_{l}^{(s)} \Rightarrow \tilde{\varLambda }_{\infty }^{(s)}\) for every \(\tilde{\omega }\) in \(\tilde{\Omega }\) and the combination of this fact with Theorem 24 in Chapter 16 of [36] yields
with the exception of a set of points \(\tilde{\omega }\) of \(\tilde{{\mathsf{P}}}\)-probability 0. The final argument is split into the following steps. First,
where the \(\tilde{\pi }\)’s and \(\tilde{\varvec{\psi }}\)’s denote the counterparts, in the Skorokhod representation, of the \(\pi \)’s and \(\varvec{\psi }(\mathbf{u}_0)\)’s, \(\tilde{\pi }_{l, o} := \max _{1 \le j \le \tilde{\nu }_l} |\tilde{\pi }_{j, \tilde{\nu }_l}|\) and the inequality is a consequence of the inclusion \(\{\mathbf{v}\in {\mathbb {R}}^3\ \big | \ |\tilde{\pi }_{j, \tilde{\nu }_l} \tilde{\varvec{\psi }}_{j, \tilde{\nu }_l} \cdot \mathbf{v}| \le \varepsilon \} \supset \{\mathbf{v}\in {\mathbb {R}}^3\ \big | \ \tilde{\pi }_{l, o} |\mathbf{v}| \le \varepsilon \} \). Second, define \(d = d(\tilde{\omega }; j, l)\) to be an element of \(\{1, 2, 3\}\) for which \(\tilde{\psi }_{j, \tilde{\nu }_l; d}^2 = \max _{1 \le i \le 3} \tilde{\psi }_{j, \tilde{\nu }_l; i}^2\). Note that \(\tilde{\psi }_{j, \tilde{\nu }_l; d}^2\) must be greater than \(1/3\) since \(\tilde{\varvec{\psi }}_{j, \tilde{\nu }_l}\) belongs to \(S^2\), for every \(\tilde{\omega }\) in \(\tilde{\Omega }\), \(l\) in \({\mathbb {N}}\) and \(j = 1, \dots , \tilde{\nu }_l\). Then,
where \(\tilde{s}_{h, l}\) denotes the sum of those \(\tilde{\pi }_{j, \tilde{\nu }_l}^{2}\) for which \(d(\tilde{\omega }; j, l) = h\). At this stage, observe that \(\tilde{\pi }_{l, o}\) goes to zero with probability one as \(l\) goes to infinity, in view of Lemma 1 in [38]. Since \(\sum _{h = 1}^{3} \tilde{s}_{h, l} = 1\) with probability one, there are some \(\tilde{\omega }\) and \(h\), say \(\tilde{\omega }_{*}\) and \(h_{*}\), such that \(\tilde{\pi }_{l, o}(\tilde{\omega }_{*}) \rightarrow 0\) and \(\overline{\lim }_l \tilde{s}_{h_{*}, l}(\tilde{\omega }_{*})\) is strictly positive. Then,
which shows that the \(h_{*}\)-th marginal of \(\mu _{0}^{(s)}\)—and hence also the \(h_{*}\)-th marginal of \(\mu _0\)—has finite second moment. To complete the proof, observe that \(h_{*}\) can be determined independently of \(\mu _0\) and that weak convergence of \(\mu (\cdot , t)\) entails weak convergence of \(\mu (\cdot , t) \circ f_{Q}^{-1}\), \(f_{Q}\) being the map \(\mathbf{v}\mapsto Q\mathbf{v}\) and \(Q\) an orthogonal matrix. Hence, since \(\mu (\cdot , t) \circ f_{Q}^{-1}\) turns out to be the solution of (1) with initial datum \(\mu _0 \circ f_{Q}^{-1}\) (cf. [31]), the above argument can be used to prove that \(\int _{{\mathbb {R}}^3}v_{h_{*}}^{2} \mu _0 \circ f_{Q}^{-1}(\text {d}\mathbf{v})\) is finite, where \(h_{*}\) is invariant w.r.t. \(Q\) and \(\mu _0\). At the end, choose \(f_Q\) firstly equal to \((v_1, v_2, v_3) \mapsto (v_2, v_3, v_1)\) and, then, equal to \((v_1, v_2, v_3) \mapsto (v_3, v_1, v_2)\) to complete the proof.
Notes
It should be noted that condition (2) is tantamount to assuming that the counterpart of \(b\) in the \({\varvec{\sigma }}\)-representation is an even function.
References
Arkeryd, L.: Intermolecular forces of infinite range and the Boltzmann equation. Arch. Ration. Mech. Anal. 77, 11–21 (1981)
Aubin, T.: Nonlinear Analysis on Manifolds. Monge-Ampère Equations. Springer, New York (1982)
Bassetti, F., Ladelli, L.: Self-similar solutions in one-dimensional kinetic models: a probabilistic view. Ann. Appl. Probab. 22, 1928–1961 (2012)
Bassetti, F., Ladelli, L., Matthes, D.: Central limit theorem for a class of one-dimensional kinetic equations. Probab. Theory Relat. Fields 150, 77–109 (2010)
Bassetti, F., Ladelli, L., Regazzini, E.: Probabilistic study of the speed of approach to equilibrium for an inelastic Kac model. J. Stat. Phys. 133, 683–710 (2008)
Beurling, A.: Sur les intégrales de Fourier absolument convergentes et leur application à une transformation fonctionnelle. In: 9th Congr. Math. Scandinaves, Tryekeri, Helsinki, 1938, pp. 199–210, Helsinki (1939) [See also: The Collected Works of Arne Beurling, vol. 2. Harmonic Analysis (L. Carleson, P. Malliavin, V. Neuberger and J. Wermer, eds.). Birkhäuser, Boston (1989)]
Bhattacharya, R.N., Rao, R.R.: Normal Approximation and Asymptotic Expansions. Wiley, New York (1976)
Bobylev, A.V.: The theory of the nonlinear spatially uniform Boltzmann equation for Maxwell molecules. Math. Phys. Rev. 7, 111–233 (1988)
Bobylev, A.V., Cercignani, C.: On the rate of entropy production for the Boltzmann equation. J. Stat. Phys. 94, 603–618 (1999)
Carleman, T.: Sur la théorie de l’equation intégrodifferentielle de Boltzmann. Acta Math. 60, 91–146 (1932)
Carlen, E.A., Carvalho, M.C.: Strict entropy production bounds and stability of the rate of convergence to equilibrium for the Boltzmann equation. J. Stat. Phys. 67, 575–608 (1992)
Carlen, E.A., Carvalho, M.C.: Entropy production estimates for Boltzmann equation with physically realistic collision kernels. J. Stat. Phys. 74, 743–782 (1994)
Carlen, E.A., Carvalho, M.C., Gabetta, E.: Central limit theorem for Maxwellian molecules and truncation of the Wild expansion. Commun. Pure Appl. Math. 53, 370–397 (2000)
Carlen, E.A., Carvalho, M.C., Gabetta, E.: On the relation between rates of relaxation and convergence of Wild sums for solutions of the Kac equation. J. Funct. Anal. 220, 362–387 (2005)
Carlen, E.A., Carvalho, M.C., Loss, M.: Determination of the spectral gap for Kac’s master equation and related stochastic evolution. Acta Math. 191, 1–54 (2003)
Carlen, E.A., Gabetta, E., Regazzini, E.: On the rate of explosion for infinite energy solutions of the spatially homogeneous Boltzmann equation. J. Stat. Phys. 129, 699–723 (2007)
Carlen, E.A., Gabetta, E., Regazzini, E.: Probabilistic investigation on the explosion of solutions of the Kac equation with infinite energy initial distribution. J. Appl. Probab. 45, 95–106 (2008)
Carlen, E.A., Gabetta, E., Toscani, G.: Propagation of smoothness and the rate of exponential convergence to equilibrium for a spatially homogeneous Maxwellian gas. Commun. Math. Phys. 199, 521–546 (1999)
Carlen, E.A., Geronimo, J.S., Loss, M.: Determination of the spectral gap in the Kac model for physical momentum and energy-conserving collisions. SIAM J. Math. Anal. 40, 327–364 (2008)
Carlen, E.A., Lu, X.: Fast and slow convergence to equilibrium for Maxwellian molecules via Wild sums. J. Stat. Phys. 112, 59–134 (2003)
do Carmo, M.P.: Riemannian Geometry. Birkhäuser, Boston (1992)
Cercignani, C.: The Boltzmann Equation and its Applications. Springer, New York (1988)
Cercignani, C., Illner, R., Pulvirenti, M.: The Mathematical Theory of Dilute Gases. Springer, New York (1994)
Chow, Y.S., Teicher, H.: Probability Theory. Independence, Interchangeability, Martingales, 3rd edn. Springer, New York (1997)
Constantine, G.M., Savits, T.H.: A multivariate Faà di Bruno formula with applications. Trans. Am. Math. Soc. 348, 503–520 (1996)
Desvillettes, L., Villani, C.: On the trend to global equilibrium for spatially inhomogeneous kinetic systems: the Boltzmann equation. Invent. Math. 159, 245–316 (2005)
Dolera, E.: Rapidity of convergence to equilibrium of the solution of the Boltzmann equation for Maxwellian molecules. Ph.D. thesis, Università degli Studi di Pavia (2010)
Dolera, E.: On the computation of the spectrum of the linearized Boltzmann collision operator for Maxwellian molecules. Boll. Unione Mat. Ital. (9) 4, 47–68 (2011)
Dolera, E.: Spatially homogeneous Maxwellian molecules in a neighborhood of the equilibrium. Ist. Lombardo Accad. Sci. Lett. Rend. A. 145 (2011). arXiv:1206.3425
Dolera, E.: Estimates of the approximation of weighted sums of conditionally independent random variables by the normal law. J. Inequal. Appl. 2013, 320 (2013)
Dolera, E.: Mathematical treatment of the homogeneous Boltzmann equation for Maxwellian molecules in the presence of singular kernels. arXiv:1306.5133
Dolera, E., Gabetta, E., Regazzini, E.: Reaching the best possible rate of convergence to equilibrium for solutions of Kac’s equation via central limit theorem. Ann. Appl. Probab. 19, 186–209 (2009)
Dolera, E., Regazzini, E.: The role of the central limit theorem in discovering sharp rates of convergence to equilibrium for the solution of the Kac equation. Ann. Appl. Probab. 20, 430–461 (2010)
Drmota, M.: Random Trees. An interplay between Combinatorics and Probability. Springer, Wien (2009)
Fortini, S., Ladelli, L., Regazzini, E.: A central limit problem for partially exchangeable random variables. Theory Probab. Appl. 41, 224–246 (1996)
Fristedt, B., Gray, L.: A Modern Approach to Probability Theory. Birkhäuser, Boston (1997)
Gabetta, E., Regazzini, E.: Some new results for McKean’s graphs with applications to Kac’s equation. J. Stat. Phys. 125, 947–974 (2006)
Gabetta, E., Regazzini, E.: Central limit theorem for the solution of the Kac equation. Ann. Appl. Probab. 18, 2320–2336 (2008)
Gabetta, E., Regazzini, E.: Central limit theorems for the solutions of the Kac equation: speed of approach to equilibrium in weak metrics. Probab. Theory Relat. Fields 146, 451–480 (2010)
Gabetta, E., Toscani, G., Wennberg, B.: Metrics for probability distributions and the trend to equilibrium for solutions of the Boltzmann equation. J. Stat. Phys. 81, 901–934 (1995)
Grigoryan, A.: Heat Kernel and Analysis on Manifolds. American Mathematical Society, Providence (2009)
Grünbaum, A.: Linearization for the Boltzmann equation. Trans. Am. Math. Soc. 165, 425–449 (1972)
Hardy, G.H., Littlewood, J.E., Polya, G.: Inequalities, 2nd edn. Cambridge University Press, Cambridge (1952) (reprinted in 1994)
Hilbert, D.: Begründung der kinetischen Gastheorie. Math. Ann. 72, 562–577 (1912)
Hirsch, M.W.: Differential Topology. Springer, New York (1976)
Ikenberry, E., Truesdell, C.: On the pressures and the flux of energy in a gas according to Maxwell’s kinetic theory. I. J. Ration. Mech. Anal. 5, 1–54 (1956)
Kac, M.: Foundations of kinetic theory. In: Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, vol. 3, pp. 171–197. University of California Press, Berkeley (1956)
Kallenberg, O.: Foundations of Modern Probability, 2nd edn. Springer, New York (2002)
Maxwell, J.C.: On the dynamical theory of gases. Philos. Trans. R. Soc. Lond. Ser. A 157, 49–88 (1867)
McKean Jr., H.P.: Speed of approach to equilibrium for Kac’s caricature of a Maxwellian gas. Arch. Ration. Mech. Anal. 21, 343–367 (1966)
McKean Jr., H.P.: An exponential formula for solving Boltzmann’s equation for a Maxwellian gas. J. Combin. Theory 2, 358–382 (1967)
Merris, R.: Combinatorics, 2nd edn. Wiley, New York (2003)
Morgenstern, D.: General existence and uniqueness proof for the spatially homogeneous solutions of the Maxwell–Boltzmann equation in the case of Maxwellian molecules. Proc. Natl. Acad. Sci. USA 40, 719–721 (1954)
Morgenstern, D.: Analytical studies related to the Maxwell–Boltzmann equation. J. Ration. Mech. Anal. 4, 533–555 (1955)
Mouhot, C.: Rate of convergence to equilibrium for the spatially homogeneous Boltzmann equation with hard potentials. Commun. Math. Phys. 261, 629–672 (2006)
Murata, H., Tanaka, H.: An inequality for certain functional of multidimensional probability distributions. Hiroshima Math. J. 4, 75–81 (1974)
Parthasarathy, K.R.: Probability Measures on Metric Spaces. Academic Press, New York (1967) (reprinted in 2005 by AMS Chelsea, Providence)
Petrov, V.V.: Limit Theorems of Probability Theory. Sequences of Independent Random Variables. The Clarendon Press, Oxford University Press, New York (1995)
Sansone, G.: Orthogonal Functions. Interscience Publishers, New York (1959) (reprinted in 1991 by Dover Publications, New York)
Stroock, D.W.: Probability Theory. An Analytic View, 2nd edn. Cambridge University Press, Cambridge (2011)
Tanaka, H.: Probabilistic treatement of the Boltzmann equation of Maxwellian molecules. Z. Wahrsch. Verw. Gebiete 46, 67–105 (1978)
Toscani, G., Villani, C.: Probability metrics and uniqueness of the solution of the Boltzmann equation for a Maxwell gas. J. Stat. Phys. 94, 619–637 (1999)
Truesdell, C., Muncaster, R.: Fundamentals of Maxwell’s Kinetic Theory of a Simple Monoatomic Gas. Academic Press, New York (1980)
Villani, C.: Fisher information estimates for Boltzmann’s collision operator. J. Math. Pures Appl. 77, 821–837 (1998)
Villani, C.: A review of mathematical topics in collisional kinetic theory. In: Friedlander, S., Serre, D. (eds.) Handbook of Mathematical Fluid Dynamics, vol. 1, pp. 71–305. North-Holland, Amsterdam (2002)
Villani, C.: Cercignani’s conjecture is sometimes true and always almost true. Commun. Math. Phys. 234, 455–490 (2003)
Wild, E.: On Boltzmann’s equation in kinetic theory of gases. Proc. Camb. Philos. Soc. 47, 602–609 (1951)
Acknowledgments
The authors would like to thank professor Eric Carlen for his constructive and helpful comments, and constant encouragement. They also acknowledge the advice of professor Francesco Bonsante.
Author information
Authors and Affiliations
Corresponding author
Additional information
Work partially supported by MIUR-2008MK3AFZ.
Appendix A
Appendix A
Gathered here are the proofs of unproved propositions and formulas scattered throughout Sects. 1 and 2.
1.1 A.1 Proof of (24), (106) and (114)
Fix \(s > 0\) and define
These functions satisfy the relations
for every \(n\) in \({\mathbb {N}}\), \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\) and for some suitable constant \(\alpha \). This claim is checked for each of them, following a common scheme of reasoning. First, \(\text {A}_{1}^{(s)}(1, {\mathfrak {t}}_1) = \text {A}_2(1, {\mathfrak {t}}_1) = \text {A}_3(1, {\mathfrak {t}}_1) = 1\) holds by definition. Then, to obtain the latter identity in (137) as regards \(\text {A}_{1}^{(s)}\), utilize (22) in the equality
Thus, \(\alpha = \int _{0}^{\pi } |\cos \varphi |^s \beta (\text {d}\varphi ) = \int _{0}^{\pi } |\sin \varphi |^s \beta (\text {d}\varphi ) = l_s(b)\), where the validity of the exchange of \(\cos \) with \(\sin \) is a consequence of (2). As to \(\text {A}_2\), use (22) and (105) in
to show that \(\text {A}_2\) satisfies the latter identity in (137) with \(\alpha = f(b)\). Passing to \(\text {A}_3\), consider (22) and (113) in conjunction with
to verify that \(\text {A}_3\) meets the latter identity in (137) with \(\alpha = g(b)\).
At this stage, since \(\delta _j({\mathfrak {t}}_n^l) + 1 = \delta _j({\mathfrak {t}}_n)\) for \(j = 1, \dots , n_l\) and \(\delta _j({\mathfrak {t}}_n^r) + 1 = \delta _{j+n_l}({\mathfrak {t}}_n)\) for \(j = 1, \dots , n_r\), an induction argument yields \(\text {A}(n, {\mathfrak {t}}_n) = \sum _{j = 1}^{n} \alpha ^{\delta _j({\mathfrak {t}}_n)}\), where \(\delta _j\) is the depth defined in Sect. 1.5. By the concept of germination explained in Sect. 1.5, \(\delta _j({\mathfrak {t}}_{n, k}) = \delta _j({\mathfrak {t}}_n) + \delta _{j, k} + \delta _{j, k+1}\) for \(j = 1, \dots , k+1\), with \(\delta _{r, s}\) standing for the Kronecker delta, and \(\delta _j({\mathfrak {t}}_{n, k}) = \delta _{j+1}({\mathfrak {t}}_n)\) for \(j = k+2, \dots , n\). Then, the specific form of \(\text {A}(n, {\mathfrak {t}}_n)\) shows that
holds for every \(n\) in \({\mathbb {N}}\) and \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\). Now, since \({\mathsf{E}}_t[\text {A}(n+1, \tau _{n+1})\ | \ \tau _n = {\mathfrak {t}}_n] = \sum _{k = 1}^{n}\text {A}(n+1, {\mathfrak {t}}_{n, k}){\mathsf{P}}_t[\tau _{n+1} = {\mathfrak {t}}_{n, k}\ | \ \tau _n = {\mathfrak {t}}_n]\), (19) and (138) imply that \(a_n := {\mathsf{E}}_t[\text {A}(n, \tau _n)]\) satisfies \(a_1 = 1\) and \(a_{n+1} = \left( 1 + \frac{2\alpha - 1}{n}\right) a_n\) for every \(n\) in \({\mathbb {N}}\). Hence, if \((1 - 2\alpha )\) does not belong to \({\mathbb {N}}\), \(a_n = \frac{\varGamma (n + 2\alpha - 1)}{\varGamma (n) \varGamma (2\alpha )}\) for every \(n\) in \({\mathbb {N}}\). Otherwise, if \((1 - 2\alpha ) = m\), then \(a_n = (-1)^{n+1} \left( {\begin{array}{c}m-1\\ n-1\end{array}}\right) \) for \(n = 1, \dots , m\) and \(a_n = 0\) for \(n > m\). Finally, note that the expectations in (24), (106) and (114) coincide with \({\mathsf{E}}_t[\text {A}_{1}^{(s)}]\), \({\mathsf{E}}_t[\text {A}_2]\) and \({\mathsf{E}}_t[\text {A}_3]\) respectively, and that \({\mathsf{E}}_t[\text {A}(\nu , \tau _{\nu })\ | \ \nu ] = a_{\nu }\), in view of the stochastic independence of \(\nu \) and \(\{\tau _n\}_{n \ge 1}\). Therefore, conclude by observing that \({\mathsf{E}}_t[a_{\nu }] = \sum _{n = 1}^{\infty } a_n e^{-t} (1 - e^{-t})^{n-1} = e^{-(1 - 2\alpha )t}\).
1.2 A.2 Probability law of \(\{\tau _n\}_{n \ge 1}\)
The aim is to show that the coefficient \(p_n({\mathfrak {t}}_n)\) in the Wild-McKean sum is equal to \({\mathsf{P}}_t[\tau _n = {\mathfrak {t}}_n]\) for every \(n\). Proceeding by mathematical induction, observe that the assertion is trivially true for \(n = 1, 2\). To treat the case \(n \ge 3\), introduce the symbol \({\mathbb {P}}({\mathfrak {t}}_n)\) to denote the subset of \({\mathbb {T}}(n-1)\) of the trees which are able to produce \({\mathfrak {t}}_n\) by germination. Whence,
the last equality being valid thanks to the inductive hypothesis. Now,
and, by (33), the RHS turns out to be equal to
At this stage, observe that
and that the same procedure yields \(\sum _{\begin{array}{c} {\mathfrak {s}}_{n-1}\in {\mathbb {P}}({\mathfrak {t}}_n)\\ {\mathfrak {s}}_{n-1}^r = {\mathfrak {t}}_n^r \end{array}} p_{n_l - 1}({\mathfrak {s}}_{n-1}^l) = (n_l - 1) p_{n_l}({\mathfrak {t}}_n^l)\). To complete the proof it is enough to combine the previous equations and to recall that \(n = n_l + n_r\).
1.3 A.3 A few interesting characteristics of \({\mathcal {C}}[\zeta , \eta ; \varphi ]\)
The first point concerns the invariance of (38) w.r.t. the choice of \(\{\mathbf{a}(\mathbf{u}), \mathbf{b}(\mathbf{u}), \mathbf{u}\}\). Fix \(\varvec{\xi }\ne \mathbf{0}\) and let \(\{\mathbf{a}(\mathbf{u}), \mathbf{b}(\mathbf{u}), \mathbf{u}\}\) and \(\{\mathbf{a}^{'}(\mathbf{u}), \mathbf{b}^{'}(\mathbf{u}), \mathbf{u}\}\) be distinct positive bases. Then, write \(\varvec{\psi }^l\) and \(\varvec{\psi }^r\) in (37) with \(\{\mathbf{a}^{'}(\mathbf{u}), \mathbf{b}^{'}(\mathbf{u}), \mathbf{u}\}\) in the place of \(\{\mathbf{a}(\mathbf{u}), \mathbf{b}(\mathbf{u}), \mathbf{u}\}\). Since there exists some \(\theta ^{*}\) in \([0, 2\pi )\) such that \(\mathbf{a}^{'} = \cos \theta ^{*} \mathbf{a} - \sin \theta ^{*} \mathbf{b}\) and \(\mathbf{b}^{'} = \sin \theta ^{*} \mathbf{a} + \cos \theta ^{*} \mathbf{b}\), the change of basis gives
After substituting these expressions in (38), the desired conclusion follows from an obvious change of variable.
To prove the measurability of \((\varvec{\xi }, \varphi ) \mapsto I(\varvec{\xi }, \varphi )\), resort to Proposition 9 in Section 9.3 of [36], so that it is enough to verify the continuity of \(\varphi \mapsto I(\varvec{\xi }, \varphi )\) for each fixed \(\varvec{\xi }\) and the measurability of \(\varvec{\xi }\mapsto I(\varvec{\xi }, \varphi )\) for each fixed \(\varphi \). The former claim follows from the form of the dependence on \(\varphi \) in (37)–(38). To verify the latter, one can show that also \(\varvec{\xi }\mapsto I(\varvec{\xi }, \varphi )\) is continuous for each fixed \(\varphi \). Continuity at \(\varvec{\xi }= \mathbf{0}\) can be derived from the relation \(|\varvec{\psi }^l| = |\varvec{\psi }^r| = 1\) and an ensuing application of the dominated convergence theorem. To check continuity at \(\varvec{\xi }^{*} \ne \mathbf{0}\), take a sequence \(\{\varvec{\xi }_n\}_{n \ge 1}\) converging to \(\varvec{\xi }^{*}\) and observe that \(|\varvec{\xi }_n| \rightarrow |\varvec{\xi }^{*}|\) and \(\mathbf{u}_n := \varvec{\xi }_n/|\varvec{\xi }_n| \rightarrow \mathbf{u}^{*} := \varvec{\xi }^{*}/|\varvec{\xi }^{*}|\). Fix a small open neighborhood \(\Omega (\mathbf{u}^{*}) \subset S^2\) of \(\mathbf{u}^{*}\) in such a way that \(S^2 {\setminus }\overline{\Omega (\mathbf{u}^{*})}\) contains at least two antipodal points. In view of the first part of this appendix, choose a distinguished basis in such a way that the restrictions of \(\mathbf{u}\mapsto \mathbf{a}(\mathbf{u})\) and \(\mathbf{u}\mapsto \mathbf{b}(\mathbf{u})\) to \(\Omega (\mathbf{u}^{*})\) vary with continuity. As a consequence, \(\varvec{\psi }^l(\varphi , \theta , \mathbf{u}_n)\) converges to \(\varvec{\psi }^l(\varphi , \theta , \mathbf{u}^{*})\) and \(\varvec{\psi }^r(\varphi , \theta , \mathbf{u}_n)\) converges to \(\varvec{\psi }^r(\varphi , \theta , \mathbf{u}^{*})\) for every \(\varphi \) in \([0, \pi ]\) and \(\theta \) in \((0, 2\pi )\), and the convergence of \(I(\varvec{\xi }_n, \varphi )\) to \(I(\varvec{\xi }^{*}, \varphi )\) follows again from the dominated convergence theorem. To show that \(\varvec{\xi }\mapsto I(\varvec{\xi }, \varphi )\) is a c.f. for every \(\varphi \) in \([0, \pi ]\), resort to the multivariate version of the Bochner characterization. See Exercise 3.1.9 in [60]. The only point that requires some care is positivity. If this property were not in force, one could find a positive integer \(N\), two \(N\)-vectors \((\omega _1, \dots , \omega _N)\) and \((\varvec{\xi }_1, \dots , \varvec{\xi }_N)\) in \({\mathbb {C}}^N\) and \(({\mathbb {R}}^3)^N\) respectively, and some \(\varphi ^{*}\) in \([0, \pi ]\) in such a way that \(\sum _{j = 1}^{N} \sum _{k = 1}^{N} \omega _j \overline{\omega }_k I(\varvec{\xi }_j - \varvec{\xi }_k, \varphi ^{*}) < 0\). Note that the LHS of this inequality is a real number since \(I(-\varvec{\xi }, \varphi ) = \overline{I(\varvec{\xi }, \varphi )}\) for any \(\varvec{\xi }\) and \(\varphi \). Hence, by continuity of \(\varphi \mapsto I(\varvec{\xi }, \varphi )\), there exists an open interval \(J\) in \([0, \pi ]\) containing \(\varphi ^{*}\) such that \(\varphi \mapsto \sum _{j = 1}^{N} \sum _{k = 1}^{N} \omega _j \overline{\omega }_k I(\varvec{\xi }_j - \varvec{\xi }_k, \varphi )\) is strictly negative on \(J\). Now, choose a specific \(b_{*}\) for which the resulting p.m. in (20), say \(\beta _{*}\), is supported by \(\overline{J}\). By construction, \(L := \int _{0}^{\pi } \sum _{j = 1}^{N} \sum _{k = 1}^{N} \omega _j \overline{\omega }_k I(\varvec{\xi }_j - \varvec{\xi }_k, \varphi ) \beta _{*}(\text {d}\varphi )\) is a strictly negative number, a fact which immediately leads to a contradiction. Indeed, denote by \({\mathcal {Q}}_{*}[\zeta , \eta ]\) the RHS of (35) when \(b_{*}\) replaces \(b\) in the definition of \(Q[p, q]\). Observe that \({\mathcal {Q}}_{*}[\zeta , \eta ]\) is in any case a p.m. even if \(b_{*}\) does not meet (2). Now, \(L\) must be equal to \(\sum _{j = 1}^{N} \sum _{k = 1}^{N} \omega _j \overline{\omega }_k \hat{\mathcal {Q}}_{*}[\zeta , \eta ](\varvec{\xi }_j - \varvec{\xi }_k)\), thanks to (36), and this quantity must be non-negative, from the Bochner criterion again.
To prove (39), start by verifying that \(\varphi \mapsto {\mathcal {C}}[\zeta , \eta ; \varphi ]\) is measurable, which is tantamount to checking that \(\varphi \mapsto {\mathcal {C}}[\zeta , \eta ; \varphi ](K)\) is measurable for every \(K = \mathsf X _{\begin{array}{c} i=1 \end{array}}^{3} (-\infty , x_i]\), in view of Lemma 1.40 of [48]. To this aim, fix such a \(K\) and use the Fubini theorem to show that
is measurable, since \((\varvec{\xi }, \varphi ) \mapsto \hat{\mathcal {C}}[\zeta , \eta ; \varphi ](\varvec{\xi })\) does. To complete the argument, invoke the inversion formula and note that \({\mathcal {C}}[\zeta , \eta ; \varphi ](K)\) is equal to the limit of the above expression as \(c \uparrow +\infty \), \(a_m \downarrow -\infty \) and \(b_m \downarrow x_m\) for \(m = 1, 2, 3\). This paves the way for writing the integral in (39), and the equality therein follows from (36) and (38) in view of the injectivity of the Fourier-transform operator.
1.4 A.4 Proof of (40)
The first step is to show that \(\varvec{\varphi } \mapsto {\mathcal {C}}_{{\mathfrak {t}}_n}[\mu _0; \varvec{\varphi }]\) is measurable as a map from \([0, \pi ]^{n-1}\) into \(\mathcal {P}({\mathbb {R}}^3)\). Mimicking the argument in the last part of A.3, it suffices to verify the measurability of \((\varvec{\xi }, \varvec{\varphi }) \mapsto \hat{\mathcal {C}}_{{\mathfrak {t}}_n}[\mu _0; \varvec{\varphi }](\varvec{\xi })\) by means of Proposition 9 in Section 9.3 of [36]. On the one hand, the function \(\varvec{\xi }\mapsto \hat{\mathcal {C}}_{{\mathfrak {t}}_n}[\mu _0; \varvec{\varphi }](\varvec{\xi })\) is continuous for every fixed \(\varvec{\varphi }\). On the other hand, measurability of \(\varvec{\varphi } \mapsto \hat{\mathcal {C}}_{{\mathfrak {t}}_n}[\mu _0; \varvec{\varphi }](\varvec{\xi })\), for every fixed \(\varvec{\xi }\), can be proved by induction. When \(n = 1\), \(\hat{\mathcal {C}}_{{\mathfrak {t}}_1}[\mu _0; \emptyset ](\varvec{\xi })\) is independent of \(\varvec{\varphi }\) and the claim is obvious. When \(n \ge 2\), it suffices to recall (44) and to exploit the inductive hypothesis. To conclude, the equality
for \(n = 2, 3, \dots \) will be proved by mathematical induction. First, when \(n = 2\), (139) is valid since it coincides with (36). When \(n \ge 3\), combine the definition of \({\mathcal {Q}}_{{\mathfrak {t}}_n}\) with (36) to obtain
and the argument is completed by invoking the inductive hypothesis, the definition of \({\mathcal {C}}_{{\mathfrak {t}}_n}\) and (36). Therefore, (139) entails (40) in view of the injectivity of the Fourier-transform operator.
1.5 A.5 Proof of Proposition 4
Put \(k := \lceil 2/p \rceil \) with \(p\) as in (15) and consider the random vector \(\mathbf{S} = (S_1, S_2, S_3) := \sum _{j=1}^{2k} (-1)^{j} \mathbf{V}_j\), whose c.f. \(\phi \) is given by \(\phi (\varvec{\xi }) = |\hat{\mu }_{0}(\varvec{\xi })|^{2k}\). The assumptions (47)–(48) plainly entail \({\mathsf{E}}_t\left[ \mathbf{S}\right] = \mathbf{0}\), \({\mathsf{E}}_t\left[ S_i S_j\right] = 0\) for \(i \ne j\), and \({\mathsf{E}}_t\left[ S_{i}^{2}\right] = 2k \sigma _{i}^{2}\) for \(i = 1, 2, 3\). Note also that \(\sigma _{i}^{2} > 0\) for \(i = 1, 2, 3\) as a consequence of (15). Moreover, thanks to the Lyapunov inequality, \({\mathsf{E}}_t\left[ |\mathbf{S}|^3\right] \le (2k)^3 {\mathfrak {m}}_3\). Now, standard arguments explained, e.g., in Section 8.4 of [24] show that
with \(\sigma _{*}^{2} := \min \{\sigma _{1}^{2}, \sigma _{2}^{2}, \sigma _{3}^{2}\}\). Thus, \(\phi (\varvec{\xi }) \le 1 - \frac{k}{2} \sigma _{*}^{2} |\varvec{\xi }|^2\) whenever \(|\varvec{\xi }| \le (3 \sigma _{*}^{2})/(8 k^2 {\mathfrak {m}}_3)\), and elementary algebra entails \(1 - \frac{k}{2} \sigma _{*}^{2} |\varvec{\xi }|^2 \le \frac{\lambda ^2}{\lambda ^2 + |\varvec{\xi }|^2}\) for every \(\varvec{\xi }\), provided that \(\lambda ^2 \ge 2/k\sigma _{*}^{2}\). Now, (15) gives \(|\phi (\varvec{\xi })| \le L|\varvec{\xi }|^{-4}\) for every \(\varvec{\xi }\ne \mathbf{0}\), with \(L := (\sup _{\varvec{\xi }\in {\mathbb {R}}^3} |\varvec{\xi }|^p |\hat{\mu }_0(\varvec{\xi })|)^{4/p}\), and again some algebra entails \(L|\varvec{\xi }|^{-4} \le \frac{\lambda ^2}{\lambda ^2 + |\varvec{\xi }|^2}\) if \(|\varvec{\xi }|^2 \ge B(\lambda ) := (L + \sqrt{L^2 + 4L\lambda ^4})/(2\lambda ^2)\). Note that \(B(\lambda ) \le 2\sqrt{L}\) holds true when \(\lambda ^2 \ge (2\sqrt{L})/3\). At this stage, choosing any \(\lambda \) satisfying \(\lambda ^2 \ge \max \{2/k\sigma _{*}^{2}, (2\sqrt{L})/3\}\) yields
for every \(\varvec{\xi }\) such that either \(|\varvec{\xi }| \le (3 \sigma _{*}^{2})/(8 k^2 {\mathfrak {m}}_3)\) or \(|\varvec{\xi }|^2 \ge 2\sqrt{L}\). Therefore, the proof is completed if \((4L)^{1/4} \le (3 \sigma _{*}^{2})/(8 k^2 {\mathfrak {m}}_3)\). Otherwise, if \((4L)^{1/4} > (3 \sigma _{*}^{2})/(8 k^2 {\mathfrak {m}}_3)\), define
and resort to Corollary 2 in Section 8.4 of [24] to state that \(M < 1\). Then, (140) holds true also when \((3 \sigma _{*}^{2})/(8 k^2 {\mathfrak {m}}_3) \le |\varvec{\xi }| \le (4L)^{1/4}\) if \(M \le \inf _{(3 \sigma _{*}^{2})/(8 k^2 {\mathfrak {m}}_3) \le |\varvec{\xi }| \le (4L)^{1/4}}\left( \frac{\lambda ^2}{\lambda ^2 + |\varvec{\xi }|^2}\right) \), the last inequality being equivalent to \(\lambda ^2 \le 2\sqrt{L} M/(1 - M)\). In conclusion, taking
leads to state that (140) is valid for every \(\varvec{\xi }\), and (49) follows.
1.6 A.6 Proof of Proposition 5
Initially, suppose that \(\chi (\text {d} \mathbf{x}) = f(\mathbf{x}) \text {d} \mathbf{x}\) for some \(f\) in \(\text {L}^1({\mathbb {R}}^3)\). Therefore, \(\varDelta \hat{\chi }(\varvec{\xi }) = \int _{{\mathbb {R}}^3}|\mathbf{x}|^2 f(\mathbf{x}) e^{i \mathbf{x}\cdot \varvec{\xi }} \text {d} \mathbf{x}\) and then, by the Plancherel identity,
Now, note that \(|\chi |({\mathbb {R}}^3) = \int _{{\mathbb {R}}^3}|f(\mathbf{x})| \text {d} \mathbf{x}\) and apply the Cauchy-Schwartz inequality to get
where \(\int _{{\mathbb {R}}^3}\frac{\text {d} \mathbf{x}}{1 + |\mathbf{x}|^4} = \sqrt{2}\pi ^2\). For a general \(\chi \), consider the convolution \(\chi _{\epsilon }\) of \(\chi \) with the Gaussian distribution of zero mean and covariance matrix \(\epsilon ^2 \text {I}\). Since \(\chi _{\epsilon }\) is absolutely continuous, the first part of the proof gives
and thereby, taking account of \(|\chi |({\mathbb {R}}^3) \le \liminf _{\epsilon \downarrow 0} |\chi _{\epsilon }|({\mathbb {R}}^3)\) and letting \(\epsilon \downarrow 0\),
To complete the argument, observe that \(\sup _{B \in {\fancyscript{B}}({{\mathbb {R}}}^3)} |\chi (B)| \le |\chi |({\mathbb {R}}^3)\).
1.7 A.7 Proof of Proposition 6
Taking account of (47), note that
with \({\mathfrak {m}}_2 = 3\). The equality emanates by virtue of the stochastic independence of the \(\mathbf{V}_j\)’s while the inequality follows from the combination of the Cauchy-Schwartz inequality with (23) and the identity \(|\varvec{\psi }_{j, n}(\mathbf{u})| = 1\). Thus, (55) holds true for \(h = 2\) with \(g_2 = {\mathfrak {m}}_2\). The case \(h = 1\) can be derived from the case \(h = 2\) thanks to the conditional Lyapunov inequality after putting \(g_1 = \sqrt{g_2}\). When \(h \ge 3\), an inequality due to Rosenthal (see Section 2.3 in [58]) yields
where \(c(h)\) is a positive constant depending only on \(h\). An additional application of the Cauchy–Schwartz inequality, combined with (23) and \(|\varvec{\psi }_{j, n}(\mathbf{u})| = 1\), gives
which entails (55) with \(g_h = c(h) \{{\mathfrak {m}}_h + {\mathfrak {m}}_{2}^{h/2}\}\). Now, \(\frac{\partial ^h}{\partial \rho ^h}{\mathcal {N}}(\rho ; \mathbf{u})\) exists and is uniformly bounded by \(g_h\), for \(h = 1, \dots , 2k\). Then, since \(\hat{{\mathcal {M}}}(\rho \mathbf{u}) = {\mathsf{E}}_t[\hat{{\mathcal {N}}}(\rho ; \mathbf{u}) \ | \ {\fancyscript{G}} ]\) and the interchanging of the derivative with the expectation is here valid, one gets (56). Finally, taking \(\mathbf{u}= \mathbf{e}_i\) in (56) yields \(\int _{{\mathbb {R}}^3}v_{i}^{2k}{\mathcal {M}}(\text {d}\mathbf{v}) < +\infty \) for \(i = 1, 2, 3\) which, in turn, entails (57).
1.8 A.8 Proof of Proposition 7
The definition of the \(k\)-th Hermite polynomial shows that \(\frac{\text {d}^k}{\text {d}\rho ^k} e^{-\rho ^2/2} = 2^{-k/2} H_k\left( \frac{\rho }{\sqrt{2}}\right) e^{-\rho ^2/2}\) for every \(k\) in \({\mathbb {N}}_0\) and \(\rho \) in \({\mathbb {R}}\). See, for example, (1) in Section 2.IV of [59]. Moreover, according to (\(9_2\)) therein,
where \([n]\) stands for the integral part of \(n\), and hence
with \(\gamma _{k, h, l} := \frac{(-2)^{-h -l}(k!)^2}{h! l! (k - 2h)! (k - 2l)!}\). Now, take account of the following elementary inequalities: \(\int _{x}^{+\infty } e^{- \rho ^2/2} \text {d}\rho \le \ \frac{1}{x} e^{- x^2/2}\) for \(x > 0\), and \(\rho ^t e^{-\rho ^2/2} \le \ (t/e)^{t/2}\) for \(\rho \ge 0\) and \(t \ge 0\), with the proviso that \(0^0 := 1\) when \(t = 0\). Whence,
with \(c(m, s, k) := \sum _{h =0}^{[k/2]} \sum _{l =0}^{[k/2]} |\gamma _{k, h, l}| \left( \frac{m + 2(k - h -l)}{e}\right) ^{m/2 + k - h -l} \left( \frac{s - 1}{e}\right) ^{(s - 1)/2}\).
1.9 A.9 Proof of Propositions 8 and 10
The main task is to prove (64), (66) and (93). The remaining inequalities (65) and (67) can be derived by interchanging derivative with expectation in the equality \(\hat{{\mathcal {M}}}(\rho \mathbf{u}) = {\mathsf{E}}_t[\hat{{\mathcal {N}}}(\rho ; \mathbf{u}) \ | \ {\fancyscript{G}}]\), since \(\varPsi (\rho )\) is a \({\fancyscript{G}}\)-measurable random variable for every fixed \(\rho \). To start, (64) follows from the combination of (32), (49) and (63), upon recalling that \(|\varvec{\psi }_{j, \nu }| = 1\). With a view to proving (66) and (93), it is worth noting that \(0 \le \rho \le \text {R}\) entails \(\sup _{\mathbf{u}\in S^2} \big |\hat{\mu }_0(\rho \pi _{j, \nu } \varvec{\psi }_{j, \nu }(\mathbf{u})) - 1\big | \le 19/128\) for \(j = 1, \dots , \nu \) and for every choice of \(\text {B}\) in (29), as shown in [30]. This paves the way for considering the principal value of the logarithm and then for writing
The next step concerns the computation of certain derivatives of \(\hat{{\mathcal {N}}}(\rho ; \mathbf{u})\) by means of the above identity. To this aim, the system of coordinates introduced in Sect. 2.2.2 comes now in useful. Then, for \(k = 1, \dots , 4\),
where \(x\) can be \(\rho \), \(u\) or \(v\). To bound each of these products, use (64) as far as \(\hat{{\mathcal {N}}}(\rho ; \mathbf{h}_k)\) is concerned, and proceed with the detailed computation of bounds for the derivatives of the logarithms. As a starting point for all these calculations, consider the following equalities from [30]:
and
Here, \(w_j(\rho , \mathbf{u}) := \hat{\mu }_0(\rho \pi _{j, \nu } \varvec{\psi }_{j, \nu }(\mathbf{u})) - 1\),
and the remainder \(R_j(\rho , \mathbf{u})\) can assume one of the following forms:
The aim is now to show that \(\sum _{j=1}^{\nu }\frac{\partial ^l}{\partial x^l} \text {Log}[ \hat{\mu }_0(\rho \pi _{j, \nu } \varvec{\psi }_{j, \nu })]\) admits, for \(l = 1, 2\), an upper bound presentable as a non-random polynomial in \(\rho \), independent of \(\mathbf{u}\).
As far as the derivatives w.r.t. \(\rho \) are concerned, for the first two terms on the RHS of (145) one gets
for \(l = 0, 1, 2\), thanks to the fact that \(|\varvec{\psi }_{j, \nu }| = 1\). Moreover, recall that \({\mathfrak {m}}_2 = 3\) in view of (47). Standard manipulations of the above expressions of \(R_j(\rho , \mathbf{u})\) lead to
for \(l = 0, 1, 2\), with \(c_0(R) = 1/24\), \(c_1(R) = 1/6\) and \(c_2(R) = 1/3\). See [30] for the details. After recalling (60), this last inequality plainly entails
As to the last term in (145), one has
and
Since \(\Phi \) is completely monotone, \(|w_j| \le \frac{19}{128}\) yields \(|\Phi ^{(l)}(w_j)| \le |\Phi ^{(l)}(-\frac{19}{128})|\) for every \(l\) in \({\mathbb {N}}\). Then, combining (144) with (146)–(147) gives
for \(l = 0, 1, 2\). By virtue of the Lyapunov inequality and Theorem 19 in [43], (152) entails
for \(l = 0, 1, 2\), with
Thus, starting from (142)–(143) and utilizing (146)–(148), (149)–(150) and (153) with \(x = \rho \), one can define the \(\wp _k\)’s in (66)–(67) as follows:
and
This completes the proof of (66), showing also that the bound therein is independent of the choice of \(\text {B}\) in (29).
To prove (93), one begins by considering \((u, v)\) in \(D_k\) and taking \(\text {B}\) in (29) equal to \(\text {B}_k\) according to (90)–(91). In this way, every map \(\varvec{\psi }_{j, \nu ; k} : (u, v) \mapsto \varvec{\psi }_{j, \nu }(\mathbf{h}_k(u, v))\), and hence the map \(\hat{{\mathcal {N}}}_k : (u, v) \mapsto \hat{{\mathcal {N}}}(\rho ; \mathbf{h}_k(u, v))\), turns out to belong to \(\text {C}^4(D_k)\) for \(k = 1, \dots , 4\). Then, one resorts to (142)–(143), with \(x\) standing either for \(u\) or \(v\), and uses (64) to bound the common factor \(\hat{{\mathcal {N}}}_k\). As to the derivatives w.r.t. \(x\), one evaluates the expression of \(\frac{\partial ^l}{\partial x^l} [\varvec{\psi }_{j, \nu ; k} \cdot \mathbf{v}]^m\) for \(l = 1, 2\), and applies the Cauchy-Schwartz inequality. Whence, after recalling (29) and introducing the \(\text {L}^2\) norm \(\mid \mid \!\cdot \! \mid \mid _{*}\) of matrices, one gets
when \(l = 1, 2\) and \(m \ge l\). Since \(\mid \mid \!\frac{\partial ^s}{\partial x^s} \text {B}_k \! \mid \mid _{*}\ \le \sqrt{3}\) for every \(s\) in \({\mathbb {N}}\), one has
for \(l = 1, 2\). Then, one proceeds with the study of the derivatives of the third term in the RHS of (145). As far as the first order derivative is concerned, one resorts to the second of the expressions of \(R_j\), given in the first part of this Appendix, to write
By virtue of (154), one gets
which, recalling (60), entails
To compute the second order derivatives of \(R_j\) one employs the third of its expressions to write
From (154) and the inequality \(|e^{i x} - \sum _{r = 1}^{N - 1} (i x)^r/r!| \le |x|^N/N!\) one obtains the bound
which, taking account of (60), becomes
Finally, as to the remaining term in the RHS of (145), one utilizes (149)–(150) with \(x = u, v\). Then, combining (144) with (155) and (156) gives
By virtue of the Lyapunov inequality and Theorem 19 in [43], (161) yields
As for the second order derivatives, from the combination of (144) with (155) and (158) one gets
and hence
where
In view of (149), (151), (153), and (162),
and, utilizing (145), (155), (156) and (165), one concludes that
for \(\rho \) in \([0, \text {R}]\). To obtain a bound of the same type for the second derivative, one can first combine (150), (151), (153) and (161)–(164) to get
and, then, utilize (145), (155), (158) and (167), to conclude that
At this stage, one observes that the RHSs of (166) and (168) can be written as \(\rho ^2 \wp _{L, 1}(\rho )\) and \(\rho ^2 \wp _{L, 2}(\rho )\) respectively, for specific non-random polynomials \(\wp _{L, 1}\) and \(\wp _{L, 2}\) with positive coefficients. As final step of the proof, expressing \(\varDelta _{S^2}\) in local coordinates leads to
where \(4(2 + \sqrt{3}) = \max _{u \in [\frac{1}{12}\pi , \frac{11}{12}\pi ]} \max \{|\cot u|, 1/\sin ^2u\}\) and hence
1.10 A.10 Proof of Proposition 9
Fix the sample point \(\omega \) in \(U^c\) and denote by \(n\) the value of \(\nu \) at \(\omega \). Then, designate the values of \(\pi _{1, \nu }^{2}, \dots , \pi _{\nu , \nu }^{2}\) at \(\omega \) by \(a_1, \dots , a_n\) respectively, so that each \(a_j\) belongs to \([0, 1]\) and \(\sum _{j = 1}^{n} a_j = 1\) in view of (23). The argument continues by resorting to the following combinatorial tools:
-
(i)
The \(k\)-th elementary symmetric function \(S_k(a_1, \dots , a_n)\) defined by
$$\begin{aligned} S_k(a_1, \dots , a_n) := \sum _{1 \le i_1 < \dots < i_k \le n} a_{i_1} \dots a_{i_k} \end{aligned}$$for \(k\) in \(\{1, \dots , n\}\).
-
(ii)
The \(k\)-th Newton symmetric function given by
$$\begin{aligned} N_k(a_1, \dots , a_n) := \sum _{j = 1}^{n} a_{j}^{k}. \end{aligned}$$ -
(iii)
The group of relations, known as Newton’s identities, which read
$$\begin{aligned} k S_k = \sum _{j = 1}^{k} (-1)^{j + 1} N_j S_{k - j} \end{aligned}$$for \(k\) in \(\{1, \dots , n\}\), with the proviso that \(S_0(a_1, \dots , a_n) := 1\).
See Section 1.9 of [52] for details. The way is now paved to prove that, if \(a_{*}\in (0, 1)\), \(N_1 = 1\) and \(N_2 \le a_{*}\), then
holds for each \(k\) in \(\{1, \dots , n\}\). Proceeding by mathematical induction, when \(k = 1\), one has \(S_1 = N_1 = 1\) and (169) follows. When \(k \ge 2\), combine the Newton identities with the inductive hypothesis to get
At this stage, note that \(N_j \le a_{*}\) for each \(j\) in \(\{2, \dots , k\}\). Moreover, thanks to the multinomial identity (see, e.g., 1.7.2 in [52]), \(N_1 = 1\) entails \(S_m \le 1/m! \le 1\) for each \(m\) in \(\{0, \dots , n\}\). Hence,
which concludes the proof of (169), after noting that \(2^{k-2} + (k-1) \le k2^{k-1}\) for every \(k\) in \({\mathbb {N}}\). To complete the proof of (68), observe that \(\omega \in U^c\) entails \(n \ge r\) by virtue of (51), so that
Finally, recall the relation between \(r\) and \(a_{*}\) given by (51) and apply (169) to obtain
To prove (69), an obvious change of variable entails
and conclude by using (68).
1.11 A.11 Proof of Proposition 11
The analysis to be developed is concerned with each of the charts \(\Omega _1, \dots , \Omega _4\), but it is of the same kind for all of them. Therefore, even if the notation agrees with that introduced in A. 9, the subscript \(k\) referring to the \(k\)-th chart will be dropped. The computation of the Laplacian in (141) yields
for every fixed \(\rho \). It is worth mentioning that here any differential operator \({\fancyscript{D}}\), applied to the complex-valued function \(h = f + ig\), must be intended as \({\fancyscript{D}}h := {\fancyscript{D}}f + i {\fancyscript{D}}g\) and that the scalar product \(\langle U_1 + i U_2, V_1 + i V_2\rangle \) is defined, by linearity, as \(\langle U_1 , V_1\rangle - \langle U_2 , V_2\rangle + i \langle U_1 , V_2\rangle + i \langle U_2 , V_1\rangle \) for the vector fields \(U_1, U_2, V_1, V_2\). Now, after observing that \(1\!\!1_{T^c} = 1 - 1\!\!1_{T}\), one gets
which represents the starting point for the proof at issue. To bound the term \(|\hat{{\mathcal {N}}}|\), use (141), (145), (80), (148) and (153) to write
for \(\rho \) in \([0, \text {R}]\), with \(c(N) := \exp \{\frac{1}{16}c_0(R) + |\Phi (-\frac{19}{128})| k_0(w)\}\). As to the term containing the gradient, an application of the triangular inequality in (145) shows that
for \((\rho , \mathbf{u})\) in \([0, \text {R}] \times \Omega \), where \(E_1\) and \(E_2\) are conditional expectations, to be specified below, involving the remainders \(R_j\) and \(\Phi (w_j)w_{j}^2\) respectively. As to the second summand, the convexity of the square of the Riemannian length entails
To evaluate the last integral, recall that \(\varvec{\psi }_{j, \nu }\) is given by (29) with the proper choice of \(\text {B}\) as in (90)–(91), which makes \(\varvec{\psi }_{j, \nu } : \Omega \rightarrow S^2\) smooth. Writing the gradient in coordinates yields
Since \(1/\sin ^2 u \le 4(2 + \sqrt{3})\) for every \((u, v)\) in \(D\), (154) leads to
The terms \(E_1\) and \(E_2\) in (172) can be derived as uniform bounds w.r.t. \(\mathbf{u}\) by writing \(\Big |\Big | \!\nabla _{S^2}\text {Log}\hat{{\mathcal {N}}}(\rho ; \mathbf{u})\! \Big |\Big |_{S^2}^2\) in coordinates, according to
As for \(E_1\), (156) and (157) give
for every \(\rho \) in \([0, \text {R}]\), the RHS being a \({\fancyscript{G}}\)-measurable function. Apropos of \(E_2\), start from (149) and notice that, in view of (153) and the complete monotonicity of \(\Phi \),
holds for every \(j = 1, \dots , \nu \), and \((\rho , \mathbf{u})\) in \([0, \text {R}] \times \Omega \). Then, by virtue of Lyapunov’s inequality and Theorem 19 in [43], inequalities (152), with \(l= 0\), (161) and (163) become
respectively. These last inequalities, in combination with (153) and (162), entail
for every \(\rho \) in \([0, \text {R}]\) with
This concludes the analysis of the first summand in the RHS of (170), after noting that the upper bound provided by (177) is \({\fancyscript{G}}\)-measurable. To proceed, for notational simplicity put
so that (141) can be re-written as \(\hat{{\mathcal {N}}} = \exp \{-\rho ^2/2 + A\rho ^2 + iB\rho ^3 + H\}\). Observe that \({\mathsf{E}}_t[A^2\ | \ {\fancyscript{G}}] = \frac{1}{4} \text {Z}(\mathbf{u})\) and \({\mathsf{E}}_t[(\varDelta _{S^2} A)^2\ | \ {\fancyscript{G}}] = \frac{1}{4} \text {Z}_L(\mathbf{u})\) hold by definition, and invoke (95)–(96) to write
for every \(\mathbf{u}\) in \(\Omega \). To deal with the Laplacian of \(H\), start from the sum of the \(R_j\)’s. After writing the Laplacian in coordinates, the combination of (156) with (158) gives
Analogously, combining (174)–(176) with (153), (162) and (164) yields
with
Hence, the second summand in the RHS of (170) admits the following bound:
for every \((\rho , \mathbf{u})\) in \([0, \text {R}] \times \Omega \). Apropos of the third summand in the RHS of (170), recall from A.9 that \(\sup _{\mathbf{u}} |\varDelta _{S^2} \text {Log}\hat{{\mathcal {N}}}| \le \rho ^2 \wp _L(\rho )\) holds for every \(\rho \) in \([0, \text {R}]\). Therefore, in view of (86), one has
To deal with the last summand in the RHS of (170), a bound for \(\big |\hat{{\mathcal {N}}} - e^{-\rho ^2/2}\big |\) can be derived from the elementary inequalities \(|e^{i x} - 1| \le |x|\) and \(|e^z - 1| \le |z|e^{|z|}\), valid for every \(x\) in \({\mathbb {R}}\) and \(z\) in \({\mathbb {C}}\), respectively. Whence, one gets
which, in turn, yields
At this stage, note that \(\sup _{\mathbf{u}} |A| \le \frac{1}{2}(1 + {\mathfrak {m}}_2)\), \(\sup _{\mathbf{u}} |B| \le \frac{1}{6}{\mathfrak {m}}_3\) and \(\sup _{\mathbf{u}} e^{|H|} \le c(N)\) for every \(\rho \) in \([0, \text {R}]\), in view of (148) and (153). Thus, taking account of (147), (174) and (178)–(179) gives
with
For the remaining terms in (182), take the conditional expectation and write
Then, an application of the Lyapunov inequality shows that
the two RHSs being \({\fancyscript{G}}\)-measurable. To evaluate the integral in (186), it is enough to write the Laplacian w.r.t. the coordinates \((u, v)\) and to recall that \(\max \{1/\sin ^2 u, 1/|\cot u|\} \le 4(2 + \sqrt{3})\) for every \((u, v)\) in \(D\), so that (154) leads to
There are now all the elements to complete the proof of Proposition 11 by setting, in view of (170)–(173), (177) and (180)–(186),
and
1.12 A.12 Proof of (103) and (110)–(111)
The identities at issue are proved by induction on \(n\). They hold true for \(n = 1\) in view of the following remarks. As to (103), it suffices to observe that \(\zeta _{1, 1} \equiv 1\), \(\varvec{\psi }_{1, 1}(\mathbf{u}) = \mathbf{u}\) and to exploit (47)–(48). Identity (110) holds thanks to \(\eta _{1, 1} \equiv 1\), \(\varvec{\psi }_{1, 1}(\mathbf{u}) = \mathbf{u}\) and the definition of \(l_3(\mathbf{u})\). As far as (111) is concerned, it is enough to observe that \(\pi _{1, 1} \equiv 1\) and \(\varvec{\psi }_{1, 1}(\mathbf{u}) = \mathbf{u}\). When \(n \ge 2\), one has to verify the identities
for every \(s = 1, 2, 3\), \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\), \(\varvec{\varphi }\) in \([0, \pi ]^{n-1}\), \(\mathbf{u}\) in \(S^2\) and every choice of \(\text {B}\) as in (29). After recalling the definition of the \(k\)-th Legendre polynomial \(P_k\), all the above equalities can be deduced from the common formula
Here, \(\varvec{\xi }\) denotes any unit vector while, for any \(k\) in \({\mathbb {N}}\), \(f_{1,1}^{(k)}({\mathfrak {t}}_1, \emptyset ) \equiv 1\) and
It is worth noting that \(f_{j, n}^{(1)} = \pi _{j, n}^{*}\), \(f_{j, n}^{(2)} = \zeta _{j, n}^{*}\) and \(f_{j, n}^{(3)} = \eta _{j, n}^{*}\). Now, in view of the same argument used in Sect. 2.1 to verify that (41) and (42) are equal, one gets
for any unit vector \(\varvec{\xi }\) and \(m\) in \({\mathbb {N}}\), which implies that the LHS of (187) can be written as \(\int _{(0, 2\pi )^{n - 1}} P_k(\mathbf{q}_{j, n}({\mathfrak {t}}_n, \varvec{\varphi }, \varvec{\theta }, \mathbf{u}) \cdot \varvec{\xi }) u_{(0, 2\pi )}^{\otimes _{n - 1}}(\text {d}\varvec{\theta })\). Taking \(j\) in \(\{1, \dots , n_l\}\), (43) and the inductive hypothesis yield
Then, one can write \(\varvec{\xi }\) as \(\cos \beta \sin \alpha \mathbf{a}(\mathbf{u}) + \sin \beta \sin \alpha \mathbf{b}(\mathbf{u}) + \cos \alpha \mathbf{u}\) for a suitable \((\alpha , \beta )\) in \([0, \pi ] \times [0, 2\pi )\), so that \(\varvec{\psi }^l \cdot \varvec{\xi }= \sin \varphi \sin \alpha \cos (\theta - \beta ) + \cos \varphi \cos \alpha \). Then, in view of the well-known addition theorem for the Legendre polynomials (see, e.g., (VII’) on page 268 of [59]),
which completes the proof of (187) for \(j \le n_l\), thanks to the definition of \(f_{j, n}^{(k)}\). The proof is completed by applying, mutatis mutandis, this very same argument to the case \(j > n_l\).
1.13 A.13 Proof of (124)
The main aim is to find a recursive inequality—reminiscent of (138)—for the conditional expectation
where \(\lambda \) is a positive parameter. For the sake of notational simplicity, the following devices will be adopted: Omission of the asterisks appearing in (22) and (27); removal of indices \((k, s)\) in \(\text {A}_{\lambda }(\nu , \tau _{\nu }; k, s)\) and of the subscript \(k\) in \(\Omega _k\) and \(\text {B}_k\); introduction of the symbols \(\varvec{\varphi }, \varvec{\theta }, \overline{\varvec{\varphi }}, \overline{\varvec{\theta }}\) to indicate \((\varphi _1, \dots , \varphi _n), (\theta _1, \dots , \theta _n), (\varphi _1, \dots , \varphi _{n-1}), (\theta _1, \dots , \theta _{n-1})\) respectively. In this notation one can write
The concept of germination explained in Sect. 1.5 is used to express the \(\pi \)’s and the \(\text {O}\)’s relative to \({\mathfrak {t}}_{n, k}\) in terms of the \(\pi \)’s and the \(\text {O}\)’s associated with \({\mathfrak {t}}_n\) according to
and
where \(\Sigma [{\mathfrak {t}}_n, k] : \{1, \dots , n-1\} \rightarrow \{1, \dots , n\}\) is an injection depending on \({\mathfrak {t}}_n\) and \(k\), while \(h\) is the element of \(\{1, \dots , n\}\) excluded from the range of \(\Sigma [{\mathfrak {t}}_n, k]\). If \(k = 1\) (\(n\), respectively) the first line (the last line, respectively) in (189)–(190) must be omitted. Therefore, the terms in (188) become
and
At this stage, apply the operator \({\fancyscript{D}}^{'}\) to the RHS of (191) and then consider the square of the corresponding norm. With a view to this application, it is useful to introduce the symbol \(\bullet \) to indicate: The product when \({\fancyscript{D}}^{'}\) is either \(\text {Id}\) or \(\varDelta _{S^2}\); the scalar product when \({\fancyscript{D}}^{'}\) is either \(\nabla _{S^2}\) or \(\nabla _{S^2}\varDelta _{S^2}\) and, when \({\fancyscript{D}}^{'}\) is the Hessian, for any pair of symmetric 2-forms \((\omega _1, \omega _2)\), \(\omega _1 \bullet \omega _2\) stands for \(\sum _{ij} \omega _1(V_i, V_j) \omega _2(V_i, V_j)\) where \(\{V_1, V_2\}\) is any orthonormal basis of vector fields. This procedure leads to the sum of the following three terms
Following (188), one has to consider the integral \(\int _{[0, \pi ]^n} \int _{(0, 2\pi )^n} \int _{\Omega } \) of \(T1\), \(T2\) and \(T3\) respectively, as well as the integral \(\int _{[0, \pi ]^n}\) of the RHS of (192). Then, observe that
holds since the measures \(u_{(0, 2\pi )}^{\otimes _n}\) and \(\beta ^{\otimes _n}\) are exchangeable, i.e. invariant under permutation of the coordinates. As to the integral of \(T2\), it is worth remarking that \(T2\) depends on \((\varphi _h, \theta _h)\) only through \(\cos ^2\varphi _h\), \(\sin ^2\varphi _h\), \(\text {M}^{l}(\varphi _h, \theta _h)\) and \(\text {M}^{r}(\varphi _h, \theta _h)\). Therefore, since \(\bullet \) behaves like the scalar product, one is led to consider the integral
which, after putting \(\varvec{\xi }:= (\text {B}\text {O}_{k, n})^{t}\ \mathbf{e}_s\), becomes
when \(e = l\) and
when \(e = r\). At this stage, the identities
and \(\int _{0}^{\pi } (-6 \cos ^2\varphi _h \sin ^2\varphi _h) \beta (\text {d}\varphi _h) = 3 \varLambda _b\) show that
Whence, thanks to (193), one gets
Moreover, for the term \(- 2\cos ^2\varphi _h \sin ^2\varphi _h \pi _{k, n}^{4}({\mathfrak {t}}_n, \Sigma [{\mathfrak {t}}_n, k] \overline{\varvec{\varphi }})\) in (192), one has
Then, it remains to consider the term containing \(T3\) by showing that there exists a value \(\lambda _0 = \lambda _0({\fancyscript{D}}^{'})\) such that
for every \(\lambda \ge \lambda _0\), \(n \ge 2\), \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\) and \(s = 1, 2, 3\). In fact, the LHS can be written as
where
Then, after putting \(R = (r_{ij})_{ij} := \text {O}_{k, n} \text {K} \text {O}_{k, n}^t\) and \(f_{ij}^{(s)}(\mathbf{u}) := \big (\mathbf{e}_s \cdot \text {B}(\mathbf{u})\mathbf{e}_i\big ) \big (\mathbf{e}_s \cdot \text {B}(\mathbf{u})\mathbf{e}_j\big )\), one notes that
and that \(\max _{ij} |r_{ij}|^2 \le \ \mid \mid \!\text {O}_{k, n} \! \mid \mid ^{4}_{*} \cdot \mid \mid \!\text {K} \! \mid \mid ^{2}_{*} \le 36\). Whence,
and the RHS is zero when \(\lambda = \lambda _0({\fancyscript{D}}^{'}) := 81\int _{\Omega } \sum _{ij} \big | {\fancyscript{D}}^{'} f_{ij}^{(s)}(\mathbf{u}) \big |^2 u_{S^2}(\text {d}\mathbf{u})\), thanks to the fact that \(\int _{0}^{\pi } \cos ^2\varphi _h \sin ^2\varphi _h \beta (\text {d}\varphi _h) = -\frac{1}{2}\varLambda _b\). Therefore, in view of (196)–(197),
holds for any \(n \ge 2\). This inequality entails \(a_{\lambda _0}(n+1) \le (1 + \frac{3 \varLambda _b}{n}) a_{\lambda _0}(n)\), where \(a_{\lambda }(\nu ) := {\mathsf{E}}_t\Big [\text {A}_{\lambda }(\nu , \tau _{\nu })\ \big | \ \nu \Big ]\). At this stage, the same argument developed in A.1 shows that
holds for every \(n \ge 2\), since \(2 + 3\varLambda _b > 0\) for any choice of \(b\) satisfying (2)–(3). Whence,
is valid for every \(t \ge 0\) with the proviso that \(\frac{e^{(1 + 3\varLambda _b)t} - 1}{1 + 3\varLambda _b} := t\) when \(\varLambda _b = -1/3\), concluding the proof.
About this article
Cite this article
Dolera, E., Regazzini, E. Proof of a McKean conjecture on the rate of convergence of Boltzmann-equation solutions. Probab. Theory Relat. Fields 160, 315–389 (2014). https://doi.org/10.1007/s00440-013-0530-z
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-013-0530-z
Keywords
- Berry–Esseen inequalities
- Central limit theorem
- Global analysis on \(S^2\)
- Maxwellian molecules
- Random measure
- Wild-McKean sum