1 Introduction and new results

On the basis of an analogy pointed out by McKean in [50, 51], a few years ago we started a program which aims at studying the long-time behavior of solutions of some kinetic equations, by means of representations which connect these solutions to probability laws of certain weighted sums of independent and identically distributed (i.i.d.) random variables. The discovery of the right representation is comparatively simple for the solution of the spatially homogeneous one-dimensional Kac equation. This fact has produced both new results and improvements on the existing ones concerning the Kac equation. See [35, 17, 32, 33, 38, 39]. Our goal in the present paper is to go back to the original kinetic model, the spatially homogeneous Boltzmann equation for Maxwellian molecules (SHBEMM), which had inspired the aforesaid one-dimensional model. The reason for having deferred its treatment is connected, on the one hand, with the mathematical complexity of the subject and, on the other hand, with the hope that useful insights could be derived from the study of simpler allied cases. More specifically, we discuss here the problem of quantifying the “best” rate of relaxation to equilibrium. The starting point of the argument is the new probabilistic representation exhibited in Sect. 1.5 of the present paper.

The last part of the program, to be developed in forthcoming papers, is concerned with the inhomogeneous Boltzmann equation for Maxwellian molecules. Although the assumption of spatial homogeneity adopted here may seem a strong restriction, it is nonetheless proving an interesting and inspiring basis for studying qualitative properties of the complete model.

1.1 The equation

In classical kinetic theory, a gas is thought of as a system of a very large number \(N\) of like particles, described by means of a time-dependent statistical distribution \(\mu (\cdot , t)\) on the phase space \(X \times {\mathbb {R}}^3\), where \(X\) stands for the spatial domain. Then, for any subset \(A\) of \(X \times {\mathbb {R}}^3\), \(\mu (A, t)\) provides an approximation, independent of \(N\), of the statistical frequency of particles in \(A\), at time \(t\). It is worth noting that \(\mu (\cdot , t)\) can be also interpreted, consistently with its statistical meaning, as a genuine probability distribution (p.d.) by arguing about \(\mu (A, t)\) as probability that the position-velocity of a randomly selected particle, at time \(t\), belongs to \(A\). See the discussion in Subsection 2.1 in Chapter 2A of [65]. The basic assumptions for the derivation of the classical equation which governs the evolution of \(\mu (\cdot , t)\) are that the gas is dilute (Boltzmann-Grad limit) and that the particles interact via binary, elastic and microscopically reversible collisions. Particles which are just about to collide are viewed as stochastically independent (Boltzmann’s Stosszahlansatz). See [22, 23, 63] for a comprehensive treatment. In this work, we also assume spatial homogeneity, so that the phase space reduces to \({\mathbb {R}}^3\) and the SHBEMM can be written as

$$\begin{aligned} \frac{\partial }{\partial t} f(\mathbf{v}, t)&= \int \limits _{{\mathbb {R}}^3}\int \limits _{S^2} [f(\mathbf{v}_{*}, t) f(\mathbf{w}_{*}, t) \ - \ f(\mathbf{v}, t) f(\mathbf{w}, t)] \nonumber \\&\times \, b\left( \frac{\mathbf{w}- \mathbf{v}}{|\mathbf{w}- \mathbf{v}|} \cdot \varvec{\omega }\right) u_{S^2}(\text {d}\varvec{\omega })\text {d}\mathbf{w} \end{aligned}$$
(1)

where \((\mathbf{v}, t)\) varies in \({\mathbb {R}}^3\times (0, +\infty )\), \(f(\cdot , t)\) stands for a density function of \(\mu (\cdot , t)\) and \(u_{S^2}\) for the uniform p.d. (normalized Riemannian measure) on the unit sphere \(S^2\), embedded in \({\mathbb {R}}^3\). The symbols \(\mathbf{v}_{*}\) and \(\mathbf{w}_{*}\) denote post-collisional velocities which, according to the conservation laws of momentum and kinetic energy, must satisfy

$$\begin{aligned} \mathbf{v}+ \mathbf{w}= \mathbf{v}_{*} + \mathbf{w}_{*}\quad \text {and}\quad |\mathbf{v}|^2 + |\mathbf{w}|^2 = \ |\mathbf{v}_{*}|^2 + |\mathbf{w}_{*}|^2. \end{aligned}$$

Throughout the paper, \(\mathbf{v}_{*}\) and \(\mathbf{w}_{*}\) are parametrized according to the \(\varvec{\omega }\)-representation, i.e.

$$\begin{aligned} \mathbf{v}_{*} = \mathbf{v}+ [(\mathbf{w}- \mathbf{v}) \cdot \varvec{\omega }] \varvec{\omega }, \quad \quad \mathbf{w}_{*} = \mathbf{w}- [(\mathbf{w}- \mathbf{v}) \cdot \varvec{\omega }] \varvec{\omega }\end{aligned}$$

where \(\cdot \) denotes the standard scalar product. The angular collision kernel \(b\) is a non-negative measurable function on \([-1, 1]\). Henceforth, for the sake of mathematical convenience, it will be assumed that \(b\) meets the symmetry condition

$$\begin{aligned} b(x) = b(\sqrt{1 - x^2}) \frac{|x|}{\sqrt{1 - x^2}} = b(-x) \end{aligned}$$
(2)

for all \(x\) in \((-1, 1)\), an assumption which does not reduce the generality of (1), as explained in Subsection 4.1 of Chapter 2A of [65].Footnote 1 In presence of a general interaction potential governing the mechanism of binary collisions, \(b\) is replaced by a more complex function called collision kernel. See Section 3 of Chapter 2A of [65]. Maxwell [49] was the first to study particles which repel each other with a force inversely proportional to the fifth power of their distance, named Maxwellian molecules after him. In this particular circumstance, the resulting collision kernel turns out to be a specific function only of \(\frac{\mathbf{w}- \mathbf{v}}{|\mathbf{w}- \mathbf{v}|} \cdot \varvec{\omega }\), as in (1), with a non-summable singularity near 0. It is customary, as we do here, to call Maxwellian any collision kernel which is a function only of \(\frac{\mathbf{w}- \mathbf{v}}{|\mathbf{w}- \mathbf{v}|} \cdot \varvec{\omega }\), and to distinguish Maxwellian kernels depending on whether they are summable or not. The former case corresponds to a SHBEMM with Grad (angular) cutoff. Without any loss of generality, this condition can be formalized assuming that

$$\begin{aligned} \int \limits _{0}^{1} b(x)\,\text {d}x = 1 \end{aligned}$$
(3)

since any SHBEMM with cutoff can be reduced, via a time-scaling, to a SHBEMM with a kernel satisfying (3). The case when \(b\) is not summable corresponds to a SHBEMM of the non-cutoff type. We shall confine ourselves to considering the weak (angular) cutoff, i.e.

$$\begin{aligned} \int \limits _{0}^{1} x b(x)\,\text {d}x < +\infty . \end{aligned}$$
(4)

This condition is actually satisfied by the explicit form of \(b\) given by Maxwell, namely the only form of \(b\) that has been justified from a physical standpoint.

The first rigorous results on existence and uniqueness, given a probability density function \(f_0\) on \({\mathbb {R}}^3\) as initial datum, were obtained in [53, 67] under the validity of (3). To discuss this question about the SHBEMM with or without cutoff within a unitary framework, one needs a reformulation of the problem. Accordingly, the weak version of (1) used throughout this paper reads

$$\begin{aligned} \frac{\text {d}}{\text {d}t} \int \limits _{{\mathbb {R}}^3}\psi (\mathbf{v}) \mu (\text {d}\mathbf{v}, t)&= \int \limits _{{\mathbb {R}}^3}\int \limits _{{\mathbb {R}}^3}\int \limits _{S^2} [\psi (\mathbf{v}_{*}) - \psi (\mathbf{v})] \nonumber \\&\times \, b\left( \frac{\mathbf{w}- \mathbf{v}}{|\mathbf{w}- \mathbf{v}|} \cdot \varvec{\omega }\right) u_{S^2}(\text {d}\varvec{\omega })\mu (\text {d}\mathbf{v}, t) \mu (\text {d}\mathbf{w}, t) \end{aligned}$$
(5)

where \(\psi \) varies in \(\text {BL}({\mathbb {R}}^3)\), the space of all bounded and Lipschitz-continuous functions defined on \({\mathbb {R}}^3\). This formulation enables us to consider any p.d. \(\mu _0\) on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\) as initial datum, \({\fancyscript{B}}({{\mathbb {R}}}^3)\) standing for the Borel class on \({\mathbb {R}}^3\). The term weak solution designates any family \(\{\mu (\cdot , t)\}_{t \ge 0}\) of p.d.’s on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\) such that

  1. (i)

    \(\mu (\cdot , 0) = \mu _0(\cdot )\);

  2. (ii)

    \(t \mapsto \int _{{\mathbb {R}}^3}\psi (\mathbf{v}) \mu (\mathrm{d}\mathbf{v}, t)\) belongs to \(\text {C}([0, +\infty )) \cap \text {C}^1((0, +\infty ))\) for all \(\psi \) in \(\text {BL}({\mathbb {R}}^3)\);

  3. (iii)

    \(\int _{{\mathbb {R}}^3}|\mathbf{v}| \mu (\mathrm{d}\mathbf{v}, t) < +\infty \) for all \(t \ge 0\), if \(b\) is not summable but obeys (4);

  4. (iv)

    \(\mu (\cdot , t)\) satisfies (5) for all \(t > 0\) and for all \(\psi \) in \(\text {BL}({\mathbb {R}}^3)\).

From now on, the term solution of (1) has to be meant as weak solution, according to the above definition, of the Cauchy problem with initial datum \(\mu _0\). Tanaka [61] gave a rigorous result of existence and uniqueness for weak solutions by probabilistic arguments. The sole assumption required on \(\mu _0\), only if \(b\) is not summable but obeys (4), is the finiteness of the absolute first moment. Apropos of the uniqueness, see also [62].

It should be recalled that, in the non-cutoff case, existence can be recovered from the existence of the solution to the SHBEMM with cutoff, via a truncation procedure originally introduced in [1]. More precisely, given a non-summable \(b\) satisfying (4) and a p.d. \(\mu _0\) on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\) with finite first absolute moment, consider the sequence of collision kernels \(\{[b(x) \wedge n]/B_n\}_{n \ge 1}\), with \(B_n := \int _{0}^{1} [b(x) \wedge n] \text {d}x\). Since \([b(x) \wedge n]/B_n\) satisfies (3), one can find the solution \(\mu _n(\cdot , t)\) to (1), with \(b\) replaced by \([b(x) \wedge n]/B_n\) and initial datum \(\mu _0\). Following [1, 31], it can be shown that \(\mu _n(\cdot , B_n t)\) converges weakly as \(n\) goes to infinity to some limit \(\mu (\cdot , t)\), for every \(t \ge 0\), and that \(\mu (\cdot , t)\) turns out to be the solution to the original Cauchy problem. Recall that a sequence \(\{P_n\}_{n \ge 1}\) of p.d.’s on some topological space \(\text {T}\), endowed with its Borel \(\sigma \)-algebra, converges weakly to a p.d. \(P\) on the same space if and only if \(\lim _{n \rightarrow \infty } \int _{\text {T}} h \text {d}P_n = \int _{\text {T}} h \text {d}P\), for every bounded and continuous function \(h\) on \(\text {T}\). Henceforth, this kind of convergence will be denoted with \(P_n \Rightarrow P\).

Apropos of the long-time behavior of \(\mu (\cdot , t)\), a well-known fact is the macroscopic conservation of momentum and kinetic energy, i.e.

$$\begin{aligned} \int \limits _{{\mathbb {R}}^3}\mathbf{v}\mu (\text {d}\mathbf{v}, t) = \int \limits _{{\mathbb {R}}^3}\mathbf{v}\mu _0(\text {d}\mathbf{v}) \quad \text {and} \quad \int \limits _{{\mathbb {R}}^3}|\mathbf{v}|^2 \mu (\text {d}\mathbf{v}, t) = \int \limits _{{\mathbb {R}}^3}|\mathbf{v}|^2 \mu _0(\text {d}\mathbf{v})\quad \end{aligned}$$
(6)

for every \(t \ge 0\), which hold true when the hypothesis

$$\begin{aligned} \int \limits _{{\mathbb {R}}^3}|\mathbf{v}|^2 \mu _0(\text {d}\mathbf{v}) < +\infty \end{aligned}$$
(7)

is in force. Section 8 of [61] is a reference also for the non-cutoff case. Another fundamental fact is that the equilibrium corresponds to the so-called Maxwellian distribution

$$\begin{aligned} \gamma _{\mathbf{v}_0, \sigma ^2}(\text {d}\mathbf{v}) = M_{\mathbf{v}_0, \sigma ^2}(\mathbf{v}) \text {d}\mathbf{v}= \left( \frac{1}{2\pi \sigma ^2}\right) ^{3/2} \exp \left\{ -\frac{1}{2 \sigma ^2} |\mathbf{v}- \mathbf{v}_0|^2\right\} \,\text {d}\mathbf{v}\end{aligned}$$
(8)

which is characterized by the first two moments \(\mathbf{v}_0 = \int _{{\mathbb {R}}^3}\mathbf{v}\mu _0(\text {d}\mathbf{v})\) and \(\sigma ^2 = \frac{1}{3} \int _{{\mathbb {R}}^3}|\mathbf{v}- \mathbf{v}_0|^2 \mu _0(\text {d}\mathbf{v})\). Note that \(\gamma _{\mathbf{v}_0, 0}\) stands for the unit mass \(\delta _{\mathbf{v}_0}\) at \(\mathbf{v}_0\). The already quoted paper [61] proves that, under (4) and (7), \(\mu (\cdot , t)\Rightarrow \gamma _{\mathbf{v}_0, \sigma ^2}\) as \(t\) goes to infinity.

1.2 The conjecture and its motivations

Relaxation to equilibrium of solutions to the Boltzmann equation is at the core of kinetic theory ever since the works of Boltzmann himself. The importance of accurate estimates of the rate of convergence is tightly connected with the issue on the physical value of any convergence statement of Boltzmann-equation solutions w.r.t. the time scale on which the Boltzmann description may be relevant. See, for example, Section 2 of Chapter 2C of [65]. Within this framework, a first preliminary question arises apropos of the choice of the topology in which this convergence ought to take place, keeping in mind that one is dealing with convergence of probability measures (p.m.’s). In fact, the literature has dealt with a variety of probability metrics, but no doubt the total variation distance (t.v.d.) continues to be a formidable reference for the study of relaxation to equilibrium in kinetic models. Recall that, for any pair \((\alpha , \beta )\) of p.d.’s on some measurable space \((S, {\fancyscript{S}})\), such a distance is defined by

$$\begin{aligned} \text {d}_{\text {TV}}(\alpha , \beta ) := \sup _{B \in {\fancyscript{S}}} | \alpha (B) - \beta (B)| \end{aligned}$$

and that it can be written as

$$\begin{aligned} \text {d}_{\text {TV}}(\alpha , \beta ) = \frac{1}{2} \int \limits _S |p(x) - q(x)| \lambda (\text {d}x) \end{aligned}$$

when \(\lambda \) is any \(\sigma \)-finite measure dominating both \(\alpha \) and \(\beta \), and \(p\), \(q\) are probability density functions w.r.t. \(\lambda \) of \(\alpha \) and \(\beta \), respectively. See Chapter III of [60] for more information. Once the right metric has been singled out, the problem of convergence to equilibrium is greatly enhanced by the knowledge of the rate of approach to the limiting distribution and even more so by a precise bound on the error in approximating the limit for each fixed instant. To introduce the reader to the essential part of the problem, we recall that, for an entire class \({\mathcal {I}}\) of initial data \(\mu _0\), one can prove that

$$\begin{aligned} \text {d}_{\text {TV}}(\mu (\cdot , t), \gamma _{\mathbf{v}_0, \sigma ^2}) \ge C_{*} e^{\varLambda _b t} \end{aligned}$$
(9)

is met with a suitable constant \(C_{*}\) and

$$\begin{aligned} \varLambda _b := -2\int \limits _{0}^{1} x^2 (1 - x^2) b(x)\, \text {d}x. \end{aligned}$$
(10)

This result can be reached from a well-known statement by Ikenberry and Truesdell [46], according to which

$$\begin{aligned} \left| \,\, \int \limits _{{\mathbb {R}}^3}\mathbf{v}^{\varvec{\alpha }} \mu (\text {d}\mathbf{v}, t) - \int \limits _{{\mathbb {R}}^3}\mathbf{v}^{\varvec{\alpha }} \gamma _{\mathbf{v}_0, \sigma ^2}(\text {d}\mathbf{v}) \right| \le C_{\varvec{\alpha }} e^{\varLambda _b t} \end{aligned}$$
(11)

holds true with suitable constants \(C_{\varvec{\alpha }}\), for any multi-index \({\varvec{\alpha }}\) such that \(\int _{{\mathbb {R}}^3}|\mathbf{v}|^{|{\varvec{\alpha }}|} \mu _0(\text {d}\mathbf{v}) < +\infty \). Recently, it has been proved that \({\mathcal {I}}\) contains all the p.d.’s \(\mu _0\) satisfying \(\int _{{\mathbb {R}}^3}e^{i \varvec{\xi }\cdot \mathbf{v}} \mu _0(\text {d}\mathbf{v}) = \int _{{\mathbb {R}}} e^{i |\varvec{\xi }| x} \zeta _0(\text {d}x)\) for every \(\varvec{\xi }\) in \({\mathbb {R}}^3\), where \(\zeta _0\) is a symmetric p.d. on \(({\mathbb {R}}, {\fancyscript{B}}({\mathbb {R}}))\) with non-zero kurtosis coefficient. See [33]. Such being the case, inequality (9) is conducive to checking whether it is possible to establish also the reverse relation

$$\begin{aligned} \text {d}_{\text {TV}}(\mu (\cdot , t), \gamma _{\mathbf{v}_0, \sigma ^2}) \le C^{*} e^{\varLambda _b t}. \end{aligned}$$
(12)

Actually, when (9) and (12) are in force simultaneously, \(\varLambda _b\) can be viewed as the best rate of exponential convergence of \(\mu (\cdot , t)\) to equilibrium. The characterization of the largest class of initial data for which (12) is valid is commonly referred to as McKean’s conjecture. The reference to McKean is due to the fact that, relative to the solution \(\mu (\cdot , t)\) of the well-known Kac’s simplification of the SHBEMM, he was the first to prove rigorously, in [50], that \(\text {d}_{\text {TV}}(\mu (\cdot , t), \gamma _{\mathbf{v}_0, \sigma ^2}) \le C^{'} e^{\lambda t}\) holds true with \(\lambda \approx -0.016\) and for a suitable constant \(C^{'}\). However, this value of \(\lambda \) is strictly greater than \(\varLambda _b\), equal to \(-1/4\) in the case of Kac’s equation. See [32].

As a completion of the argument, it is interesting to point out the meaning of \(\varLambda _b\) w.r.t. the asymptotic behavior of \(\mu (\cdot , t)\). Besides the important role played in (11), \(\varLambda _b\) represents also the least negative eigenvalue of the linearized collision operator

$$\begin{aligned} L_b[h](\mathbf{v})&:= \int \limits _{{\mathbb {R}}^3}\int \limits _{S^2} \ [h({\mathbf{v}_{*}}) + h({\mathbf{w}_{*}}) \ - \ h(\mathbf{v}) - h(\mathbf{w})] \\&\times \, b\left( \frac{\mathbf{w}- \mathbf{v}}{|\mathbf{w}- \mathbf{v}|} \cdot \varvec{\omega }\right) u_{S^2}(\text {d}\varvec{\omega })\gamma _{\mathbf{v}_0, \sigma ^2}(\text {d}\mathbf{w}) \end{aligned}$$

defined on \({\mathcal {H}} := \text {L}^2({\mathbb {R}}^3, \gamma _{\mathbf{v}_0, \sigma ^2}(\text {d}\mathbf{x}))\). Hilbert [44] was the first to derive this operator from a linearization of (1) and to highlight the opportunity of choosing the domain \({\mathcal {H}}\) with a view to carrying out the spectral analysis. In the Hilbert setting, \(L_b\) turns out to be self-adjoint and negative with discrete spectrum and \(|\varLambda _b|\) represents the spectral gap. See [28]. Finally, it is worth recalling that \(\varLambda _b\) arises also in Kac-like derivations of the SHBEMM [47], based on a stochastic evolution of an \(N\)-particle system. See [15, 19].

1.3 A glance at the literature on McKean’s conjecture

The formulation of the Boltzmann H-theorem originated a significant mathematical research, aimed at studying the convergence to equilibrium in total variation, whose first rigorous outcomes are in [10, 54]. In any case, in spite of the huge literature on this subject, the number of works which expressly pursued the validation of the conjecture is small. Essentially, four lines of research have been followed to achieve the goal, based on: (1) use of contractive functionals or probability metrics; (2) entropy methods; (3) linearization; (4) central limit theorem. (1) As for the first line of research, the papers [18, 40, 56, 61, 64] are worth mentioning. In particular, Theorem 1.1 in [18] constitutes the closest result to the McKean conjecture obtained so far. It is valid only under (3) and states that, for every \(\varepsilon > 0\), there is \(C_{\varepsilon }(\mu _0, b)\) such that

$$\begin{aligned} \text {d}_{\text {TV}}(\mu (\cdot , t), \gamma _{\mathbf{v}_0, \sigma ^2}) \le C_{\varepsilon }(\mu _0, b) e^{(1 - \varepsilon )\varLambda _b t} \end{aligned}$$
(13)

holds for every \(t \ge 0\), but this \(C_{\varepsilon }\) goes to infinity as \(\varepsilon \) goes to zero. Therefore, the presence of \(\varepsilon \), together with such a behavior of \(C_{\varepsilon }(\mu _0, b)\), defeats the hope of extending (13) to the solution of the SHBEMM of non-cutoff type through the truncation argument explained in Sect. 1.1: A strong motivation for the pursuit of a bound with \(\varepsilon = 0\) and of a constant \(C(\mu _0)\) depending only on \(\mu _0\), in the place of \(C_{\varepsilon }(\mu _0, b)\). Moreover, (13) has been deduced thanks to rather strong conditions on \(\mu _0(\text {d}\mathbf{x}) = f_0(\mathbf{x}) \text {d}\mathbf{x}\), such as finiteness of all absolute moments, Sobolev regularity and finiteness of the Linnik functional. (2) Entropy methods aim at proving quantitative H-theorems, on the basis of the seminal ideas introduced in [11, 12]. An attempt to improve this strategy, towards the achievement of the McKean conjecture, was represented by the Cercignani conjecture which, however, proved to be false in the case of Maxwellian molecules. See [9, 66]. Anyway, quantitative H-theorems are still considered as conducive to the most powerful strategy to study relaxation to equilibrium in non-homogeneous frameworks. See [26]. (3) The strategy of the linearization is outlined in [29, 42]. It gives general positive answers to the problem of quantifying the relaxation to equilibrium only when the solution enters a small neighborhood of the equilibrium itself, so that the spectral analysis of \(L_b\), as an operator on \({\mathcal {H}}\), becomes relevant to the study of the nonlinear problem. It is only recently that, in the case of the homogeneous Boltzmann equation with hard potentials, the linearization has been used successfully to prove the conjecture. See [55]. However, the radical difference between the situation of hard potentials and that of Maxwellian molecules hampers a direct extension of the positive conclusion from the former to the latter. (4) Finally, the link with the central limit theorem discovered by McKean in [50, 51] has been taken into serious consideration only recently in [13, 14], two works which have strongly inspired and motivated our program.

1.4 The main result

A precise and complete formulation is encapsulated in the following theorem where \(\hat{\mu }\) stands for the Fourier transform of the p.d. \(\mu \) on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\), namely \(\hat{\mu }(\varvec{\xi }) := \int _{{\mathbb {R}}^3}e^{i \varvec{\xi }\cdot \mathbf{v}} \mu (\text {d}\mathbf{v})\) for \(\varvec{\xi }\) in \({\mathbb {R}}^3\).

Theorem 1

Assume that (2) and (4) are in force and that the initial datum \(\mu _0\) satisfies

$$\begin{aligned} {\mathfrak {m}}_4 := \int \limits _{{\mathbb {R}}^3}|\mathbf{v}|^4 \mu _0(\text {d}\mathbf{v}) < +\infty \end{aligned}$$
(14)

and

$$\begin{aligned} |\hat{\mu }_0(\varvec{\xi })| = o(|\varvec{\xi }|^{-p}) \quad (|\varvec{\xi }| \rightarrow +\infty ) \end{aligned}$$
(15)

for some strictly positive \(p\). Then, the solution \(\mu (\cdot , t)\) meets

$$\begin{aligned} {\mathrm {d}}_{\mathrm {TV}}(\mu (\cdot , t), \gamma _{\mathbf{v}_0, \sigma ^2}) \le C(\mu _0) e^{\varLambda _b t} \end{aligned}$$
(16)

for every \(t \ge 0\), where \(\varLambda _b\) is given by (10) and \(C(\mu _0)\) is a positive constant which depends only on \(\mu _0\).

Indications for numerical evaluation of \(C(\mu _0)\) can be derived from specific passages of the proof, in Sect. 2.2. With reference to the SHBEMM with cutoff, this theorem represents the first direct validation of the McKean conjecture, without unnecessary extra-conditions. Moreover, as far as the non-cutoff case is concerned, the same theorem is, at the best of our knowledge, the only existing sharp quantification of the speed of convergence to equilibrium. A detailed explanation of these points is given in the following

Remarks

  1. 1.

    The proof of Theorem 1 will be developed, in Sect. 2.2, under the cutoff condition (3). Indeed, once (16) has been established under (3), one can resort to the truncation procedure described in Sect. 1.1 to write, for every \(n\) in \({\mathbb {N}}\),

    $$\begin{aligned} \text {d}_{\text {TV}}(\mu _n(\cdot , B_n t), \gamma _{\mathbf{v}_0, \sigma ^2}) \le C(\mu _0) \exp \left\{ -2 t \int \limits _{0}^{1} x^2(1 - x^2) [b(x) \wedge n] \text {d}x\right\} . \end{aligned}$$

    Now, the combination of this inequality with

    $$\begin{aligned} \text {d}_{\text {TV}}(\mu (\cdot , t), \gamma _{\mathbf{v}_0, \sigma ^2}) \le \liminf _{n \rightarrow \infty } \text {d}_{\text {TV}}(\mu _n(\cdot , B_n t), \gamma _{\mathbf{v}_0, \sigma ^2}) \end{aligned}$$

    leads to the desired conclusion.

  2. 2.

    Let us now discuss assumption (14). It is interesting to recall that, under the cutoff condition, convergence in the total variation metric to the Maxwellian holds under (7). See [20]. The necessity of this condition, in a cutoff setting, is stated both in [16] and in Theorem 3 of the present paper. In [20] it is also shown that convergence to equilibrium, under the sole assumption of finiteness of the second moment of \(\mu _0\), could be arbitrarily slow, whereas the finiteness of the \((2+\delta )\)-th absolute moment, for some \(\delta > 0\), is enough to get exponentially decreasing bounds. Nevertheless, if \(\delta < 2\), these bounds can be worse than that conjectured by McKean. Here is an example which shows that, even if the tail condition (15) is fulfilled, the desired bound is not achieved because of “infinitesimal” deviations from hypothesis (14). Consider the class of initial data \(\mu ^{(q)}_{0}(\text {d}\mathbf{v}) = f_{0, q}(\mathbf{v}) \text {d}\mathbf{v}\) with

    $$\begin{aligned} f_{0, q}(\mathbf{v}) = \frac{q}{4\pi \ |\mathbf{v}|^{3 + q}} 1\!\!1_{\{|\mathbf{v}| \ge 1\}} \end{aligned}$$

    for \(q\) in \((3, 4)\). The Fourier transform of this density at \(\varvec{\xi }\) is

    $$\begin{aligned} 1 - \frac{q}{6 (q - 2)} |\varvec{\xi }|^2 - \frac{\varGamma (1 - q) \cos (q \pi /2)}{1+q} |\varvec{\xi }|^q - q \sum _{m \ge 2} \frac{(-1)^m |\varvec{\xi }|^{2m}}{(2m + 1)! (2m - q)} \end{aligned}$$

    which meets \(\hat{\mu }^{(q)}_{0}(\varvec{\xi }) = O(|\varvec{\xi }|^{-1})\) when \(|\varvec{\xi }|\) goes to infinity. Then, \(\mu ^{(q)}_{0}\) satisfies (15) and has finite absolute moment of order \((3 + \delta )\) for every \(\delta \) in \((0, q - 3)\), but has infinite absolute fourth moment. Denoting by \(\mu ^{(q)}(\cdot , t)\) the solution of (1) relative to \(\mu ^{(q)}_{0}\), one can mimic the argument explained in [33] to prove that

    $$\begin{aligned} \text {d}_{\text {TV}}(\mu ^{(q)}(\cdot , t), \gamma _{\mathbf{0}, \sigma ^2}) \ge C_q \exp \{-(1 - 2l_q(b)) t\} \end{aligned}$$

    holds for every \(t \ge 0\), where \(3\sigma ^2 = q/(q - 2)\), \(C_q\) is a strictly positive constant independent of \(b\), \(l_q(b) := \int _{0}^{1} (1 - x^2)^{q/2} b(x) \text {d}x\) and \(\varLambda _b < -(1 - 2l_q(b)) < 0\).

  3. 3.

    As far as the tail assumption (15) is concerned, it is worth noting that it is implied by the finiteness of the Linnik functional, according to Lemma 2.3 in [18]. Also the relationship between (15) and certain regularity conditions adopted to guarantee the validity of classical local limit theorems of probability theory are worth noting. See, for example, Theorem 19.1 in [7].

1.5 A probabilistic representation of the solution

The proof of Theorem 1 relies on a representation of the solution \(\mu (\cdot , t)\)—already proposed and studied in [27]—which is valid under the cutoff condition (3). The motivation for this representation is twofold. On the one hand, it leads us to study the problem of convergence to equilibrium from the standpoint of the central limit problem of probability theory. On the other hand, it lends itself to computability of certain derivatives of the Fourier transform of \(\mu (\cdot , t)\) involved in the first steps of the proof of Theorem 1. See, for example, (58) below. It should be mentioned that the existing representations, essentially based on the Bobylev identity (see Section 3 of [8]), turn out to be unfit for the aforesaid computations.

In a nutshell, the probabilistic representation at issue states that

$$\begin{aligned} \mu (B, t) = \mathsf E _t \left[ {\mathcal {M}}(B)\right] \end{aligned}$$
(17)

for every \(t \ge 0\) and every \(B\) in \({\fancyscript{B}}({{\mathbb {R}}}^3)\), where \({\mathsf{E}}_t\) is an expectation and \({\mathcal {M}}\) is a random p.m. connected with a distinguished weighted sum of random vectors, to be defined below. Here and in the rest of the paper we use the term random p.m. to designate any measurable function from some measurable space into the space \(\mathcal {P}({\mathbb {R}}^3)\) of all p.m.’s on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\), endowed with the Borel \(\sigma \)-algebra of weak convergence of p.m.’s. See, e.g., Chapters 11–12 of [48] for further details. Then, to carry out our programme, it remains to provide the reader with those definitions and preliminary results which are necessary to understand (17). In this way, we shall present also the core of the notation used in the rest of the paper.

The starting point is the introduction of the sample space

$$\begin{aligned} \Omega := {\mathbb {N}} \times {\mathbb {T}} \times [0, \pi ]^{\infty } \times (0, 2\pi )^{\infty } \times ({\mathbb {R}}^3)^{\infty } \end{aligned}$$

where: For any nonempty set \(X\), \(X^{\infty }\) stands for the set of all sequences \((x_1, x_2, \ldots )\) whose elements belong to \(X\); \({\mathbb {T}} := \mathsf X _{\begin{array}{c} n \ge 1 \end{array}} {\mathbb {T}}(n)\) and \({\mathbb {T}}(n)\) is the (finite) set of all McKean binary trees with \(n\) leaves. We write \({\mathfrak {t}}_n\) to denote an element of \({\mathbb {T}}(n)\). Then, \({\mathfrak {t}}_{n, k}\) indicates the germination of \({\mathfrak {t}}_n\) at its \(k\)-th leaf, obtained by appending a two-leaved tree to the \(k\)-th leaf of \({\mathfrak {t}}_n\). Finally, \({\mathfrak {t}}_n^l\) and \({\mathfrak {t}}_n^r\) symbolize the two trees, of \(n_l\) and \(n_r\) leaves respectively, obtained by a split-up of \({\mathfrak {t}}_n\). See [13, 14, 37, 50, 51] for a more detailed explanation of these concepts, and [34] for a recent and comprehensive treatment of random trees.

Then, associate with \(\Omega \) the \(\sigma \)-algebra

$$\begin{aligned} {\fancyscript{F}} := 2^{{\mathbb {N}}} \otimes 2^{{\mathbb {T}}} \otimes {\fancyscript{B}}([0, \pi ]^{\infty }) \otimes {\fancyscript{B}}((0, 2\pi )^{\infty }) \otimes {\fancyscript{B}}(({\mathbb {R}}^3)^{\infty }) \end{aligned}$$

where \(2^X\) stands for the power set of \(X\) and \({\fancyscript{B}}(X)\) for the Borel class on \(X\). Define

$$\begin{aligned} \nu ,\ \{\tau _n\}_{n \ge 1},\ \{\phi _n\}_{n \ge 1},\ \{\vartheta _n\}_{n \ge 1}, \{\mathbf{V}_n\}_{n \ge 1} \end{aligned}$$

to be the coordinate random variables of \(\Omega \) and, by them, generate the \(\sigma \)-algebras

$$\begin{aligned} {\fancyscript{G}}&{:=} \sigma \left( \nu ,\ \{\tau _n\}_{n \ge 1},\ \{\phi _n\}_{n \ge 1} \right) \\ {\fancyscript{H}}&{:=} \sigma \left( \nu ,\ \{\tau _n\}_{n \ge 1},\ \{\phi _n\}_{n \ge 1},\ \{\vartheta _n\}_{n \ge 1}\right) \!. \end{aligned}$$

Now, for every \(t \ge 0\), consider the unique p.d. \({\mathsf{P}}_t\) on \((\Omega , {\fancyscript{F}})\) which makes the random coordinates stochastically independent, consistently with the following marginal p.d.’s:

  1. (a)
    $$\begin{aligned} {\mathsf{P}}_t[ \nu = n ] = e^{-t}(1 - e^{-t})^{n-1} \quad \ (n = 1, 2, \ldots ) \end{aligned}$$
    (18)

    with the proviso that \(0^0 := 1\).

  2. (b)

    \(\{\tau _n\}_{n \ge 1}\) is a Markov sequence driven by

    $$\begin{aligned} \begin{array}{l} {\mathsf{P}}_t[\tau _1 = {\mathfrak {t}}_1] = 1 {} \\ {\mathsf{P}}_t[ \tau _{n+1} = {\mathfrak {t}}_{n, k}\ | \ \tau _n = {\mathfrak {t}}_n] = \frac{1}{n} \quad \ \text {for}\ k = 1, \ldots , n \\ {\mathsf{P}}_t[ \tau _{n+1} = {\mathfrak {t}}_{n+1} \ | \ \tau _n = {\mathfrak {t}}_n] = 0 \quad \text {if}\ {\mathfrak {t}}_{n+1} \not \in {\mathbb {G}}({\mathfrak {t}}_n) \end{array} \end{aligned}$$
    (19)

    for every \(n\) in \({\mathbb {N}}\) and \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\), where, for a given \({\mathfrak {t}}_n\), \({\mathbb {G}}({\mathfrak {t}}_n)\) is the subset of \({\mathbb {T}}(n+1)\) containing all the germinations of \({\mathfrak {t}}_n\).

  3. (c)

    The elements of \(\{\phi _n\}_{n \ge 1}\) are i.i.d. random numbers with p.d.

    $$\begin{aligned} \beta (\text {d}\varphi ) := \frac{1}{2} b(\cos \varphi ) \sin \varphi \text {d}\varphi , \quad (\varphi \in [0, \pi ]). \end{aligned}$$
    (20)
  4. (d)

    The elements of \(\{\vartheta _n\}_{n \ge 1}\) are i.i.d. with uniform p.d. on \((0, 2\pi )\), \(u_{(0, 2\pi )}\).

  5. (e)

    The elements of \(\{\mathbf{V}_n\}_{n \ge 1}\) are i.i.d. with p.d. \(\mu _0\), the initial datum of the Cauchy problem relative to (1).

According to the above notation, \({\mathsf{E}}_t\) denotes expectation w.r.t. \({\mathsf{P}}_t\).

A constituent of the representation under study is \({\varvec{\pi }} := \{\pi _{j, n}\ | \ j = 1, \ldots , n; n \in {\mathbb {N}}\}\), an array of \([-1,1]\)-valued random numbers. They are obtained by setting

$$\begin{aligned} \pi _{j, n} := \pi _{j, n}^{*}(\tau _n, (\phi _1, \ldots , \phi _{n-1})) \end{aligned}$$
(21)

for \(j = 1, \dots , n\) and \(n\) in \({\mathbb {N}}\). The \(\pi _{j, n}^{*}\)’s are functions on \({\mathbb {T}}(n) \times [0, \pi ]^{n-1}\) defined by putting \(\pi _{1, 1}^{*} \equiv 1\) and, for \(n \ge 2\),

$$\begin{aligned} \pi _{j, n}^{*}({\mathfrak {t}}_n, {\varvec{\varphi }}) := \left\{ \begin{array}{l@{\quad }l} \pi _{j, n_l}^{*}({\mathfrak {t}}_{n}^{l}, {\varvec{\varphi }}^l) \cos \varphi _{n-1} &{} \text {for} \ j = 1, \dots , n_l \\ \pi _{j - n_l, n_r}^{*}({\mathfrak {t}}_{n}^{r}, {\varvec{\varphi }}^r) \sin \varphi _{n-1} &{} \text {for} \ j = n_l + 1, \dots , n \end{array} \right. \end{aligned}$$
(22)

for every \({\varvec{\varphi }} = ({\varvec{\varphi }}^l, {\varvec{\varphi }}^r, \varphi _{n-1})\) in \([0, \pi ]^{n-1}\), with

$$\begin{aligned} {\varvec{\varphi }}^l := (\varphi _1, \dots , \varphi _{n_l-1})\quad \text {and} \quad {\varvec{\varphi }}^r := (\varphi _{n_l}, \dots , \varphi _{n-2}). \end{aligned}$$

An induction argument shows that

$$\begin{aligned} \sum _{j=1}^{n} \pi _{j, n}^{2} = 1 \end{aligned}$$
(23)

for every \(n\) in \({\mathbb {N}}\). It is also worth recalling the identity

$$\begin{aligned} {\mathsf{E}}_t\left[ \sum _{j = 1}^{\nu } |\pi _{j, \nu }|^s \right] = e^{-(1 - 2 l_s(b))t} \end{aligned}$$
(24)

valid for every \(t, s > 0\), with \(l_s(b) := \int _{0}^{1} (1 - x^2)^{s/2} b(x) \text {d}x\). The original derivation is in [37] but, for the sake of completeness, we have included its proof in Appendix A.1. Throughout the paper, A.\(n\) designates the \(n\)-th subsection of the Appendix. With a view to the proof of Theorem 1, it is interesting to point out that

$$\begin{aligned} -(1 - 2l_4(b)) = \varLambda _b. \end{aligned}$$
(25)

Another constituent of the desired representation is the array \({\mathbf{O }} := \{\text {O}_{j, n} | j = 1, \ldots , n; n \in {\mathbb {N}}\}\) of random matrices \(\text {O}_{j, n}\), taking values in the Lie group \({\mathbb {SO}}(3)\) of orthogonal matrices with positive determinant, defined by

$$\begin{aligned} \text {O}_{j, n} := \text {O}_{j, n}^{*}(\tau _n, (\phi _1, \dots , \phi _{n-1}), (\vartheta _1, \dots , \vartheta _{n-1})) \end{aligned}$$
(26)

for \(j = 1, \dots , n\) and \(n\) in \({\mathbb {N}}\). The \(\text {O}_{j, n}^{*}\)’s are \({\mathbb {SO}}(3)\)-valued functions obtained by putting \(\text {O}_{1, 1}^{*} \equiv \text {Id}_{3 \times 3}\) and, for \(n \ge 2\),

$$\begin{aligned} \text {O}_{j, n}^{*}({\mathfrak {t}}_n, {\varvec{\varphi }}, {\varvec{\theta }}) := \left\{ \begin{array}{l@{\quad }l} \text {M}^l(\varphi _{n-1}, \theta _{n-1}) \text {O}_{j, n_l}^{*}({\mathfrak {t}}_n^l, {\varvec{\varphi }}^l, {\varvec{\theta }}^l) &{} \text {for} \ j = 1, \dots , n_l \\ \text {M}^r(\varphi _{n-1}, \theta _{n-1}) \text {O}_{j - n_l, n_r}^{*}({\mathfrak {t}}_n^r, {\varvec{\varphi }}^r, {\varvec{\theta }}^r) &{} \text {for} \ j = n_l + 1, \dots , n \end{array} \right. \nonumber \\ \end{aligned}$$
(27)

for every \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\), \({\varvec{\varphi }}\) in \([0, \pi ]^{n-1}\) and \({\varvec{\theta }}\) in \((0, 2\pi )^{n-1}\). Here

$$\begin{aligned} {\varvec{\theta }}^l := (\theta _1, \dots , \theta _{n_l-1})\quad \text {and} \quad {\varvec{\theta }}^r := (\theta _{n_l}, \dots , \theta _{n-2}) \end{aligned}$$

and, finally,

$$\begin{aligned} \text {M}^l(\varphi , \theta ) := \left( \begin{array}{c@{\quad }c@{\quad }c} -\cos \theta \cos \varphi &{} \sin \theta &{} \cos \theta \sin \varphi \\ - \sin \theta \cos \varphi &{} -\cos \theta &{} \sin \theta \sin \varphi \\ \sin \varphi &{} 0 &{} \cos \varphi \\ \end{array} \right) \end{aligned}$$
$$\begin{aligned} \text {M}^r(\varphi , \theta ) := \left( \begin{array}{c@{\quad }c@{\quad }c} \sin \theta &{} \cos \theta \sin \varphi &{} -\cos \theta \cos \varphi \\ -\cos \theta &{} \sin \theta \sin \varphi &{} - \sin \theta \cos \varphi \\ 0 &{} \cos \varphi &{} \sin \varphi \\ \end{array} \right) . \end{aligned}$$

Working out the recursion formula (26) gives

$$\begin{aligned} \text {O}_{j, n}^{*}({\mathfrak {t}}_n, {\varvec{\varphi }}, {\varvec{\theta }}) = \prod _{h = 1}^{\delta _j({\mathfrak {t}}_n)} \text {M}^{\epsilon _h({\mathfrak {t}}_n, j)}(\varphi _{m_h({\mathfrak {t}}_n, j)}, \theta _{m_h({\mathfrak {t}}_n, j)}) \end{aligned}$$
(28)

where \(\prod _{h=1}^{n} A_h := A_1 \times \dots \times A_n\) and \(\delta _j({\mathfrak {t}}_n)\) indicates the depth of the \(j\)-th leaf of \({\mathfrak {t}}_n\), that is the number of generations separating this leaf from the root (the top node of the tree). The \(\epsilon _h({\mathfrak {t}}_n, j)\)’s take values in \(\{l, r\}\) and, in particular, \(\epsilon _1({\mathfrak {t}}_n, j)\) equals \(l\) (\(r\), respectively) if \(j \le n_l\) (\(j > n_l\), respectively). Then,

$$\begin{aligned} \epsilon _h({\mathfrak {t}}_n, j) := \left\{ \begin{array}{l@{\quad }l} \epsilon _{h-1}({\mathfrak {t}}_n^l, j) &{} \text {for} \ j = 1, \dots , n_l \\ \epsilon _{h-1}({\mathfrak {t}}_n^r, j - n_l) &{} \text {for} \ j = n_l + 1, \dots , n \end{array} \right. \end{aligned}$$

when \(h \ge 2\). Each \(m_h\) belongs to \(\{1, \dots , n-1\}\) and \(m_1 \ne \dots \ne m_{\delta _j({\mathfrak {t}}_n)}\). In fact, \(m_1({\mathfrak {t}}_n, j) := n-1\) for every \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\), \(j = 1, \dots , n\), and

$$\begin{aligned} m_h({\mathfrak {t}}_n, j) := \left\{ \begin{array}{l@{\quad }l} m_{h-1}({\mathfrak {t}}_n^l, j) &{} \text {for} \ j = 1, \dots , n_l \\ m_{h-1}({\mathfrak {t}}_n^r, j - n_l) &{} \text {for} \ j = n_l + 1, \dots , n \end{array} \right. \end{aligned}$$

when \(h \ge 2\).

Now, choose a non-random measurable function \(\text {B}\) from \(S^2\) onto \({\mathbb {SO}}(3)\) such that \(\text {B}(\mathbf{u}) \mathbf{e}_3 = \mathbf{u}\) for every \(\mathbf{u}\) in \(S^2\), and define the random functions \(\varvec{\psi }_{j, n} : S^2 \rightarrow S^2\) through the relation

$$\begin{aligned} \varvec{\psi }_{j, n}(\mathbf{u}) := \text {B}(\mathbf{u}) \text {O}_{j, n} \mathbf{e}_3 \end{aligned}$$
(29)

for \(j = 1, \dots , n\) and \(n\) in \({\mathbb {N}}\), with \(\mathbf{e}_3 := (0, 0, 1)^t\). It should be noticed that such a \(\text {B}\) actually exists and that it cannot be continuous. See, e.g., Chapter 5 of [45].

The central object of our construction is the random sum

$$\begin{aligned} S(\mathbf{u}) := \sum _{j = 1}^{\nu } \pi _{j, \nu } \varvec{\psi }_{j, \nu }(\mathbf{u}) \cdot \mathbf{V}_j \end{aligned}$$
(30)

whose characteristic function (c.f.) serves the new representation according to

Theorem 2

Assume that (2)–(3) are in force. Then, the function

$$\begin{aligned} \hat{{\mathcal {M}}}(\varvec{\xi })&:= {\mathrm {E}}_t \left[ e^{i \rho S(\mathbf{u})} \ | \ {\fancyscript{G}} \right] \nonumber \\&= \left\{ \begin{array}{l@{\quad }l} \hat{\mu }_0(\varvec{\xi }) &{} \text {if} \ \nu = 1 \\ \int _{(0, 2\pi )^{\nu - 1}} \left[ \prod _{j = 1}^{\nu } \hat{\mu }_0\left( \rho \pi _{j, \nu }{\mathrm {B}}(\mathbf{u})\text {O}_{j, \nu }^{*}(\tau _{\nu }, {\varvec{\phi }}, {\varvec{\theta }}) \mathbf{e}_3\right) \right] u_{(0, 2\pi )}^{\otimes _{\nu - 1}}({\mathrm {d}} {\varvec{\theta }}) &{} \text {if} \ \nu \ge 2, \end{array} \right. \nonumber \\ \end{aligned}$$
(31)

with \(\varvec{\xi }\) in \({\mathbb {R}}^3{\setminus }\{\mathbf{0}\}\), \(\rho := |\varvec{\xi }|\), \(\mathbf{u}:= \varvec{\xi }/|\varvec{\xi }|\) and \({\varvec{\phi }} := (\phi _1, \dots , \phi _{\nu - 1})\), is the Fourier transform of a random p.d. on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\), denoted by \({\mathcal {M}}\). This \({\mathcal {M}}\) turns out to be independent of the choice of \({\mathrm {B}}\), and satisfies (17) for every \(t \ge 0\) and \(B\) in \({\fancyscript{B}}({{\mathbb {R}}}^3)\).

The proof of the theorem is contained in Sect. 2.1. Many relevant properties of \({\mathcal {M}}\) rely on the analysis of the random function

$$\begin{aligned} \hat{{\mathcal {N}}}(\rho ; \mathbf{u}) := \mathsf E _t \left[ e^{i \rho S(\mathbf{u})} \ | \ {\fancyscript{H}} \right] = \prod _{j = 1}^{\nu } \hat{\mu }_0(\rho \pi _{j, \nu } \varvec{\psi }_{j, \nu }(\mathbf{u})) \end{aligned}$$
(32)

which, as a function of \(\varvec{\xi }\), is not c.f. and depends on the choice of \(\text {B}\).

One of the merits of representation (17) is that it allows the formulation of a central limit-like theorem for the asymptotic behavior of the solution of the SHBEMM with cutoff, condensed in the following

Theorem 3

When (2)–(3) are in force, \(\mu (\cdot , t)\) converges weakly as \(t\) goes to infinity if and only if (7) holds true. Moreover, in case this condition is satisfied, the limiting distribution is given by (8).

As already mentioned at the beginning of Remark 2, this theorem is well-known. In fact, the “if part” was proved in [20, 61], while the “only if” part was proved, in a quite different way, in [16]. What is new is the proof we develop in Sect. 2.3 on the basis of (17).

2 Proofs

In this section, we present the skeleton of the proofs of Theorems 1, 2 and 3. Some technical issues are deferred to the Appendix and to [30, 31]. We start from the basic representation formulated in Theorem 2.

2.1 Proof of Theorem 2

When (2)–(3) are in force, recall that \(\mu (\cdot , t)\) can be expressed by means of the so-called Wild-McKean sum [51, 67], namely

$$\begin{aligned} \mu (B, t) = \sum _{n = 1}^{+\infty } e^{-t} (1 - e^{-t})^{n-1} \sum _{{\mathfrak {t}}_n\in {\mathbb {T}}(n)} p_n({\mathfrak {t}}_n) {\mathcal {Q}}_{{\mathfrak {t}}_n}[\mu _0](B) \end{aligned}$$

for every \(t \ge 0\) and \(B\) in \({\fancyscript{B}}({{\mathbb {R}}}^3)\). According to McKean, the weights \(p_n({\mathfrak {t}}_n)\) are defined inductively starting from \(p_1({\mathfrak {t}}_1) := 1\) and then putting

$$\begin{aligned} p_n({\mathfrak {t}}_n) := \frac{1}{n-1} p_{n_l}({\mathfrak {t}}_n^l) p_{n_r}({\mathfrak {t}}_n^r) \end{aligned}$$
(33)

for every \(n \ge 2\) and \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\). These \(p_n\)’s are connected with the p.d. of \(\{\tau _n\}_{n \ge 1}\) through the identity

$$\begin{aligned} p_n({\mathfrak {t}}_n) = {\mathsf{P}}_t[\tau _n = {\mathfrak {t}}_n] \end{aligned}$$
(34)

valid for every \(n\) in \({\mathbb {N}}\) and \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\). See Appendix A.2 for the proof. As far as the \({\mathcal {Q}}_{{\mathfrak {t}}_n}\)’s are concerned,

$$\begin{aligned} \begin{array}{lll} {\mathcal {Q}}_{{\mathfrak {t}}_1}[\mu _0] &{}:= \mu _0 &{} {} \\ {\mathcal {Q}}_{{\mathfrak {t}}_n}[\mu _0] &{}:= {\mathcal {Q}}\left[ {\mathcal {Q}}_{{\mathfrak {t}}_n^l}[\mu _0], {\mathcal {Q}}_{{\mathfrak {t}}_n^r}[\mu _0]\right] &{} \ \ \ \ \text {for} \ n \ge 2 \end{array} \end{aligned}$$

where \({\mathcal {Q}}\) is an operator which sends a pair \((\zeta , \eta )\) belonging to \(\mathcal {P}({\mathbb {R}}^3)\times \mathcal {P}({\mathbb {R}}^3)\) into a new element \({\mathcal {Q}}[\zeta , \eta ]\) of \(\mathcal {P}({\mathbb {R}}^3)\) according to the following rule. First, take two sequences \(\{\zeta _n\}_{n\ge 1}\) and \(\{\eta _n\}_{n\ge 1}\) of absolutely continuous p.m.’s such that \(\zeta _n \Rightarrow \zeta \) and \(\eta _n \Rightarrow \eta \), and denote with \(p_n\) (\(q_n\), respectively) the density of \(\zeta _n\) (\(\eta _n\), respectively). Then, denoting the limit w.r.t. weak convergence by \(\text {w-lim}\), put

$$\begin{aligned} {\mathcal {Q}}[\zeta , \eta ](\text {d}\mathbf{v}) := \mathop {\hbox {w-lim}}\limits _{n \rightarrow \infty } Q[p_n, q_n](\mathbf{v}) \text {d}\mathbf{v}\end{aligned}$$
(35)

where

$$\begin{aligned} Q[p, q](\mathbf{v}) := \int \limits _{{\mathbb {R}}^3}\int \limits _{S^2} p(\mathbf{v}_{*}) q(\mathbf{w}_{*}) b\left( \frac{\mathbf{w}- \mathbf{v}}{|\mathbf{w}- \mathbf{v}|} \cdot \varvec{\omega }\right) u_{S^2}(\text {d}\varvec{\omega })\, \text {d}\mathbf{w}. \end{aligned}$$

Note that \(Q[p, q] = Q[q, p]\), as a consequence of (2). As shown in [31], the limit in (35) exists and is independent of the choice of the approximating sequences \(\{\zeta _n\}_{n\ge 1}\) and \(\{\eta _n\}_{n\ge 1}\).

To carry on with the proof, consider the Fourier transform and apply the well-known Bobylev formula, as in [31], to get

$$\begin{aligned} \hat{\mathcal {Q}}[\zeta , \eta ](\varvec{\xi }) = \int \limits _{S^2} \hat{\zeta }((\varvec{\xi }\cdot \varvec{\omega })\varvec{\omega }) \hat{\eta }(\varvec{\xi }- (\varvec{\xi }\cdot \varvec{\omega })\varvec{\omega }) \ b\left( \frac{\varvec{\xi }}{|\varvec{\xi }|} \cdot \varvec{\omega }\right) u_{S^2}(\text {d}\varvec{\omega })\end{aligned}$$

for every \(\varvec{\xi }\) in \({\mathbb {R}}^3{\setminus }\{\mathbf{0}\}\). This, by the change of variable \(\varvec{\omega }= \varvec{\omega }(\varphi , \theta , \varvec{\xi }) = \sin \varphi \cos \theta \mathbf{a}(\mathbf{u}) + \sin \varphi \sin \theta \mathbf{b}(\mathbf{u}) + \cos \varphi \mathbf{u}\), becomes

$$\begin{aligned} \hat{\mathcal {Q}}[\zeta , \eta ](\varvec{\xi }) = \int \limits _{0}^{\pi }\int \limits _{0}^{2\pi } \hat{\zeta }(\rho \cos \varphi {\varvec{\psi }}^l) \hat{\eta }(\rho \sin \varphi {\varvec{\psi }}^r) u_{(0, 2\pi )}(\text {d}\theta ) \beta (\text {d}\varphi ) \end{aligned}$$
(36)

where \(\rho = |\varvec{\xi }|\), \(\mathbf{u}= \varvec{\xi }/|\varvec{\xi }|\) and \({\varvec{\psi }}^l\), \({\varvec{\psi }}^r\) are abbreviations for the quantities

$$\begin{aligned} \begin{array}{lll} \varvec{\psi }^l(\varphi , \theta , \mathbf{u}) &{}:= &{}{}\cos \theta \sin \varphi \mathbf{a}(\mathbf{u}) + \sin \theta \sin \varphi \mathbf{b}(\mathbf{u}) + \cos \varphi \mathbf{u}\\ \varvec{\psi }^r(\varphi , \theta , \mathbf{u}) &{}:= &{}-\cos \theta \cos \varphi \mathbf{a}(\mathbf{u}) - \sin \theta \cos \varphi \mathbf{b}(\mathbf{u}) + \sin \varphi \mathbf{u}\end{array} \end{aligned}$$
(37)

which depend on the choice of the orthonormal basis \(\{\mathbf{a}(\mathbf{u}), \mathbf{b}(\mathbf{u}), \mathbf{u}\}\) of \({\mathbb {R}}^3\). The components of this basis are exactly the columns of the matrix \(\text {B}\) introduced in (29). The inner integral in (36), that is

$$\begin{aligned} I(\varvec{\xi }, \varphi ) := \left\{ \begin{array}{l@{\quad }l} \int \nolimits _{0}^{2\pi } \hat{\zeta }(\rho \cos \varphi {\varvec{\psi }}^l) \hat{\eta }(\rho \sin \varphi {\varvec{\psi }}^r) u_{(0, 2\pi )}(\text {d}\theta ) &{} \text {if} \ \varvec{\xi }\ne \mathbf{0} \\ 1 &{} \text {if} \ \varvec{\xi }= \mathbf{0}, \end{array} \right. \end{aligned}$$
(38)

has interesting properties, which are at the basis of the new representation (17). In particular, \(I\) is a measurable function of \((\varvec{\xi }, \varphi )\) independent of the choice of \(\{\mathbf{a}(\mathbf{u}), \mathbf{b}(\mathbf{u}), \mathbf{u}\}\). Moreover, for every fixed \(\varphi \) in \([0, \pi ]\), \(I(\cdot , \varphi )\) is the Fourier transform of a p.m. on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\), say \({\mathcal {C}}[\zeta , \eta ; \varphi ]\), that is \(I(\varvec{\xi }, \varphi ) = \hat{\mathcal {C}}[\zeta , \eta ; \varphi ](\varvec{\xi })\) for every \(\varvec{\xi }\) in \({\mathbb {R}}^3\). The link with \({\mathcal {Q}}\) is given by

$$\begin{aligned} {\mathcal {Q}}[\zeta , \eta ](B) = \int \limits _{0}^{\pi } {\mathcal {C}}[\zeta , \eta ; \varphi ](B) \beta (\text {d}\varphi ) \end{aligned}$$
(39)

for every \(B\) in \({\fancyscript{B}}({{\mathbb {R}}}^3)\). The proof of these facts is contained in Appendix A.3. At this stage, mimicking the iteration procedure developed for \({\mathcal {Q}}\) leads to the following definition

$$\begin{aligned} \begin{array}{lll} {\mathcal {C}}_{{\mathfrak {t}}_1}[\mu _0; \emptyset ] &{}:= \mu _0 &{} {} \\ {\mathcal {C}}_{{\mathfrak {t}}_n}[\mu _0; {\varvec{\varphi }}] &{}:= {\mathcal {C}}\left[ {\mathcal {C}}_{{\mathfrak {t}}_n^l}[\mu _0; {\varvec{\varphi }}^l], {\mathcal {C}}_{{\mathfrak {t}}_n^r}[\mu _0; {\varvec{\varphi }}^r]; \varphi _{n-1}\right] &{} \ \ \ \ \text {for} \ n \ge 2 \end{array} \end{aligned}$$

for every \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\) and \({\varvec{\varphi }}\) in \([0, \pi ]^{n-1}\), with the proviso that \({\varvec{\varphi }}^l\) (\({\varvec{\varphi }}^r\), respectively) is void when \(n_l\) (\(n - n_l\), respectively) is equal to one. For every \(n \ge 2\) and \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\), the mapping \({\varvec{\varphi }} \mapsto {\mathcal {C}}_{{\mathfrak {t}}_n}[\mu _0; {\varvec{\varphi }}]\) is a random p.m. and

$$\begin{aligned} {\mathcal {Q}}_{{\mathfrak {t}}_n}[\mu _0](B) = \int \limits _{[0, \pi ]^{n-1}} {\mathcal {C}}_{{\mathfrak {t}}_n}[\mu _0; {\varvec{\varphi }}](B) \beta ^{\otimes _{n-1}}(\text {d}{\varvec{\varphi }}) \end{aligned}$$
(40)

holds true for every \(B\) in \({\fancyscript{B}}({{\mathbb {R}}}^3)\), as proved in Appendix A.4. In view of this link, the Wild-McKean sum can be re-written as

$$\begin{aligned} e^{-t}\mu _0(B) + \sum _{n = 2}^{+\infty } e^{-t} (1 - e^{-t})^{n-1} \sum _{{\mathfrak {t}}_n\in {\mathbb {T}}(n)} p_n({\mathfrak {t}}_n) \int \limits _{[0, \pi ]^{n-1}} {\mathcal {C}}_{{\mathfrak {t}}_n}[\mu _0; {\varvec{\varphi }}](B) \beta ^{\otimes _{n-1}}(\text {d}{\varvec{\varphi }}) \end{aligned}$$

which coincides with \({\mathsf{E}}_t\left[ {\mathcal {C}}_{\tau _{\nu }}[\mu _0; (\phi _1, \dots , \phi _{\nu - 1})](B)\right] \). Therefore, to show the validity of (17), it is enough to verify that \({\mathcal {M}}(B) = {\mathcal {C}}_{\tau _{\nu }}[\mu _0; (\phi _1, \dots , \phi _{\nu - 1})](B)\) for every \(B\) in \({\fancyscript{B}}({{\mathbb {R}}}^3)\) or, equivalently, that

$$\begin{aligned} \hat{\mathcal {C}}_{{\mathfrak {t}}_n}[\mu _0; {\varvec{\varphi }}](\varvec{\xi })&= \int \limits _{(0, 2\pi )^{n-1}} \left[ \prod _{j=1}^{n} \hat{\mu }_0\left( \rho \pi _{j, n}^{*}({\mathfrak {t}}_n, {\varvec{\varphi }}) \mathbf{q}_{j, n}({\mathfrak {t}}_n, {\varvec{\varphi }}, {\varvec{\theta }}, \mathbf{u}) \right) \right] u_{(0, 2\pi )}^{\otimes _{n-1}}(\text {d}{\varvec{\theta }}) \quad \quad \end{aligned}$$
(41)
$$\begin{aligned}&= \int \limits _{(0, 2\pi )^{n-1}}\! \left[ \prod _{j=1}^{n} \hat{\mu }_0\left( \rho \pi _{j, n}^{*}({\mathfrak {t}}_n, {\varvec{\varphi }}) \text {B}(\mathbf{u}) \text {O}_{j, n}^{*}({\mathfrak {t}}_n, {\varvec{\varphi }}, {\varvec{\theta }}) \mathbf{e}_3 \right) \!\right] \! u_{(0, 2\pi )}^{\otimes _{n-1}}(\text {d}{\varvec{\theta }}) \quad \quad \nonumber \\ \end{aligned}$$
(42)

hold true for every \(n \ge 2\), \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\), \({\varvec{\varphi }}\) in \([0, \pi ]^{n-1}\) and \(\varvec{\xi }\ne \mathbf{0}\). The \(\mathbf{q}_{j, n}\)’s are defined inductively starting from \(\mathbf{q}_{1, 1}({\mathfrak {t}}_1, \emptyset , \emptyset , \mathbf{u}) := \mathbf{u}\) and then putting

$$\begin{aligned}&\mathbf{q}_{j, n}({\mathfrak {t}}_n, {\varvec{\varphi }}, {\varvec{\theta }}, \mathbf{u})\nonumber \\&\qquad = \left\{ \begin{array}{l@{\quad }l} \mathbf{q}_{j, n_l}({\mathfrak {t}}_n^l, {\varvec{\varphi }}^l, {\varvec{\theta }}^l, \varvec{\psi }^l(\varphi _{n-1}, \theta _{n-1}, \mathbf{u})) &{} \text {for} \ j = 1, \dots , n_l \\ \mathbf{q}_{j - n_l, n_r}({\mathfrak {t}}_n^r, {\varvec{\varphi }}^r, {\varvec{\theta }}^r, \varvec{\psi }^r(\varphi _{n-1}, \theta _{n-1}, \mathbf{u})) &{} \text {for} \ j \!=\! n_l + 1, \dots , n \end{array} \right. \end{aligned}$$
(43)

for every \(n \ge 2\), \({\mathfrak {t}}_n\) in \({\mathbb {T}}(n)\), \({\varvec{\varphi }}\) in \([0, \pi ]^{n-1}\) and \({\varvec{\theta }}\) in \((0, 2\pi )^{n-1}\).

To prove (41), first consider the case when \(n = 2\) and observe that \(\pi _{1, 2}^{*} = \cos \varphi _1\), \(\pi _{2, 2}^{*} = \sin \varphi _1\), \(\mathbf{q}_{1, 2} = \varvec{\psi }^l\), \(\mathbf{q}_{2, 2} = \varvec{\psi }^r\). Then, (41) reduces to (38) with \(\zeta = \eta = \mu _0\). Next, by mathematical induction, assume \(n \ge 3\) and combine (38) with the definition of \({\mathcal {C}}_{{\mathfrak {t}}_n}\) to write

$$\begin{aligned} \hat{\mathcal {C}}_{{\mathfrak {t}}_n}[\mu _0; {\varvec{\varphi }}]({\varvec{\xi }})&= \int \limits _{0}^{2\pi } \hat{\mathcal {C}}_{{\mathfrak {t}}_n^l}[\mu _0; {\varvec{\varphi }}^l](\rho \cos \varphi _{n-1} {\varvec{\psi }}^l(\varphi _{n-1}, \theta _{n-1}, \mathbf{u})) \nonumber \\&\times \, \hat{\mathcal {C}}_{{\mathfrak {t}}_n^r}[\mu _0; {\varvec{\varphi }}^r](\rho \sin \varphi _{n-1} {\varvec{\psi }}^r(\varphi _{n-1}, \theta _{n-1}, \mathbf{u})) u_{(0, 2\pi )}(\text {d}\theta _{n-1}). \qquad \quad \end{aligned}$$
(44)

Thus, assuming that (41) holds true for every \(m\) in \(\{1, \dots , n-1\}\) and every tree \({\mathfrak {t}}_{m}\) in \({\mathbb {T}}(m)\), deduce

$$\begin{aligned} \hat{\mathcal {C}}_{{\mathfrak {t}}_n^s}[\mu _0; {\varvec{\varphi }}^s](x \varvec{\psi }^s(\varphi _{n-1}, \theta _{n-1}, \mathbf{u}))&= \int \limits _{(0, 2\pi )^{n_s - 1}} \left\{ \prod _{j=1}^{n_s} \hat{\mu }_0 \left[ x \pi _{j, n_s}^{*}({\mathfrak {t}}_n^s, {\varvec{\varphi }}^s) \right. \right. \\&\left. \left. \times \, \mathbf{q}_{j, n_s}({\mathfrak {t}}_n^s, {\varvec{\varphi }}^s, {\varvec{\theta }}^s, {\varvec{\psi }}^s(\varphi _{n-1}, \theta _{n-1}, \mathbf{u}))\right] \right\} u_{(0, 2\pi )}^{\otimes _{n_s - 1}}(\text {d}{\varvec{\theta }}^s) \end{aligned}$$

where \((s, x)\) is \((l, \rho \cos \varphi _{n-1})\) or \((r, \rho \sin \varphi _{n-1})\). To complete the argument, combine the last two equalities with (22) and (43).

As far as the proof of (42) is concerned, start by noting that \(\mathbf{q}_{j, 2}({\mathfrak {t}}_2, \varphi , \theta , \mathbf{u})\) equals \(\text {B}(\mathbf{u})\text {O}_{j, 2}^{*}({\mathfrak {t}}_2, \varphi , \theta ) \mathbf{e}_3\) for \(j = 1, 2\), for every \(\varphi \) in \([0, \pi ]\), \(\theta \) in \((0, 2\pi )\) and \(\mathbf{u}\) in \(S^2\), provided that the basis \(\{\mathbf{a}(\mathbf{u}), \mathbf{b}(\mathbf{u}), \mathbf{u}\}\) in (37) is formed by the three columns of \(\text {B}(\mathbf{u})\). Then, assume \(n \ge 3\) and argue by induction starting from (41), definitions (26) and (43). Whence,

$$\begin{aligned}&\int \limits _{(0, 2\pi )^{n-1}} \left[ \prod _{j=1}^{n} \hat{\mu }_0\left( \rho \pi _{j, n}^{*}({\mathfrak {t}}_n, {\varvec{\varphi }}) \mathbf{q}_{j, n}({\mathfrak {t}}_n, {\varvec{\varphi }}, {\varvec{\theta }}, \mathbf{u})\right) \right] u_{(0, 2\pi )}^{\otimes _{n-1}}(\text {d}\varvec{\theta })\nonumber \\&\quad = \int \limits _{0}^{2\pi } \! \int \limits _{(0, 2\pi )^{n_l-1}} \int \limits _{(0, 2\pi )^{n_r-1}} P_{j, n}^{l} P_{j, n}^{r} u_{(0, 2\pi )}^{\otimes _{n_r - 1}}(\text {d}\varvec{\theta }^r) u_{(0, 2\pi )}^{\otimes _{n_l - 1}}(\text {d}\varvec{\theta }^l) u_{(0, 2\pi )}(\text {d}\theta _{n-1})\nonumber \\ \end{aligned}$$
(45)

where

$$\begin{aligned} P_{j, n}^{l} := \prod _{j=1}^{n_l} \hat{\mu }_0\left( \rho \pi _{j, n}^{*}({\mathfrak {t}}_n, \varvec{\varphi }) \text {B}(\varvec{\psi }^l(\varphi _{n-1}, \theta _{n-1}, \mathbf{u})) \text {O}_{j, n_l}^{*}({\mathfrak {t}}_n^l, \varvec{\varphi }^l, \varvec{\theta }^l) \mathbf{e}_3 \right) \end{aligned}$$

and

$$\begin{aligned} P_{j, n}^{r} := \prod _{j=n_l + 1}^{n} \hat{\mu }_0\left( \rho \pi _{j, n}^{*}({\mathfrak {t}}_n, \varvec{\varphi }) \text {B}(\varvec{\psi }^r(\varphi _{n-1}, \theta _{n-1}, \mathbf{u})) \text {O}_{j-n_l, n_r}^{*}({\mathfrak {t}}_n^r, \varvec{\varphi }^r, \varvec{\theta }^r) \mathbf{e}_3 \right) . \end{aligned}$$

For the sake of clarity, the integral \(\int _{(0, 2\pi )^{n_l-1}}\) (\(\int _{(0, 2\pi )^{n_r-1}}\), respectively) in (45) should not be written if \(n_l = 1\) (\(n_r = 1\), respectively) since \(\varvec{\theta }^l\) (\(\varvec{\theta }^r\), respectively) corresponds to the empty set. At this stage, it will be proved that

$$\begin{aligned} \int \limits _{(0, 2\pi )^{n_l-1}} P_{j, n}^{l} u_{(0, 2\pi )}^{\otimes _{n_l - 1}}(\text {d}\varvec{\theta }^l) \!&= \! \int \limits _{(0, 2\pi )^{n_l-1}} \left[ \prod _{j=1}^{n_l} \hat{\mu }_0\left( \rho \pi _{j, n}^{*}({\mathfrak {t}}_n, \varvec{\varphi }) \text {B}(\mathbf{u}) \text {O}_{j, n}^{*}({\mathfrak {t}}_n, \varvec{\varphi }, \varvec{\theta }) \mathbf{e}_3 \right) \right] \nonumber \\&u_{(0, 2\pi )}^{\otimes _{n_l - 1}}(\text {d}\varvec{\theta }^l) \!\! \end{aligned}$$
(46)

holds for every \(\rho \) in \({\mathbb {R}}\), \(\mathbf{u}\) in \(S^2\), \(\varvec{\varphi }\) in \([0, \pi ]^{n-1}\) and \(\theta _{n-1}\) in \((0, 2\pi )\). If \(n_l = 1\), the proof of (46) reduces to verify that

$$\begin{aligned} \text {B}(\varvec{\psi }^l(\varphi _{n-1}, \theta _{n-1}, \mathbf{u})) \mathbf{e}_3 = \varvec{\psi }^l(\varphi _{n-1}, \theta _{n-1}, \mathbf{u}) = \text {B}(\mathbf{u}) \text {M}^l(\varphi _{n-1}, \theta _{n-1}) \mathbf{e}_3. \end{aligned}$$

To proceed, since the third column of \(\text {B}(\varvec{\psi }^l(\varphi _{n-1}, \theta _{n-1}, \mathbf{u}))\) is the same as that of \(\text {B}(\mathbf{u}) \text {M}^l(\varphi _{n-1}, \theta _{n-1})\), then there exists an orthogonal matrix

$$\begin{aligned} \text {R}(\alpha ) := \left( \begin{array}{c@{\quad }c@{\quad }c} \cos \alpha &{} -\sin \alpha &{} 0 \\ \sin \alpha &{} \cos \alpha &{} 0 \\ 0 &{} 0 &{} 1 \\ \end{array} \right) \end{aligned}$$

for which \(\text {B}(\varvec{\psi }^l(\varphi _{n-1}, \theta _{n-1}, \mathbf{u})) = \text {B}(\mathbf{u}) \text {M}^l(\varphi _{n-1}, \theta _{n-1}) \text {R}(\alpha )\), where \(\alpha \) depends only on \(\varphi _{n-1}\), \(\theta _{n-1}\) and \(\mathbf{u}\). Now, note that \(\text {R}(\alpha ) \text {M}^s(\varphi , \theta ) = \text {M}^s(\varphi , \theta + \alpha )\) is valid for \(s = l, r\) and for every \(\varphi \) and \(\theta \). Then, when \(n_l \ge 2\), consider the definition of \(P_{j,n}^{l}\), recall (28) and take account that the product \(\text {R}(\alpha )\text {M}^{\epsilon _{1}({\mathfrak {t}}_n^l, j)}(\varphi _{n_l-1}, \theta _{n_l-1})\) equals \(\text {M}^{\epsilon _{1}({\mathfrak {t}}_n^l, j)}(\varphi _{n_l-1}, \theta _{n_l-1} + \alpha )\). The change of variable \(\theta _{n_l-1}^{'} = \theta _{n_l-1} + \alpha \) transforms the LHS of (46) into

$$\begin{aligned} \int \limits _{(0, 2\pi )^{n_l-1}} \left[ \prod _{j=1}^{n_l} \hat{\mu }_0\left( \rho \pi _{j, n}^{*}({\mathfrak {t}}_n, \varvec{\varphi }) \text {B}(\mathbf{u}) \text {M}^l(\varphi _{n-1}, \theta _{n-1}) \text {O}_{j, n_l}^{*}({\mathfrak {t}}_n^l, \varvec{\varphi }^l, \varvec{\theta }^l) \mathbf{e}_3 \right) \right] u_{(0, 2\pi )}^{\otimes _{n_l - 1}}(\text {d}\varvec{\theta }^l) \end{aligned}$$

which, in view of (26), turns out to be the same as the RHS of (46). The proof of (42) is completed using (45), after noting that an equality similar to (46) can be stated by changing subscripts and superscripts from \(l\) to \(r\), and replacing \(\text {O}_{j, n}^{*}\) with \(\text {O}_{j + n_l, n}^{*}\).

Finally, the invariance of \({\mathcal {M}}\) w.r.t. \(\text {B}\) is equivalent to the invariance of representation (42) when \(\text {B}(\mathbf{u})\) is replaced by any matrix \(\text {B}^{'}(\mathbf{u})\) having the same characteristics as \(\text {B}(\mathbf{u})\). Anyway, such an equivalence follows from the above reasoning.

2.2 Proof of Theorem 1

In the first place, we recall that the entire proof will be developed under hypotheses (2)–(3) on \(b\), in view of Remark 1 in Sect. 1.4. Then, we set a few conditions on \(\mu _0\) to simplify a number of arguments without loss of generality. In this sense, we make use of (6) to assume, from now on,

$$\begin{aligned} \int \limits _{{\mathbb {R}}^3}\mathbf{v}\mu _0(\text {d}\mathbf{v}) = \mathbf{0} \quad \text {and} \quad \int \limits _{{\mathbb {R}}^3}|\mathbf{v}|^2 \mu _0(\text {d}\mathbf{v}) = 3 \end{aligned}$$
(47)

implying that the limiting Maxwellian is \(\gamma := \gamma _{\mathbf{0}, 1}\). We also assume that the covariance matrix \(V = V[\mu _0]\) of \(\mu _0\) is diagonal. In fact, since for any covariance matrix \(V\) there is an orthogonal matrix \(Q\) such that \(Q V Q^t\) is diagonal, then \(\mu _0 \circ f_{Q}^{-1}\) has a diagonal covariance matrix, \(f_Q\) standing for the function \(\mathbf{x}\mapsto Q\mathbf{x}\). At this stage, since \(\text {d}_{\text {TV}}( \mu (\cdot , t) \circ f_{Q}^{-1}, \gamma )\) is equal to \(\text {d}_{\text {TV}}( \mu (\cdot , t), \gamma )\) for every \(t\), we can prove (16) by taking \(\mu _0 \circ f_{Q}^{-1}\) as initial distribution. Compare [31] for a more detailed explanation. Hence, we suppose that

$$\begin{aligned} \begin{array}{l} \displaystyle \int \limits _{{\mathbb {R}}^3}v_{i}^{2} \mu _0(\text {d}\mathbf{v}) = \sigma _{i}^{2} \ \ \ (i = 1, 2, 3) \\ \displaystyle \int \limits _{{\mathbb {R}}^3}v_i v_j \mu _0(\text {d}\mathbf{v}) = 0 \ \ \ (1 \le i < j \le 3) \\ \displaystyle \sigma _{1}^{2} + \sigma _{2}^{2} + \sigma _{3}^{2} = 3 {} \end{array} \end{aligned}$$
(48)

are in force. In fact, extra-conditions (47)–(48) yield the following

Proposition 4

Let \(\mu _0\) satisfy (47)–(48) in addition to the hypotheses of Theorem 1. Then, there exists a constant \(\lambda \) such that

$$\begin{aligned} |\hat{\mu }_0(\varvec{\xi })| \le \left( \frac{\lambda ^2}{\lambda ^2 + |\varvec{\xi }|^2} \right) ^q \end{aligned}$$
(49)

is valid for every \(\varvec{\xi }\) in \({\mathbb {R}}^3\), with \(q = 1/(2 \lceil 2/p \rceil )\).

Here, \(\lceil x \rceil \) indicates the least integer not less than \(x\), while \(p\) is the same as in (15). As to the numerical evaluation of \(\lambda \), the reader is referred to the proof of the proposition in Appendix A.5.

As first step of the real proof, an application of (17) yields

$$\begin{aligned} \text {d}_{\text {TV}}(\mu (\cdot , t), \gamma ) = \sup _{B \in {\fancyscript{B}}({{\mathbb {R}}}^3)} \left| {\mathsf{E}}_t[{\mathcal {M}}(B)] - \gamma (B) \right| \le {\mathsf{E}}_t[\text {d}_{\text {TV}}({\mathcal {M}}, \gamma )]. \end{aligned}$$

After introducing the random number

$$\begin{aligned} \text {W}:= \sum _{j = 1}^{\nu } \pi _{j, \nu }^4 \end{aligned}$$
(50)

we put

$$\begin{aligned} r := 11 \lceil 2/p \rceil \quad \text {and}\quad a_{*}:= (2^r r!)^{-1} \end{aligned}$$
(51)

to define the partition \(\{U, U^c\}\) of \(\Omega \) by

$$\begin{aligned} U := \{\nu \le r\} \cup \left\{ \prod _{j=1}^{\nu } \pi _{j, \nu } = 0\right\} \cup \{\text {W}\ge a_{*}\}. \end{aligned}$$

This can be used to write

$$\begin{aligned} {\mathsf{E}}_t[\text {d}_{\text {TV}}({\mathcal {M}}, \gamma )] = {\mathsf{E}}_t[\text {d}_{\text {TV}}({\mathcal {M}}, \gamma ); U] + {\mathsf{E}}_t[\text {d}_{\text {TV}}({\mathcal {M}}, \gamma ); U^c] \end{aligned}$$
(52)

where \({\mathsf{E}}_t[X; S]\) denotes \(\int _{S} X \text {d}{\mathsf{P}}_t\). The former summand on the right of (52) will be bounded by utilizing the fact that \(U\) has “asymptotically small” probability. As to the latter, it will be shown that \({\mathcal {M}}(\cdot ; \omega )\) has nice analytical properties for each \(\omega \) in \(U^c\), so that a proper bound will be derived from these very same properties. In fact, as \(\text {d}_{\text {TV}}({\mathcal {M}}, \gamma ) \le 1\) entails \({\mathsf{E}}_t[\text {d}_{\text {TV}}({\mathcal {M}}, \gamma ); U] \le {\mathsf{P}}_t(U)\), we get

$$\begin{aligned} {\mathsf{P}}_t(U)&\le {\mathsf{P}}_t\{\nu \le r\} + {\mathsf{P}}_t\left\{ \prod _{j=1}^{\nu } \pi _{j, \nu } = 0\right\} + {\mathsf{P}}_t\{\text {W}\ge a_{*}\} \\&\le r e^{-t} + {\mathsf{P}}_t\{\text {W}\ge a_{*}\}. \end{aligned}$$

The inequality \({\mathsf{P}}_t\{\nu \le r\} \le r e^{-t}\) follows from (18), while \({\mathsf{P}}_t\{\prod _{j=1}^{\nu } \pi _{j, \nu } = 0\}\) equals zero since \({\mathsf{P}}_t\{\prod _{j=1}^{\nu } \pi _{j, \nu } = 0 \ | \ \nu , \tau _{\nu }\} = 0\). This claim is obvious on \(\{\nu = 1\}\) while, on \(\{\nu \ge 2\}\),

$$\begin{aligned} {\mathsf{P}}_t\left\{ \prod _{j=1}^{\nu } \pi _{j, \nu } = 0 \ | \ \nu , \tau _{\nu }\right\} \le \sum _{j = 1}^{\nu - 1} {\mathsf{P}}_t\{\phi _j \in \{0, \pi /2, \pi \}\} \end{aligned}$$

and the RHS is equal to zero since each \(\phi _j\) has an absolutely continuous law. To complete the evaluation of \({\mathsf{P}}_t(U)\), it is enough to combine the Markov inequality with (24)–(25) to get \({\mathsf{P}}_t\{\text {W}\ge a_{*}\} \le (1/a_{*}) \ {\mathsf{E}}_t[\text {W}] = (2^r r!) \ e^{\varLambda _b t}\). Whence,

$$\begin{aligned} {\mathsf{E}}_t[\text {d}_{\text {TV}}({\mathcal {M}}, \gamma ); U] \le (r + 2^r r!) e^{\varLambda _b t}. \end{aligned}$$
(53)

The argument to deduce a bound for the expectation over \(U^c\) occupies the rest of this subsection. It is based on the following multidimensional extension of a result by Beurling [6].

Proposition 5

Let \(\chi \) be a finite signed measure on \(({\mathbb {R}}^3, {\fancyscript{B}}({\mathbb {R}}^3))\) such that \(\int _{{\mathbb {R}}^3}|\mathbf{x}|^2 |\chi |({\mathrm {d}}\mathbf{x}) < +\infty \), \(|\chi |\) standing for the total variation of \(\chi \). Then,

$$\begin{aligned} \sup _{B \in {\fancyscript{B}}({{\mathbb {R}}}^3)} |\chi (B)| \le 2^{-5/4}\pi ^{-1/2} \left( \,\,\int \limits _{{\mathbb {R}}^3} [|\hat{\chi }(\varvec{\xi })|^2 + |\varDelta _{\varvec{\xi }} \hat{\chi }(\varvec{\xi })|^2]\, {\mathrm {d}} \varvec{\xi } \right) ^{1/2} \end{aligned}$$

where \(\varDelta _{\varvec{\xi }}\) denotes the Laplacian operator.

The proof is deferred to Appendix A.6. The applicability of this proposition to \(\chi = {\mathcal {M}}- \gamma \) is made possible by

Proposition 6

If (47) holds and

$$\begin{aligned} {\mathfrak {m}}_h := \int \limits _{{\mathbb {R}}^3}|\mathbf{v}|^h \mu _0({\mathrm {d}} \mathbf{v}) < +\infty \end{aligned}$$
(54)

for \(h = 1, \dots , 2k\) and some integer \(k \ge 2\), then there are positive constants \(g_h\) depending on \(\mu _0\) only through \({\mathfrak {m}}_h\), such that

$$\begin{aligned} \sup _{\mathbf{u}\in S^2}{\mathsf{E}}_t\left[ |S(\mathbf{u})|^h \ | \ {\fancyscript{H}} \right] \le g_h \end{aligned}$$
(55)

for \(h = 1, \dots , 2k\) and any choice of \({\mathrm {B}}\), \({\mathsf{P}}_t\)-almost surely. Moreover, \(\rho \mapsto \frac{\partial ^h}{\partial \rho ^h} \hat{{\mathcal {M}}}(\rho \mathbf{u})\) exists for every \(\mathbf{u}\) in \(S^2\) and

$$\begin{aligned} \sup _{(\rho , \mathbf{u}) \in [0, +\infty ) \times S^2}\left| \frac{\partial ^h}{\partial \rho ^h} \hat{{\mathcal {M}}}(\rho \mathbf{u})\right| \le g_h \end{aligned}$$
(56)

\({\mathsf{P}}_t\)-almost surely with \(h = 1, \dots , 2k\), which entails

$$\begin{aligned} \int \limits _{{\mathbb {R}}^3}|\mathbf{v}|^{2k} {\mathcal {M}}({\mathrm {d}}\mathbf{v}) < +\infty \end{aligned}$$
(57)

\({\mathsf{P}}_t\)-almost surely and \(\varvec{\xi }\mapsto \hat{{\mathcal {M}}}(\varvec{\xi }) \in \text {C}^{2k}({\mathbb {R}}^3)\).

See Appendix A.7 for the proof and the numerical evaluation of the constants \(g_h\). At this stage, Proposition 5 yields

$$\begin{aligned}&{\mathsf{E}}_t[\text {d}_{\text {TV}}({\mathcal {M}}, \gamma ); U^c] \le 2^{-5/4}\pi ^{-1/2} \nonumber \\&\quad \times \, {\mathsf{E}}_t\left[ \left( \,\,\int \limits _{{\mathbb {R}}^3}\left| \hat{\mathcal {M}}(\varvec{\xi }) - e^{-|\varvec{\xi }|^2/2}\right| ^2 \text {d} \varvec{\xi }+ \int \limits _{{\mathbb {R}}^3}\left| \varDelta _{\varvec{\xi }}[\hat{\mathcal {M}}(\varvec{\xi }) - e^{-|\varvec{\xi }|^2/2}]\right| ^2 \text {d} \varvec{\xi }\right) ^{1/2}; U^c\right] . \nonumber \\ \end{aligned}$$
(58)

To evaluate the integrals on the RHS, we change the variables according to the isometry \(i : {\mathbb {R}}^3{\setminus }\{\mathbf{0}\} \rightarrow (0, +\infty ) \times S^2\) defined by \(i : \varvec{\xi }\mapsto (|\varvec{\xi }|, \varvec{\xi }/|\varvec{\xi }|)\). In view of Theorem 3.11, Example 3.23 and Lemma 3.27 in [41], denoting the \(d\)-dimensional Lebesgue measure by \({\fancyscript{L}}^d\), integrals w.r.t. \({\fancyscript{L}}^3(\text {d}\varvec{\xi })\) become integrals w.r.t. \(4\pi \rho ^2 {\fancyscript{L}}^1 \otimes u_{S^2}(\text {d}\rho \text {d}\mathbf{u})\) and the standard Laplacian \(\varDelta _{\varvec{\xi }}\) changes into \(\varDelta _{(\rho , \mathbf{u})} := \frac{\partial ^2}{\partial \rho ^2} + \frac{2}{\rho }\frac{\partial }{\partial \rho } + \frac{1}{\rho ^2}\varDelta _{S^2}\), where \(\varDelta _{S^2}\) stands for the Laplace–Beltrami operator on \(S^2\). Now, from \(|z_1 + z_2 + z_3|^2 \le 3(|z_1|^2 + |z_2|^2 + |z_3|^2)\), we write

$$\begin{aligned} \left| \varDelta _{(\rho , \mathbf{u})}[\hat{{\mathcal {M}}}(\rho \mathbf{u}) - e^{-\rho ^2/2}] \right| ^2&\le 3 \ \left| \frac{\partial ^2}{\partial \rho ^2} [\hat{{\mathcal {M}}}(\rho \mathbf{u}) - e^{-\rho ^2/2}] \right| ^2\\&+ \frac{12}{\rho ^2} \ \left| \frac{\partial }{\partial \rho } [\hat{{\mathcal {M}}}(\rho \mathbf{u}) - e^{-\rho ^2/2}] \right| ^2 + \frac{3}{\rho ^4} \ \left| \varDelta _{S^2} \hat{{\mathcal {M}}}(\rho \mathbf{u}) \right| ^2 \end{aligned}$$

and then we define the random functions

$$\begin{aligned} \text {I}_1(\rho , \mathbf{u})&:= \left| \hat{{\mathcal {M}}}(\rho \mathbf{u}) - e^{-\rho ^2/2} \right| ^2 + 3 \ \left| \frac{\partial ^2}{\partial \rho ^2} [\hat{{\mathcal {M}}}(\rho \mathbf{u}) - e^{-\rho ^2/2}]\right| ^2 \\&+ \frac{12}{\rho ^2} \ \left| \frac{\partial }{\partial \rho } [\hat{{\mathcal {M}}}(\rho \mathbf{u}) - e^{-\rho ^2/2}]\right| ^2 \\ \text {I}_2(\rho , \mathbf{u})&:= \frac{3}{\rho ^4} \ \left| \varDelta _{S^2} \hat{{\mathcal {M}}}(\rho \mathbf{u})\right| ^2. \end{aligned}$$

Hence, for the sum of the two integrals on the RHS of (58) we obtain

$$\begin{aligned}&\int \limits _{{\mathbb {R}}^3}\left| \hat{\mathcal {M}}(\varvec{\xi }) - e^{-|\varvec{\xi }|^2/2}\right| ^2 \text {d} \varvec{\xi }+ \int \limits _{{\mathbb {R}}^3}\left| \varDelta _{\varvec{\xi }}[\hat{\mathcal {M}}(\varvec{\xi }) - e^{-|\varvec{\xi }|^2/2}]\right| ^2 \text {d} \varvec{\xi }\nonumber \\&\quad \le 4\pi \int \limits _{(0, \text {R}] \times S^2} \left( \text {I}_1(\rho , \mathbf{u})+ \text {I}_2(\rho , \mathbf{u})\right) \rho ^2{\fancyscript{L}}^1 \otimes u_{S^2}(\text {d}\rho \text {d}\mathbf{u}) \nonumber \\&\qquad +\, 4\pi \int \limits _{(\text {R}, +\infty ) \times S^2} \left( \text {I}_1(\rho , \mathbf{u})+ \text {I}_2(\rho , \mathbf{u})\right) \rho ^2{\fancyscript{L}}^1 \otimes u_{S^2}(\text {d}\rho \text {d}\mathbf{u}) \end{aligned}$$
(59)

where

$$\begin{aligned} \text {R}:= \frac{1}{2} \left( \frac{1}{{\mathfrak {m}}_4\text {W}}\right) ^{1/4}. \end{aligned}$$
(60)

In the following sub-subsections we analyze the integrals appearing in (59), calling inner (outer, respectively) any integral on \((0, \text {R}] \times S^2\) (\((\text {R}, +\infty ) \times S^2\), respectively).

2.2.1 Outer integral of  \(\mathrm {I}_1(\rho , \mathbf{u})\)

An application of the inequality \(|z_1 + z_2|^2 \le 2|z_1|^2 + 2|z_2|^2\) yields

$$\begin{aligned} \text {I}_1(\rho , \mathbf{u})&\le 2 |\hat{{\mathcal {M}}}(\rho \mathbf{u})|^2 + 2e^{-\rho ^2} + 6 \left| \frac{\partial ^2}{\partial \rho ^2} \hat{{\mathcal {M}}}(\rho \mathbf{u})\right| ^2 + 6 \left| \frac{\text {d}^2}{\text {d}\rho ^2} (e^{-\rho ^2/2})\right| ^2 \nonumber \\&+\, \frac{24}{\rho ^2} \left| \frac{\partial }{\partial \rho } \hat{{\mathcal {M}}}(\rho \mathbf{u})\right| ^2 + \frac{24}{\rho ^2} \left| \frac{\text {d}}{\text {d}\rho } (e^{-\rho ^2/2})\right| ^2 \end{aligned}$$
(61)

and a first proposition is given to analyze those summands which contain the Gaussian c.f..

Proposition 7

Let \(m, s, k\) be real numbers such that \(m \ge 0\), \(s \ge 1\) and \(k\) in \({\mathbb {N}}_0\). Then, there exists a positive constant \(c(m, s, k)\) such that

$$\begin{aligned} \int \limits _{x}^{+\infty } \left( \frac{{\mathrm {d}}^k}{{\mathrm {d}} \rho ^k} (e^{-\rho ^2/2})\right) ^2 \rho ^m {\mathrm {d}} \rho \le c(m, s, k) x^{-s} \end{aligned}$$

holds for every \(x > 0\).

See Appendix A.8 for the proof and an evaluation of \(c(m, s, k)\). At this stage, applying successively the above statement with \((x, m, s, k) = (\text {R}, 2, 8, 0), (\text {R}, 0, 8, 1)\) and \((\text {R}, 2, 8, 2)\) gives

$$\begin{aligned} \int \limits _{\text {R}}^{+\infty } \left\{ 2e^{-\rho ^2} + 6 \left| \frac{\text {d}^2}{\text {d}\rho ^2} (e^{-\rho ^2/2})\right| ^2 + \frac{24}{\rho ^2} \left| \frac{\text {d}}{\text {d}\rho } (e^{-\rho ^2/2})\right| ^2 \right\} \rho ^2 \text {d}\rho \le \ \overline{C}_1 \text {R}^{-8} \end{aligned}$$
(62)

with \(\overline{C}_1 := [2c(2, 8, 0) + 6c(2, 8, 2) + 24c(0, 8, 1)]\).

Then we study those terms on the RHS of (61) which depend on \({\mathcal {M}}\) making use of the next proposition, whose statement involves the random function

$$\begin{aligned} \varPsi (\rho ) := \prod _{j = 1}^{\nu } \left( \frac{\lambda ^2}{\lambda ^2 + \rho ^2 \pi _{j, \nu }^2}\right) ^q \end{aligned}$$
(63)

with \(\lambda \) and \(q\) as in Proposition 4.

Proposition 8

If (14)–(15) and (47)–(48) are in force, then

$$\begin{aligned} \sup _{\mathbf{u}\in S^2} \left| \hat{{\mathcal {N}}}(\rho ; \mathbf{u})\right| \ \le \ \varPsi (\rho ) \end{aligned}$$
(64)

and

$$\begin{aligned} \sup _{\mathbf{u}\in S^2} \left| \hat{{\mathcal {M}}}(\rho \mathbf{u})\right| \ \le \ \varPsi (\rho ) \end{aligned}$$
(65)

hold for every \(\rho \) in \([0, {\mathrm {R}}]\), with the exception of a set of \({\mathsf{P}}_t\)-probability zero. Moreover, there are two non-random polynomials \(\wp _1\) and \(\wp _2\) of degree 2 and 4 respectively, with positive coefficients depending only on \(\mu _0\), such that

$$\begin{aligned} \sup _{\mathbf{u}\in S^2} \left| \frac{\partial ^k}{\partial \rho ^k}\hat{{\mathcal {N}}}(\rho ; \mathbf{u})\right| \ \le \ \wp _k(\rho ) \varPsi (\rho ) \end{aligned}$$
(66)

and

$$\begin{aligned} \sup _{\mathbf{u}\in S^2} \left| \frac{\partial ^k}{\partial \rho ^k}\hat{{\mathcal {M}}}(\rho \mathbf{u})\right| \ \le \ \wp _k(\rho ) \varPsi (\rho ) \end{aligned}$$
(67)

hold for \({k}=1,2\) and every \(\rho \) in \([0, {\mathrm {R}}]\), with the exception of a set of \({\mathsf{P}}_t\)-probability zero.

A complete characterization of \(\wp _1\) and \(\wp _2\) is given in the course of the proof of this proposition, in Appendix A.9. For the sake of completeness, we observe that (64) and (66) hold true for any choice of \(\text {B}\) in (29).

One of the advantages of the splitting (52) consists in the fact that all the realizations of \(\varPsi \) on \(U^c\) share a property of uniform integrability, as shown in the following

Proposition 9

Over \(U^c\), the inequality

$$\begin{aligned} \prod _{j=1}^{\nu } \left( 1 + \pi _{j, \nu }^{2} x^2\right) \ge \ \epsilon x^{2r} \end{aligned}$$
(68)

is valid for every \(x > 0\), with \(\epsilon := (2 r!)^{-1}\) and \(r\) given by (51). Therefore,

$$\begin{aligned} \sup _{\omega \in U^c} \int \limits _{x}^{+\infty } \varPsi ^s(\rho ) \rho ^m {\mathrm {d}} \rho \le \ \frac{1}{2rqs - m - 1}\left( \frac{\lambda ^{2r}}{\epsilon }\right) ^{qs} x^{-2rqs + m + 1} \end{aligned}$$
(69)

holds true for every \(x > 0\), \(s > 0\) and \(m < (2rqs - 1)\).

See Appendix A.10 for the proof. We are now in a position to complete the study of the outer integral of \(\text {I}_1(\rho , \mathbf{u})\). First, taking into account that \(2rq = 11\), combination of (65) with (69) yields

$$\begin{aligned} \int \limits _{\text {R}}^{+\infty } |\hat{{\mathcal {M}}}(\rho \mathbf{u})|^2 \rho ^2 \text {d}\rho \le \int \limits _{\text {R}}^{+\infty } \varPsi ^2(\rho ) \rho ^2 \text {d}\rho \le \ \frac{1}{19}\left( \frac{\lambda ^{2r}}{\epsilon }\right) ^{2q} \text {R}^{-19}. \end{aligned}$$
(70)

The applicability of (69) is guaranteed by the fact that, when \(s = 2\) and \(m = 2\), one has \(m < 4rq - 1 = 21\). Second, since (56), (65) and (68) entail

$$\begin{aligned} \lim _{y \rightarrow +\infty } \hat{{\mathcal {M}}}(-y \mathbf{u}) \left[ \frac{\partial }{\partial \rho } \hat{{\mathcal {M}}}(\rho \mathbf{u})\right] _{\rho = y} = 0 \end{aligned}$$

on \(U^c\), after integrating by parts we get

$$\begin{aligned} \int \limits _{\text {R}}^{+\infty } \left| \frac{\partial }{\partial \rho } \hat{{\mathcal {M}}}(\rho \mathbf{u})\right| ^2 \text {d}\rho \le g_1 \varPsi (\text {R}) + g_2 \int \limits _{\text {R}}^{+\infty } \varPsi (\rho ) \text {d}\rho . \end{aligned}$$

Thus, (68)–(69) with \(s=1\) and \(m=0\) lead to

$$\begin{aligned} \int \limits _{\text {R}}^{+\infty } \left| \frac{\partial }{\partial \rho } \hat{{\mathcal {M}}}(\rho \mathbf{u})\right| ^2 \text {d}\rho \le \left( \frac{\lambda ^{2r}}{\epsilon }\right) ^q \cdot \left( g_1 \text {R}^{-11} + \frac{g_2}{10} \text {R}^{-10}\right) . \end{aligned}$$
(71)

To study the last integral, we recall that \(\prod _{j=1}^{\nu } \pi _{j, \nu } \ne 0\) on \(U^c\) and then combine (65) with (67)–(68) to prove that

$$\begin{aligned} \lim _{y \rightarrow +\infty } y^2 \left[ \frac{\partial ^2}{\partial \rho ^2} \hat{{\mathcal {M}}}(\rho \mathbf{u}) \cdot \left( \frac{\partial }{\partial \rho } \hat{{\mathcal {M}}}(-\rho \mathbf{u})\right) \right] _{\rho = y}&= 0\\ \lim _{y \rightarrow +\infty } y^2 \hat{{\mathcal {M}}}(-y \mathbf{u}) \left[ \frac{\partial ^3}{\partial \rho ^3} \hat{{\mathcal {M}}}(\rho \mathbf{u})\right] _{\rho = y}&= 0 \\ \lim _{y \rightarrow +\infty } y \hat{{\mathcal {M}}}(-y \mathbf{u}) \left[ \frac{\partial ^2}{\partial \rho ^2} \hat{{\mathcal {M}}}(\rho \mathbf{u})\right] _{\rho = y}&= 0. \end{aligned}$$

At this stage, after two integrations by parts, we have

$$\begin{aligned}&\int \limits _{\text {R}}^{+\infty } \left| \frac{\partial ^2}{\partial \rho ^2} \hat{{\mathcal {M}}}(\rho \mathbf{u})\right| ^2 \rho ^2\text {d}\rho \le \text {R}^2 \wp _1(\text {R})\wp _2(\text {R})\varPsi ^2(\text {R}) + (g_3\text {R}^2 + 2g_2\text {R}) \varPsi (\text {R})\\&\quad + \int \limits _{\text {R}}^{+\infty } (g_4 \rho ^2 + 4g_3 \rho + 2g_2) \varPsi (\rho ) \text {d}\rho \end{aligned}$$

and, in view of Proposition 9, the above RHS is bounded by

$$\begin{aligned}&\left( \frac{\lambda ^{2r}}{\epsilon }\right) ^{2q} \text {R}^{-20}\wp _1(\text {R})\wp _2(\text {R}) + \left( g_3\text {R}^2 + 2g_2\text {R}\right) \cdot \left( \frac{\lambda ^{2r}}{\epsilon }\right) ^q \text {R}^{-11} \nonumber \\&\quad + \left( \frac{\lambda ^{2r}}{\epsilon }\right) ^q \cdot \left( \frac{g_4}{8} \text {R}^{-8} + \frac{4g_3}{9} \text {R}^{-9} + \frac{g_2}{5} \text {R}^{-10}\right) . \end{aligned}$$
(72)

The final bound can be obtained, via the Tonelli theorem, by starting from (61) and collecting the upper bounds in (62) and (70)–(72). Indeed, these last upper bounds are independent of \(\mathbf{u}\) and are expressed as sums of powers of \(\text {R}\), of order less than or equal to \(-8\). Therefore, recalling (60) and the inequality \(\text {W}\le 1\), we obtain

$$\begin{aligned} 4\pi \int \limits _{(\text {R}, +\infty ) \times S^2} \text {I}_1(\rho , \mathbf{u})\rho ^2{\fancyscript{L}}^1 \otimes u_{S^2}(\text {d}\rho \text {d}\mathbf{u}) \le C_{*, r} \text {W}^2 \end{aligned}$$
(73)

with

$$\begin{aligned} C_{*, r}&:= 4\pi \left\{ 2^8 \overline{C}_1 {\mathfrak {m}}_4^2 + \left( \frac{\lambda ^{2r}}{\epsilon }\right) ^{2q} \cdot \Bigg ( \frac{1}{19} 2^{20} {\mathfrak {m}}_4^{19/4} \right. \\&+\, 6 \cdot 2^{20} {\mathfrak {m}}_4^5 \wp _1(2^{-1}{\mathfrak {m}}_4^{-1/4}) \wp _2(2^{-1}{\mathfrak {m}}_4^{-1/4}) \Bigg ) + \left( \frac{\lambda ^{2r}}{\epsilon }\right) ^q \cdot \Bigg (6 \cdot 2^5 {\mathfrak {m}}_4^2g_4 \\&\left. + \frac{26}{3} 2^9 {\mathfrak {m}}_4^{9/4}g_3 + \frac{78}{5} 2^{10} {\mathfrak {m}}_4^{5/2}g_2 + 24 \cdot 2^{11} {\mathfrak {m}}_4^{11/4}g_1 \Bigg )\,\,\right\} . \end{aligned}$$

2.2.2 Outer integral of \({\mathrm {I}}_2(\rho , \mathbf{u})\)

As first step, we use the Tonelli theorem to write the outer integral of \(\text {I}_2(\rho , \mathbf{u})\) as

$$\begin{aligned} \lim _{y \rightarrow +\infty } \int \limits _{\text {R}}^{y} \frac{3}{\rho ^2} \left( \,\,\int \limits _{S^2} \left| \varDelta _{S^2} \hat{{\mathcal {M}}}(\rho \mathbf{u})\right| ^2 u_{S^2}(\text {d}\mathbf{u})\right) \text {d}\rho . \end{aligned}$$
(74)

Then, we apply Theorem 3.16 in [41] to obtain

$$\begin{aligned} \int \limits _{S^2} \left| \varDelta _{S^2} \hat{{\mathcal {M}}}(\rho \mathbf{u})\right| ^2 u_{S^2}(\text {d}\mathbf{u})= \int \limits _{S^2} \hat{{\mathcal {M}}}(-\rho \mathbf{u}) \varDelta _{S^2}^2 \hat{{\mathcal {M}}}(\rho \mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\end{aligned}$$

which, by virtue of (65), yields

$$\begin{aligned} \int \limits _{S^2} \left| \varDelta _{S^2} \hat{{\mathcal {M}}}(\rho \mathbf{u})\right| ^2 u_{S^2}(\text {d}\mathbf{u})\le \ \varPsi (\rho ) \sup _{\mathbf{u}\in S^2}\left| \varDelta _{S^2}^2 \hat{{\mathcal {M}}}(\rho \mathbf{u})\right| . \end{aligned}$$
(75)

At this stage, to handle the computations involving the Laplace–Beltrami operator, we define the following plane domains

$$\begin{aligned} D_1 = D_3&:= \{(u, v) \in {\mathbb {R}}^2 \ | \ (u - \pi /2)^2/(5\pi )^2 + (v - \pi )^2/(11\pi )^2 < (1/12)^2\} \\ D_2 = D_4&:= \{(u, v) \in {\mathbb {R}}^2 \ | \ (u - \pi /2)^2/(5\pi )^2 + v^2/(11\pi )^2 < (1/12)^2\} \end{aligned}$$

along with the parametrizations

$$\begin{aligned} \begin{array}{lll} \mathbf{h}_k : D_k \ni (u, v) &{}\mapsto (\cos v \sin u, \sin v \sin u, \cos u) \in {\mathbb {R}}^3&{} k = 1, 2 \\ \mathbf{h}_k : D_k \ni (u, v) &{}\mapsto (\cos u, \cos v \sin u, \sin v \sin u) \in {\mathbb {R}}^3&{} k = 3, 4 \end{array} \end{aligned}$$
(76)

to form the atlas \({\mathcal {A}}\) on \(S^2\) composed by the charts \(\Omega _k := \mathbf{h}_k(D_k) \subset S^2\) for \(k = 1, \dots , 4\). Then, \(\varDelta _{S^2}^2\) can be expressed in local coordinates as

$$\begin{aligned} \varDelta _{(u, v)}^{2}&= \partial _{u u u u} + 2\cot u \partial _{u u u} - \sin ^{-2}u \partial _{u u} + \sin ^{-2}u\cot u \partial _u + \sin ^{-4}u \partial _{v v v v} \\&- 2\sin ^{-4}u (2 - \sin ^{2}u) \partial _{v v} + 6 \sin ^{-2}u\cot u \partial _{u v v} + 2\sin ^{-2}u \partial _{u u v v} \end{aligned}$$

by virtue of (3.84) in [41], and hence

$$\begin{aligned} \sup _{\mathbf{u}\in S^2} \left| \varDelta _{S^2}^{2}\hat{\mathcal {M}}(\rho \mathbf{u}) \right|&= \sup _{k \in \{1, \dots , 4\}} \sup _{(u, v) \in D_k} \left| \varDelta _{(u, v)}^{2} \hat{{\mathcal {M}}}(\rho \mathbf{h}_k(u, v)) \right| \nonumber \\&\le \overline{\varDelta } \sum _{1 \le |\varvec{\alpha }| \le 4} \sup _{k \in \{1, \dots , 4\}} \sup _{(u, v) \in D_k} \left| \partial _{\varvec{\alpha }} \hat{{\mathcal {M}}}(\rho \mathbf{h}_k(u, v)) \right| \end{aligned}$$
(77)

where \(\varvec{\alpha }\) indicates the multi-index \((\alpha _1, \alpha _2)\), \(\partial _{\varvec{\alpha }}\) stands for the partial derivative \(\frac{\partial ^{\alpha _1 + \alpha _2}}{\partial u^{\alpha _1} \partial v^{\alpha _2}}\), and \(\overline{\varDelta } = 4 (2 + \sqrt{3})^2 (6 + \sqrt{3})\) is the maximum absolute value of the coefficients of \(\varDelta _{(u, v)}^{2}\). To study \(\partial _{\varvec{\alpha }} \ \hat{{\mathcal {M}}}(\rho \mathbf{h}_k(u, v))\) we resort to the multi-dimensional Faà di Bruno formula stated and proved in [25]. Therefore, taking into account that \(| \partial _{\varvec{\alpha }} \ \mathbf{h}_k(u, v) | \le 1\) for every multi-index \(\varvec{\alpha }\), we have

$$\begin{aligned} \left| \partial _{\varvec{\alpha }} \ \hat{{\mathcal {M}}}(\rho \mathbf{h}_k(u, v)) \right| \le \sum _{h = 1}^{|\varvec{\alpha }|}\sum _{l=1}^{|\varvec{\alpha }|} a_{h,l}(\varvec{\alpha }) {\mathfrak {M}}_l \rho ^h \end{aligned}$$
(78)

where the \(a_{h,l}\)’s are constants specified in [25], and \({\mathfrak {M}}_l := \int _{{\mathbb {R}}^3}|\mathbf{v}|^l {\mathcal {M}}(\text {d}\mathbf{v})\). At this stage, (74)–(75) and (77)–(78) yield

$$\begin{aligned}&\int \limits _{(\text {R}, +\infty ) \times S^2} \text {I}_2(\rho , \mathbf{u})\rho ^2{\fancyscript{L}}^1 \otimes u_{S^2}(\text {d}\rho \text {d}\mathbf{u})\\&\quad \le 3\overline{\varDelta } \sum _{1 \le |\varvec{\alpha }| \le 4}\sum _{h = 1}^{|\varvec{\alpha }|} \sum _{l=1}^{|\varvec{\alpha }|} a_{h,l}(\varvec{\alpha }) {\mathfrak {M}}_l \int \limits _{\text {R}}^{+\infty } \varPsi (\rho ) \rho ^{h-2} \text {d}\rho . \end{aligned}$$

Moreover, the Lyapunov inequality gives \({\mathfrak {M}}_l \le {\mathfrak {M}}_{4}^{l/4}\) for \(l\) in \([0, 4]\) and then, from (56), we get \({\mathfrak {M}}_4 \le 3\sum _{i = 1}^{3} \left( \lim _{\rho \rightarrow 0}\frac{\partial ^4}{\partial \rho ^4} \hat{{\mathcal {M}}}(\rho \mathbf{e}_i)\right) \le \ 9g_4\). Now, an application of (69) with \(s = 1\) and \(m = h - 2\), combined with (60), leads to

$$\begin{aligned} 4\pi \int \limits _{(\text {R}, +\infty ) \times S^2} \text {I}_2(\rho , \mathbf{u})\rho ^2{\fancyscript{L}}^1 \otimes u_{S^2}(\text {d}\rho \text {d}\mathbf{u}) \le C_{*, s} \text {W}^2 \end{aligned}$$
(79)

with

$$\begin{aligned} C_{*, s} := 12\pi \overline{\varDelta } \left( \frac{\lambda ^{2r}}{\epsilon }\right) ^{q} \sum _{1 \le |\varvec{\alpha }| \le 4}\sum _{h = 1}^{|\varvec{\alpha }|} \sum _{l=1}^{|\varvec{\alpha }|} a_{h,l}(\varvec{\alpha }) (9g_4)^{l/4} \frac{1}{12 - h}(2{\mathfrak {m}}_4^{1/4})^{12 - h}. \end{aligned}$$

2.2.3 Inner integral of \({\mathrm {I}}_1(\rho , \mathbf{u})\)

The analysis is essentially based on certain new Berry–Esseen-type inequalities presented in [30], after observing the analogy between \(\rho \mapsto \hat{{\mathcal {M}}}(\rho \mathbf{u})\) and the c.f. \(\varphi _n(t)\) therein. Indeed, for any \(\mathbf{u}\) in \(S^2\) and for every choice of \(\text {B}\) in (29), each realization of \(\hat{{\mathcal {N}}}(\rho ; \mathbf{u})\), as a function of \(\rho \), coincides with the c.f. of a weighted sum of independent random numbers, according to (32). Moreover, the definition of \(\text {R}\) in (60) corresponds to the upper bound \(\tau \) appearing in the Berry–Esseen-type inequalities proved in [30]. To implement the aforesaid inequalities within the present framework, it is worth introducing the following entities

$$\begin{aligned} T(\mathbf{u})&:= \left\{ \sum _{j = 1}^{\nu } \pi _{j, \nu }^2 \varvec{\psi }_{j, \nu }^{t}(\mathbf{u}) V[\mu _0] \varvec{\psi }_{j, \nu }(\mathbf{u}) \le 1/3\right\} \end{aligned}$$
(80)
$$\begin{aligned} \text {M}_{j, n}^{(m)}(\mathbf{u})&:= {\mathsf{E}}_t\left[ \left( \mathbf{V}_j \cdot \varvec{\psi }_{j, n}(\mathbf{u})\right) ^m\ \big |\ {\fancyscript{G}} \right] \end{aligned}$$
(81)
$$\begin{aligned} \text {X}(\mathbf{u})&:= \sum _{j = 1}^{\nu } \pi _{j, \nu }^2 |\text {M}_{j, \nu }^{(2)}(\mathbf{u}) - 1| \end{aligned}$$
(82)
$$\begin{aligned} \text {Y}(\mathbf{u})&:= \sum _{j = 1}^{\nu } \big |\pi _{j, \nu }^3 \text {M}_{j, \nu }^{(3)}(\mathbf{u})\big | \end{aligned}$$
(83)
$$\begin{aligned} \text {Z}(\mathbf{u})&:= {\mathsf{E}}_t\left[ \left[ \sum _{j = 1}^{\nu } \pi _{j, \nu }^2 \varvec{\psi }_{j, \nu }^{t}(\mathbf{u}) V[\mu _0] \varvec{\psi }_{j, \nu }(\mathbf{u}) - 1\right] ^2\ \big |\ {\fancyscript{G}} \right] \end{aligned}$$
(84)

where \(T(\mathbf{u})\) belongs to \({\fancyscript{H}}\). With this new notation at hand, the Berry–Esseen-type inequality can be re-written as

$$\begin{aligned} \left| \frac{\partial ^l}{\partial \rho ^l} \left[ \hat{\mathcal {M}}(\rho \mathbf{u}) - e^{-\rho ^2/2}\right] \right|&\le {\mathsf{E}}_t\left[ \left| \frac{\partial ^l}{\partial \rho ^l} \hat{{\mathcal {N}}}(\rho ; \mathbf{u}) \right| \ 1\!\!1_{T(\mathbf{u})} \ | \ {\fancyscript{G}}\right] + u_{2,l}(\rho ) \text {X}(\mathbf{u}) \\&+\, u_{3,l}(\rho ) \text {Y}(\mathbf{u}) + u_{4,l}(\rho ) {\mathfrak {m}}_4\text {W}+ v_{l}(\rho ) \text {Z}(\mathbf{u}) \end{aligned}$$

for \(l = 0, 1, 2\), \(\rho \) in \([0, \text {R}]\) and \(\mathbf{u}\) in \(S^2\), \(u_{2,l}\), \(u_{3,l}\), \(u_{4,l}\), \(v_l\) being non-random rapidly decreasing continuous functions depending only on \(\mu _0\). See [30] for their definition. The above inequality yields

$$\begin{aligned}&\int \limits _{0}^{\text {R}} \Big | \frac{\partial ^l}{\partial \rho ^l} \left[ \hat{\mathcal {M}}(\rho \mathbf{u}) - e^{-\rho ^2/2}\right] \Big |^2 \rho ^m \text {d}\rho \nonumber \\&\quad \le 5 \int \limits _{0}^{\text {R}} \left( {\mathsf{E}}_t\left[ \left| \frac{\partial ^l}{\partial \rho ^l} \hat{{\mathcal {N}}}(\rho ; \mathbf{u}) \right| \ 1\!\!1_{T(\mathbf{u})} \ | \ {\fancyscript{G}}\right] \right) ^2 \rho ^m \text {d}\rho + 5 \text {X}^2(\mathbf{u}) \int \limits _{0}^{+\infty } u_{2,l}^{2}(\rho ) \rho ^m \text {d}\rho \nonumber \\&\qquad + 5 \text {Y}^2(\mathbf{u}) \int \limits _{0}^{+\infty } u_{3,l}^{2}(\rho ) \rho ^m \text {d}\rho +\, 5 {\mathfrak {m}}_4^2\text {W}^2 \int \limits _{0}^{+\infty } u_{4,l}^2(\rho ) \rho ^m \text {d}\rho \nonumber \\&\qquad +\, 5 \text {Z}^2(\mathbf{u}) \int \limits _{0}^{+\infty } v_{l}^{2}(\rho ) \rho ^m \text {d}\rho \end{aligned}$$
(85)

for \(l = 0, 1, 2\), \(m \ge 0\) and \(\mathbf{u}\) in \(S^2\). The integrals \(\overline{u}_{h, l, m} := \int _{0}^{+\infty } u_{h,l}^{2}(\rho ) \rho ^m \text {d}\rho \) and \(\overline{v}_{l, m} := \int _{0}^{+\infty } v_{l}^{2}(\rho ) \rho ^m \text {d}\rho \) are finite and depend only on \(\mu _0\) for \(h = 2, 3, 4\), \(l = 0, 1, 2\) and \(m \ge 0\). As to the above conditional expectation, we have

$$\begin{aligned} {\mathsf{E}}_t[1\!\!1_{T(\mathbf{u})} \ | \ {\fancyscript{G}}]&= {\mathsf{P}}_t[T(\mathbf{u}) \ | \ {\fancyscript{G}}] \nonumber \\&\le {\mathsf{P}}_t\left[ \left\{ \left| \sum _{j = 1}^{\nu } \pi _{j, \nu }^2 \varvec{\psi }_{j, \nu }^{t}(\mathbf{u}) V[\mu _0] \varvec{\psi }_{j, \nu }(\mathbf{u}) - 1\right| \ge 1/3\right\} \ | \ {\fancyscript{G}}\right] \le 9 \text {Z}(\mathbf{u}) \nonumber \\ \end{aligned}$$
(86)

the latter inequality following from the conditional Markov inequality. Now, we apply (64) and (66) and, after observing that the upper bounds provided therein are \({\fancyscript{G}}\)-measurable, we obtain

$$\begin{aligned}&\int \limits _{0}^{\text {R}} \left( {\mathsf{E}}_t\left[ \left| \hat{{\mathcal {N}}}(\rho ; \mathbf{u})\right| \ 1\!\!1_{T(\mathbf{u})} \ | \ {\fancyscript{G}}\right] \right) ^2 \rho ^m \text {d}\rho \le 81\text {Z}^2(\mathbf{u})\int \limits _{0}^{+\infty } \varPsi ^2(\rho ) \rho ^m \text {d}\rho \end{aligned}$$
(87)
$$\begin{aligned}&\int \limits _{0}^{\text {R}} \left( {\mathsf{E}}_t\left[ \left| \frac{\partial ^l}{\partial \rho ^l} \hat{{\mathcal {N}}}(\rho ; \mathbf{u}) \right| \ 1\!\!1_{T(\mathbf{u})} \ | \ {\fancyscript{G}}\right] \right) ^2 \rho ^m \text {d}\rho \le 81\text {Z}^2(\mathbf{u})\int \limits _{0}^{+\infty } \wp _{l}^{2}(\rho ) \varPsi ^2(\rho ) \rho ^m \text {d}\rho \nonumber \\ \end{aligned}$$
(88)

for \(l = 1, 2\) and any \(m\) in \([0, 13)\). In addition, by virtue of Proposition 9, the integrals \(\overline{z}_m := \int _{0}^{+\infty } \varPsi ^2(\rho ) \rho ^m \text {d}\rho \) and \(\overline{w}_{l, m} := \int _{0}^{+\infty } \wp _{l}^{2}(\rho ) \varPsi ^2(\rho ) \rho ^m \text {d}\rho \) are finite and depend only on \(\mu _0\) when \(\omega \) varies in \(U^c\). Coming back to the integral of interest, the Tonelli theorem can be applied to write

$$\begin{aligned} \int \limits _{(0, \text {R}]\times S^2} \text {I}_1(\rho , \mathbf{u})\rho ^2{\fancyscript{L}}^1 \otimes u_{S^2}(\text {d}\rho \text {d}\mathbf{u}) = \int \limits _{S^2} \left( \int \limits _{0}^{\text {R}} \text {I}_1(\rho , \mathbf{u})\rho ^2 \text {d}\rho \right) u_{S^2}(\text {d}\mathbf{u}). \end{aligned}$$

Since the inner integral on the RHS has already been studied, it remains to explain how it depends on \(\mathbf{u}\). For this, a fundamental role is played by \(\text {B}\), which appears in the RHS of (85) through the random variables \(\text {X}\), \(\text {Y}\) and \(\text {Z}\). Apropos of this, it should be recalled that the so-called hairy ball theorem—see, e.g., Chapter 5 of [45]—asserts that a function \(\text {B}\), meeting the properties specified to write (29), cannot be continuous everywhere. Nevertheless, we know that the definition of \({\mathcal {M}}\) is independent of the choice of \(\text {B}\). We take advantage of this fact to overcome the aforesaid drawback by splitting \(S^2\) into the charts \(\Omega _k\) introduced in the previous subsection and by choosing for each \(\Omega _k\) a specific \(\text {B}\), say \(\text {B}_k\), smooth on \(\overline{\Omega }_k\). This possibility is guaranteed by the fact that \(S^2{\setminus }\overline{\Omega }_k\) contains at least two antipodal points. We now have, by (85) and (87)–(88),

$$\begin{aligned} 4\pi \int \limits _{(0, \text {R}] \times S^2} \text {I}_1(\rho , \mathbf{u})\rho ^2{\fancyscript{L}}^1 \otimes u_{S^2}(\text {d}\rho \text {d}\mathbf{u})&\le \overline{B}_2\sum _{k = 1}^{4} \int \limits _{\Omega _k} \text {X}^{2}_k(\mathbf{u})u_{S^2}(\text {d}\mathbf{u})\nonumber \\&+ \overline{B}_3\sum _{k = 1}^{4} \int \limits _{\Omega _k} \text {Y}^{2}_{k}(\mathbf{u})u_{S^2}(\text {d}\mathbf{u})+ \overline{B}_4{\mathfrak {m}}_4^2 \text {W}^2 \nonumber \\&+ \overline{B}_5 \sum _{k = 1}^{4} \int \limits _{\Omega _k} \text {Z}^{2}_{k}(\mathbf{u})u_{S^2}(\text {d}\mathbf{u}) \end{aligned}$$
(89)

where \(\text {X}_k\), \(\text {Y}_k\), \(\text {Z}_k\) are the same as in (82)–(84) respectively, with \(\text {B} = \text {B}_k\) and

$$\begin{aligned} \overline{B}_2&:= 20\pi [\overline{u}_{2, 0, 2} + 12\overline{u}_{2, 1, 0} + 3\overline{u}_{2, 2, 2}] \\ \overline{B}_3&:= 20\pi [\overline{u}_{3, 0, 2} + 12\overline{u}_{3, 1, 0} + 3\overline{u}_{3, 2, 2}] \\ \overline{B}_4&:= 80\pi [\overline{u}_{4, 0, 2} + 12\overline{u}_{4, 1, 0} + 3\overline{u}_{4, 2, 2}] \\ \overline{B}_5&:= 1620\pi [\overline{z}_2 + 12\overline{w}_{1, 0} + 3\overline{w}_{2, 2}] + 20\pi [\overline{v}_{0, 2} + 12\overline{v}_{1, 0} + 3\overline{v}_{2, 2}]. \end{aligned}$$

2.2.4 Inner integral of \({\mathrm {I}}_2(\rho , \mathbf{u})\)

With reference to (59), the integral at issue is analyzed by splitting \(S^2\) into the charts \(\Omega _k\) defined in Sect. 2.2.2. On the basis of considerations made apropos of \(\text {B}\) at the end of the previous sub-subsection, here we choose the \(\text {B}_k\)’s as follows:

$$\begin{aligned} \text {B}_k(\mathbf{h}_k(u, v)) := \left( \begin{array}{c@{\quad }c@{\quad }c} \sin v &{} \cos v \cos u &{} \cos v \sin u \\ -\cos v &{} \sin v \cos u &{} \sin v \sin u \\ 0 &{} -\sin u &{} \cos u \\ \end{array} \right) \end{aligned}$$
(90)

for \(k = 1, 2\) and

$$\begin{aligned} \text {B}_k(\mathbf{h}_k(u, v)) := \left( \begin{array}{c@{\quad }c@{\quad }c} 0 &{} -\sin u &{} \cos u \\ \sin v &{} \cos v \cos u &{} \cos v \sin u \\ -\cos v &{} \sin v \cos u &{} \sin v \sin u \\ \end{array} \right) \end{aligned}$$
(91)

for \(k = 3, 4\). Then, equality \(\hat{{\mathcal {M}}}(\rho \mathbf{u}) = {\mathsf{E}}_t[\hat{{\mathcal {N}}}(\rho ; \mathbf{u})\ | \ {\fancyscript{G}}]\), in combination with the definition of \(T(\mathbf{u})\) in (80), produces this upper bound for \(| \varDelta _{S^2} \hat{{\mathcal {M}}}(\rho \mathbf{u}) |^2\):

$$\begin{aligned} 2\left( {\mathsf{E}}_t\left[ \left| \varDelta _{S^2}\hat{{\mathcal {N}}}_k(\rho ; \mathbf{u})\right| 1\!\!1_{T_k(\mathbf{u})} \ | \ {\fancyscript{G}} \right] \right) ^2 + 2 \left| {\mathsf{E}}_t\left[ \varDelta _{S^2}\hat{{\mathcal {N}}}_k(\rho ; \mathbf{u})1\!\!1_{T_k(\mathbf{u})^{c}} \ | \ {\fancyscript{G}} \right] \right| ^2 \end{aligned}$$
(92)

for every \(\mathbf{u}\) in \(\Omega _k\), where \({\mathcal {N}}_k\) and \(T_k(\mathbf{u})\) are the same as \({\mathcal {N}}\) and \(T(\mathbf{u})\), respectively, with \(\text {B} = \text {B}_k\). To bound the former summand we make use of the following

Proposition 10

Assume that the tail condition (15) is in force together with the moment assumptions (14) and (47)–(48). Then, there exists a non-random polynomial \(\wp _L\) of degree 6, with positive coefficients which depend only on \(\mu _0\), such that

$$\begin{aligned} \sup _{k \in \{1, \dots , 4\}}\sup _{\mathbf{u}\in \Omega _k} \left| \varDelta _{S^2}\hat{{\mathcal {N}}}_k(\rho ; \mathbf{u})\right| \le \rho ^2\wp _L(\rho )\varPsi (\rho ) \end{aligned}$$
(93)

holds for every \(\rho \) in \([0, {\mathrm {R}}]\), with the exception of a set of \({\mathsf{P}}_t\)-probability zero.

The proof is deferred to Appendix A.9, where \(\wp _L\) is given explicitly. At this stage, we note that the upper bound in (93) is \({\fancyscript{G}}\)-measurable and, afterwards, we apply (86) to obtain

$$\begin{aligned}&\int \limits _{(0, \text {R}] \times \Omega _k} \frac{1}{\rho ^2} \left( {\mathsf{E}}_t\left[ \left| \varDelta _{S^2}\hat{{\mathcal {N}}}_k(\rho ; \mathbf{u})\right| 1\!\!1_{T_k(\mathbf{u})} \ | \ {\fancyscript{G}} \right] \right) ^2{\fancyscript{L}}^1 \otimes u_{S^2}(\text {d}\rho \text {d}\mathbf{u}) \nonumber \\&\quad \le \int \limits _{(0, \text {R}] \times \Omega _k} \frac{1}{\rho ^2} \rho ^4\wp _{L}^{2}(\rho )\varPsi ^2(\rho ) {\mathsf{P}}_t[T(\mathbf{u}) \ | \ {\fancyscript{G}}]^2 {\fancyscript{L}}^1 \otimes u_{S^2}(\text {d}\rho \text {d}\mathbf{u}) \nonumber \\&\quad \le 81 \int \limits _{0}^{+\infty } \rho ^2\wp _{L}^{2}(\rho )\varPsi ^2(\rho ) \text {d}\rho \cdot \int \limits _{\Omega _k}\text {Z}^{2}_{k}(\mathbf{u})u_{S^2}(\text {d}\mathbf{u}). \end{aligned}$$
(94)

If we consider the random variable \(\int _{0}^{+\infty } \rho ^2\wp _{L}^{2}(\rho )\varPsi ^2(\rho ) \text {d}\rho \) on \(U^c\), then Proposition 9 can be used to conclude that this random variable is bounded by the constant \(J_L := \int _{0}^{1} \rho ^2\wp _{L}^{2}(\rho ) \text {d}\rho + \left( \frac{\lambda ^{2r}}{\epsilon }\right) ^{2q} \int _{1}^{+\infty } \rho ^{-20} \wp _{L}^{2}(\rho )\text {d}\rho \).

In the final part of this sub-subsection we provide an upper bound for the latter summand in the RHS of (92), by means of the following statement which involves new random quantities such as

$$\begin{aligned} \text {X}_L(\mathbf{u})&:= \sum _{j = 1}^{\nu } \pi _{j, \nu }^2 |\varDelta _{S^2}\text {M}_{j, \nu }^{(2)}(\mathbf{u})| \end{aligned}$$
(95)
$$\begin{aligned} \text {Y}_L(\mathbf{u})&:= \sum _{j = 1}^{\nu } \left| \pi _{j, \nu }^3 \varDelta _{S^2}\text {M}_{j, \nu }^{(3)}(\mathbf{u})\right| \end{aligned}$$
(96)
$$\begin{aligned} \text {Z}_G(\mathbf{u})&:= {\mathsf{E}}_t\left[ \left| \left| \sum _{j = 1}^{\nu }\pi _{j, \nu }^{2}\nabla _{S^2}\left( \varvec{\psi }_{j, \nu }^{t}(\mathbf{u}) V[\mu _0] \varvec{\psi }_{j, \nu }(\mathbf{u})\right) \right| \right| ^2_{S^2} \ | \ {\fancyscript{G}} \right] \end{aligned}$$
(97)
$$\begin{aligned} \text {Z}_L(\mathbf{u})&:= {\mathsf{E}}_t\left[ \left[ \sum _{j = 1}^{\nu }\pi _{j, \nu }^{2}\varDelta _{S^2}\left( \varvec{\psi }_{j, \nu }^{t}(\mathbf{u}) V[\mu _0] \varvec{\psi }_{j, \nu }(\mathbf{u})\right) \right] ^2 \ | \ {\fancyscript{G}} \right] \end{aligned}$$
(98)

where the \(\text {M}_{j, n}^{(m)}(\mathbf{u})\)’s are the same as in (81), \(\nabla _{S^2}\) is the Riemannian gradient on \(S^2\) and \(\mid \mid \!\cdot \! \mid \mid _{S^2}\) the Riemannian length.

Proposition 11

Let the moment assumptions (14) and (47)–(48) be in force. Then, for every \(k = 1, \dots , 4\), there exist (non-random) rapidly decreasing continuous functions \(z_1, \dots , z_6\), depending only on \(\mu _0\), such that

$$\begin{aligned}&\left| {\mathsf{E}}_t\left[ \left( \varDelta _{S^2}\hat{{\mathcal {N}}}_k(\rho ; \mathbf{u})\right) 1\!\!1_{T_k(\mathbf{u})^{c}} \ | \ {\fancyscript{G}} \right] \right| \nonumber \\&\quad \le \rho ^2 \Bigg [ z_1(\rho ) \text {W}+ z_2(\rho ) \text {X}_{L, k}(\mathbf{u}) + z_3(\rho ) \text {Y}_{L, k}(\mathbf{u}) \nonumber \\&\qquad +\, z_4(\rho ) \text {Z}_k(\mathbf{u}) + z_5(\rho ) \text {Z}_{G, k}(\mathbf{u}) + z_6(\rho ) \text {Z}_{L, k}(\mathbf{u})\Bigg ] \end{aligned}$$
(99)

holds for every \(\mathbf{u}\) in \(\Omega _k\) and \(\rho \) in \([0, {\mathrm {R}}]\), with the exception of a set of \({\mathsf{P}}_t\)-probability zero. \(\text {X}_{L, k}\), \(\text {Y}_{L, k}\), \(\text {Z}_k\), \(\text {Z}_{G, k}\) and \(\text {Z}_{L, k}\) are defined as in (95)–(98) and (84) with \({\mathrm {B}}_k\) in place of \({\mathrm {B}}\).

For the proof and the definition of the \(z_i\)’s see Appendix A.11. Now, a straightforward application of the above proposition yields

$$\begin{aligned}&\int \limits _{(0, \text {R}] \times \Omega _k} \frac{1}{\rho ^2} \left| {\mathsf{E}}_t\left[ \left( \varDelta _{S^2}\hat{{\mathcal {N}}}_k(\rho ; \mathbf{u}) \right) 1\!\!1_{T_k(\mathbf{u})^c} \ | \ {\fancyscript{G}} \right] \right| ^2{\fancyscript{L}}^1 \otimes u_{S^2}(\text {d}\rho \text {d}\mathbf{u}) \nonumber \\&\quad \le \overline{B}_{1, L} \text {W}^2 + \overline{B}_{2, L} \int \limits _{\Omega _k} \text {X}_{L, k}^2(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})+ \overline{B}_{3, L}\int \limits _{\Omega _k} \text {Y}_{L, k}^2(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\nonumber \\&\qquad +\, \overline{B}_{4, L} \int \limits _{\Omega _k}\text {Z}_{k}^{2}(\mathbf{u})u_{S^2}(\text {d}\mathbf{u})+ \overline{B}_{5, L} \int \limits _{\Omega _k} \text {Z}_{G, k}^2(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\nonumber \\&\qquad +\, \overline{B}_{6, L}\int \limits _{\Omega _k} \text {Z}_{L, k}^2(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u}) \end{aligned}$$
(100)

where \(\overline{B}_{i, L} := 6 \int _{0}^{+\infty } z_{i}^{2}(\rho ) \rho ^2 \text {d}\rho \) for \(i = 1, \dots , 6\).

The final bound is achieved by collecting inequalities (92), (94) and (100), according to

$$\begin{aligned}&4\pi \int \limits _{(0, \text {R}] \times S^2} \text {I}_2(\rho , \mathbf{u})\rho ^2{\fancyscript{L}}^1 \otimes u_{S^2}(\text {d}\rho \text {d}\mathbf{u}) \nonumber \\&\quad \le 96\pi \overline{B}_{1, L} \text {W}^2\ +\ 24\pi \sum _{k = 1}^{4}\left\{ \overline{B}_{2, L} \int \limits _{\Omega _k} \text {X}_{L, k}^2(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right. \nonumber \\&\qquad +\, \overline{B}_{3, L} \int \limits _{\Omega _k} \text {Y}_{L, k}^2(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})+ (\overline{B}_{4, L} + 81J_L) \int _{\Omega _k}\text {Z}_{k}^{2}(\mathbf{u})u_{S^2}(\text {d}\mathbf{u})\nonumber \\&\qquad \left. +\, \overline{B}_{5, L} \int \limits _{\Omega _k} \text {Z}_{G, k}^2(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})+ \overline{B}_{6, L} \int \limits _{\Omega _k} \text {Z}_{L, k}^2(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right\} . \end{aligned}$$
(101)

2.2.5 The final step

With a view to bounding the RHS of (58), we use the ultimate results of Sects. 2.2.12.2.4, encapsulated in (73), (79), (89) and (101) respectively, to write

$$\begin{aligned}&\left( \,\,\int \limits _{{\mathbb {R}}^3}\big |\hat{{\mathcal {M}}}(\varvec{\xi }) - e^{-|\varvec{\xi }|^2/2}\big |^2 \text {d} \varvec{\xi }+ \int \limits _{{\mathbb {R}}^3}\big |\varDelta _{\varvec{\xi }}[\hat{{\mathcal {M}}}(\varvec{\xi }) - e^{-|\varvec{\xi }|^2/2}]\big |^2 \text {d} \varvec{\xi }\right) ^{1/2} 1\!\!1_{U^c}\nonumber \\&\quad \le (C_{*, r} + C_{*, s} + \overline{B}_4{\mathfrak {m}}_4^2 + 96\pi \overline{B}_{1, L})^{1/2} \text {W}\ \nonumber \\&\qquad +\ \sum _{k = 1}^{4} \left\{ \overline{B}_{2}^{1/2} \left( \,\,\int \limits _{\Omega _k} \text {X}^{2}_k(\mathbf{u})u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2} + \sqrt{24\pi } \overline{B}_{2, L}^{1/2} \left( \,\,\int \limits _{\Omega _k} \text {X}_{L, k}^2(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2}\right. \nonumber \\&\qquad +\, \overline{B}_{3}^{1/2} \left( \,\,\int \limits _{\Omega _k} \text {Y}^{2}_{k}(\mathbf{u})u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2} + \sqrt{24\pi } \overline{B}_{3, L}^{1/2} \left( \,\,\int \limits _{\Omega _k} \text {Y}_{L, k}^2(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2} \nonumber \\&\qquad +\, [\overline{B}_5 + 24\pi (\overline{B}_{4, L} + 81J_L)]^{1/2} \left( \,\,\int \limits _{\Omega _k} \text {Z}^{2}_k(\mathbf{u})u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2} \nonumber \\&\qquad +\, \sqrt{24\pi }\overline{B}_{5, L}^{1/2} \left( \,\,\int \limits _{\Omega _k} \text {Z}_{G, k}^2(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2} \nonumber \\&\qquad \left. +\, \sqrt{24\pi } \overline{B}_{6, L}^{1/2} \left( \,\,\int \limits _{\Omega _k} \text {Z}_{L, k}^2(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2} \right\} . \end{aligned}$$
(102)

Then, we proceed by taking expectation of both sides of (102). Apropos of this computation it is worth noting that, if \(\mu _0\) meets the additional conditions

$$\begin{aligned} \begin{array}{lll} &{} \sigma _1 = \sigma _2 = \sigma _3 = 1 &{} {} \\ &{}\displaystyle \int \limits _{{\mathbb {R}}^3}\mathbf{x}^{\varvec{\alpha }} \mu _0(\text {d}\mathbf{x}) = 0 &{} \text {for every multi-index}\ \varvec{\alpha }\ \text {with}\ |\varvec{\alpha }| = 3, \end{array} \end{aligned}$$

then \(\text {M}_{j, n}^{(2)} \equiv 1\) and \(\text {M}_{j,n}^{(3)} \equiv 0\), implying that all random variables in the RHS of (102) vanish, except for \(\text {W}\). Since \({\mathsf{E}}_t[\text {W}] = e^{\varLambda _b t}\) in view of (24)–(25), the proof of Theorem 1 would be complete. Let us carry on with the computation of the aforesaid expectations to show they all admit an upper bound like \(C e^{\varLambda _b t}\), even under the original more general conditions.

As for the random variables \(\text {X}_k\) and \(\text {X}_{L, k}\), a key role is played by the identity

$$\begin{aligned} \text {M}_{j, n}^{(2)}(\mathbf{u}) - 1 = \left( \sum _{s = 1}^{3} \sigma _{s}^{2} (u_{s}^{2} - 1/3)\right) \cdot \zeta _{j, n} \end{aligned}$$
(103)

valid for \(j = 1, \dots , n\), \(n\) in \({\mathbb {N}}\) and \(\mathbf{u}\) in \(S^2\), independently of the choice of \(\text {B}\) in (29). The \(\zeta _{j, n}\)’s are given by

$$\begin{aligned} \zeta _{j, n} := \zeta _{j, n}^{*}(\tau _n, (\phi _1, \dots , \phi _{n-1})) \end{aligned}$$
(104)

and the \(\zeta _{j, n}^{*}\)’s are defined on \({\mathbb {T}}(n) \times [0, \pi ]^{n-1}\) as follows. Put \(\zeta _{1, 1}^{*} \equiv 1\) and, for \(n \ge 2\),

$$\begin{aligned} \zeta _{j, n}^{*}({\mathfrak {t}}_n, \varvec{\varphi }) := \left\{ \begin{array}{l@{\quad }l} \zeta _{j, n_l}^{*}({\mathfrak {t}}_{n}^{l}, \varvec{\varphi }^l) \cdot (\frac{3}{2}\cos ^2\varphi _{n-1} - \frac{1}{2})&{} \text {for} \ j = 1, \dots , n_l \\ \zeta _{j - n_l, n_r}^{*}({\mathfrak {t}}_{n}^{r}, \varvec{\varphi }^r) \cdot (\frac{3}{2}\sin ^2\varphi _{n-1} - \frac{1}{2}) &{} \text {for} \ j = n_l + 1, \dots , n \end{array} \right. \end{aligned}$$
(105)

for every \(\varvec{\varphi }\) in \([0, \pi ]^{n-1}\). The reader is referred to Appendix A.12 for the proof of (103). Combination of (103) with (82) and (95) yields \(\text {X}_{k}(\mathbf{u}) = \big |\sum _{s = 1}^{3} \sigma _{s}^{2} (u_{s}^{2} - 1/3)\big | \cdot \sum _{j = 1}^{\nu } \pi _{j, \nu }^{2} |\zeta _{j, \nu }|\) and \(\text {X}_{L, k}(\mathbf{u}) = \big |\sum _{s = 1}^{3} \sigma _{s}^{2} \varDelta _{S^2}(u_{s}^{2})\big | \cdot \sum _{j = 1}^{\nu } \pi _{j, \nu }^{2} |\zeta _{j, \nu }|\) for \(k = 1, \dots , 4\). Whence,

$$\begin{aligned} \left( \,\,\int \limits _{\Omega _k} \text {X}_{k}^{2}(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2}&= \overline{\text {X}}_k \sum _{j = 1}^{\nu } \pi _{j, \nu }^{2} |\zeta _{j, \nu }| \\ \left( \,\,\int \limits _{\Omega _k} \text {X}_{L, k}^{2}(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2}&= \overline{\text {X}}_{L, k} \sum _{j = 1}^{\nu } \pi _{j, \nu }^{2} |\zeta _{j, \nu }| \end{aligned}$$

where

$$\begin{aligned} \overline{\text {X}}_k&:= \left( \,\,\int \limits _{\Omega _k}\left[ \sum _{s = 1}^{3} \sigma _{s}^{2} (u_{s}^{2} - 1/3)\right] ^2u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2}\\ \overline{\text {X}}_{L, k}&:= \left( \,\,\int \limits _{\Omega _k}\left[ \sum _{s = 1}^{3} \sigma _{s}^{2}\varDelta _{S^2}(u_{s}^{2}) \right] ^2u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2} \end{aligned}$$

are constants. At this stage, it is worth noticing that

$$\begin{aligned} {\mathsf{E}}_t\left( \sum _{j = 1}^{\nu } \pi _{j, \nu }^{2} |\zeta _{j, \nu }|\right) = e^{-(1 - f(b))t} \end{aligned}$$
(106)

holds for every \(t \ge 0\), with \(f(b) := \int _{0}^{\pi } \sin ^2\varphi \ \big | \frac{3}{2}\sin ^2\varphi - \frac{1}{2} \big | \ \beta (\text {d}\varphi )\). See Appendix A.1. Then, we combine the inequality \(\sin ^2\varphi \ \big | \frac{3}{2}\sin ^2\varphi - \frac{1}{2} \big | + \cos ^2\varphi \ \big | \frac{3}{2}\cos ^2\varphi - \frac{1}{2} \big | \ \le \sin ^4\varphi + \cos ^4\varphi \) with (2) to show that \(\varLambda _b \ge -(1 - f(b))\), i.e. the RHS in (106) approaches zero faster than \(e^{\varLambda _b t}\) as \(t\) goes to infinity. Therefore, we can conclude that

$$\begin{aligned} {\mathsf{E}}_t\left[ \left( \,\,\int \limits _{\Omega _k} \text {X}_{k}^{2}(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2}\right]&= \overline{\text {X}}_k e^{-(1 - f(b))t} \end{aligned}$$
(107)
$$\begin{aligned} {\mathsf{E}}_t\left[ \left( \,\,\int \limits _{\Omega _k} \text {X}_{L, k}^{2}(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2}\right]&= \overline{\text {X}}_{L, k} e^{-(1 - f(b))t} . \end{aligned}$$
(108)

As for the random variables \(\text {Y}_k\) and \(\text {Y}_{L, k}\) are concerned, we write

$$\begin{aligned} \text {M}_{j, n}^{(3)}(\mathbf{u}) = {\mathsf{E}}_t\left[ \left( \mathbf{V}_j \cdot \varvec{\psi }_{j, n}(\mathbf{u})\right) ^3 - \frac{3}{5} {\mathfrak {L}}_{3}\cdot \varvec{\psi }_{j, n}(\mathbf{u})\ \big |\ {\fancyscript{G}} \right] + \frac{3}{5} {\mathfrak {L}}_{3}\cdot {\mathsf{E}}_t\left[ \varvec{\psi }_{j, n}(\mathbf{u})\ \big |\ {\fancyscript{G}} \right] \end{aligned}$$
(109)

with \({\mathfrak {L}}_{3}:= \int _{{\mathbb {R}}^3}|\mathbf{v}|^2 \mathbf{v}\mu _0(\text {d}\mathbf{v})\). Now, the analog of (103) is given by the couple of identities

$$\begin{aligned} {\mathsf{E}}_t\left[ \left( \mathbf{V}_j \cdot \varvec{\psi }_{j, n}(\mathbf{u})\right) ^3 - \frac{3}{5} {\mathfrak {L}}_{3}\cdot \varvec{\psi }_{j, n}(\mathbf{u})\ \big |\ {\fancyscript{G}} \right]&= l_3(\mathbf{u}) \eta _{j, n} \end{aligned}$$
(110)
$$\begin{aligned} {\mathsf{E}}_t\left[ \varvec{\psi }_{j, n}(\mathbf{u})\ \big |\ {\fancyscript{G}} \right]&= \mathbf{u}\pi _{j,n} \end{aligned}$$
(111)

valid for \(j = 1, \dots , n\), \(n\) in \({\mathbb {N}}\) and \(\mathbf{u}\) in \(S^2\), independently of the choice of \(\text {B}\) in (29), and for \(l_3(\mathbf{u}) := \int _{{\mathbb {R}}^3}[(\mathbf{u}\cdot \mathbf{v})^3 - \frac{3}{5}|\mathbf{v}|^2 (\mathbf{u}\cdot \mathbf{v})]\mu _0(\text {d}\mathbf{v})\). The \(\eta _{j, n}\)’s are given by

$$\begin{aligned} \eta _{j, n} := \eta _{j, n}^{*}(\tau _n, (\phi _1, \dots , \phi _{n-1})) \end{aligned}$$
(112)

while the \(\eta _{j, n}^{*}\)’s are defined on \({\mathbb {T}}(n) \times [0, \pi ]^{n-1}\) as follows. Put \(\eta _{1, 1}^{*} \equiv 1\) and, for \(n \ge 2\),

$$\begin{aligned} \eta _{j, n}^{*}({\mathfrak {t}}_n, \varvec{\varphi })\!:=\! \left\{ \begin{array}{l@{\quad }l} \eta _{j, n_l}^{*}({\mathfrak {t}}_{n}^{l}, \varvec{\varphi }^l) \cdot (\frac{5}{2}\cos ^2\varphi _{n-1} - \frac{3}{2})\cos \varphi _{n-1} &{} \text {for} \ j = 1, \dots , n_l \\ \eta _{j - n_l, n_r}^{*}({\mathfrak {t}}_{n}^{r}, \varvec{\varphi }^r) \cdot (\frac{5}{2}\sin ^2\varphi _{n-1} - \frac{3}{2} )\sin \varphi _{n-1} &{} \text {for} \ j = n_l + 1, \dots , n \end{array} \right. \nonumber \\ \end{aligned}$$
(113)

for every \(\varvec{\varphi }\) in \([0, \pi ]^{n-1}\). The reader is referred to Appendix A.12 for the proof of (110)–(111). Combination of (109)–(111) with (83) and (96) entails \(\text {Y}_{k}(\mathbf{u}) \le |l_3(\mathbf{u})| \cdot \sum _{j = 1}^{\nu } |\pi _{j, \nu }^{3} \eta _{j, \nu }| + \frac{3}{5}|{\mathfrak {L}}_{3}\cdot \mathbf{u}| \text {W}\) and \(\text {Y}_{L, k}(\mathbf{u}) \le |\varDelta _{S^2}l_3(\mathbf{u})| \cdot \sum _{j = 1}^{\nu }|\pi _{j, \nu }^{3} \eta _{j, \nu }| + \frac{3}{5}|\varDelta _{S^2}({\mathfrak {L}}_{3}\cdot \mathbf{u})| \text {W}\) for \(k = 1, \dots , 4\). By elementary inequalities we obtain

$$\begin{aligned} \left( \,\,\int \limits _{\Omega _k} \text {Y}_{k}^{2}(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2}&\le \overline{\text {Y}}_{k}^{(1)}\sum _{j = 1}^{\nu } |\pi _{j, \nu }^{3} \eta _{j, \nu }| + \overline{\text {Y}}_{k}^{(2)} \text {W}\\ \left( \,\,\int \limits _{\Omega _k} \text {Y}_{L, k}^{2}(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2}&\le \overline{\text {Y}}_{L, k}^{(1)}\sum _{j = 1}^{\nu } |\pi _{j, \nu }^{3} \eta _{j, \nu }| + \overline{\text {Y}}_{L, k}^{(2)} \text {W}\\ \end{aligned}$$

where

$$\begin{aligned} \overline{\text {Y}}_{k}^{(1)}&:= \left( 2\int \limits _{\Omega _k} |l_3(\mathbf{u})|^2 u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2} \\ \overline{\text {Y}}_{k}^{(2)}&:= \left( \frac{18}{25} \int \limits _{\Omega _k} |{\mathfrak {L}}_{3}\cdot \mathbf{u}|^2 u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2} \\ \overline{\text {Y}}_{L, k}^{(1)}&:= \left( 2\int \limits _{\Omega _k} |\varDelta _{S^2}l_3(\mathbf{u})|^2 u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2} \\ \overline{\text {Y}}_{L, k}^{(2)}&:= \left( \frac{18}{25}\int \limits _{\Omega _k} |\varDelta _{S^2}({\mathfrak {L}}_{3}\cdot \mathbf{u})|^2 u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2} \end{aligned}$$

are constants. At this stage, to compute the expectation in the above inequalities, it is worth highlighting that the identity

$$\begin{aligned} {\mathsf{E}}_t\left( \sum _{j = 1}^{\nu }|\pi _{j, \nu }^{3} \eta _{j, \nu }| \right) = e^{-(1 - g(b))t} \end{aligned}$$
(114)

holds for every \(t \ge 0\), with \(g(b) := \int _{0}^{\pi } \sin ^4\varphi \ \big | \frac{5}{2}\sin ^2\varphi - \frac{3}{2} \big | \ \beta (\text {d}\varphi )\). See Appendix A.1. Now, we combine the inequality \(\sin ^4\varphi \ \big | \frac{5}{2}\sin ^2\varphi - \frac{3}{2} \big | + \cos ^4\varphi \ \big | \frac{5}{2}\cos ^2\varphi - \frac{3}{2} \big | \ \le \sin ^4\varphi + \cos ^4\varphi \) with (2) to show that \(\varLambda _b \ge -(1 - g(b))\), which says that the RHS in (114) approaches zero faster than \(e^{\varLambda _b t}\) as \(t\) goes to infinity. Relations (109)–(114) lead to

$$\begin{aligned} {\mathsf{E}}_t\left[ \left( \,\,\int \limits _{\Omega _k} \text {Y}_{k}^{2}(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2}\right]&\le \overline{\text {Y}}_{k}^{(1)}e^{-(1 - g(b))t} + \overline{\text {Y}}_{k}^{(2)}e^{\varLambda _b t} \end{aligned}$$
(115)
$$\begin{aligned} {\mathsf{E}}_t\left[ \left( \,\,\int \limits _{\Omega _k} \text {Y}_{L, k}^{2}(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2}\right]&\le \overline{\text {Y}}_{L, k}^{(1)}e^{-(1 - g(b))t} + \overline{\text {Y}}_{L, k}^{(2)}e^{\varLambda _b t} . \end{aligned}$$
(116)

It remains only to deal with the expectations involving \(\text {Z}\), \(\text {Z}_G\) and \(\text {Z}_L\). Unfortunately, unlike the \(\text {X}\)’s and the \(\text {Y}\)’s, it is not possible to write the random variables \(\text {Z}\), \(\text {Z}_G\) and \(\text {Z}_L\) as product of a given function of \(\mathbf{u}\) by some other random variable independent of \(\mathbf{u}\) and “contracting” in some sense. Nevertheless, such a contraction property can be found on the integrals of the \(\text {Z}\)’s over \(\Omega _k\). Accordingly, we show that the expectations of the last three random variables in (102) admit bounds like \(C e^{\varLambda _b t}\) with \(C\) depending only on \(\mu _0\). To prove this, we apply the Jensen inequality and exploit (48) to get

$$\begin{aligned} \left| \sum _{j = 1}^{\nu } \pi _{j, \nu }^2 \varvec{\psi }_{j, \nu ; k}^{t}(\mathbf{u}) V[\mu _0] \varvec{\psi }_{j, \nu ; k}(\mathbf{u}) - 1\right| ^2&\le \sum _{s = 1}^{3} \frac{\sigma _{s}^{2}}{3}\text {S}_{k, s}^2 \end{aligned}$$
(117)
$$\begin{aligned} \left| \left| \sum _{j = 1}^{\nu }\pi _{j, \nu }^{2}\nabla _{S^2}\left( \varvec{\psi }_{j, \nu ; k}^{t}(\mathbf{u}) V[\mu _0] \varvec{\psi }_{j, \nu ; k}(\mathbf{u})\right) \right| \right| ^2_{S^2}&\le \sum _{s = 1}^{3} \frac{\sigma _{s}^{2}}{3}\left| \left| \nabla _{S^2}\text {S}_{k, s} \right| \right| ^2_{S^2} \end{aligned}$$
(118)
$$\begin{aligned} \left| \sum _{j = 1}^{\nu }\pi _{j, \nu }^{2}\varDelta _{S^2}\left( \varvec{\psi }_{j, \nu ; k}^{t}(\mathbf{u}) V[\mu _0] \varvec{\psi }_{j, \nu ; k}(\mathbf{u})\right) \right| ^2&\le \sum _{s = 1}^{3} \frac{\sigma _{s}^{2}}{3}\left| \varDelta _{S^2} \text {S}_{k, s}\right| ^2 \end{aligned}$$
(119)

where \(\varvec{\psi }_{j, n; k}\) is the analog of (29) when \(\text {B}\) is replaced by \(\text {B}_k\), \(\psi _{j, n; k, s}\) denotes its \(s\)-th component and \(\text {S}_{k, s} := \sum _{j = 1}^{\nu } \pi _{j, \nu }^{2}\big ( 3 \psi _{j, \nu ; k, s}^{2} - 1 \big )\). Whence, by a further application of Jensen’s inequality and of an obvious inequality concerning the square root of a sum,

$$\begin{aligned} \left( \,\,\int \limits _{\Omega _k} \text {Z}^{2}_k(\mathbf{u})\text {d}u_{S^2}\right) ^{1/2}&\le \frac{\sqrt{3}}{3} \sum _{s = 1}^{3} \sigma _s \left( \,\,\int \limits _{\Omega _k} \left\{ {\mathsf{E}}_t\left[ \text {S}_{k, s}^2 \ | \ {\fancyscript{G}}\right] \right\} ^2 \text {d}u_{S^2}\right) ^{1/2} \\ \left( \,\,\int \limits _{\Omega _k} \text {Z}_{G, k}^2(\mathbf{u}) \text {d}u_{S^2}\right) ^{1/2}&\le \frac{\sqrt{3}}{3} \sum _{s = 1}^{3} \sigma _s \left( \,\, \int \limits _{\Omega _k}\left\{ {\mathsf{E}}_t\left[ \Big |\Big | \!\nabla _{S^2}\text {S}_{k, s}\! \Big |\Big |_{S^2}^2\ | \ {\fancyscript{G}}\right] \right\} ^2 \text {d}u_{S^2}\right) ^{1/2}\\ \left( \,\,\int \limits _{\Omega _k} \text {Z}_{L, k}^2(\mathbf{u}) \text {d}u_{S^2} \right) ^{1/2}&\le \frac{\sqrt{3}}{3} \sum _{s = 1}^{3} \sigma _s \left( \,\,\int \limits _{\Omega _k}\left\{ {\mathsf{E}}_t\left[ \Big |\varDelta _{S^2} \text {S}_{k, s}\Big |^2 \ | \ {\fancyscript{G}}\right] \right\} ^2 \text {d}u_{S^2} \right) ^{1/2}. \end{aligned}$$

Both the square roots and the squares after the brackets constitute an obstacle for the interchange of the integral with the expectation \({\mathsf{E}}_t\) and for the consequent application of useful properties of conditional expectation. To overcome this difficulty, we resort to the imbedding of the Sobolev space \(\text {W}^{1, 1}(\Omega _k)\) into \(\text {L}^2(\Omega _k)\). See, e.g., Chapter 2 of [2]. Taking the same constants \(A_1(0)\) and \(K(2, 1)\) as in Theorem 2.28 therein, we write

$$\begin{aligned}&{\mathsf{E}}_t\left[ \left( \,\,\int \limits _{\Omega _k}\!\left\{ {\mathsf{E}}_t\left[ \left| {\fancyscript{D}}\text {S}_{k, s}\right| ^2 \ | \ {\fancyscript{G}}\right] \right\} ^2 u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2}\right] \!\le \! A_1(0) {\mathsf{E}}_t\left[ \,\,\int \limits _{\Omega _k}\! \left| {\fancyscript{D}}\text {S}_{k, s}\right| ^2u_{S^2}(\text {d}\mathbf{u})\right] \nonumber \\&\quad + K(2, 1){\mathsf{E}}_t\left[ \,\,\int \limits _{\Omega _k} \Big |\Big | \!\nabla _{S^2}{\mathsf{E}}_t\left[ \left| {\fancyscript{D}}\text {S}_{k, s}\right| ^2 \ | \ {\fancyscript{G}}\right] \! \Big |\Big |_{S^2}u_{S^2}(\text {d}\mathbf{u})\right] \end{aligned}$$
(120)

where \({\fancyscript{D}}\) can be \(\text {Id}\), \(\nabla _{S^2}\), \(\varDelta _{S^2}\), and \(\big |{\fancyscript{D}}\text {S}_{k, s}\big |\) is to be interpreted in accordance with the meaning of \({\fancyscript{D}}\). To work out the last term in the above inequality, we use (5.1.25) in [60] to say that

$$\begin{aligned} \Big |\Big | \!\nabla _{S^2}{\mathsf{E}}_t\Big [\big |{\fancyscript{D}}\text {S}_{k, s}\big |^2 \ | \ {\fancyscript{G}}\Big ]\! \Big |\Big |_{S^2}\ \le {\mathsf{E}}_t\Big [\Big |\Big | \!\nabla _{S^2}\big (\big |{\fancyscript{D}}\text {S}_{k, s}\big |^2\big )\! \Big |\Big |_{S^2}\ | \ {\fancyscript{G}}\Big ] \end{aligned}$$
(121)

holds true \({\mathsf{P}}_t\)-almost surely. Moreover, when \({\fancyscript{D}}\) is \(\text {Id}\) or \(\varDelta _{S^2}\), the Leibnitz rule for the gradient entails

$$\begin{aligned} \Big |\Big | \!\nabla _{S^2}\big (\big |{\fancyscript{D}}\text {S}_{k, s}\big |^2\big )\! \Big |\Big |_{S^2}\ \le \ \big |{\fancyscript{D}}\text {S}_{k, s}\big |^2 + \Big |\Big | \!\nabla _{S^2}\big ({\fancyscript{D}}\text {S}_{k, s}\big )\! \Big |\Big |_{S^2}^2. \end{aligned}$$
(122)

When \({\fancyscript{D}}\) is \(\nabla _{S^2}\), the definition of the Hessian as symmetric bilinear form leads to

$$\begin{aligned} \langle \nabla _{S^2}(\mid \mid \!\nabla _{S^2}\text {S}_{k, s}\! \mid \mid _{S^2}^2), V\rangle&= 2 \langle D_V \nabla _{S^2}\text {S}_{k, s}, \nabla _{S^2}\text {S}_{k, s}\rangle \\&= 2 \text {Hess}_{S^2}[\text {S}_{k, s}](\nabla _{S^2}\text {S}_{k, s}, V) \end{aligned}$$

for every vector field \(V\), \(D\) standing for the Levi-Civita connection. See Exercise 11 in Chapter 6 of [21]. Whence,

$$\begin{aligned} \mid \mid \!\nabla _{S^2}(\mid \mid \!\nabla _{S^2}\text {S}_{k, s}\! \mid \mid _{S^2}^2) \! \mid \mid _{S^2}&\le 2\mid \mid \!\text {Hess}_{S^2}[\text {S}_{k, s}]\! \mid \mid _{*} \mid \mid \!\nabla _{S^2}\text {S}_{k, s}\! \mid \mid _{S^2} \nonumber \\&\le \mid \mid \!\text {Hess}_{S^2}[\text {S}_{k, s}]\! \mid \mid _{*}^2 + \mid \mid \!\nabla _{S^2}\text {S}_{k, s}\! \mid \mid _{S^2}^2 \end{aligned}$$
(123)

where \(\mid \mid \!\cdot \! \mid \mid _{*}\) denotes the \(\text {L}^2\)-norm of the Hessian given by \(\mid \mid \!\text {Hess}_{S^2}[\text {S}_{k, s}]\! \mid \mid _{*}^2 \,\,:= \sum _{ij} [\text {Hess}_{S^2}[\text {S}_{k, s}](V_i, V_j)]^2\) for some orthonormal basis \(\{V_1, V_2\}\) of vector fields. At this stage, it comes in useful to emphasize the fact that, in view of (121)–(123), the latter summand in (120) can be bounded by a sum of terms sharing the same structure of the former summand. Then, to provide an effective bound for the RHS of (120) it is enough to prove that

$$\begin{aligned} {\mathsf{E}}_t\left[ \,\,\int \limits _{\Omega _k} \left| {\fancyscript{D}}^{'} \text {S}_{k, s}\right| ^2u_{S^2}(\text {d}\mathbf{u})\right] \le c_k({\fancyscript{D}}^{'}) e^{\varLambda _b t} \end{aligned}$$
(124)

holds for some suitable constant \(c({\fancyscript{D}}^{'})\), \({\fancyscript{D}}^{'}\) being one of the following operators: \(\text {Id}\), \(\nabla _{S^2}\), \(\varDelta _{S^2}\), \(\nabla _{S^2}\varDelta _{S^2}\), \(\text {Hess}_{S^2}\). For the proof of (124), cf. Appendix A.13. Now, we are in a position to write explicit bounds for the last three terms in (102), which read

$$\begin{aligned} {\mathsf{E}}_t\left[ \left( \,\,\int \limits _{\Omega _k} \text {Z}_{k}^{2}(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2}\right]&\le \overline{Z}_k e^{\varLambda _b t} \end{aligned}$$
(125)
$$\begin{aligned} {\mathsf{E}}_t\left[ \left( \,\,\int \limits _{\Omega _k} \text {Z}_{G, k}^{2}(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2}\right]&\le \overline{Z}_{G, k} e^{\varLambda _b t} \end{aligned}$$
(126)
$$\begin{aligned} {\mathsf{E}}_t\left[ \left( \,\,\int \limits _{\Omega _k} \text {Z}_{L, k}^2(\mathbf{u}) u_{S^2}(\text {d}\mathbf{u})\right) ^{1/2}\right]&\le \overline{Z}_{L, k} e^{\varLambda _b t} \end{aligned}$$
(127)

with

$$\begin{aligned} \overline{Z}_k&= \frac{\sqrt{3}}{3} \left( \,\sum _{s = 1}^{3} \sigma _s\right) [A_1(0)c_k(\text {Id}) + K(2,1) (c_k(\text {Id}) + c_k(\nabla _{S^2}))] \\ \overline{Z}_{G, k}&= \frac{\sqrt{3}}{3} \left( \,\sum _{s = 1}^{3} \sigma _s\right) [A_1(0)c_k(\nabla _{S^2}) + K(2,1) (c_k(\text {Hess}_{S^2}) + c_k(\nabla _{S^2}))] \\ \overline{Z}_{L, k}&= \frac{\sqrt{3}}{3} \left( \,\sum _{s = 1}^{3} \sigma _s\right) [A_1(0)c_k(\varDelta _{S^2}) + K(2,1) (c_k(\varDelta _{S^2}) + c_k(\nabla _{S^2}\varDelta _{S^2}))]. \end{aligned}$$

To conclude, we gather (24)–(25), (107)–(108), (115)–(116), (125)–(127) and we resort to (58) and (102) to obtain

$$\begin{aligned} {\mathsf{E}}_t[\text {d}_{\text {TV}}({\mathcal {M}}, \gamma ); U^c] \le 2^{-5/4} \pi ^{-1/2} C(U^c) e^{\varLambda _b t} \end{aligned}$$

with

$$\begin{aligned} C(U^c)&{:=} (C_{*, r} + C_{*, s} + \overline{B}_4{\mathfrak {m}}_4^2 + 96\pi \overline{B}_{1, L})^{1/2} \\&+ \sum _{k=1}^{4} \Bigg \{ \overline{B}_{2}^{1/2} \overline{\text {X}}_k + \sqrt{24\pi } \overline{B}_{2, L}^{1/2}\overline{\text {X}}_{L, k} + \overline{B}_{3}^{1/2}(\overline{\text {Y}}_k^{(1)} + \overline{\text {Y}}_k^{(2)}) \\&+ \sqrt{24\pi }\,\, \overline{B}_{3, L}^{1/2}(\overline{\text {Y}}_{L, k}^{(1)} + \overline{\text {Y}}_{L, k}^{(2)}) + [\overline{B}_5 + 24\pi (\overline{B}_{4, L} + 81J_L)]^{1/2} \overline{Z}_k \\&+ \sqrt{24\pi }\,\,\overline{B}_{5, L}^{1/2} \overline{Z}_{G, k} + \sqrt{24\pi }\,\, \overline{B}_{6, L}^{1/2}\overline{Z}_{L, k} \Bigg \}. \end{aligned}$$

Finally, we recall (52) and combine the last inequality with (53).

2.3 Proof of Theorem 3

Without any loss of generality, we prove the sufficiency of (7) for the weak convergence to the Maxwellian distribution, under extra-conditions (47)–(48) and

$$\begin{aligned} \max _{i = 1, 2, 3} |\sigma _{i}^{2} - 1| \le \frac{\sqrt{42 + \delta ^2}}{21 + \delta ^2} \delta \end{aligned}$$
(128)

with \(\delta := -\varLambda _b/16\). This last assumption is not restrictive since the Cauchy problem associated with (1) is autonomous and \(\max _{i = 1, 2, 3} \big |\int _{{\mathbb {R}}^3}v_{i}^{2}\mu (\text {d}\mathbf{v}, t) - 1\big |\) approaches zero as \(t\) goes to infinity. See [29, 31, 46]. The argument proceeds, as in Section 9.1 of [24], on the basis of the Lévy continuity theorem. Therefore, fix \(\varvec{\xi }\ne \mathbf{0}\) and write

$$\begin{aligned} \left| \hat{\mu }(\varvec{\xi }, t) - \hat{\gamma }(\varvec{\xi })\right| \ \le \ {\mathsf{E}}_t\left| \hat{{\mathcal {N}}}(\rho ; \mathbf{u}) - e^{-T^2 \rho ^2/2}\right| \ +\ \left| {\mathsf{E}}_t\left[ e^{-T^2 \rho ^2/2}\right] - e^{-\rho ^2/2}\right| \qquad \quad \end{aligned}$$
(129)

where \(\rho = |\varvec{\xi }|\), \(\mathbf{u}= \varvec{\xi }/|\varvec{\xi }|\) and \(T^2 := \sum _{j = 1}^{\nu } \pi _{j, \nu }^2 \varvec{\psi }_{j, \nu }^{t}(\mathbf{u}) V[\mu _0] \varvec{\psi }_{j, \nu }(\mathbf{u})\). As to the first summand in (129), use (6)–(7) in Section 9.1 of [24] to obtain

$$\begin{aligned} \left| \hat{{\mathcal {N}}}(\rho ; \mathbf{u}) - e^{-T^2 \rho ^2/2}\right|&\le \rho ^2 \sum _{j = 1}^{\nu } \pi _{j, \nu }^2 \int \limits _{A_j(\varepsilon )} (\varvec{\psi }_{j, \nu }(\mathbf{u}) \cdot \mathbf{v})^2 \mu _0(\text {d}\mathbf{v}) + \varepsilon |T|^3 \rho ^3 \nonumber \\&+ \frac{1}{8} T^4 \rho ^4 \frac{\max _{1 \le j \le \nu } \pi _{j, \nu }^2 \varvec{\psi }_{j, \nu }^{t}(\mathbf{u}) V[\mu _0] \varvec{\psi }_{j, \nu }(\mathbf{u})}{T^2} \qquad \qquad \end{aligned}$$
(130)

with \(\varepsilon > 0\) and \(A_j(\varepsilon ) := \{\mathbf{v}\in {\mathbb {R}}^3\ \big | \ |\pi _{j, \nu } (\varvec{\psi }_{j, \nu }(\mathbf{u}) \cdot \mathbf{v})| \ge \varepsilon |T|\}\) for \(j = 1, \dots , \nu \). Then, one has \(\sigma _{*}^2 := \min \{\sigma _{1}^2, \sigma _{2}^2, \sigma _{3}^2\} \le T^2 \le 3\) and

$$\begin{aligned} \frac{1}{8} T^2 \rho ^4 \max _{1 \le j \le \nu } \pi _{j, \nu }^2 \varvec{\psi }_{j, \nu }^{t}(\mathbf{u}) V[\mu _0] \varvec{\psi }_{j, \nu }(\mathbf{u}) \le \frac{9}{8} \rho ^4 \pi _{o}^2 \end{aligned}$$
(131)

with \(\pi _o := \max _{1 \le j \le \nu } |\pi _{j, \nu }|\). Put \(M(y) := \int _{\{|\mathbf{v}| \ge 1/y\}} |\mathbf{v}|^2 \mu _0(\text {d}\mathbf{v})\) for \(y > 0\) and note that \(M\) is a monotonically increasing bounded function satisfying \(\lim _{y \downarrow 0} M(y) = 0\). Moreover, from

$$\begin{aligned} A_j(\varepsilon ) \subset \left\{ \mathbf{v}\in {\mathbb {R}}^3\ \big | \ \pi _o |\varvec{\psi }_{j, \nu }(\mathbf{u}) \cdot \mathbf{v}| \ge \varepsilon \sigma _{*}\right\} \subset \left\{ \mathbf{v}\in {\mathbb {R}}^3\ \big | \ \pi _o \cdot |\mathbf{v}| \ge \varepsilon \sigma _{*}\right\} \end{aligned}$$

one can conclude that

$$\begin{aligned} \sum _{j = 1}^{\nu } \pi _{j, \nu }^2 \int \limits _{A_j(\varepsilon )} (\varvec{\psi }_{j, \nu }(\mathbf{u}) \cdot \mathbf{v})^2 \mu _0(\text {d}\mathbf{v}) \le M\left( \frac{\pi _o}{\varepsilon \sigma _{*}}\right) \end{aligned}$$
(132)

holds true for every strictly positive \(\varepsilon \). At this stage, take \(\varepsilon = \sqrt{\pi _o}\) and combine (130)–(132) to get

$$\begin{aligned} \left| \hat{{\mathcal {N}}}(\rho ; \mathbf{u}) - e^{-T^2 \rho ^2/2}\right| \le M\left( \frac{\sqrt{\pi _o}}{\sigma _{*}}\right) \rho ^2 + \sqrt{27 \pi _o} \rho ^3 + \frac{9}{8} \pi _{o}^2 \rho ^4. \end{aligned}$$
(133)

To complete the analysis of the first summand in the RHS of (129), one shows that the expectation of the RHS of (133) approaches zero as \(t\) goes to infinity, for every \(\rho \) in \([0, +\infty )\). Indeed, for any monotonically increasing bounded function \(g : (0, \infty ) \rightarrow (0, \infty )\) satisfying \(\lim _{x \downarrow 0} g(x) = 0\), one has

$$\begin{aligned} {\mathsf{E}}_t[g(\pi _{o})]&= {\mathsf{E}}_t[g(\pi _{o})1\!\!1\{\pi _{o} \le e^{-z t}\}] + {\mathsf{E}}_t[g(\pi _{o})1\!\!1\{\pi _{o} > e^{-z t}\}] \\&\le g(e^{-z t}) + \sup _{x \in (0, \infty )} g(x) \cdot {\mathsf{E}}_t[\pi _{o}^4] e^{4 z t} \end{aligned}$$

for every \(z\) in \((0, \infty )\). By virtue of (24)–(25), \({\mathsf{E}}_t[\pi _{o}^4] \le e^{\varLambda _b t}\) and, after choosing \(z = -\varLambda _b/8\), one obtains \(\lim _{t \rightarrow +\infty } {\mathsf{E}}_t[g(\pi _{o})] = 0\). This argument, applied with \(g(x) = M\left( \frac{\sqrt{x}}{\sigma _{*}}\right) \rho ^2 + \sqrt{27 x} \rho ^3 + \frac{9}{8} x^2 \rho ^4\), leads to the desired result. As far as the latter summand in (129) is concerned, a plain application of (17) implies that \({\mathsf{E}}_t[e^{- T^2 \rho ^2/2}]\) can be thought of as the Fourier transform of the solution of (1) when the initial datum coincides with \(\prod _{i = 1}^{3} \frac{1}{\sigma _i \sqrt{2\pi }} \exp \{-\frac{v_{i}^{2}}{2 \sigma _{i}^{2}}\} \text {d}v_i\), where the \(\sigma _i\)’s have been fixed initially. Now, in view of (128), this initial datum belongs to a convenient neighborhood of the equilibrium \(\gamma \)—according to Theorem 1.1 in [29]—so that

$$\begin{aligned} \sup _{\rho \in {\mathbb {R}}} \big |{\mathsf{E}}_t\exp \{-T^2 \rho ^2/2\} - \exp \{-\rho ^2/2\}\big | \le C_{*} e^{\frac{1}{2}\varLambda _b t} \end{aligned}$$

holds true for every \(t \ge 0\) with the same \(C_{*}\) as in the above-quoted theorem.

As to the necessity of (7), suppose that \(\mu (\cdot , t)\) converges weakly to some limit as \(t\) goes to infinity. Following a technique developed in [35], the argument starts with the introduction of the random vector

$$\begin{aligned} W = \big (\nu , \{\tau _n\}_{n \ge 1}, \{\phi _n\}_{n \ge 1}, \{\vartheta _n\}_{n \ge 1}, \varvec{\lambda }, \varLambda , \mathbf{U}\big ) \end{aligned}$$

defined on \((\Omega , {\fancyscript{F}})\). To explain the three right-most symbols above, one fixes an arbitrary point \(\mathbf{u}_0\) in \(S^2\) and defines:

  1. (i)

    \(\varvec{\lambda } := \{\lambda _1(\cdot ), \dots , \lambda _{\nu }(\cdot ), \delta _0(\cdot ), \delta _0(\cdot ), \dots \}\) to be the sequence of random p.d.’s on \(({\mathbb {R}}, {\fancyscript{B}}({\mathbb {R}}))\) such that \(\hat{\lambda }_j(\xi ) := \hat{\mu }_0(\xi \pi _{j, \nu } \varvec{\psi }_{j, \nu }(\mathbf{u}_0))\), for \(j = 1, \dots , \nu \) and \(\xi \) in \({\mathbb {R}}\).

  2. (ii)

    \(\varLambda \) to be the random p.d. on \(({\mathbb {R}}, {\fancyscript{B}}({\mathbb {R}}))\) obtained as convolution of all elements of \(\varvec{\lambda }\), i.e. \(\varLambda = \lambda _1 *\dots *\lambda _{\nu }\).

  3. (iii)

    \(\mathbf{U} := \{U_1, U_2, \dots \}\) to be the sequence of random numbers defined by \(U_k := \max _{1 \le j \le \nu } \lambda _j\left( \left[ -\frac{1}{k}, \frac{1}{k}\right] ^c\right) \) for every \(k\) in \({\mathbb {N}}\).

To grasp the usefulness of \(W\), one can note that its components are the essential ingredients of the central limit problem for independent uniformly asymptotically negligible summands. See Sections 16.6-9 of [36]. Apropos of the negligibility condition, it is easy to prove that \(\lim _{t \rightarrow +\infty } {\mathsf{P}}_t[U_k > \alpha ] = 0\) holds for every \(k\) in \({\mathbb {N}}\) and for every \(\alpha \) in \((0, +\infty )\). In fact, the inclusion \(\{\mathbf{v}\in {\mathbb {R}}^3\ \big | \ |\pi _{j, \nu } \varvec{\psi }_{j, \nu } \cdot \mathbf{v}| \ge 1/k\} \subset \{\mathbf{v}\in {\mathbb {R}}^3\ \big | \ |\pi _{j, \nu } \mathbf{v}| \ge 1/k\}\) entails

$$\begin{aligned} \{U_k > \alpha \} \subset \left\{ \left[ \max _{1 \le j \le \nu } \mu _0\{\mathbf{v}\in {\mathbb {R}}^3\ \big | \ |\pi _{j, \nu } \mathbf{v}| \ge 1/k\}\right] \ge \alpha \right\} . \end{aligned}$$

To conclude, apply the argument used to prove Lemma 2 in [38]. Now, think of the range of \(W\) as a subset of

$$\begin{aligned} {\mathbb {S}} := \overline{\mathbb {N}} \times \overline{\mathbb {T}} \times [0, \pi ]^{\infty } \times [0, 2\pi ]^{\infty } \times ({\mathcal {P}}(\overline{{\mathbb {R}}}))^{\infty } \times {\mathcal {P}}(\overline{{\mathbb {R}}}) \times [0, 1]^{\infty } \end{aligned}$$

where: \(\overline{{\mathbb {N}}} := \{1, 2, \dots , +\infty \}\) and \(\overline{{\mathbb {T}}}\) are the one-point compactifications of \({\mathbb {N}}\) and \({\mathbb {T}}\), respectively; \(\overline{\mathbb {R}} := [-\infty , +\infty ]\); \({\mathcal {P}}(X)\) is the space of all p.d.’s on \(X\). Here, \({\mathcal {P}}(\overline{\mathbb {R}})\) is metrized, consistently with the topology of weak convergence, in such a way that it turns out to be a separable, compact and complete metric space. Cf. Section 6.II of [57]. Then, \({\mathbb {S}}\) is a separable, compact and complete metric space w.r.t. the product topology and so the family of probability distributions \(\{{\mathsf{P}}_t\circ W^{-1}\}_{t \ge 0}\) is tight. This implies that any sequence \(\{\mathsf{P}_{t_m} \circ W^{-1}\}_{m \ge 1}\), when \(t_m\) strictly increases to infinity, contains a subsequence \(\{\mathsf{Q}_l\}_{l \ge 1}\), with \(\mathsf{Q}_l := \mathsf{P}_{t_{m_l}} \circ W^{-1}\), which converges weakly to a p.d. \(\mathsf{Q}\). It is worth noting that, thanks to the weak convergence of \(\mu (\cdot , t)\), \(\mathsf{Q}\) is supported by

$$\begin{aligned} \{+\infty \} \times \overline{{\mathbb {T}}} \times [0, \pi ]^{\infty } \times [0, 2\pi ]^{\infty } \times \{\delta _0\}^{\infty } \times {\mathcal {P}}({\mathbb {R}}) \times \{0\}^{\infty }. \end{aligned}$$

This claim can be verified by recalling Lemma 3 in [38]. Since \({\mathbb {S}}\) is Polish, one can now invoke the Skorokhod representation theorem (see Theorem 4.30 in [48]). Therefore, there are a probability space \((\tilde{\Omega }, \tilde{\fancyscript{F}}, \tilde{{\mathsf{P}}})\) and \({\mathbb {S}}\)-valued random elements on it, say \(\tilde{W}_l = \big (\tilde{\nu }_l, \{\tilde{\tau }_{n, l}\}_{n \ge 1}, \{\tilde{\phi }_{n, l}\}_{n \ge 1}, \{\tilde{\vartheta }_{n, l}\}_{n \ge 1}, \tilde{\varvec{\lambda }}_l, \tilde{\varLambda }_l, \tilde{\mathbf{U}}_l\big )\) and \(\tilde{W}_{\infty }\), which have respective p.d.’s \(\mathsf{Q}_l\) and \(\mathsf Q \), for every \(l\) in \({\mathbb {N}}\). Moreover, for every \(\tilde{\omega }\) in \(\tilde{\Omega }\), one has \(\tilde{W}_l(\tilde{\omega }) \rightarrow \tilde{W}_{\infty }(\tilde{\omega })\) (in the metric of \({\mathbb {S}}\)) as \(l\) goes to infinity, which entails

$$\begin{aligned} \begin{array}{lll} \tilde{\nu }_l \rightarrow +\infty , &{}\quad \tilde{\mathbf{U}}_l \rightarrow \{0, 0, \dots \} \\ \tilde{\varvec{\lambda }}_l \Rightarrow \{\delta _0, \delta _0, \dots \}, &{}\quad \tilde{\varLambda }_l \Rightarrow \tilde{\varLambda }_{\infty } \end{array} \end{aligned}$$
(134)

\(\tilde{\varLambda }_{\infty }\) being an element of \({\mathcal {P}}({\mathbb {R}})\). The distributional properties of \(\tilde{W}_l\) imply that \(\tilde{\varLambda }_l\) is the convolution of the elements of \(\tilde{\varvec{\lambda }}_l\), and that \(\tilde{U}_{k, l}\) coincides with \(\max _{1 \le j \le \tilde{\nu }_l} \tilde{\lambda }_{j, l}\left( \left[ -\frac{1}{k}, \frac{1}{k}\right] ^c\right) \) for every \(k\) in \({\mathbb {N}}\), \(\tilde{{\mathsf{P}}}\)-almost surely. For convenience, denote with \(q^{(s)}\) the symmetrized form of the p.d. \(q\), i.e. \(\hat{q^{(s)}(\cdot )} := |\hat{q}(\cdot )|^2\). Now, (134) entails \(\tilde{\varLambda }_{l}^{(s)} \Rightarrow \tilde{\varLambda }_{\infty }^{(s)}\) for every \(\tilde{\omega }\) in \(\tilde{\Omega }\) and the combination of this fact with Theorem 24 in Chapter 16 of [36] yields

$$\begin{aligned} +\!\infty > \sigma ^2(\tilde{\omega }) := \lim \limits _{\varepsilon \downarrow 0} \mathop {\overline{\underline{\lim }}}\limits _{l \rightarrow \infty } \sum _{j = 1}^{\tilde{\nu }_l(\tilde{\omega })} \,\,\int \limits _{[-\varepsilon , \varepsilon ]} x^2 \tilde{\lambda }_{j}^{(s)}(\text {d}x; \tilde{\omega }) \end{aligned}$$
(135)

with the exception of a set of points \(\tilde{\omega }\) of \(\tilde{{\mathsf{P}}}\)-probability 0. The final argument is split into the following steps. First,

$$\begin{aligned} \sum _{j = 1}^{\tilde{\nu }_l} \int \limits _{[-\varepsilon , \varepsilon ]} x^2 \tilde{\lambda }_{j}^{(s)}(\text {d}x)&= \sum _{j = 1}^{\tilde{\nu }_l} \tilde{\pi }_{j, \tilde{\nu }_l}^{2} \int \limits _{{\mathbb {R}}^3}(\tilde{\varvec{\psi }}_{j, \tilde{\nu }_l} \cdot \mathbf{v})^2 1\!\!1\{|\tilde{\pi }_{j, \tilde{\nu }_l} \tilde{\varvec{\psi }}_{j, \tilde{\nu }_l} \cdot \mathbf{v}| \le \varepsilon \} \mu _{0}^{(s)}(\text {d}\mathbf{v}) \nonumber \\&\ge \sum _{j = 1}^{\tilde{\nu }_l} \tilde{\pi }_{j, \tilde{\nu }_l}^{2} \sum _{i = 1}^{3} \tilde{\psi }_{j, \tilde{\nu }_l; i}^{2} \int \limits _{\{\tilde{\pi }_{l, o} |\mathbf{v}| \le \varepsilon \}} v_{i}^{2} \mu _{0}^{(s)}(\text {d}\mathbf{v}) \end{aligned}$$
(136)

where the \(\tilde{\pi }\)’s and \(\tilde{\varvec{\psi }}\)’s denote the counterparts, in the Skorokhod representation, of the \(\pi \)’s and \(\varvec{\psi }(\mathbf{u}_0)\)’s, \(\tilde{\pi }_{l, o} := \max _{1 \le j \le \tilde{\nu }_l} |\tilde{\pi }_{j, \tilde{\nu }_l}|\) and the inequality is a consequence of the inclusion \(\{\mathbf{v}\in {\mathbb {R}}^3\ \big | \ |\tilde{\pi }_{j, \tilde{\nu }_l} \tilde{\varvec{\psi }}_{j, \tilde{\nu }_l} \cdot \mathbf{v}| \le \varepsilon \} \supset \{\mathbf{v}\in {\mathbb {R}}^3\ \big | \ \tilde{\pi }_{l, o} |\mathbf{v}| \le \varepsilon \} \). Second, define \(d = d(\tilde{\omega }; j, l)\) to be an element of \(\{1, 2, 3\}\) for which \(\tilde{\psi }_{j, \tilde{\nu }_l; d}^2 = \max _{1 \le i \le 3} \tilde{\psi }_{j, \tilde{\nu }_l; i}^2\). Note that \(\tilde{\psi }_{j, \tilde{\nu }_l; d}^2\) must be greater than \(1/3\) since \(\tilde{\varvec{\psi }}_{j, \tilde{\nu }_l}\) belongs to \(S^2\), for every \(\tilde{\omega }\) in \(\tilde{\Omega }\), \(l\) in \({\mathbb {N}}\) and \(j = 1, \dots , \tilde{\nu }_l\). Then,

$$\begin{aligned} \sum _{j = 1}^{\tilde{\nu }_l} \tilde{\pi }_{j, \tilde{\nu }_l}^{2} \sum _{i = 1}^{3} \tilde{\psi }_{j, \tilde{\nu }_l; i}^2 \int \limits _{\{\tilde{\pi }_{l, o} |\mathbf{v}| \le \varepsilon \}} v_{i}^{2} \mu _{0}^{(s)}(\text {d}\mathbf{v})&\ge \sum _{j = 1}^{\tilde{\nu }_l} \tilde{\pi }_{j, \tilde{\nu }_l}^{2} \tilde{\psi }_{j, \tilde{\nu }_l; d}^2 \int \limits _{\{\tilde{\pi }_{l, o} |\mathbf{v}| \le \varepsilon \}} v_{d}^{2} \mu _{0}^{(s)}(\text {d}\mathbf{v})\\&\ge \frac{1}{3} \sum _{j = 1}^{\tilde{\nu }_l} \tilde{\pi }_{j, \tilde{\nu }_l}^{2} \int \limits _{\{\tilde{\pi }_{l, o} |\mathbf{v}| \le \varepsilon \}} v_{d}^{2} \mu _{0}^{(s)}(\text {d}\mathbf{v}) \\&= \frac{1}{3} \sum _{h = 1}^{3} \tilde{s}_{h, l} \int \limits _{\{\tilde{\pi }_{l, o} |\mathbf{v}| \le \varepsilon \}} v_{h}^{2} \mu _{0}^{(s)}(\text {d}\mathbf{v}) \end{aligned}$$

where \(\tilde{s}_{h, l}\) denotes the sum of those \(\tilde{\pi }_{j, \tilde{\nu }_l}^{2}\) for which \(d(\tilde{\omega }; j, l) = h\). At this stage, observe that \(\tilde{\pi }_{l, o}\) goes to zero with probability one as \(l\) goes to infinity, in view of Lemma 1 in [38]. Since \(\sum _{h = 1}^{3} \tilde{s}_{h, l} = 1\) with probability one, there are some \(\tilde{\omega }\) and \(h\), say \(\tilde{\omega }_{*}\) and \(h_{*}\), such that \(\tilde{\pi }_{l, o}(\tilde{\omega }_{*}) \rightarrow 0\) and \(\overline{\lim }_l \tilde{s}_{h_{*}, l}(\tilde{\omega }_{*})\) is strictly positive. Then,

$$\begin{aligned} \sigma ^2(\tilde{\omega }_{*})&\ge \lim _{\varepsilon \downarrow 0} \mathop {\overline{\lim }}\limits _{l \rightarrow \infty } \frac{1}{3} \sum _{h = 1}^{3} \tilde{s}_{h, l}(\tilde{\omega }_{*}) \int \limits _{\{\tilde{\pi }_{l, o}(\tilde{\omega }_{*}) |\mathbf{v}| \le \varepsilon \}} v_{h}^{2} \mu _{0}^{(s)}(\text {d}\mathbf{v}) \\&\ge \frac{1}{3} \mathop {\overline{\lim }}\limits _{r \rightarrow \infty } \int \limits _{\{|\mathbf{v}| \le r\}} v_{h_{*}}^{2} \mu _{0}^{(s)}(\text {d}\mathbf{v}) \cdot \mathop {\overline{\lim }}\limits _{l \rightarrow \infty } \tilde{s}_{h_{*}, l}(\tilde{\omega }_{*}) \end{aligned}$$

which shows that the \(h_{*}\)-th marginal of \(\mu _{0}^{(s)}\)—and hence also the \(h_{*}\)-th marginal of \(\mu _0\)—has finite second moment. To complete the proof, observe that \(h_{*}\) can be determined independently of \(\mu _0\) and that weak convergence of \(\mu (\cdot , t)\) entails weak convergence of \(\mu (\cdot , t) \circ f_{Q}^{-1}\), \(f_{Q}\) being the map \(\mathbf{v}\mapsto Q\mathbf{v}\) and \(Q\) an orthogonal matrix. Hence, since \(\mu (\cdot , t) \circ f_{Q}^{-1}\) turns out to be the solution of (1) with initial datum \(\mu _0 \circ f_{Q}^{-1}\) (cf. [31]), the above argument can be used to prove that \(\int _{{\mathbb {R}}^3}v_{h_{*}}^{2} \mu _0 \circ f_{Q}^{-1}(\text {d}\mathbf{v})\) is finite, where \(h_{*}\) is invariant w.r.t. \(Q\) and \(\mu _0\). At the end, choose \(f_Q\) firstly equal to \((v_1, v_2, v_3) \mapsto (v_2, v_3, v_1)\) and, then, equal to \((v_1, v_2, v_3) \mapsto (v_3, v_1, v_2)\) to complete the proof.