Advertisement

Journal of Statistical Physics

, Volume 155, Issue 2, pp 323–391 | Cite as

A Brownian Particle in a Microscopic Periodic Potential

  • Jeremy Thane Clark
  • Loïc Dubois
Article

Abstract

We study a model for a massive test particle in a microscopic periodic potential and interacting with a reservoir of light particles. In the regime considered, the fluctuations in the test particle’s momentum resulting from collisions typically outweigh the shifts in momentum generated by the periodic force, so the force is effectively a perturbative contribution. The mathematical starting point is an idealized reduced dynamics for the test particle given by a linear Boltzmann equation. In the limit that the mass ratio of a single reservoir particle to the test particle tends to zero, we show that there is convergence to the Ornstein–Uhlenbeck process under the standard normalizations for the test particle variables. Our analysis is primarily directed towards bounding the perturbative effect of the periodic potential on the particle’s momentum.

Keywords

Brownian limit Linear Boltzmann equation Ornstein-Uhlenbeck process Nummelin splitting 

1 Introduction

The Ornstein–Uhlenbeck process offers a homogenized picture for the motion of a massive particle interacting with a gas of lightweight particles at fixed temperature [33]. In this description, the spatial degrees of freedom are driven ballistically by momentum variables which are themselves governed by a diffusion equation that includes a drift term corresponding to the drag felt by the massive particle as it accumulates speed and has more frequent collisions with the gas. Under diffusive rescaling, the spatial variables converge in law to a Brownian motion. This result follows by an elementary analysis of the closed formulas available for the Ornstein–Uhlenbeck process [26]. The Brownian motion description for the test particle transport is effectively “more macroscopic” than the Ornstein–Uhlenbeck model since the fluctuations in the particle’s momentum are integrated into infinitesimal spatial “jumps” for the Brownian particle.

In the other direction, we may consider derivations of the Ornstein–Uhlenbeck process from models that are “more microscopic”. These relatively microscopic descriptions may merely be more complicated stochastic models for the test particle such as a linear Boltzmann equation, or more fundamentally, a reduced dynamics for the test particle beginning from a full microscopic model that includes the evolution of the degrees of freedom for the gas. The stochastic model in the former case should be regarded as an intermediary picture between the Ornstein–Uhlenbeck and the Hamiltonian dynamics arising in some limit; see [30] for a discussion of the low density limit. In the Boltzmann models, the test particle undergoes a Markovian dynamics, whereas for the Hamiltonian model including the gas, the randomness is only in the initial configuration, and the resulting dynamics for the test particle given by integrating out the gas is non-Markovian. In the other direction, the contrast between the Ornstein–Uhlenbeck and the Boltzmann-type dynamics is that the momentum in the Boltzmann case makes discrete jumps, which are individually small in the Brownian limit, corresponding to collisions with gas particles rather than evolving with continuous trajectories according to a Langevin equation as in the Ornstein–Uhlenbeck case. We refer to the book [26] for a discussion of these various levels of description for a Brownian particle.

Rigorous mathematical derivations of the Ornstein–Uhlenbeck process were achieved in [3, 15] from stochastic models giving an effective description of the test particle as it receives collisions from particles in a background gas. For models that begin with a full mechanical Hamiltonian model including the test particle and the gas, derivations of the Ornstein–Uhlenbeck process from the reduced dynamics of the test particle were obtained in [10, 16, 31].

In this article we consider the Brownian regime for a stochastic model in which a one-dimensional test particle makes jumps in momentum, interpreted as collisions with a background gas, and is acted upon by a force from an external, spatially periodic potential field. With the presence of the field, the momentum process is no longer Markovian since it drifts at a rate depending on the particle’s position. The momentum of the particle has two contributions: the total displacement in momentum generated by the field, which is given by a time integral of the force, and the sum of the momentum jumps from collisions. As a result of the specific scaling regime considered, which includes the period length of the potential, the force field typically makes a smaller-scale contribution to the test particle’s momentum than the fluctuations in momentum due to the jumps identified with “collisions”. The vanishing of the force contribution is an averaged effect driven by the frequent rate at which the test particle is typically passing through the period cells of the potential field. The Brownian limit of the model to first-order thus yields the same Ornstein–Uhlenbeck process as if the force were set to zero. Our analysis is focused on obtaining a sharp upper bound for the influence of the external potential on the momentum of the particle, and our techniques improve those applied to a related model in [9]. Ultimately, the main contributions to the total drift in momentum due to the forcing are made during “rare” time periods at which the test particle’s momentum returns to “small” values. The results of this article are extended in [7] to prove that the integral of the force, or net displacement in momentum due to the potential, converges in law to a fractional diffusion whose rate depends on the amount time that the limiting Ornstein–Uhlenbeck process spends at zero momentum, i.e., the local time at zero.

Our model is a linear Boltzmann dynamics for a one-dimensional particle making elastic collisions with the gas and including a spatially periodic potential. The jump rate kernel is the one-dimensional case of the formula appearing in [30, Chap. 8.6], which corresponds to a hard-rod interaction between the test particle and a single reservoir particle. However, since the model is one-dimensional, it cannot be derived from a mechanical microscopic dynamics in the Boltzmann-Grad limit. We thus regard our model as phenomenological, and we argue that the resulting behavior that we find is qualitatively the same as what should be expected in an analogous three-dimensional model for a Brownian particle in a one-dimensional periodic potential.

We think of our model as corresponding to an experimental situation for a large atom or molecule in a periodic standing-wave light field and interacting with a dilute background gas. A periodic optical force on an atom can be produced experimentally by counter-propagating lasers; see, for instance, [22] or the reviews [1, 25]. A classical treatment of the atom is reasonable in the regime where the potential is effectively weak because the test particle is typically not constrained by the potential and the coherent quantum effects for the test particle will be suppressed by interactions with the gas.

1.1 Model and Results

We will consider a one-dimensional particle of mass \(M\) interacting with a gas of particles with mass \(m\) for \(\frac{m}{M}=\lambda \ll 1\) in the presence of a force \(-\frac{dV}{dx}(\frac{x}{\lambda })\) for some smooth \(V:{\mathbb R}\rightarrow {\mathbb R}^{+}\) with period \(a>0\). We take the phase space density \({\Psi }_{t,\lambda }(x,p)\) at time \(t\in {\mathbb R}^{+}\) to obey a linear Boltzmann equation
$$\begin{aligned} \frac{d}{dt}{\Psi }_{t,\lambda }(x,\,p)&= -\frac{\lambda }{m}p\frac{\partial }{\partial x}{\Psi }_{t,\lambda }(x,p)+\frac{dV}{dx}\left( \frac{x}{\lambda }\right) \frac{\partial }{\partial p}{\Psi }_{t,\lambda }(x,p) \nonumber \\&+\int \limits _{{\mathbb R}}dp^{\prime }\big ({\mathcal {J}}_{\lambda }(p^{\prime },p){\Psi }_{t,\lambda }(x,p^{\prime })-{\mathcal {J}}_{\lambda }(p,p^{\prime }){\Psi }_{t,\lambda }(x,p) \big ), \end{aligned}$$
(1.1)
where \({\mathcal {J}}_{\lambda }(p^{\prime },p)\) is a kernel describing the rate of kicks in momentum \(p^{\prime }\rightarrow p\) for the massive particle due to collisions with reservoir particles. Since we are considering an ideal gas, the rates \({\mathcal {J}}_{\lambda }(p^{\prime },p)\) will be determined by the interaction potential between the test particle and a reservoir particle, the temperature \(\beta ^{-1}\), the ratio \(\lambda =\frac{m}{M}\), and the spatial density \(\eta \). We will take the rates \({\mathcal {J}}_{\lambda }(p^{\prime },p)\) to correspond to a hard-rod interaction (or alternatively “hard-point” since the length of the objects does not appear for the one-dimensional linear Boltzmann equation), which has the form
$$\begin{aligned} {\mathcal {J}}_{\lambda }(p^{\prime },p):= \frac{\eta (1+\lambda )}{2m}\big |p^{\prime }-p\big |\frac{e^{-\frac{\beta }{2m}\big (\frac{1-\lambda }{2}p^{\prime } -\frac{1+\lambda }{2}p \big )^{2} } }{(2\pi \frac{m}{\beta })^{\frac{1}{2}}}. \end{aligned}$$
(1.2)
The jump rates \({\mathcal {J}}_{\lambda }\) are the explicit form of those in Eq. (8.118) from [30] for the dimension one case, written in momentum variables rather than velocities.
We will denote the stochastic process whose probability density evolves according to (1.1) by \((X_{t},P_{t})\). Let us also define the process
$$\begin{aligned} D_{t}:= \int \limits _{0}^{t}dr\frac{dV}{dx}\left( \frac{X_{r}}{\lambda }\right) . \end{aligned}$$
The process \(-D_{t}\) is the cumulative drift in the particle’s momentum due to the periodic force field, and hence the momentum at time \(t\in {\mathbb R}^{+}\) has the form
$$\begin{aligned} P_{t}=P_{0}-D_{t}+J_{t}, \end{aligned}$$
(1.3)
where \(J_{t}\) is the sum of all the momentum jumps due to collisions with the gas over the time interval \([0,t]\).
Let \(({\mathfrak {q}}_{t},{\mathfrak {p}}_{t})\in {\mathbb R}^{2}\) be a process satisfying the Langevin equations
$$\begin{aligned} d\mathfrak {q}_{t}&= \frac{1}{m}\mathfrak {p}_{t}dt, \nonumber \\ d\mathfrak {p}_{t}&= -\gamma \mathfrak {p}_{t}dt+\left( \frac{2m\gamma }{\beta }\right) ^{\frac{1}{2}}d\mathbf {B}_{t}, \end{aligned}$$
(1.4)
where \(\gamma = 8 \eta \big (\frac{ 2}{\pi m \beta })^{\frac{1}{2}} \) and \(\mathbf {B}_{t}\) is a standard Brownian motion.

The technical assumptions for our main results are the following:

List 1.1

  1. 1.

    The potential \(V(x)\) is non-negative, has period \(a>0\), and is continuously differentiable.

     
  2. 2.

    The probability measure \(\mu \) on \({\mathbb R}^{2}\) for the initial location in phase space \((X_{0},P_{0})\) has finite moments.

     

The following theorems are the main results of this article. Theorem 1.3 states that as \(\lambda \searrow 0\) the momentum process \(P_{\frac{t}{\lambda }}\) rescaled by a factor \(\lambda ^{\frac{1}{2}}\) converges to an Ornstein–Uhlenbeck process. Theorem 1.2 bounds the cumulative drift from the periodic force, although only the weaker limit result (1.6) is required for the proof of Theorem 1.3. The estimates developed to prove (1.5) are extended in [7] to prove that the process \((\lambda ^{\frac{1}{4}}D_{\frac{t}{\lambda } },t\in [0,T])\) converges in law to a time-fractional diffusion as \(\lambda \searrow 0\). In particular, the exponent \(\iota =\frac{1}{4}\) is the smallest possible such that the expectation of \(|\lambda ^{\iota }D_{\frac{t}{\lambda }}|\) is uniformly bounded for small \(\lambda >0\) and \(\limsup _{\lambda \rightarrow 0} \mathbb {E}^{(\lambda )}\big [\sup _{0\le t\le T}\big | \lambda ^{\frac{1}{4}}D_{\frac{t}{\lambda }}\big | \big ]>0\).

Theorem 1.2

There exists a \(C>0\) such that for all \(\lambda <1\),
$$\begin{aligned} \mathbb {E}^{(\lambda )}\Big [\sup _{0\le t\le T}\big | \lambda ^{\frac{1}{4}}D_{\frac{t}{\lambda }}\big | \Big ]\le C. \end{aligned}$$
(1.5)
In particular, there is convergence in probability as \(\lambda \searrow 0\) given by
$$\begin{aligned} \sup _{0\le t\le T}\big | \lambda ^{\frac{1}{2}}D_{\frac{t}{\lambda } }\big | \Longrightarrow 0. \end{aligned}$$
(1.6)

Theorem 1.3

In the limit \(\lambda \searrow 0\), there is convergence in law of the process \(\lambda ^{\frac{1}{2}}P_{\frac{t}{\lambda }}\) to the Ornstein–Uhlenbeck process \(\mathfrak {p}_{t}\) over the interval \(t\in [0,\,T]\). The convergence is with respect to the uniform metric on paths.

Since the position process \(X_{t}=X_{0}+\frac{\lambda }{m}\int _{0}^{t}drP_{r} \) is driven by the momentum process, it follows from Theorem 1.3 that \(\lambda ^{\frac{1}{2}}X_{\frac{t}{\lambda }}\) converges in law as \(\lambda \searrow 0\) to the process \(\mathfrak {q}_{t}\) defined in (1.4).

1.2 Further Discussion

This article concerns the dynamics of a Brownian particle that feels a force from a one-dimensional periodic potential. We focus on a regime in which the potential is “microscopic”. By “microscopic”, we mean that the potential has an amplitude \(\sup _{x,x'}|V(x)-V(x')|\) that is much smaller than the typical kinetic energy \(\frac{M}{\beta }=\lambda ^{-1}\frac{m}{\beta }\) of the test particle at equilibrium with the heat bath, and that the period \(a\) is small enough so that the typical rate at which the particle passes through the period cells \((a^{2}M\beta )^{-\frac{1}{2}}\) is much faster than the rate of energy relaxation \(\approx \lambda \gamma \) for the test particle.

For our mathematical analysis, the force \(F(x)=-\frac{dV}{dx}\big (\frac{x}{\lambda }\big )\) is taken to have a period \(a\lambda \) which scales proportionally to the mass ratio \(\lambda =\frac{m}{M}\). This is not essential to these results, and only the broad features described above are critical. The same can be said about the amplitude of the potential.

Theorem 1.3 states that to first approximation under Brownian rescaling, the momentum is an Ornstein–Uhlenbeck process with no dependence on the potential. This classical treatment of the particle allows for comparisons with quantum models. A similar model for a one-dimensional quantum particle was studied in [6] for which the potential is a periodic \(\delta \)-potential. In that case, the singular potential makes a first-order change to the dynamics characterized by spatial subdiffusion caused by quantum reflections even though the periodic potential is “microscopic” in a similar sense as described above. See [4, 13, 19] for examples of experimental investigations of quantum reflections of atoms from potentials generated through laser light. Analogous quantum models with smoother potentials will behave more like their classical counterparts.

A three-dimensional linear Boltzmann dynamics for a particle in a gas of hard spheres and under the influence of a one-dimensional periodic potential will have the same limit result up to the constants as in Theorem 1.3 for the degree of freedom in the direction of the potential. Although the momentum for a single spatial degree of freedom is not Markovian in the linear Boltzmann description, it becomes “more Markovian” in the Brownian limit as is seen in the limiting three-dimensional Ornstein–Uhlenbeck process. The rates (1.2) can then be replaced by the effective rates that emerge for a single degree of freedom in the three dimensional case, which have the same qualitative features for our purposes.

1.2.1 Features of the Model

By rescaling the spatial coordinate for the particle by a factor of \(\lambda ^{-1}\), the master equation (1.1) becomes
$$\begin{aligned} \frac{d}{dt}{\Psi }_{t,\lambda }(x,\,p)=\mathcal {L}_{\lambda }^{*}({\Psi }_{t,\lambda })(x,p)&= -\frac{p}{m}\frac{\partial }{\partial x}{\Psi }_{t,\lambda }(x,p)+\frac{dV}{dx}\big (x\big )\frac{\partial }{\partial p}{\Psi }_{t,\lambda }(x,p) \nonumber \\&+\!\int \limits _{{\mathbb R}}dp^{\prime }\big ({\mathcal {J}}_{\lambda }(p^{\prime },p){\Psi }_{t,\lambda }(x,p^{\prime })-{\mathcal {J}}_{\lambda }(p,p^{\prime }){\Psi }_{t,\lambda }(x,p) \big ),\nonumber \\ \end{aligned}$$
(1.7)
where the generator \(\mathcal {L}_{\lambda }^{*}\) is defined by the second equality. Notice that \(\lambda >0\) does not appear in the deterministic terms on the right side of (1.7). We thus effectively have a particle with Hamiltonian \(H(x,p)=\frac{1}{2m}p^{2}+V(x)\) and a \(\lambda \)-dependent noise. Note that under the new spatial metric, the velocity of the test particle is \(\frac{p}{m}\) rather than \(\frac{p}{M}=\lambda \frac{p}{m}\). For the purpose of Theorem 1.3 and this article generally, it is sufficient to consider the spatial degree of freedom to be a unit torus \(\mathbb {T}=[0,1)\) so that the total state space is \(\Sigma :=\mathbb {T}\times {\mathbb R}\). The equilibrium state for the dynamics on \(\Sigma \) is given by the Maxwell-Boltzmann distribution
$$\begin{aligned} \Psi _{\infty ,\lambda }(x,p):= \frac{ e^{-\beta \lambda H(x,p)} }{N(\lambda )} \end{aligned}$$
(1.8)
for some normalization \(N(\lambda )\).
After the spatial stretching, the drift process in momentum \(D_{t}\) has the form
$$\begin{aligned} D_{t}= \int \limits _{0}^{t}dr g(X_{r},P_{r}) \end{aligned}$$
(1.9)
for \(g:\Sigma \rightarrow {\mathbb R}\) given by \(g(x,p)= \frac{dV}{dx}(x)\). It is thus an integral functional of an exponentially ergodic Markov process on \(\Sigma \); see Appendix 8 for a discussion of the exponential ergodicity of the dynamics. Nonetheless, a central limit theorem for \(\lambda ^{\frac{1}{4}}D_{\frac{t}{\lambda }}\) does not follow from the limit theory for integral functionals of ergodic Markov processes [20] because the relaxation to the state (1.8) only occurs on the time scale \(\lambda ^{-1}\gg 1\). Indeed, there must be many collisions with reservoir particles before there is memory loss for the heavy particle. As will be explained in Sects. 1.2.2 and 1.2.3, the analysis of \(\lambda ^{\frac{1}{4}}D_{\frac{t}{\lambda }}\) is more related to the limit theory for martingales whose bracket processes are additive functionals of a null-recurrent Markov process [17]. This is due to the fact that the fluctuations in \(D_{t}\) accumulate mainly during time intervals in which \(|P_{t}|\) is much smaller than the typical momentum size \(\big (\frac{m}{\lambda \beta }\big )^{\frac{1}{2}}\gg \big (\frac{m}{\beta }\big )^{\frac{1}{2}}\) for the equilibrium state \(\Psi _{\infty ,\lambda }\).

1.2.2 Rough Picture of the Behavior in the Brownian Regime \(\lambda \ll 1\)

Since the equilibrium state of the dynamics is given by the Maxwell-Boltzmann distribution (1.8), the typical energy for the particle when \(\lambda \ll 1\) will be on the order \(\lambda ^{-1}\). Moreover, the potential \(V(x)\) is bounded, so most of the energy will be in the kinetic component \(\frac{1}{2m}p^{2}\) corresponding to momenta \(p\) with absolute value on the order of \( \big (\frac{m}{\lambda \beta }\big )^{\frac{1}{2}}\gg \big (\frac{m}{\beta }\big )^{\frac{1}{2}} \). The jump rates \(\mathcal {J}_{\lambda }(p,p')\) for \(|p|= O (\lambda ^{-\frac{1}{2}})\) are approximately
$$\begin{aligned} \mathcal {J}_{\lambda }(p,p')= j(p-p')+\lambda \frac{\beta }{4m}\big ((p)^2-(p')^{2}\big ) j(p-p')+ O (\lambda ), \end{aligned}$$
(1.10)
where the idealized rates \(j(p)\) have the form
$$\begin{aligned} j(p) := \frac{\eta }{2m}|p|\frac{e^{-\frac{\beta }{2m}p^{2}}}{(2\pi \frac{m}{\beta })^{\frac{1}{2}}}. \end{aligned}$$
(1.11)
The second term on the right side of (1.10) is \( O (\lambda ^{\frac{1}{2}})\) by our assumption \(|p|= O (\lambda ^{-\frac{1}{2}})\). The physical meaning behind the approximation (1.10) is that the gas reservoir particles are typically moving at speeds on the order \((m\beta )^{-\frac{1}{2}}\) which is greater than the typical speed of the test particle \((M\beta )^{-\frac{1}{2}}= \lambda ^{\frac{1}{2}}(m\beta )^{-\frac{1}{2}}\ll (m\beta )^{-\frac{1}{2}}\).1 The statistics for the momentum transfers from the gas thus do not depend strongly on the momentum of the test particle, and have approximately a convolution form as in the zeroth-order term in (1.10). The zeroth-order approximation in (1.10) suggests that the collision component \(J_{t}\) of the momentum (1.3) is typically behaving as an unbiased random walk with increments having density \(j(v)\). Based on this reasoning, \(\lambda ^{\frac{1}{2}}J_{\frac{t}{\lambda }}\) should converge to a Brownian motion with diffusion constant \(\frac{2m\gamma }{\beta }\) as \(\lambda \searrow 0\) by the central limit theorem. However, the first-order term in (1.10) generates a drift for \(\lambda ^{\frac{1}{2}}J_{\frac{t}{\lambda }}\) that is retained as \(\lambda \searrow 0\) and converges to a limit by a law of large numbers. This can be seen in the friction term appearing in the Langevin equation (1.4).
According to the heuristics above, \(J_{\frac{t}{\lambda }}\) should typically be found on the scale \(\lambda ^{-\frac{1}{2}}\) when \(\lambda \ll 1\) and \(t\in [0,T]\), and we will now argue that \(D_{\frac{t}{\lambda }}\) should typically be \( O (\lambda ^{-\frac{1}{4}}) \). We can parse the integral for \(D_{t}\) according to the collision times \(t_{n}\) as
$$\begin{aligned} D_{t}=\int \limits _{t_{{\mathcal {N}}_{t}}}^{t} dr\frac{dV}{dx}\big (X_{r}\big ) + \sum _{n=1}^{{\mathcal {N}}_{t}}\int \limits _{t_{n-1}}^{t_{n}}dr\frac{dV}{dx}\big (X_{r}\big ), \end{aligned}$$
where \(t_{0}=0\) and \({\mathcal {N}}_{t}\) is the number of collisions up to time \(t\). Between collisions from the gas, the particle evolves deterministically according to the Hamiltonian \(H(x,p)=\frac{1}{2}p^{2}+V(x)\), and Newton’s equations give
$$\begin{aligned} P_{t_{n}^{-}}-P_{t_{n-1}}=-\int \limits _{ t_{n-1}}^{t_{n} }dr\frac{dV}{dx}\big (X_{r}\big ). \end{aligned}$$
(1.12)
If \(H(X_{t_{n-1}},P_{t_{n-1}})>2\sup _{x}V(x)\), then the momentum will not change signs over the interval \([t_{n-1},t_{n})\), and
$$\begin{aligned} \big |P_{t_{n}^{-}}-P_{t_{n-1}}\big |= \Big ||P_{t_{n-1}}|-\sqrt{ P_{t_{n-1}}^{2}+2V(X_{t_{n-1}})-2V(X_{t_{n}})}\Big |\le \frac{2\sup _{x}V(x)}{|P_{t_{n-1}}|}, \end{aligned}$$
which follows from the conservation of energy and the quadratic formula. Thus, when \(|P_{t_{n-1}}|\) is on the typical order \(\propto \lambda ^{-\frac{1}{2}}\), the increment (1.12) of the momentum drift is \( O (\lambda ^{\frac{1}{2}})\ll 1\).
In fact there is another critical feature of an ergodic nature that makes the contributions (1.12) to \(D_{t}\) even smaller when \(|P_{t_{n-1}}|\gg 1\). There is an ergodicity on the spatial torus relating to the fact that when the momentum is high, then the particle revolves quickly around the torus, and its location at the time of the next collision is close to uniform over \(\mathbb {T}\). This idea can be used to show that the mean for \(\int _{ t_{n-1}}^{t_{n}}dr\frac{dV}{dx}(X_{r})\) is \( O (|P_{t_{n-1}}|^{-2})\) when given the information known up to time \(t_{n-2}\). In other words, besides the increments (1.12) just being small when \(|P_{t_{n-1}}|\gg 1\), \(D_{t}\) is also behaving like a martingale since the increments are close to being uncorrelated with mean zero. Thus, there is a central limit theorem-like cancellation among the terms. This motivates that the contribution to \(D_{\frac{t}{\lambda }}\) from time intervals where \(|P_{r}|\propto \lambda ^{-\frac{1}{2}}\) is \( O (\lambda ^{-\frac{1}{2}})\) since, for fixed small \(\epsilon >0\),
$$\begin{aligned}&\mathbb {E}\Big [\Big (\sum _{n=1}^{{\mathcal {N}}_{\frac{t}{\lambda } }}\chi \big (|P_{t_{n-1}}| \ge \epsilon \lambda ^{-\frac{1}{2}}\big ) \int \limits _{t_{n-1}}^{t_{n}}dr\frac{dV}{dx}(X_{r}) \Big )^{2} \Big ]\\&= O \Big (\mathbb {E}\Big [\sum _{n=1}^{{\mathcal {N}}_{\frac{t}{\lambda } }}\chi \big (|P_{t_{n-1}}| \ge \epsilon \lambda ^{-\frac{1}{2}}\big ) \Big (\int \limits _{t_{n-1}}^{t_{n}}dr\frac{dV}{dx}(X_{r}) \Big )^{2} \Big ]\Big )\\&\le \epsilon ^{-2}\mathbb {E}\big [{\mathcal {N}}_{\frac{t}{\lambda }}\big ] O (\lambda )= O (1), \end{aligned}$$
because the collisions occur with a frequency on the order of one per unit time \(\mathbb {E}\big [{\mathcal {N}}_{\frac{t}{\lambda }}\big ]= O (\frac{t}{\lambda })\). These contributions disappear for the normalized expression \(\lambda ^{\frac{1}{4}}D_{\frac{t}{\lambda }}\). For technical reasons, our analysis of these facts is actually performed with a different set of artificially introduced stopping times rather than the collision times; see Sect. 4.2.

The above arguments motivate that \(D_{\frac{t}{\lambda }}\) spends the greater portion of the time interval \(t\in [0,T]\) behaving as a constant, or, said differently, its larger fluctuations are typically concentrated on a small fraction of the interval \([0,T]\). Let us consider the order of the contributions to \(D_{t}\) that are likely to occur for the periods of time when \(P_{r}\) returns to the region around the origin, that is, \(|P_{r}|= O (1)\). If \(P_{r}\) is behaving roughly as a random walk for \(t\in [0,\frac{T}{\lambda }]\) with some very weak friction, then we expect that \(P_{r}\) spends on the order of \(\lambda ^{-\frac{1}{2}}\) time in the vicinity of the origin. If there are central limit theorem-like cancellations between the increments \(\int _{ t_{n-1} }^{t_{n}}dr\frac{dV}{dx}(X_{r})\) in those time periods, then \(D_{\frac{t}{\lambda }}\) should be expected to be on the scale \(\lambda ^{-\frac{1}{4}}\).

1.2.3 Techniques and Strategy of the Proof

The main difficulty in showing that \(\lambda ^{\frac{1}{2}}P_{\frac{t}{\lambda }}\) converges in law to the Ornstein–Uhlenbeck process \(\mathfrak {p}_{t}\) is to show that the component \( D_{\frac{t}{\lambda }}\) of the momentum is typically \( o (\lambda ^{-\frac{1}{2}})\) for \(t\in [0,T]\). As indicated by the heuristics of Sect. 1.2.2, we should expect, in fact, that typically \(\sup _{0\le t\le T}| D_{\frac{t}{\lambda }}|\) is \( O (\lambda ^{-\frac{1}{4}})\).

One of the main ingredients in our analysis is a splitting technique that consists in introducing an artificial “atom” into the state space by embedding the original process as a component of a process with an enlarged state space. In principle, the benefit for having an extended state space with an atom is that the trajectories for the process \(S_{t}\) can be decomposed into a series of i.i.d. parts, i.e., life cycles, corresponding to time intervals \([R_{n},R_{n+1})\) where \(R_{n}\) are the return times to the atom. This would allow the integral functional \(D_{t}\) to be written as a pair of boundary terms plus a sum of i.i.d. random variables with a random number of terms. For Markov chains such a technique for embedding an atom was developed independently in [28] and [2] and is referred to as Nummelin splitting or merely splitting. When it comes to splitting a Markov process, there are different schemes available. In [17] there is a sequence of split processes constructed which contain marginal processes that are arbitrarily close to the original process. The construction in [21] involves a larger state space \(\Sigma \times [0,1]\times \Sigma \) although an exact copy of the original process is embedded as a marginal. The idea that splitting constructions could be used as a tool to prove certain limit theorems for Markov processes was suggested in an unpublished paper [32].

We use a truncated version of the split process introduced in [21]. The split process is not Markovian itself, but contains an embedded chain (the split resolvent chain) which is Markovian. The life cycles for the process are not completely independent in this construction because there are correlations between successive life cycles. The details of the construction are explained in Sect. 2. The original process \(S_{t}\), which lives in \(\Sigma =\mathbb {T}\times {\mathbb R}\), is embedded as a component of \(\tilde{S}_{t}=(S_{t},Z_{t}) \in \tilde{\Sigma }=\Sigma \times \{0,1\}\). The process \(D_{t}\) can be written as four boundary terms plus a martingale
$$\begin{aligned} D_{t}&= \big (\text {Sum of boundary terms})+\tilde{M}_{t} \\ \tilde{M}_{t}&= \sum _{n=1}^{ \tilde{N}_{t} }\Big (\int \limits _{R_{n}}^{R_{n+1}}dr\frac{dV}{dx}(X_{r})-\big (\mathfrak {R}^{(\lambda )}\,\frac{dV}{dx}\big )(S_{R_{n}}) +\big (\mathfrak {R}^{(\lambda )}\,\frac{dV}{dx}\big )(S_{R_{n+1}}) \Big ) ,\nonumber \end{aligned}$$
(1.13)
where \(\tilde{N}_{t}\) is the number of returns to the atom \(\Sigma \times 1\) to have occurred before time \(t\), and \(\mathfrak {R}^{(\lambda )}:L^{\infty }(\Sigma )\rightarrow L^{\infty }(\Sigma )\) is the reduced resolvent of the backwards generator \(\mathcal {L}_{\lambda }\). The boundary terms are
$$\begin{aligned} \int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r})-\int \limits _{t}^{R_{\tilde{N}_{t}+1} }dr\frac{dV}{dx}(X_{r})+\left( \mathfrak {R}^{(\lambda )}\,\frac{dV}{dx}\right) (S_{R_{1}}) - \left( \mathfrak {R}^{(\lambda )}\,\frac{dV}{dx}\right) (S_{R_{\tilde{N}_{t}+1}}). \end{aligned}$$
The interjection of the telescoping terms \( \big (\mathfrak {R}^{(\lambda )}\frac{dV}{dx}\big )(S_{R_{n}})\) removes the correlations between successive life cycles. The fact that the increments of \(\tilde{M}_{t}\) have mean zero with respect to the information known up to time \(R_{n}\) is a consequence of the splitting construction and the fact that the observable \(\frac{dV}{dx}\) has mean zero in the equilibrium state \(\Psi _{\infty ,\lambda }\). The process \(\tilde{M}_{t}\) is a martingale with respect to its own filtration, and this opens the possibility of applying Doob’s maximal inequality to bound the fluctuations of \(\tilde{M}_{t}\).
The martingale \(\tilde{M}_{t}\) is a variant of the martingale \(\tilde{M}'\) below that is usually employed when studying limit theorems for integral functionals of Markov processes:
$$\begin{aligned} \tilde{M}_{t}'= \left( \mathfrak {R}^{(\lambda )}\,\frac{dV}{dx}\right) (S_{t})- \left( \mathfrak {R}^{(\lambda )}\,\frac{dV}{dx}\right) (S_{0}) +D_{t}. \end{aligned}$$
(1.14)
The martingale \(\tilde{M}'\) has predictable quadratic variation
$$\begin{aligned} \langle \tilde{M}'\rangle _{t}=\int \limits _{0}^{t}dr\int \limits _{{\mathbb R}}dp'\left( \left( \mathfrak {R}^{(\lambda )}\,\frac{dV}{dx}\right) (X_{r},p')- \left( \mathfrak {R}^{(\lambda )}\,\frac{dV}{dx}\right) (X_{r},P_{r}) \right) ^{2}\mathcal {J}_{\lambda }(P_{r},p'). \end{aligned}$$
Using the martingale (1.14) would require showing some decay that is uniform in \(\lambda <1\) for increments \(| \big (\mathfrak {R}^{(\lambda )}\,\frac{dV}{dx}\big )(x,p')- \big (\mathfrak {R}^{(\lambda )}\,\frac{dV}{dx}\big )(x,p) |\) when \(|p|,|p'| \) are large and \(|p-p'|= O (1)\). However, it is not clear to us how to obtain the necessary bounds on the resolvent, and the methods here are designed to exploit the time-averaging of the oscillatory process \(\frac{dV}{dx}(X_{r})\) as suggested by the heuristics in Sect. 1.2.2. Our technique is based on having bounds for a generalized resolvent \(U^{(\lambda )}_{h}:L^{\infty }(\Sigma )\rightarrow L^{\infty }(\Sigma )\) of the form
$$\begin{aligned} \big (U^{(\lambda )}_{h}g\big )(s):=\mathbb {E}^{(\lambda )}_{s}\Big [\int \limits _{0}^{\infty }dt e^{-\int \limits _{0}^{t}dr h(S_{r})}g(S_{t}) \Big ], \end{aligned}$$
where \(h\) is a non-negative function with compact support, and the function \(g \equiv g_{\lambda }\) essentially has the form
$$\begin{aligned} g_{\lambda }(s)=\Big |\mathbb {E}_{s}^{(\lambda )}\Big [\int \limits _{0}^{\infty }dt\,t\,e^{-t}\frac{dV}{dx}(X_{t}) \Big ]\Big |. \end{aligned}$$
The operator \(U^{(\lambda )}_{h}\) arises in the study of recurrence for Markov processes and has been referred to as the state-modulated resolvent [23]. Analysis of \(U^{(\lambda )}_{h}\) for our dynamics is contained in [8].

1.2.4 The Unit Conventions and Organization of the Article

Throughout the remainder of the article, we will remove units by setting \(\beta =a=m=1\), and picking \(\eta \) such that \( \gamma =\frac{1}{2}\); recall that \(\gamma \) is defined below (1.4). We assume List 1.1 in all theorems, lemmas, etc. unless otherwise stated.

Most of the analysis, Sects. 25, is concerned with the proof of Theorem 1.2. The proof of Theorem 1.3 given Theorem 1.2 is relatively nontechnical. The contents of later sections are roughly characterized by the following:
  • Section 2 presents the splitting structure that allows us to decompose the dynamics into a series of life cycles as sketched in Sect. 1.2.3.

  • Section 3 is directed towards gaining control over the frequency and duration of life cycles in the limit \(\lambda \searrow 0\).

  • Section 4 demonstrates how to bound the fluctuations of the integral functional \(\int _{0}^{t}dr\frac{dV}{dx}(X_{r})\) over the time period of a single life cycle.

  • Sections 5 and 6 contain the proofs respectively for Theorems 1.2 and 1.3.

  • Various proofs are placed in Sect. 7 to avoid diverting the reader from the main points in earlier sections.

2 Nummelin Splitting

The split process that we define here is a truncated version of that in [21]. In the context of a larger probability space, the drift in momentum \(D_{t}=\int _{0}^{t}dr\frac{dV}{dx}(X_{r})\) may be viewed as a martingale plus a few small “boundary” terms. This allows us to apply martingale techniques. For those familiar with the terminology related to Nummelin splitting, we outline the extension of the process as follows: We introduce a resolvent chain embedded in the original process, we split the chain using Nummelin’s technique, and we extend the resolvent chain to a non-Markovian process which contains an embedded version of the original process.

We will begin with a generic discussion of the splitting structure by assuming that we have a function \(h:\Sigma \rightarrow [0,1]\) and a probability measure \(\nu \) on \(\Sigma \) satisfying the inequality (2.1). The specific \(h\) and \(\nu \) that we use in this article are defined below in Convention 2.2. Let \((e_m)\) be a sequence of mean one exponential random variables that are independent of each other and of the process \((X_t,\,P_t)\), and let \(\tau _n := \sum _{m=1}^n e_m\) with the convention \(\tau _0 = 0\). The \(\tau _{n}\) will be referred to as the partition times. Define \(\mathbf {N}_{t}\) to be the number of non-zero \(\tau _{n}\) less than \(t\), and the Markov chain \(\sigma _{n}:=(X_{\tau _{n}},P_{\tau _{n}})\in \Sigma \), which is referred to as the resolvent chain. The resolvent chain has the same invariant probability density as the original process. Let \(\mathcal {T}\) be the transition kernel for the chain, acting on functions from the left and measures from the right. Recall that for a Markov chain, an atom is a nonempty set \(\alpha \) such that the probability transitions starting from a point \(s\in \alpha \) are independent of \(s\). An atom is said to be recurrent if, when starting from a point in the atom, the probability of returning to the atom in the future is one. In general, an Harris recurrent Markov chain with invariant measure \(\mu \) does not necessarily have a recurrent atom \(\alpha \) with positive weight \(\mu (\alpha )>0\). The splitting technique that we outline presently, originally due to Nummelin, allows us to create a recurrent atom for an Harris recurrent Markov chain though a minorization condition (2.1). The idea is to extend the state space \(\Sigma \) to \(\tilde{\Sigma }:=\Sigma \times \{0,1\}\) in order to construct a chain \((\tilde{\sigma }_{n})\in \tilde{\Sigma }\) with a recurrent atom and having the statistics for \((\sigma _{n})\) embedded in the first component of \((\tilde{\sigma }_{n})\). Let \(\nu \) be a probability measure on \(\Sigma \) and \(h:\Sigma \rightarrow [0,1) \) be such that
$$\begin{aligned} \mathcal {T}(s_{1},ds_{2})\ge h(s_{1})\nu (ds_{2}). \end{aligned}$$
(2.1)
We have the following transition rates from the state \((s_{1},z_{1})\in \tilde{\Sigma }\) to the infinitesimal region \((ds_{2}, z_{2})\):
$$\begin{aligned} \tilde{\mathcal {T}}(s_{1},z_{1}; ds_{2},z_{2})=\left\{ \begin{array}{ll} \frac{1-h(s_{2})}{1-h(s_{1})} \big (\mathcal {T}- h\otimes \nu \big ) (s_{1},ds_{2})&{} z_{1}=z_{2}=0 , \\ \frac{h(s_{2})}{1-h(s_{1})} \big (\mathcal {T}- h \otimes \nu \big ) (s_{1},ds_{2})&{} z_{1}=1-z_{2}=0,\\ \big (1- h(s_{2})\big ) \nu (ds_{2}) &{} z_{1}=1- z_{2}=1 , \\ h(s_{2}) \nu (ds_{2}) &{} z_{1}=z_{2}=1. \end{array} \right. \end{aligned}$$
Given a measure \(\mu \) on \(\Sigma \), we refer to its splitting \(\tilde{\mu }\) as the measure on \(\tilde{\Sigma }\) given by
$$\begin{aligned} \tilde{\mu }(ds,z)= \chi (z=0)\big (1-h(s)\big )\mu (ds)+\chi (z=1)h(s)\mu (ds). \end{aligned}$$
(2.2)
In particular, the split chain is taken to have initial distribution given by the splitting of the initial distribution for the original (pre-split) chain. The set \(\Sigma \times 1\) is an atom since the transition measure from \((s_1,1)\) is independent of \(s_1\). Moreover, it is a recurrent atom because our original process is exponentially ergodic to \(\Psi _{\infty ,\lambda }\) (see Appendix 8), and, as a consequence, the split chain is exponentially ergodic with respect to the invariant state \(\tilde{\Psi }_{\infty ,\lambda }\) (see Part (2) of Proposition 2.3) which has \(\tilde{\Psi }_{\infty ,\lambda }(\Sigma \times 1)=\Psi _{\infty ,\lambda }(h)>0\). Notice that the conditional probability that \(z_2=1\) given \(s_1, z_1, s_2\) is determined by a coin with heads-probability \(h(s_2)\).
Using the law for the split chain \((\tilde{\sigma }_{n})\), we may construct a split process \((\tilde{S}_{t})\in \tilde{\Sigma }\) and a sequence of times \(\tilde{\tau }_{n}\) with the recipe below. We refer to pages 1302 and 1306 of [17] for more discussion on the construction. The \(\tilde{\tau }_{n}\) should be thought of as the partition times \(\tau _{n}\) embedded in the split statistics, although we temporarily denote them differently to emphasize their axiomatic role in the construction of the split process. Let \(\tilde{\tau }_{n}\) and \(\tilde{S}_{t}=(S_{t},Z_{t})\) be such that
  1. 1.

    \(0=\tilde{\tau }_{0}\), \(\tilde{\tau }_{n}\le \tilde{\tau }_{n+1}\), and \(\tilde{\tau }_{n}\rightarrow \infty \) almost surely.

     
  2. 2.

    The chain \((\tilde{S}_{\tilde{\tau }_{n}})\) has the same law as \((\tilde{\sigma }_{n})\).

     
  3. 3.

    For \(t\in [\tilde{\tau }_{n},\tilde{\tau }_{n+1})\), then \(Z_{t}=Z_{\tilde{\tau }_{n}}\).

     
  4. 4.

    Conditioned on the information known up to time \(\tilde{\tau }_{n}\) for \(\tilde{S}_{t}\), \(t\in [0,\tilde{\tau }_{n}]\) and \(\tilde{\tau }_{m}\), \(m\le n\), and also the value \(\tilde{S}_{\tilde{\tau }_{n+1}}\), the law for the trajectories \(S_{t}\), \(t\in [\tilde{\tau }_{n},\tilde{\tau }_{n+1}]\) (which refers also to the length \(\tilde{\tau }_{n+1}-\tilde{\tau }_{n}\)) agrees with the law for the original process conditioned on knowing the values \(S_{\tilde{\tau }_{n}}\) and \(S_{\tilde{\tau }_{n+1}}\).

     
The marginal distribution for the first component \(S_{t}\) agrees with the original process and the times \(\tilde{\tau }_{n}\) are independent mean one exponential random variables that are independent of \(S_{t}\). Of course, the times \(\tilde{\tau }_{n}\) are not independent of the process \(\tilde{S}_{t}\), and we note that the increment \( \tilde{\tau }_{n+1}-\tilde{\tau }_{n}\) is not necessarily exponential when conditioned on the state \(\tilde{S}_{\tilde{\tau }_{n}}\). The process \(\tilde{S}_{t}\) is not Markovian although, as emphasized in [21], the process \((S_{t},Z_{t},S_{\tau (t)})\in \Sigma \times \{0,1\}\times \Sigma \) is Markovian, where \(\tau (t)\) is the first partition time \(\tilde{\tau }_{n}\) following time \(t\). Importantly, the strong Markov property for \(\tilde{S}_{t}\) does hold for the times \(\tilde{\tau }_{n}\); see [17, Remark 2.5]. We now drop the tilde from \(\tilde{\tau }_{n}\), and use \((\tilde{\sigma }_{n})\) to denote the sequence \((\tilde{S}_{\tau _{n}})\). We refer to the statistics of the split process by \(\tilde{\mathbb {E}}^{(\lambda )}\) and \(\tilde{\mathbb {P}}^{(\lambda )}\) for expectations and probabilities, respectively.

Now that we have defined the split process \(\tilde{S}_{t}\), we can proceed to define the “life cycles”. Let \(R_{m}'\) be the value \(\tau _{\tilde{n}_{m}}\) for \(\tilde{n}_{m}=min \{ n\in \mathbb {N}\,\big |\,\sum _{k=0}^{n}\chi (Z_{\tau _k}= 1) =m \} \). In other words, \(R_{m}'\) is the \(m\)th partition time to visit the atom set \(\Sigma \times 1\), and we use the convention that \(R_{0}'=0\). Define \(R_{m}\), \(m\ge 1\) to be the partition time following \(R_{m}'\). The \(m\)th life cycle is the time interval \([R_{m},R_{m+1})\). Intuitively, it may at first seem more natural to define \(S_{R_{m}'}\) as the beginning of the life cycle. However, the distribution for \(R_{1}'\) will depend on the initial distribution \(\tilde{S}_{0}\). It is better to consider the beginning of the life cycle to be the partition time \(R_{m}\) following \(R_{m}'\), which has distribution \(\tilde{\nu }\) with respect to information known up to time \(R_{m}'\). Although the conditional distribution for \(\tilde{S}_{R_{m}}\) is independent of the value \(\tilde{S}_{R_{m}'}\in \Sigma \times 1\), successive live cycles \([R_{n-1},R_{n})\), \([R_{n},R_{n+1})\) are obviously not independent since, for instance, there is almost sure convergence \(\lim _{t\nearrow R_{n}} S_{t}=S_{R_{n}}\). Let \(d\mathbf {N}_{t}\) be the counting measure on \({\mathbb R}^{+}\) such that \(\int _{(t_{1},t_{2}]} d\mathbf {N}_{r}=\mathbf {N}_{t_{2}}-\mathbf {N}_{t_{1}}\) for \(0\le t_{1}<t_{2}\), i.e., the number of partition times over the interval \((t_{1},t_{2}]\). The following proposition lists some independence properties that follow closely from the construction of the split process. The measure \(\nu \) in the statement of Proposition 2.1 can be regarded as a generic normalized measure satisfying (2.1) for some \(h:\Sigma \rightarrow [0,1]\) although we will choose it to be of the specific form in Convention 2.2 later in the text.

Proposition 2.1

  1. 1.

    The distribution for \(\tilde{S}_{R_{n}}\) is \(\tilde{\nu }\) when conditioned on all information known up to time \(R_{n}'\): \(\tilde{\mathcal {F}}_{R_{n}'}\).

     
  2. 2.

    The sequence of trajectories \(\big (S_{t},\, d\mathbf {N}_{t} : \, t\in [R_{n},R_{n+1}'] \big ) \) are i.i.d. for \(n\ge 1\), and \(\big (S_{t},\, d\mathbf {N}_{t} : \, t\in [R_{n},R_{n+1}'] \big ) \) is independent of \(\big (\tilde{S}_{t},\,d\mathbf {N}_{t}: \, t\notin (R_{n}',R_{n+1}) \big ) \).

     
  3. 3.

    The trajectory \(\big (\tilde{S}_{t},\,d\mathbf {N}_{t}: \, t\in [R_{n},R_{n+1}] \big ) \) is independent of \(\big (\tilde{S}_{t},\, d\mathbf {N}_{t}: \, t\notin (R_{n}',R_{n+2}) \big ) \). In particular, \(\big (\tilde{S}_{t},\, d\mathbf {N}_{t}: \, t\in [R_{n},R_{n+1}] \big ) \) is independent of \(\big (\tilde{S}_{t},\, d\mathbf {N}_{t}: \, t\in [R_{m},R_{m+1}] \big ) \) for \(|n-m|\ge 2\).

     

Proof

Statement (1), which is given in [21, Prop. 2.13], follows immediately from the construction. Statements (2) and (3) follow from Part (1), the strong Markov property at the times \(R_{n}\), and the independence of the partition times from the past [21, Prop. 2.6]. For instance, \(\tilde{S}_{R_{n}}\) has distribution \(\tilde{\nu }\) independently of \(\big (S_{t},\, d\mathbf {N}_{t} : \, t\in [0,R_{n}'] \big ) \) by Part (1). By the strong Markov property for \(\tilde{S}_{t}\) at the time \(R_{n}\), the trajectory \(\big (S_{t} : \, t\in [R_{n},R_{n+1}'] \big ) \) is independent of \(\big (\tilde{S}_{t},\, d\mathbf {N}_{t} : \, t\in [0,R_{n}'] \big ) \) when given the state \(\tilde{S}_{R_{n}}\) and has the same law as \(\big (S_{t}: \, t\in [0 ,R_{1}'] \big ) \) when \(\tilde{S}_{0}\) has distribution \(\tilde{\nu }\). The partition times \(\tau _{m}\) over the interval \([R_{n},R_{n}']\), encoded by \(\int _{R_{n}}^{t} d\mathbf {N}_{r}\) for \(t\in [R_{n},R_{n+1}'] \), are independent of \(\big (S_{t},\, d\mathbf {N}_{t} : \, t\in [0,R_{n}'] \big ) \) by [21, Prop. 2.6] (and also independent of the process \(S_{t}\) for all \(t\in {\mathbb R}^{+}\)). \(\square \)

Unfortunately, notation multiplies when the splitting structure is invoked. For easier reference, we list the following frequently used symbols:
\(\tilde{S}_{t}=(S_{t},Z_{t})\)

State of the split process at time \(t\)

\(\tau _{m}\in {\mathbb R}^{+}\)

\(m\)th partition time

\(\tilde{\sigma }_{m}= \tilde{S}_{\tau _{m}}\)

\(m\)th state of the split chain

\((\sigma _{m} , \zeta _{m}) =\tilde{\sigma }_{m}\)

\(\sigma _{m}\) and \(\zeta _{m}\) are the state and binary components, respectively, of \(\tilde{\sigma }_{m}\)

\(\mathbf {N}_{t}\in \mathbb {N}\)

Number of partition times \(\tau _{m}\), \(m\ge 1\) to occur up to time \(t\)

\(R_{m}' \in {\mathbb R}^{+}\)

\(m\)th partition time visiting the set \(\Sigma \times 1\)

\(R_{m} \in {\mathbb R}^{+}\)

Partition time succeeding \(R_{m}'\) and the beginning of the \(m\)th life cycle

\(\tilde{N}_{t} \in \mathbb {N}\)

Number of returns to the atom up to time \(t\)

\(\tilde{n}_{m} \in \mathbb {N}\)

Number of partition times in the interval \((0,R_{m}]\)

\(\mu \rightarrow \tilde{\mu }\)

The splitting of a measure \(\mu \) on \(\Sigma \) as defined in (2.2)

\(\mathcal {F}_{t}\)

Information up to time \(t\) for the original process \(S_{r}\) and the \(\tau _{m}\)

\(\tilde{\mathcal {F}}_{t}\)

Information up to time \(t\) for the split process \(\tilde{S}_{r}\) and the \(\tau _{m}\)

\(\tilde{\mathcal {F}}_{t}'\)

Information for \(\tilde{S}_{t}\) and the \(\tau _{m}\) before time \(R_{n+1}\), where \(R_{n}'\le t<R_{n+1}'\) plus knowledge of the time \( R_{n+1}\) itself

If \(\mathbf {t}\) is a partition time, e.g., \(\mathbf {t}=\tau _{n}\) or \(\mathbf {t}=R_{n}\), the \(\sigma \)-algebra \(\tilde{\mathcal {F}}_{\mathbf {t}^{-}}\) will refer to all information before time \(\mathbf {t}\) plus the information that \(\mathbf {t}\) is a partition time.

We will henceforth attach the subscript \(\lambda \) to the transition map \( \mathcal {T}\) to emphasize the dependence of the dynamics on this parameter. There is some flexibility in the choice of \(\nu \) and \(h\) in the criterion (2.1), although choosing them to be independent of \(\lambda >0\) adds a little extra constraint. By Part (1) of Proposition 2.3, we can select a pair \(\nu \), \(h\) that is independent of \(\lambda \), and where both are functions of the energy. We will use the symbol \(\nu \) for both the measure and the corresponding density.

Convention 2.2

We take \(\nu \) and \(h\) of the form
$$\begin{aligned} h(s)= \mathbf {u} \frac{ \chi \big (H(s)\le l \big )}{ U} \quad \text {and} \quad \nu (ds)= ds \frac{ \chi \big (H(s)\le l \big )}{U}, \end{aligned}$$
where \(l:=1+2\sup _{x}V(x)\), \(U>0\) is the normalization constant of \(\nu \), and \(\mathbf {u}\in (0,U)\) is from Part (1) of Proposition 2.3.

The compact support of \(h:\Sigma \rightarrow [0,1]\) implies that the extended state space for the split dynamics is effectively \( \Sigma \times 0 \cup {{\mathrm{supp}}}(h) \times 1 \subset \tilde{\Sigma } \) since other states in \( \tilde{\Sigma }=\Sigma \times \{0,1\} \) will not be visited. Any supremum, minimum, etc. over \(\tilde{\Sigma }\) refers to this contracted set. Parts (2) and (3) of the proposition below are elementary consequences of the splitting structure defined above and the proof is contained in Sect. 7.1.

Proposition 2.3

  1. 1.
    There is a constant \(\mathbf {u} >0\) such that the \(h\) and \(\nu \) in Convention 2.2 satisfy \(\mathcal {T}_{\lambda }(s,ds' )\ge h(s)\nu (ds')\,\) for all \(s,s'\in \Sigma \) and \(\lambda <1\). Also, the transition measures \(\mathcal {T}_{\lambda }(s,ds')\) have densities over the domains \(\{s'\in \Sigma \,\big |\, H(s')\ne H(s)\}\), which have the following bound
    $$\begin{aligned} \sup _{\lambda \le 1}\mathop {{\mathrm {ess}}\,{\mathrm {sup}}}\limits _{ \begin{array}{c} H(s)>l\\ H(s)\ne H(s') \end{array}}\frac{{\mathcal {T}}_{\lambda }(s,ds^{\prime })}{ds^{\prime }}<\infty . \end{aligned}$$
     
  2. 2.
    The invariant state of both the split chain \((\tilde{\sigma }_{n})\) and the split process \((\tilde{S}_{t})\) is the splitting of the invariant state of the original process, i.e.,
    $$\begin{aligned} \tilde{\Psi }_{\infty ,\lambda }(s,0)= \big (1-h(s)\big )\Psi _{\infty ,\lambda }(s)\quad \text {and}\quad \tilde{\Psi }_{\infty ,\lambda }(s,1)= h(s)\Psi _{\infty ,\lambda }(s). \end{aligned}$$
    Thus, the “atom” has measure \( \int _{\Sigma }ds h(s)\Psi _{\infty ,\lambda }(s) >0 \).
     
  3. 3.
    If \(\mathbf {t}\) is a partition time, the distribution for \(\tilde{S}_{\mathbf {t}}\) conditioned on \(\tilde{\mathcal {F}}_{\mathbf {t}^{-}}\) is the splitting of the \(\delta \)-distribution at \(S_{\mathbf {t}}\):
    $$\begin{aligned} \tilde{\delta }_{ S_{\mathbf {t}} }(s,z)=\delta (s-S_{\mathbf {t}})\big (\chi (z=0)\big (1-h(S_{\mathbf {t}})\big )+\chi (z=1)h(S_{\mathbf {t}})\big ). \end{aligned}$$
    In particular, \(\tilde{\mathbb {P}}^{(\lambda )}\big [Z_{\mathbf {t} }= 1 \,\big | \,\tilde{\mathcal {F}}_{\mathbf {t}^{-}}\big ]=h(S_{\mathbf {t}})\). The strong Markov property at the time \(\mathbf {t}\) and stationarity give us that
    $$\begin{aligned} \mathcal {L}\big ((\tilde{S}_{\mathbf {t}+r})\,\big |\, \tilde{\mathcal {F}}_{\mathbf {t}^{-}}\big )= \mathcal {L}_{\tilde{\delta }_{ S_{\mathbf {t}}}}\big ((\tilde{S}_{r }) \big ), r\in {\mathbb R}^{+}, \end{aligned}$$
    where \(\mathcal {L}_{\mu }\) refers to the law starting from the distribution \(\mu \).
     
Besides the nearly independent behavior of the process \(\tilde{S}_{t}\) over the intervals \([R_{m},R_{m+1})\), the payoff for introducing the splitting structure includes the closed formulas in Proposition 2.4. Part (2) of Proposition 2.4 is a special case of [21, Prop. 2.20], which applies also to null-recurrent processes. For Part (3) and (4) of the proposition below, \(\mathfrak {R}^{(\lambda )}\) is the reduced resolvent of the backward generator \(\mathcal {L}^{*}_{\lambda }\)
$$\begin{aligned} \mathfrak {R}^{(\lambda )}g=\int \limits _{0}^{\infty }dr e^{r\mathcal {L}^{*}_{\lambda }}(g), \end{aligned}$$
which operates on \(g\in L^{\infty }(\Sigma )\) with \(\Psi _{\infty ,\lambda }(g)=0 \). The reduced resolvent is well-defined since the process \(S_{t}\) is exponentially ergodic for any fixed \(\lambda >0\). As \(\lambda \searrow 0\) the expression in Part (4) is related to the diffusion constant \(\kappa \) appearing in [7, Theorem 1.1].

Proposition 2.4

  1. 1.
    For \(g\in L^{\infty }(\tilde{\Sigma })\),
    $$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\sum _{m=0}^{\tilde{n}_{1}} g(\tilde{\sigma }_{m}) \Big ]=\tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\sum _{m=1}^{\tilde{n}_{1}+1} g(\tilde{\sigma }_{m}) \Big ]=\frac{ \int _{\tilde{\Sigma }}d\tilde{s}\tilde{\Psi }^{(\lambda )}_{\infty }(\tilde{s}) g(\tilde{s})}{\int _{\Sigma }ds\Psi ^{(\lambda )}_{\infty }(s) h(s) }. \end{aligned}$$
    In particular, if \(g\in L^{\infty }(\Sigma )\) does not depend on the binary variable, then the numerator on the right side above is equal to \(\int _{\tilde{\Sigma }}d\tilde{s}\tilde{\Psi }^{(\lambda )}_{\infty }(\tilde{s}) g(\tilde{s})=\int _{\Sigma }ds \Psi _{\infty ,\lambda }(s) g(s)\).
     
  2. 2.
    For \(g\in L^{\infty }(\Sigma )\),
    $$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\int \limits _{0}^{R_{1}}dr g(S_{r}) \Big ]=\frac{ \int _{\Sigma }ds\Psi _{\infty ,\lambda }(s) g(s) }{\int _{\Sigma }ds\Psi _{\infty ,\lambda }(s) h(s)}. \end{aligned}$$
     
  3. 3.
    For \(g\in L^{\infty }(\Sigma )\) with \(\Psi _{\infty ,\lambda }(g)=0\) and \(s_{1},s_{2}\in \Sigma \),
    $$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\delta }_{s_{1}}}^{ (\lambda )} \Big [\int \limits _{0}^{R_{1}}dr g(S_{r}) \Big ]- \tilde{\mathbb {E}}_{\tilde{\delta }_{s_{2}}}^{ (\lambda )} \Big [\int \limits _{0}^{R_{1}}dr g(S_{r}) \Big ] =\big (\mathfrak {R}^{(\lambda )}g\big )(s_{1})-\big (\mathfrak {R}^{(\lambda )}g\big )(s_{2}), \end{aligned}$$
    where \(\tilde{\delta }_{s}\) is the splitting of the \(\delta \)-measure at \(s\in \Sigma \).
     
  4. 4.
    For \(g\in L^{\infty }(\Sigma )\) with \(\Psi _{\infty ,\lambda }(g)=0\),
    $$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\int \limits _{0}^{R_{1}}dr g(S_{r}) \int \limits _{r}^{R_{2}}dr'g(S_{r'}) \Big ]= \frac{ \int _{\Sigma }ds\Psi _{\infty ,\lambda }(s) g(s)\big (\mathfrak {R}^{(\lambda )}g\big )(s)}{ \int _{\Sigma }ds\Psi _{\infty ,\lambda }(s)h(s)}. \end{aligned}$$
     

Proof

Part (1): This follows as a general fact for split chains when the original chain \(\sigma _{n}\) is positive recurrent with normalizable invariant measure \(\Psi _{\infty ,\lambda }\). As mentioned in the proof of [28, Theorem 3], the measure \(\beta \) on \(\tilde{\Sigma }\) given by
$$\begin{aligned} \beta (g)= \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\sum _{m=0}^{\tilde{n}_{1}} g(\tilde{\sigma }_{m}) \Big ], \quad g\in L^{\infty }(\tilde{\Sigma }), \end{aligned}$$
(2.3)
satisfies \( \beta (h)=1 \) and is invariant for the split chain dynamics, i.e., \(\beta \tilde{\mathcal {T}}_{\lambda }=\beta \). Since the dynamics is positive recurrent with invariant state \(\tilde{\Psi }^{(\lambda )}_{\infty }\), these features uniquely determine the above measure by the explicit form
$$\begin{aligned} \beta (g)=\frac{ \int _{\tilde{\Sigma }}d\tilde{s}\tilde{\Psi }^{(\lambda )}_{\infty }(\tilde{s}) g(\tilde{s})}{\int _{\Sigma }ds\Psi ^{(\lambda )}_{\infty }(s) h(s)}. \end{aligned}$$
The distribution for \(\tilde{\sigma }_{m}\) when \(m=0\) and \(m=\tilde{n}_{1}+1\) is \(\tilde{\nu }\), so the summation of \( g(\tilde{\sigma }_{m})\) over \([1,\tilde{n}_{1}+1]\) rather than \([0,\tilde{n}_{1}]\) in (2.3) yields the same result.
Part (2): Let \(g_{n}^{(\lambda )}:\Sigma ^{2}\rightarrow {\mathbb R}\) and \(\mathbf {g}^{(\lambda )}:\Sigma \rightarrow {\mathbb R}\) be defined as
$$\begin{aligned} g_{n}^{(\lambda )}(s,s')&:= \mathbb {E}_{s}^{(\lambda )}\Big [\Big (\int \limits _{0}^{\tau _{1}}dr g(S_{r})\Big )^{n}\,\Big |\,s'=S_{\tau _{1} } \Big ]\\ \mathbf {g}^{(\lambda )}(s)&:= \mathbb {E}_{s}^{(\lambda )}\Big [\int \limits _{0}^{\tau _{1}}dr g(S_{r}) \Big ]. \end{aligned}$$
Also define \(\tilde{\mathbf {g}}^{(\lambda )}:\tilde{\Sigma }\rightarrow {\mathbb R}\) analogously to \(\mathbf {g}^{(\lambda )}(s)\) with \(\mathbb {E}_{s}^{(\lambda )}\) replaced by \(\tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\).
We have the following equalities:
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\int \limits _{0}^{R_{1}}dr g(S_{r}) \Big ]&= \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\sum _{n=0}^{\tilde{n}_{1}}\tilde{\mathbb {E}}^{(\lambda )}\Big [\int \limits _{\tau _{n}}^{\tau _{n+1}}dr g(S_{r})\,\Big |\tilde{\sigma }_{n},\, \tilde{\sigma }_{n+1} \Big ] \Big ] =\tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\sum _{n=0}^{\tilde{n}_{1}} g_{1}^{(\lambda )}(\sigma _{n},\sigma _{n+1})\Big ]\nonumber \\&= \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\sum _{n=0}^{\tilde{n}_{1}} \tilde{\mathbf {g}}^{(\lambda )}(\tilde{\sigma }_{n})\Big ]=\frac{ \int _{\tilde{\Sigma }}d\tilde{s}\tilde{\Psi }_{\infty ,\lambda }(\tilde{s}) \tilde{\mathbf {g}}^{(\lambda )}(\tilde{s})}{ \int _{\Sigma }ds\Psi _{\infty ,\lambda }(s) h(s)}, \end{aligned}$$
(2.4)
where the second equality holds because the statistics for \(S_{r}\) over an interval \((\tau _{n},\tau _{n+1})\) given the values \(\tilde{\sigma }_{n}=(\sigma _{n},\zeta _{n}),\,\tilde{\sigma }_{n+1}=(\sigma _{n+1},\zeta _{n+1})\) is independent of \(\zeta _{n}\), \(\zeta _{n+1}\) and is the same for the split and the original dynamics. The fourth equality is from Part (1).
The numerator of the expression on the right side of (2.4) can be rewritten as follows:
$$\begin{aligned} \int \limits _{\tilde{\Sigma }}d\tilde{s}\tilde{\Psi }_{\infty ,\lambda }(\tilde{s}) \tilde{\mathbf {g}}^{(\lambda )}(\tilde{s})&= \int \limits _{\Sigma }ds\Psi _{\infty ,\lambda }(s) \mathbf {g}^{(\lambda )}(s) = \int \limits _{\Sigma }ds\Psi _{\infty ,\lambda }(s) \mathbb {E}_{s}^{(\lambda )}\Big [\int \limits _{0}^{\tau _{1}}dr g(S_{r}) \Big ]\nonumber \\&= \int \limits _{\Sigma }ds\Psi _{\infty ,\lambda }(s) \mathbb {E}_{s}^{(\lambda )}\Big [\int \limits _{0}^{\infty }dr e^{-r}g(S_{r}) \Big ] \nonumber \\&= \int \limits _{0}^{\infty } dr e^{-r} \left( \int \limits _{\Sigma }ds \Psi _{\infty ,\lambda }(s) \mathbb {E}_{s}^{(\lambda )}\big [g(S_{r}) \big ] \right) \nonumber \\&= \int \limits _{0}^{\infty } dr e^{-r} \left( \int \limits _{\Sigma }ds\Psi _{\infty ,\lambda }(s) g(s) \right) =\int \limits _{\Sigma }ds\Psi _{\infty ,\lambda }(s) g(s). \end{aligned}$$
(2.5)
The first equality uses that \(\tilde{\Psi }_{\infty ,\lambda }\) has the split form in Part (2) of Proposition 2.3, and the third equality holds since \(\tau _{1}\) is a mean one exponential independent of \(S_{t}\) in the original statistics. The fourth equality is Fubini, and the fifth is due to the stationarity of \(\Psi _{\infty ,\lambda }\).
Part (3): The reduced resolvent \(\mathfrak {R}^{(\lambda )}\) is the pointwise limit given by
$$\begin{aligned} \big (\mathfrak {R}^{(\lambda )}g\big )(s)&= \lim _{\gamma \searrow 0 }\mathbb {E}^{ (\lambda )}_{s}\Big [\int \limits _{0}^{\infty }dr e^{-r\gamma } g(S_{r}) \Big ] =\lim _{\gamma \searrow 0}\tilde{\mathbb {E}}^{ (\lambda )}_{\tilde{\delta }_{s}}\Big [\int \limits _{0}^{\infty }dre^{-r\gamma } g(S_{r}) \Big ], \end{aligned}$$
where the second equality embeds the expectation in the split statistics. However, for \(s_{1},s_{2}\in \Sigma \),
$$\begin{aligned} \Big (\tilde{\mathbb {E}}^{ (\lambda )}_{\tilde{\delta }_{s_{1}} }-\tilde{\mathbb {E}}^{ (\lambda )}_{\tilde{\delta }_{s_{2}} }\Big )\Big [\int \limits _{0}^{\infty }dr e^{-r\gamma } g(S_{r}) \Big ]=\Big (\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\delta }_{s_{1}} }-\tilde{\mathbb {E}}^{ (\lambda )}_{\tilde{\delta }_{s_{2}} }\Big )\Big [\int \limits _{0}^{R_{1}}dr e^{-r\gamma } g(S_{r}) \Big ] \end{aligned}$$
since the distribution for the state \(\tilde{S}_{R_{1}} \) is \(\tilde{\nu }\) regardless of the initial measure. Using the above equalities, we have that
$$\begin{aligned} \big (\mathfrak {R}^{(\lambda )}g\big )(s_{1})-\big (\mathfrak {R}^{(\lambda )}g\big )(s_{2})&= \lim \limits _{\gamma \searrow 0 }\Big (\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\delta }_{s_{1}} }-\tilde{\mathbb {E}}^{ (\lambda )}_{\tilde{\delta }_{s_{2}} }\Big )\Big [\int \limits _{0}^{R_{1}}dr e^{-r\gamma } g(S_{r}) \Big ]\\&=\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\delta }_{s_{1}}}\Big [\int \limits _{0}^{R_{1}}dr g(S_{r}) \Big ]-\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\delta }_{s_{1}}}\Big [\int \limits _{0}^{R_{1}}dr g(S_{r}) \Big ], \end{aligned}$$
where the limits are well-defined since the process \(\tilde{S}_{t}\) is positive-recurrent and hence \(\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{s}}[R_{1}]\) is finite for all \(\tilde{s}\in \tilde{\Sigma }\).
Part (4): Notice that
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\int \limits _{0}^{R_{1}}dr g(S_{r}) \int \limits _{r}^{R_{2}}dr'g(S_{r'}) \Big ]&= \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\sum _{n=0}^{\tilde{n}_{1}}\tilde{\mathbb {E}}^{(\lambda )}\Big [\int \limits _{\tau _{n}}^{\tau _{n+1} }dr g(S_{r}) \int \limits _{r}^{R_{2}}dr'g(S_{r'})\,\Big |\,\tilde{\mathcal {F}}_{\tau _{n}^{-} }\Big ] \Big ] \nonumber \\&= \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )} \Big [\sum _{n=0}^{\tilde{n}_{1}}\mathbf {f}^{(\lambda )}(\sigma _{n}) \Big ], \end{aligned}$$
(2.6)
where \(\mathbf {f}^{(\lambda )}:\Sigma \rightarrow {\mathbb R}\) is defined as
$$\begin{aligned} \mathbf {f}^{(\lambda )}(s):= \tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{ (\lambda )} \Big [\int \limits _{0}^{\tau _{1}}dr g(S_{r}) \int \limits _{r}^{R_{2}}dr'g(S_{r'}) \Big ]. \end{aligned}$$
The equality (2.6) is a consequence of Part (3) of Proposition 2.3 and uses the strong Markov property at the times \(\tau _{n}\) for \(n\in [0,\tilde{n}_{1}]\). The function \(\mathbf {f}^{(\lambda )}(s) \) can be rewritten as
$$\begin{aligned} \mathbf {f}^{(\lambda )}(s)&= \tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\int \limits _{0}^{\tau _{1}}drg(S_{r})\int \limits _{r}^{\tau _{1}}dv g(S_{v})+ \Big (\int \limits _{0}^{\tau _{1}}dr g(S_{r})\Big )\tilde{\mathbb {E}}^{ (\lambda )}\Big [\int \limits _{\tau _{1}}^{R_{2}}dr g(S_{r})\,\Big |\,\tilde{\mathcal {F}}_{\tau _{1}^{-}} \Big ] \Big ] \nonumber \\&= \tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\int \limits _{0}^{\tau _{1}}dr g(S_{r})\int \limits _{r}^{\tau _{1}}dv g(S_{v})+ \Big (\int \limits _{0}^{\tau _{1}}dr g(S_{r})\Big )\big (\mathfrak {R}^{(\lambda )}g\big )\big (S_{\tau _{1}}\big )+c \int \limits _{0}^{\tau _{1}}dr g(S_{r}) \Big ] \nonumber \\&= \mathbb {E}_{s}^{(\lambda )}\Big [\int \limits _{0}^{\tau _{1}}dr g(S_{r}) \big (\mathfrak {R}^{(\lambda )}g\big )\big (S_{r}\big ) +c\int \limits _{0}^{\tau _{1}}dr g(S_{r}) \Big ], \end{aligned}$$
(2.7)
where \(c\in {\mathbb R}\) is the constant such that \( (\mathfrak {R}^{(\lambda )}g)(s)+c=\tilde{\mathbb {E}}^{ (\lambda )}_{\tilde{\delta }_{s}}\big [\int _{0}^{R_{1}}dr g(S_{r})\big ]\) for all \(s\in \Sigma \), which exists by Part (3). The value for \(c\) depends on \(g\) and the choice of \(\nu \), \(h\) defining the Nummelin splitting. For the second equality, we have used that
$$\begin{aligned} \tilde{\mathbb {E}}^{ (\lambda )}\Big [\int \limits _{\tau _{1}}^{R_{2}}dr g(S_{r})\,\Big |\,\tilde{\mathcal {F}}_{\tau _{1}^{-}} \Big ] =\tilde{\mathbb {E}}^{ (\lambda )}_{\tilde{\delta }_{S_{\tau _{1}}} }\Big [\int \limits _{0}^{R_{1}}dr g(S_{r}) \Big ] = \big (\mathfrak {R}^{(\lambda )}g\big )\big (S_{\tau _{1}}\big )+c, \end{aligned}$$
where the first equality is by Part (3) of Proposition 2.3. The third equality of (2.7) follows by replacing \(\tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\) with \( \mathbb {E}_{s}^{(\lambda )}\) and using a nested conditional expectation with respect to \(\mathcal {F}_{r}\) for \(r\le \tau _{1}\):
$$\begin{aligned} \mathbb {E}^{(\lambda )}\Big [\int \limits _{r}^{\tau _{1}}dv g(S_{v}) \!+\!\big (\mathfrak {R}^{(\lambda )}g\big )\big (S_{\tau _{1}}\big )\,\Big |\,\mathcal {F}_{r} \Big ] \!=\!\mathbb {E}_{S_{r}}^{(\lambda )}\Big [\int \limits _{0}^{\tau _{1}}dv g(S_{v}) \!+\!\big (\mathfrak {R}^{(\lambda )}g\big )\big (S_{\tau _{1}}\big ) \Big ]\!=\! \big (\mathfrak {R}^{(\lambda )}g\big )\big (S_{r}\big ). \end{aligned}$$
We can then plug our expression (2.7) for \( \mathbf {f}^{(\lambda )}(s)\) into (2.6) and invert the first two steps of the proof to obtain the first equality below:
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\int \limits _{0}^{R_{1}}dr g(S_{r}) \int \limits _{r}^{R_{2}}dr'g(S_{r'}) \Big ]&= \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\int \limits _{0}^{R_{1}}dr g(S_{r}) \big (\mathfrak {R}^{(\lambda )}g\big )\big (S_{r}\big ) +c \int \limits _{0}^{R_{1}}dr g(S_{r}) \Big ]\nonumber \\&= \frac{\int _{\Sigma }ds\Psi _{\infty ,\lambda }(s) \big [\big (\mathfrak {R}^{(\lambda )}g\big )(s)\mathbf {g}^{(\lambda )}(s) +cg(s) \big ]}{ \int _{\Sigma }ds\Psi _{\infty ,\lambda }(s) h(s)}. \end{aligned}$$
(2.8)
The second equality follows by Part (2), and the constant \(c\) disappears from the expression since \(\Psi _{\infty ,\lambda }(g)=0\). \(\square \)

The following proposition lists a few martingales related to the number \(\tilde{N}_{t}\) of returns to the atom up to time \(t\in {\mathbb R}^{+}\).

Proposition 2.5

For the split statistics, \( \tilde{N}_{t} - \sum _{n=1}^{\mathbf {N}_{t}}h(S_{\tau _{n}})\) is a martingale with respect to the filtration \(\tilde{\mathcal {F}}_{t}\). For the original statistics, \(\sum _{n=1}^{\mathbf {N}_{t}}h(S_{\tau _{n}}) - \int _{0}^{t}dr h(S_{r})\) is a martingale with respect to \(\mathcal {F}_{t}\). In particular,
$$\begin{aligned} \tilde{\mathbb {E}}^{(\lambda )}\big [\tilde{N}_{t} \big ]=\mathbb {E}^{(\lambda )}\Big [\int \limits _{0}^{t}dr h(S_{r}) \Big ]. \end{aligned}$$

Proof

The difference
$$\begin{aligned} \tilde{N}_{t} - \sum _{n=1}^{\mathbf {N}_{t}}h(S_{\tau _{n}})= \sum _{n=1}^{\mathbf {N}_{t}}\big (\chi (Z_{\tau _{n}}=1)-h(S_{\tau _{n}})\big ) \end{aligned}$$
is a martingale since for \(t< \tau _{n}\) the increments satisfy
$$\begin{aligned} \mathbb {E}\big [\chi (Z_{\tau _{n}}=1)-h(S_{\tau _{n}})\,\big |\,\tilde{\mathcal {F}}_{t} \big ]&= \mathbb {E}\big [\mathbb {P}^{(\lambda )}[Z_{\tau _{n}}=1\,|\, \tilde{\mathcal {F}}_{ \tau _{n}^{-}}]-h(S_{\tau _{n}}) \,\big |\,\tilde{\mathcal {F}}_{t} \big ]=0, \end{aligned}$$
(2.9)
where the second equality holds because \(\mathbb {P}^{(\lambda )}[Z_{\tau _{n}}=1\,|\, \tilde{\mathcal {F}}_{ \tau _{n}^{-}} ]=h(S_{\tau _{n}})\) by Part (3) of Proposition 2.3. The difference \(\sum _{n=1}^{\mathbf {N}_{t}}h(S_{\tau _{n}}) - \int _{0}^{t}dr h(S_{r})\) is a martingale according to the original law since the contributions \(h(S_{\tau _{n}})\) occur with Poisson rate \(1\). \(\square \)

3 The Frequency of Returns to the Atom

Sections 3.1 and 3.2 effectively bound the frequency of returns to the atom from above and below, respectively.

3.1 Bounding the Number of Returns to the Atom

Recall that \(\tilde{N}_{t}\) is defined for the split process as the number of returns to the atom set up to time \(t\in {\mathbb R}^{+}\). We will now focus on bounding the expectation of \(\tilde{N}_{t}\) for \(t=\frac{T}{\lambda }\) in the limit of small \(\lambda \). By Proposition 2.5 the expectation of \( \tilde{N}_{t}\) with respect to the split statistics is equal to the expectation of \(\int _{0}^{t}dr h(S_{r})\) with respect to the original statistics. The time integral of the process \( h(S_{t})\) keeps track of the amount of time that \(S_{t}\) loiters in the low momentum region where \(h:\Sigma \rightarrow {\mathbb R}^+\) has support and the life cycles regenerate. However, it is useful to work with a process that serves the same purpose as \(\int _{0}^{t}dr h(S_{r})\) but that is easier to handle. A convenient option is the increasing part of the drift \(\mathbf {A}_{t}^{+}\) in the semi-martingale decomposition for \( \mathbf {Q}_{t}:=(2H_{t})^{\frac{1}{2}} \), which increases at a decaying rate away from the low momentum region; see the discussion below and Part (2) of Proposition 3.1. Functions of the energy \(H(x,p)=\frac{1}{2}p^{2}+V(x)\) have the advantage of being invariant under the Hamiltonian evolution, which makes energy related quantities a desirable starting point for gaining some control over the typical behavior of the dynamics.

Define the functions \(\mathcal {E}_{\lambda }:{\mathbb R}\rightarrow {\mathbb R}^{+}\) and \(\mathcal {A}_{\lambda },\mathcal {V}_{\lambda ,n},\mathcal {V}_{\lambda ,n}^{+},\mathcal {K}_{\lambda ,n}:\Sigma \rightarrow {\mathbb R}\) as
$$\begin{aligned} \mathcal {E}_{\lambda }(p)&= \int \limits _{{\mathbb R}}dp^{\prime }\mathcal {J}_{\lambda }(p,p^{\prime }),\\ \mathcal {A}_{\lambda }(x,p)&= \int \limits _{{\mathbb R}}dp^{\prime } \Big (2^{\frac{1}{2}}H^{\frac{1}{2}}(x,p^{\prime })- 2^{\frac{1}{2}}H^{\frac{1}{2}}(x,p) \Big )\mathcal {J}_{\lambda }(p,p^{\prime }), \\ \mathcal {V}_{\lambda ,n}(x,p)&= \int \limits _{{\mathbb R}}dp^{\prime } \Big (2^{\frac{1}{2}}H^{\frac{1}{2}}(x,p^{\prime })- 2^{\frac{1}{2}}H^{\frac{1}{2}}(x,p) \Big )^{2n} \mathcal {J}_{\lambda }(p,p^{\prime }) , \\ \mathcal {V}_{\lambda ,n}^{+}(x,p)&= \int \limits _{{\mathbb R}}dp^{\prime } \Big | H^{\frac{1}{2}}(x,p^{\prime })- H^{\frac{1}{2}}(x,p)\Big |^{n}\chi \big (|p'|>|p|\big ) \mathcal {J}_{\lambda }(p,p^{\prime }) , \\ \mathcal {K}_{\lambda ,n}(x,p)&= \int \limits _{{\mathbb R}}dp^{\prime } \Big | H^{\frac{1}{2}}(x,p^{\prime })- H^{\frac{1}{2}}(x,p) -\frac{\mathcal {A}_{\lambda }(x,p)}{ \mathcal {E}_\lambda (p)}\Big |^{n} \mathcal {J}_{\lambda }(p,p^{\prime }). \end{aligned}$$
Also define \(\mathcal {A}_{\lambda }^{\pm }(s)= \max (\pm \mathcal {A}_{\lambda }(s),0)\) to be the positive and negative parts of \(\mathcal {A}_{\lambda }\). We will often denote \(\mathcal {V}_{\lambda ,1} \) as \(\mathcal {V}_{\lambda } \). Let \(\mathbf {M}_{t}\) and \(\mathbf {A}_{t}\) be the martingale and predictable parts in the semi-martingale decomposition of \(\mathbf {Q}_{t}\) in which both are initially zero:
$$\begin{aligned} \mathbf {Q}_{t}= \big (2 H_{t} \big )^{\frac{1}{2}}=\mathbf {Q}_{0}+\mathbf {M}_{t}+\mathbf {A}_{t}. \end{aligned}$$
The predictable component has the form \(\mathbf {A}_{t}=\int _{0}^{t}dr\mathcal {A}_{\lambda }(X_r,P_r) \). By defining \(\mathbf {A}_{t}^{\pm }:=\int _{0}^{t}dr\mathcal {A}_{\lambda }^{\pm }(X_r,P_r) \) for \(\mathcal {A}_{\lambda }^{\pm }(s):=max (\pm \mathcal {A}_{\lambda }(s),0)\), the predictable component can be written as the difference \(\mathbf {A}_{t}=\mathbf {A}_{t}^{+}-\mathbf {A}_{t}^{-}\). The martingale \(\mathbf {M}_{t}\) has predictable quadratic variation \(\langle \mathbf {M}\rangle _{t}=\int _{0}^{t}dr\mathcal {V}_{\lambda }(X_{r},P_{r})\).

The following proposition states some basic facts for the functions \(\mathcal {A}_{\lambda }^{\pm } \), \( \mathcal {V}_{\lambda ,n} \), \(\mathcal {V}_{\lambda ,n}^{+}\), and \(\mathcal {K}_{\lambda ,n}\). The proofs of Parts 1–4 of Proposition 3.1 are placed in Sect. 7.2, and we do not include the proofs of Parts 5-7 which require similar calculus-based arguments. The function \(\mathcal {D}_\lambda :{\mathbb R}\rightarrow {\mathbb R}\) in Part (1) of Proposition 3.1 is the drift rate in momentum due to collisions: \(\mathcal {D}_{\lambda }(p)=\int _{{\mathbb R}}dp^{\prime }(p^{\prime }-p) {\mathcal {J}}_{\lambda }(p,p^{\prime })\).

Proposition 3.1

There exist \(c,C,C_{n}>0\) such that for \(\lambda \) small enough the statements below hold.
  1. 1.

    For all \((x,p)\in \Sigma \), \(\mathcal {A}_{\lambda }^{-}(x,p)\le |\mathcal {D}_\lambda (p)|\). In particular, \(\mathcal {A}_{\lambda }^{-}(x,p)\le C(\lambda |p|+\lambda ^{2}p^{2})\).

     
  2. 2.

    For all \((x,p)\in \Sigma \), \(\mathcal {A}_{\lambda }^{+}(x,p)\le \frac{C}{1+p^{2}}\).

     
  3. 3.

    As \(\lambda \rightarrow 0\), we have \(\int _{\Sigma }ds\mathcal {A}_{\lambda }^{+}(s)= 1+ O (\lambda ^{\frac{1}{2}})\).

     
  4. 4.

    For all \((x,p)\in \Sigma \), \(\mathcal {K}_{\lambda ,n}(x,p)\le C_n(1+\lambda |p|)\).

     
  5. 5.

    For all \((x,p)\in \Sigma \), \(\mathcal {V}_{\lambda ,n}(x,p)\le C(1+\lambda |p|)^{n+1}\).

     
  6. 6.

    For all \((x,p)\in \Sigma \), \(\mathcal {V}_{\lambda ,n}^{+}(x,p)\le C_n\).

     
  7. 7.

    For all \((x,p)\in \Sigma \), \(\mathcal {V}_{\lambda }(x,p)\ge c\).

     

Lemma 3.2 states that the energy process \(H_{t}:=H(X_{t},P_{t})\) typically does not go above the scale \(\lambda ^{-1}\) over the time interval \([0,\frac{T}{\lambda }]\). The proof is based on martingale analysis and the bounds in Proposition 3.1 and does not involve the Nummelin splitting structure.

Lemma 3.2

For any \(n\in \mathbb {N}\), there exists a \(C>0\) such that
$$\begin{aligned} \mathbb {E}^{(\lambda )}\Big [\sup _{0\le r\le \frac{T}{\lambda }} (H_{r})^{\frac{n}{2}} \Big ] \le C\Big (\frac{ T}{\lambda }\Big )^{\frac{n}{2}} \end{aligned}$$
for all \(T>0\) and \(\lambda <1\).

Proof

We will work with the process \(\mathbf {Q}_{t}:=(2 H_{t})^{\frac{1}{2}}\). The reader should think of \(\mathbf {Q}_{t}\) as being roughly the absolute value of the momentum \(|P_{t}|\). If \(P_{t}\) were a symmetric random walk making steps every unit of time, then the result would follow by Doob’s maximal inequality with \(\mathbf {Q}_{t}\) replaced by \(|P_{t}|\) (supposing that the tail distribution of the jumps decays sufficiently fast). The situation for our jump rates should, in principle, be even more accommodating since the jump rates (1.2) tend to drag a momentum with large absolute value down to a momentum with smaller absolute value. However, for the purposes of this lemma, it is useful to discard the term associated with these large downward jumps in the decomposition (3.2) of \(\mathbf {Q}_{t}\) because it is less analytically wieldy and it is not helpful on the time scales \(\frac{T}{\lambda }\) for \(T\) fixed and \(\lambda \ll 1\).

For technical reasons, we partition the time interval \([0,\frac{T}{\lambda }]\) through a sequence of incursion times \(\varsigma _m'\) into a region of “low” energy. Let \(\varsigma _{0}=\varsigma ^{\prime }_{1}=0\), and define the stopping times \(\varsigma _{m},\varsigma _{m}^{\prime }\) such that
$$\begin{aligned} \varsigma ^{\prime }_{m}= \min \{r\in (\varsigma _{m-1},\infty )\,\big | \,\mathbf {Q}_{r}\le \lambda ^{-\frac{1}{2}} \},\quad \varsigma _{m} = \min \{ r\in (\varsigma ^{\prime }_{m},\infty )\,\big |\, \mathbf {Q}_{r}\ge 2 \lambda ^{-\frac{1}{2}}\}. \end{aligned}$$
The intervals \([\varsigma _m',\varsigma _m)\) and \([\varsigma _m,\varsigma _{m+1}')\) are incursions and excursions, respectively. The above definitions assume that \(\mathbf {Q}_{0}\le \lambda ^{-\frac{1}{2}}\), which is reasonable for \(\lambda \ll 1\) by the locality assumption on the initial distribution (2) of List 1.1, but we should take \(\varsigma _{1}=\varsigma _{1}'=0\) when \(\mathbf {Q}_{0}> \lambda ^{-\frac{1}{2}}\).
Trivially, we have the inequality
$$\begin{aligned} \sup _{0\le r\le t}\mathbf {Q}_{r}\le 2\lambda ^{-\frac{1}{2}}+\sup _{0\le r\le t}\big (\mathbf {Q}_{r}-\mathbf {Q}_{r^{-}}\big )^+ +\sup _{ \varsigma _{m}\le t}\sup _{r\in [\varsigma _{m},\,\varsigma _{m+1 }'\wedge t]}\big (\mathbf {Q}_{r}-\mathbf {Q}_{\varsigma _{m}}\big )^+, \end{aligned}$$
(3.1)
where \((y)^{+}:=\max (y,0)\) for \(y\in {\mathbb R}\). The two rightmost terms in (3.1) bound the largest fluctuation of the process \(\mathbf {Q}_{t}\) above the line \(2\lambda ^{-1}\). In particular, the middle term on the right side of (3.1) bounds the largest over-jump past the line \(2\lambda ^{-\frac{1}{2}}\) at the start of the excursions from low energy.
Let \(t_{j}\), \(j>0\) be the collision times with \(t_{0}=0\) and recall that \({\mathcal {N}}_{t}\) is the number of collisions up to time \(t\). We can write \(\mathbf {Q}_{t}\) as
$$\begin{aligned} \mathbf {Q}_{t}=\mathbf {Q}_{0}+ \mathbf {m}_{t}+ \mathbf {m}_{t}'+\int \limits _{0}^{t}dr\mathcal {A}^{+}(S_{r})-\sum _{j=1}^{{\mathcal {N}}_{t}}\frac{\mathcal {A}_{\lambda }^{-}(S_{t_{j}^{-}})}{\mathcal {E}_{\lambda }(P_{t_{j}^{-}})}, \end{aligned}$$
(3.2)
where the process \(\mathbf {m}_{t}\) is defined as
$$\begin{aligned} \mathbf {m}_{t}:=\sum _{j=1}^{{\mathcal {N}}_{t}}\Delta _{j} \quad \quad \text {for} \quad \quad \Delta _{j}:= \mathbf {Q}_{t_{j}} -\mathbf {Q}_{t_{j}^{-}} - \frac{\mathcal {A}_{\lambda }(S_{t_{j}^{-}})}{\mathcal {E}_{\lambda }(P_{t_{j}^{-}})}, \end{aligned}$$
and \(\mathbf {m}_{t}'\) is the difference
$$\begin{aligned} \mathbf {m}_{t}':=\sum _{j=1}^{{\mathcal {N}}_{t}}\frac{\mathcal {A}_{\lambda }^{+}(S_{t_{j}^{-}}) }{\mathcal {E} _{\lambda }(P_{t_{j}^{-}}) }-\int \limits _{0}^{t}dr\mathcal {A}_{\lambda }^{+}(S_{r}). \end{aligned}$$
The processes \(\mathbf {m}_{t}\) and \(\mathbf {m}_{t}'\) are martingales with respect to the filtration \(\mathcal {F}_{t}\). To see that \(\mathbf {m}_{t}\) is a martingale, notice that the increments \(\Delta _{j}\) have mean zero given the information \(\mathcal {F}_{\tau _{j}^{-}}\) and \({\mathcal {N}}_{\tau _{j}}={\mathcal {N}}_{\tau _{j}^{-}}+1\). The process \(\mathbf {m}_{t}'\) is a martingale since the terms \( \frac{\mathcal {A}_{\lambda }^{+}(S_{r})}{\mathcal {E} _{\lambda }(P_{r})}\) in the sum occur with Poisson rate \(\mathcal {E} _{\lambda }(P_{r})\). Moreover, the predictable quadratic variations corresponding to the martingales have the forms
$$\begin{aligned} \langle \mathbf {m}\rangle _{t}= \int \limits _{0}^{t}dr\mathcal {K}_{\lambda ,2}(S_{r})\quad \text {and} \quad \langle \mathbf {m}'\rangle _{t}= \int \limits _{0}^{t}dr\frac{\big (\mathcal {A}_{\lambda }^{+}(S_{r})\big )^{2}}{\mathcal {E}_{\lambda }(P_{r})}. \end{aligned}$$
When \(t=\frac{T}{\lambda }\) the third term on the right side of (3.1) is smaller than
$$\begin{aligned}&\sup _{ \varsigma _{n}\le \frac{T}{\lambda } }\sup _{t\in [\varsigma _{n},\,\varsigma _{n+1}'\wedge \frac{T}{\lambda }) }\big (\mathbf {Q}_{t}-\mathbf {Q}_{\varsigma _{n}}\big )^+ \nonumber \\&\quad \le \sup _{ \varsigma _{n}\le \frac{T}{\lambda } }\sup _{t\in [\varsigma _{n},\,\varsigma _{n+1}'\wedge \frac{T}{\lambda }] }\left( \mathbf {m}_{t}+\mathbf {m}_{t}'-\mathbf {m}_{\varsigma _{n}}-\mathbf {m}_{\varsigma _{n}}'+\int \limits _{\varsigma _{n}}^{t}dr\mathcal {A}^{+}(S_{r})\right) ^+\nonumber \\&\quad \le 2\sup _{0\le t\le \frac{T}{\lambda }}\big |\mathbf {m}_{t}\big | +2\sup _{0\le t\le \frac{T}{\lambda }}\big |\mathbf {m}_{t}'\big |+ \int \limits _{0}^{\frac{T}{\lambda }}dr\mathcal {A}^{+}(S_{r})\chi (\mathbf {Q}_{r}\ge \lambda ^{-\frac{1}{2}}) \nonumber \\&\quad \le 2\sup _{0\le t\le \frac{T}{\lambda }}\big |\mathbf {m}_{t}\big | +2\sup _{0\le t\le \frac{T}{\lambda }}\big |\mathbf {m}_{t}'\big |+ 2C T. \end{aligned}$$
(3.3)
For the first inequality, we have thrown away the term \(-\sum _{j=1}^{{\mathcal {N}}_{t}}\frac{\mathcal {A}_{\lambda }^{-}(S_{t_{j}^{-}})}{\mathcal {E}_{\lambda }(P_{t_{j}^{-}})}\) since it is strictly negative. The second inequality is the triangle inequality with \(\sup _{0\le s, r\le t}\big |f_{r}-f_{s} \big |\le 2\sup _{0\le r\le t}\big |f_{r}\big |\) for \(f=\mathbf {m},\mathbf {m}'\), and uses the fact that \(\mathbf {Q}_{r}\ge \lambda ^{-\frac{1}{2}}\) during the excursion intervals \( [\varsigma _{n},\,\varsigma _{n+1}'\wedge \frac{T}{\lambda })\). The third inequality is a consequence of Part (2) of Proposition 3.1, which gives a \(C>0\) such that
$$\begin{aligned} \mathcal {A}^{+}(X_{r},P_{r})\chi \big (\mathbf {Q}_{r}\ge \lambda ^{-\frac{1}{2}}\big )\le \frac{C}{1+P_{r}^{2}} \chi \big (\mathbf {Q}_{r}\ge \lambda ^{-\frac{1}{2}}\big )\le \frac{C}{1+\frac{1}{2}\lambda ^{-1}}\le 2C\lambda , \end{aligned}$$
where the second inequality is for \(\lambda \) small enough so that \(4\sup _{x}V(x)\le \lambda ^{-1}\).
Combining (3.2) for \(t=\frac{T}{\lambda }\) with (3.3) and using the triangle inequality, then
$$\begin{aligned} \mathbb {E}^{(\lambda )}\Big [\sup _{0\le t\le \frac{T}{\lambda }} \mathbf {Q}_{t}^{n} \Big ]^{\frac{1}{n}} \le&2CT+ \mathbb {E}^{(\lambda )}\big [\mathbf {Q}_{0}^{n} \big ]^{\frac{1}{n}}+\mathbb {E}^{(\lambda )}\Big [\sup \limits _{0\le t\le \frac{T}{\lambda } }\big ((\mathbf {Q}_{t}-\mathbf {Q}_{t^{-}})^+\big )^{n} \Big ]^{\frac{1}{n}}\nonumber \\&+2 \mathbb {E}^{(\lambda )}\Big [\sup \limits _{0\le t\le \frac{T}{\lambda } }\big |\mathbf {m}_{t}\big |^{n} \Big ]^{\frac{1}{n}} + 2 \mathbb {E}^{(\lambda )}\Big [\sup \limits _{0\le t\le \frac{T}{\lambda } }\big |\mathbf {m}_{t}'\big |^{n} \Big ]^{\frac{1}{n}}. \end{aligned}$$
(3.4)
We will now give bounds for each of the terms on the right side above. The goal is to show that
$$\begin{aligned} \mathbb {E}^{(\lambda )}\Big [\sup _{0\le t\le \frac{T}{\lambda }} \mathbf {Q}_{t}^{n} \Big ]^{\frac{1}{n}}= O \Big (\lambda ^{-\frac{1}{2}} +\sum _{m}\Big (\mathbb {E}^{(\lambda )}\Big [\sup _{0\le t\le \frac{T}{\lambda }} \mathbf {Q}_{t}^{n} \Big ]^{\frac{1}{n}} \Big )^{\alpha _{m}} \Big ), \end{aligned}$$
(3.5)
where \(\alpha _{m}\ge 2\) and the sum over \(m\) includes a finite number of terms not depending on the parameter \(\lambda >0\). The above would imply that \(\mathbb {E}^{(\lambda )}\big [\sup _{0\le t\le \frac{T}{\lambda }} \mathbf {Q}_{t}^{n} \big ]\) is \( O (\lambda ^{-\frac{n}{2}})\), which is the statement we wish to prove.
For the second term on the right side of (3.4),
$$\begin{aligned} \mathbb {E}^{(\lambda )}\big [\mathbf {Q}_{0}^{n} \big ]=\int \limits _{S}d\mu (x,p)\big (p^{2}+2V(x)\big )^{\frac{n}{2}} <\infty , \end{aligned}$$
and the right side is finite by our assumption on the initial measure in List 1.1. For the third term on the right side of (3.4). Using that \((\sup _{m}a_{m})^{2}\le \sum _{m} a_{m}^{2}\) and Jensen’s inequality, we have the first inequality below
$$\begin{aligned} \mathbb {E}^{(\lambda )}\Big [\sup _{0\le t\le \frac{T}{\lambda } }\big ((\mathbf {Q}_{t}-\mathbf {Q}_{t^{-}})^{+}\big )^{n} \Big ]^{\frac{1}{n}}&\le \mathbb {E}^{(\lambda )}\Big [\sum _{j=1}^{{\mathcal {N}}_{\frac{T}{\lambda }}} \big ((\mathbf {Q}_{t_{j}}-\mathbf {Q}_{t_{j}^{-}})^+\big )^{2n} \Big ]^{\frac{1}{2n}}\\&= \mathbb {E}^{(\lambda )}\Big [\sum _{j=1}^{{\mathcal {N}}_{\frac{T}{\lambda }}}\frac{\mathcal {V}_{\lambda ,2n}^{+}(X_{t_{j}^{-}},P_{t_{j}^{-} })}{ \mathcal {E}_{\lambda }(P_{t_{j}^{-}})} \Big ]^{\frac{1}{2n}} \\&= \mathbb {E}^{(\lambda )}\Big [\int \limits _{0}^{\frac{T}{\lambda }}dt\mathcal {V}_{\lambda ,2n}^{+}(X_{t},P_{t}) \Big ]^{\frac{1}{2n}}\\&\le c^{\frac{1}{2n}}\mathbb {E}^{(\lambda )}\Big [\int \limits _{0}^{\frac{T}{\lambda }}dt\big (1+\lambda \mathbf {Q}_{t} \big ) \Big ]^{\frac{1}{2n}}\\&\le c^{\frac{1}{2n}}T^{\frac{1}{2}}\lambda ^{-\frac{1}{2n}}+c^{\frac{1}{2n}}T^{\frac{1}{2n}}\mathbb {E}^{(\lambda )}\Big [\sup _{0\le t\le \frac{T}{\lambda }} \mathbf {Q}_{t}^{n} \Big ]^{\frac{1}{2n^{2}}}. \end{aligned}$$
The first equality uses that
$$\begin{aligned} \mathbb {E}\Big [\Big ((\mathbf {Q}_{t}-\mathbf {Q}_{t^{-}})^+\Big )^{2n}\,\Big |\,\mathcal {F}_{t^{-}},\,{\mathcal {N}}_{t}={\mathcal {N}}_{t^{-}}+1\Big ]=\frac{\mathcal {V}_{\lambda ,2n}^{+}(X_{t},P_{t^{-}}) }{ \mathcal {E}_{\lambda }(P_{t^{-}})}, \end{aligned}$$
and the second equality uses that the times \(t_{n}\) occur with Poisson rate \(\mathcal {E}_{\lambda }(P_{t})\). The second inequality is for some \(c>0\) by Part (6) of Proposition 3.1. The right-most term above has the form of (3.5).
For the fourth term on the right side of (3.4), we can apply Doob’s maximal inequality to get the first inequality below:
$$\begin{aligned} \mathbb {E}^{(\lambda )}\Big [\sup _{0\le t\le \frac{T}{\lambda } }\big |\mathbf {m}_{t}\big |^{n} \Big ]^{\frac{1}{n}}&\le \frac{n}{n-1}\mathbb {E}^{(\lambda )}\big [\big |\mathbf {m}_{\frac{T}{\lambda } }\big |^{n} \big ]^{\frac{1}{n}}\le C' \mathbb {E}^{(\lambda )}\big [\big (\langle \mathbf {m}\rangle _{\frac{T}{\lambda }}\big )^{\frac{n}{2}} \big ]^{\frac{1}{n}}+C'\mathbb {E}^{(\lambda )}\Big [\sum _{j=1}^{{\mathcal {N}}_{\frac{T}{\lambda }}}\big |\Delta _{j} \big |^{n} \Big ]^{\frac{1}{n}} \\&= C' \mathbb {E}^{(\lambda )}\Big [\Big (\int \limits _{0}^{\frac{T}{\lambda } }dt\mathcal {K}_{\lambda ,2}(S_{t})\Big )^{\frac{n}{2}} \Big ]^{\frac{1}{n}}+C' \mathbb {E}^{(\lambda )}\Big [\int \limits _{0}^{\frac{T}{\lambda }}dt\mathcal {K}_{\lambda ,n}(S_{t}) \Big ]^{\frac{1}{n}} \\&\le C'' \mathbb {E}^{(\lambda )}\Big [\Big (\int \limits _{0}^{\frac{T}{\lambda } }dt\big (1+\lambda \mathbf {Q}_{t}\big )\Big )^{\frac{n}{2}} \Big ]^{\frac{1}{n}}+C''\mathbb {E}^{(\lambda )}\Big [\int \limits _{0}^{\frac{T}{\lambda }}dt\big (1+\lambda \mathbf {Q}_{t}\big ) \Big ]^{\frac{1}{n}} \\&\le C''(T^{\frac{1}{2}} \lambda ^{-\frac{1}{2}}+T^{\frac{1}{n}}\lambda ^{-\frac{1}{n}})+C'' T^{\frac{1}{2}}\mathbb {E}^{(\lambda )}\Big [\sup _{0\le t\le \frac{T}{\lambda }} \mathbf {Q}_{t}^{n} \Big ]^{\frac{1}{2n}}\\&+\, C'' T^{\frac{1}{n}}\mathbb {E}^{(\lambda )}\Big [\sup _{0\le t\le \frac{T}{\lambda }} \mathbf {Q}_{t}^{n} \Big ]^{\frac{1}{n^{2}}}. \end{aligned}$$
The second inequality is for some \(C'>0\) by Rosenthal’s inequality (see e.g. [11, Lem. 2.1]). The third inequality is Part (4) of Proposition 3.1. We have combined the constants at each step. Thus we have dressed our bound in the form (3.5).

The last term in (3.4) is bounded similarly to \(\mathbb {E}^{(\lambda )}\big [\sup _{0\le t\le \frac{T}{\lambda } }\big |\mathbf {m}_{t}\big |^{n} \big ]^{\frac{1}{n}}\). \(\square \)

The following lemma bounds the expected number of returns to the atom up to time \(\frac{T}{\lambda }\) for \(\lambda \ll 1\).

Lemma 3.3

There is a \(C>0\) such that for all \(\lambda <1\),
$$\begin{aligned} \tilde{\mathbb {E}}^{(\lambda )}\big [\lambda ^{\frac{1}{2}}\tilde{N}_{\frac{T}{\lambda }}\big ]\le C. \end{aligned}$$

Proof

The main step in this proof is the inequality in (3.6) where we bound the function \(h(s)\), which determines the probability of life cycle expiration, by a constant multiple of the function \(\mathcal {A}_{\lambda }^{+}(s)\) arising as the increasing part of the semi-martingale decomposition for \((2H_{t})^{\frac{1}{2}}\). Once the quantity we must bound is formulated in terms of processes related to the square root of the energy process, we can apply Lemma 3.2 and results from Proposition 3.1 to finish the proof. By Proposition 2.5 we have the first equality below:
$$\begin{aligned} \tilde{\mathbb {E}}^{(\lambda )}\big [\lambda ^{\frac{1}{2}}\tilde{N}_{\frac{T}{\lambda }}\big ]&= \mathbb {E}^{(\lambda )}\Big [\lambda ^{\frac{1}{2}}\int \limits _{0}^{\frac{T}{\lambda }}drh(S_{r})\Big ]\nonumber \\&\le c\mathbb {E}^{(\lambda )}\Big [\lambda ^{\frac{1}{2}}\int \limits _{0}^{\frac{T}{\lambda }}dr\mathcal {A}_{\lambda }^{+}(S_{r})\Big ]\nonumber \\&= c \mathbb {E}^{(\lambda )}\big [\lambda ^{\frac{1}{2}}\mathbf {A}_{\frac{T}{\lambda }}^{+} \big ]. \end{aligned}$$
(3.6)
The second equality in (3.6) follows by the definition of the process \(\mathbf {A}_{t}^{+}\), and the inequality holds since there is \(c>0\) such that for small enough \(\lambda >0\) and all \(s\in \Sigma \)
$$\begin{aligned} h(s)\le c\mathcal {A}_{\lambda }^{+}(s). \end{aligned}$$
(3.7)
To show (3.7), we first observe that for \(\lambda =0\) we have \(\mathcal {A}_{0}^{+}(x,p)=\mathcal {A}_{0}(x,p)\) since for all \((x,p)\in \Sigma \)
$$\begin{aligned} \mathcal {A}_{0}(x,p)&= \int \limits _{{\mathbb R}}dvj(v)\Big (2^{\frac{1}{2}}H^{\frac{1}{2}}(x,p+v)-2^{\frac{1}{2}}H^{\frac{1}{2}}(x,p)\Big )\nonumber \\&= V(x) \int \limits _{0}^{\infty }dvj(v)\int \limits _{-|v|}^{|v|}dw\frac{ |v|-|w|}{ 2^{\frac{1}{2}}H^{\frac{3}{2}}(x,p+w)}\nonumber \\&> 0 , \end{aligned}$$
(3.8)
where the jump rate density \(j(p-p')=\mathcal {J}_{0}(p,p')\) is defined as in (1.11). The second equality in (3.8) holds by a Taylor formula for \(H^{\frac{1}{2}}(x,p+v)\) around \(v=0\) with second-order error:
$$\begin{aligned} 2^{\frac{1}{2}}H^{\frac{1}{2}}(x,p+v)-2^{\frac{1}{2}}H^{\frac{1}{2}}(x,p)&= v\frac{ p}{2^{\frac{1}{2}}H^{\frac{1}{2}}(x,p)} +\int \limits _{0}^{v}dw(v-w) \frac{ V(x)}{2^{\frac{1}{2}} H^{\frac{3}{2}}(x,p+w)} , \end{aligned}$$
where the first-order term vanishes from (3.8) because \(j(v)=j(-v)\), and the error term is symmetrized for the same reason. The formula on the second line of (3.8) assumes \(V(x)>0\) and can be replaced by a simpler expression when \(V(x)=0\) (in which case \(2^{\frac{1}{2}}H^{\frac{1}{2}}(x,p)=|p|\)), although the resulting value is still strictly positive for all \((x,p)\). For fixed \(\lambda >0\), the function \(\mathcal {A}_{\lambda }^{+}(x,p)\) is supported in a compact region around the origin of phase space. As \(\lambda \searrow 0\) the bias that tends to drag the test particle to lower energy due to head-on collisions becomes less pronounced, and \({{\mathrm{supp}}}(\mathcal {A}_{\lambda }^{+})\) grows to all of phase space (as in the \(\lambda =0\) case). For any compact set \(\mathcal {K}\subset \Sigma \) we can pick \(\lambda >0\) small enough such that \(\mathcal {K}\subset {{\mathrm{supp}}}(\mathcal {A}_{\lambda }^{+})\), and there is uniform convergence for all \((x,p)\in \mathcal {K}\) as \(\lambda \searrow 0\)
$$\begin{aligned} \mathcal {A}_{\lambda }^{+}(x,p)=\mathcal {A}_{\lambda }(x,p)&=\int \limits _{{\mathbb R}}dv\mathcal {J}_{\lambda }(p,p+v)\Big (2^{\frac{1}{2}}H^{\frac{1}{2}}(x,p+v)-2^{\frac{1}{2}}H^{\frac{1}{2}}(x,p) \Big )\nonumber \\&\longrightarrow \mathcal {A}_{0}(x,p)>0. \end{aligned}$$
(3.9)
This uniform convergence follows easily from the smooth dependence of the rates \(\mathcal {J}_{\lambda }(p,p+v)\) in \(\lambda \) and the uniform exponential decay of the rates in \(v\) for all \(p\) in a compact set. In particular, we can pick a \(\delta >0\) and \(\lambda >0\) small enough so that \(\mathcal {A}_{\lambda }^{+}(x,p) >\delta \) for all \((x,p)\in \mathcal {K}\). Since \(h:\Sigma \rightarrow {\mathbb R}^{+}\) is bounded by one and has compact support, the above remarks imply that (3.7) holds for some \(c>0\), and we thus have the inequality in (3.6).
By (3.6) it is sufficient to show that \(\mathbb {E}^{(\lambda )}\big [\lambda ^{\frac{1}{2}}\mathbf {A}_{\frac{T}{\lambda }}^{+} \big ]\) is uniformly bounded for \(\lambda \ll 1\). Since \(\mathbf {A}_{t}^{+}=\mathbf {Q}_{t}-\mathbf {Q}_{0}-\mathbf {M}_{t}+\mathbf {A}_{t}^{-}\), the triangle inequality gives that
$$\begin{aligned} \mathbb {E}^{(\lambda )}\big [\lambda ^{\frac{1}{2} }\mathbf {A}_{\frac{T}{\lambda }}^{+}\big ]&\le 2 \mathbb {E}^{(\lambda )}\big [\sup _{0\le t\le \frac{T}{\lambda }}\lambda ^{\frac{1}{2}}\mathbf {Q}_{t} \big ]+ \mathbb {E}^{(\lambda )}\big [\big |\lambda ^{\frac{1}{2}}\mathbf {M}_{\frac{T}{\lambda } }\big | \big ]+\mathbb {E}^{(\lambda )}\big [\lambda ^{\frac{1}{2} }\mathbf {A}_{\frac{T}{\lambda }}^{-}\big ] \nonumber \\&\le 2 \mathbb {E}^{(\lambda )}\big [\sup _{0\le t\le \frac{T}{\lambda }}\lambda ^{\frac{1}{2}}\mathbf {Q}_{t} \big ]+ \mathbb {E}^{(\lambda )}\Big [\lambda \int \limits _{0}^{\frac{T}{\lambda } }dr\mathcal {V}_{\lambda }(S_{r}) \Big ]^{\frac{1}{2}}+\mathbb {E}^{(\lambda )}\Big [\lambda ^{\frac{1}{2}}\int \limits _{0}^{\frac{T}{\lambda }}dr\mathcal {A}^{-}_{\lambda }(S_{r})\Big ] \nonumber \\&\le 2 \mathbb {E}^{(\lambda )}\big [\sup _{0\le t\le \frac{T}{\lambda }}\lambda ^{\frac{1}{2}}\mathbf {Q}_{t} \big ]+ C_{1}^{\frac{1}{2}}\mathbb {E}^{(\lambda )}\Big [\lambda \int \limits _{0}^{\frac{T}{\lambda } }dr(1+\lambda \mathbf {Q}_{r})^{2} \Big ]^{\frac{1}{2}}\nonumber \\&+\,C_{2}\mathbb {E}^{(\lambda )}\Big [\lambda ^{\frac{3}{2}}\int \limits _{0}^{\frac{T}{\lambda }}dr(\mathbf {Q}_{r}+\lambda \mathbf {Q}_{r}^{2}) \Big ]. \end{aligned}$$
(3.10)
For the second inequality, the second term employs Jensen’s inequality with the square function along with the fact that the martingale \(\mathbf {M}_{t}\) has bracket \(\langle \mathbf {M}\rangle _{t}=\int _{0}^{t}dr\mathcal {V}_{\lambda }(S_{r})\). The bound for the second term in the third equality is by Part (5) of Proposition 3.1. The third term in the third inequality is bounded by Part (1) of Proposition 3.1 and \(|P_{r}|\le \mathbf {Q}_{r}\). The right side of (3.10) is uniformly bounded in \(\lambda <1\) since \( \mathbf {Q}_{r} = (2H_{r})^{\frac{1}{2}} \) and \(\mathbb {E}^{(\lambda )}\big [\sup _{0\le r\le \frac{T}{\lambda }} H_{r}^{\frac{n}{2}} \big ]\le C_{n}'\lambda ^{-\frac{n}{2}}\) for some constants \(C'_{n}>0\) by Lemma 3.2. \(\square \)

3.2 Fractional Moments for the Duration of Life Cycles

By Appendix 8 the original dynamics is exponentially ergodic to the equilibrium state \(\Psi _{\infty ,\lambda }\) for any fixed \(\lambda \). Thus the split dynamics converges exponentially to the \(\tilde{\Psi }_{\infty ,\lambda }\), and the time span \(R_{1}\) and the number of partition times \(\tilde{n}_{1}\) during a single life cycle will have finite expectation. However, the process \(S_{t}=(X_{t},P_{t})\) behaves more and more like a random walk in the \(P_{t}\) variable as \(\lambda \searrow 0\), so we should expect thatas \(\lambda \searrow 0\) since the time elapsed during a random walk’s excursion from a region around the origin has infinite expectation. However, the excursive durations for random walks do have finite fractional moments for exponents \(<\frac{1}{2}\), and Part (2) of Proposition 3.4 states the analogous property for our process.

Proposition 3.4

Let \(\tilde{n}_{1}\) and \(R_{1}\) be defined as above.
  1. 1.
    There is a \(C>0\) such that for \(\lambda <1\),
    $$\begin{aligned} \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [\tilde{n}_{1}\big ]\le C \lambda ^{-\frac{1}{2}}\quad \text {and} \quad \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [R_{1}\big ]\le C \lambda ^{-\frac{1}{2}}. \end{aligned}$$
     
  2. 2.
    Each fractional moment \(0<\alpha <\frac{1}{2}\) is uniformly bounded for \(\lambda <1\),
    $$\begin{aligned} \sup _{\lambda <1}\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [\tilde{n}_{1}^{\alpha }\big ]<\infty \quad \text {and} \quad \sup _{\lambda <1}\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [R_{1}^{\alpha }\big ]<\infty . \end{aligned}$$
     

Before beginning with the proof of Proposition 3.4, we must establish Lemmas 3.5 and 3.6 below. The following trivial lemma bounds the length of time up to the first partition time \(\tau _{1}\) independently of the initial state \(\tilde{s}\in \tilde{S}\). Although the time intervals between partition times are not exponentially distributed, there is still an exponential bound on their densities.

Lemma 3.5

For \(c=\max (\frac{U}{\mathbf {u}}, \frac{ U}{U-\mathbf {u}})\), the following inequality holds:
$$\begin{aligned} \sup _{\lambda <1}\sup _{\tilde{s}\in \tilde{\Sigma }}\tilde{\mathbb {E}}^{(\lambda )}_{ \tilde{s}}\big [\delta _{t}(\tau _{1}) \big ]\le ce^{-t}, \end{aligned}$$
where \(\tilde{\mathbb {E}}^{(\lambda )}_{ \tilde{s}}\big [\delta _{t}(\tau _{1}) \big ]\) refers to the density of the random variable \(\tau _{1}\) at the value \(t\ge 0\). As a consequence, for any \(n\in \mathbb {N}\),
$$\begin{aligned} \sup _{\lambda \le 1}\sup _{\tilde{s}\in \tilde{\Sigma }}\tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\big [(R_{1}-R_{1}')^{n} \big ]<\infty . \end{aligned}$$

Proof

In the original dynamics, \(\tau _{1}\) has a mean one exponential distribution regardless of the initial state. Splitting the distribution starting from \(s\in \Sigma \) yields the equality
$$\begin{aligned} e^{-t}=\mathbb {E}^{(\lambda )}_{ s}\big [\delta _{t}(\tau _{1}) \big ] = \big (1-h(s)\big )\tilde{\mathbb {E}}^{(\lambda )}_{(s,0) }\big [\delta _{t}(\tau _{1}) \big ] + h(s)\tilde{\mathbb {E}}^{(\lambda )}_{(s,1)}\big [\delta _{t}(\tau _{1})\big ]. \end{aligned}$$
For \(s\in \Sigma \) with \(H(s)>l\), we have \(h(s)=0\) and no splitting occurs, and thus \(c=1\) is sufficient for the inequality. For \(s\in \Sigma \) with \(H(s)\le l\), then \(c= \inf _{H(s)\le l}\max (\frac{1}{h(s)}, \frac{1}{1-h(s)})=\max (\frac{U}{\mathbf {u}}, \frac{ U}{U-\mathbf {u}}) \), where \(l\), \(\mathbf {u}\), and \(U\) are defined as in Convention 2.2. The bound for the moments of \(R_{1}-R_{1}'\) follows because \(R_{1}\) is defined as the first partition time after \(R_{1}'\) and by the strong Markov property for the chain \(\tilde{\sigma }_{n}=\tilde{S}_{\tau _{n}}\) since \(R_{1}'=\tau _{\tilde{n}_{1}}\) for the hitting time \(\tilde{n}_{1}\) \(\square \)

Lemma 3.6

There exist \(c,C>0\) such that for all \(t\in {\mathbb R}^{+}\), \(\lambda <1\), and \(s\) in a given compact subset of \(\Sigma \),
$$\begin{aligned} \mathbb {E}^{(\lambda )}_{s}\big [\mathbf {A}_{t}^{+} \big ]\ge -C+c t^{\frac{1}{2}}. \end{aligned}$$

Proof

The positive-valued, increasing process \( \mathbf {A}_{t}^{+}\) is difficult to analyze directly, so our strategy will be to write it using the other terms in the semi-martingale decomposition of \(\mathbf {Q}_{t}\) as we did before at the end of the proof of Lemma 3.3: \( \mathbf {A}_{t}^{+}= \mathbf {Q}_{t}-\mathbf {Q}_{0}-\mathbf {M}_{t}+\mathbf {A}_{t}^{-} \). In fact we can immediately throw away the positive terms \( \mathbf {Q}_{t}\), \(\mathbf {A}_{t}^{-}\) in this expression for \(\mathbf {A}_{t}^{+}\) since we are looking for a lower bound; see (3.11). Our analysis will rely on applications of Proposition 3.1 and Lemma 3.2 to bound the remaining martingale term.

Since \(\mathbf {Q}_{t}\) and \(\mathbf {A}_{t}^{-}\) are positive and \(\mathbf {A}_t^+\ge 0\) is increasing, we have the first inequality below:
$$\begin{aligned} \mathbf {A}_{t}^{+}= \mathbf {Q}_{t}-\mathbf {Q}_{0}-\mathbf {M}_{t}+\mathbf {A}_{t}^{-}&\ge -\mathbf {Q}_{0} +\sup _{0\le r\le t}-\mathbf {M}_{r}\nonumber \\&\ge -\mathbf {Q}_{0} +\mathbf {M}_{t}^-, \end{aligned}$$
(3.11)
where \(\mathbf {M}_{t}^-:=-\mathbf {M}_{t}\chi (\mathbf {M}_{t}\le 0)\). Taking the expectation of both side gives
$$\begin{aligned} \mathbb {E}^{(\lambda )}_{s}\big [t^{-\frac{1}{2}} \mathbf {A}_{t}^{+} \big ]\ge -2^{\frac{1}{2}}t^{-\frac{1}{2}} H^{\frac{1}{2}}(s)+ t^{-\frac{1}{2}}\mathbb {E}^{(\lambda )}_{s}\big [\mathbf {M}_{t}^-\big ]. \end{aligned}$$
Since \(\mathbf {M}_{t}\) has mean zero, we have the equality below
$$\begin{aligned} 2\mathbb {E}^{(\lambda )}_{s}\big [\mathbf {M}_{t}^-\big ] =\mathbb {E}^{(\lambda )}_{s}\big [|\mathbf {M}_{t}|\big ]\ge \frac{ \mathbb {E}^{(\lambda )}_{s}\big [|\mathbf {M}_{t}|^{2}\big ]^{2}}{ \mathbb {E}^{(\lambda )}_{s}\big [|\mathbf {M}_{t}|^{3}\big ]}, \end{aligned}$$
and the inequality is by Cauchy-Schwarz. However,
$$\begin{aligned} \mathbb {E}^{(\lambda )}_{s}\big [|\mathbf {M}_{t}|^{2}\big ]= \mathbb {E}^{(\lambda )}_{s}\Big [\int \limits _{0}^{t}dr\mathcal {V}_{\lambda }(S_{r}) \Big ]\ge c t, \end{aligned}$$
where \(c>0\) is from Part (7) of Proposition 3.1. For the first inequality below, we use Rosenthal’s inequality to produce a \(C>0\) such that
$$\begin{aligned} \mathbb {E}^{(\lambda )}_{s}\big [|\mathbf {M}_{t}|^{3}\big ]&\le C\mathbb {E}^{(\lambda )}_{s}\big [\langle \mathbf {M}_{t}\rangle ^{\frac{3}{2}}\big ]+C\mathbb {E}^{(\lambda )}_{s}\Big [\sum _{n=1}^{{\mathcal {N}}_{t}}\big |\mathbf {M}_{t_{n}}-\mathbf {M}_{t_{n}^{-}}\big |^{3}\Big ]\\&= C\mathbb {E}^{(\lambda )}_{s}\big [\langle \mathbf {M}_{t}\rangle ^{\frac{3}{2}}\big ]+C\mathbb {E}^{(\lambda )}_{s}\Big [\sum _{n=1}^{{\mathcal {N}}_{t}}\big |\mathbf {Q}_{t_{n}}-\mathbf {Q}_{t_{n}^{-}}\big |^{3}\Big ]\\&= C\mathbb {E}^{(\lambda )}_{s}\Big [\Big (\int \limits _{0}^{t}dr\mathcal {V}_{\lambda }(S_{r}) \Big )^{\frac{3}{2}}+ \int \limits _{0}^{t}dr\mathcal {V}_{\lambda ,3}(S_{r})\Big ]\\&\le C'\mathbb {E}^{(\lambda )}_{s}\Big [\Big (\int \limits _{0}^{t}dr(1+\lambda \mathbf {Q}_{r})^{2} \Big )^{\frac{3}{2}}+ \int \limits _{0}^{t}dr(1+\lambda \mathbf {Q}_{r})^{4}\Big ]\le C''t^{\frac{3}{2}}, \end{aligned}$$
where \(t_{n}\) are the collision times and \({\mathcal {N}}_{t}\) is the number of collisions up to time \(t\). The first equality uses that the \(\mathbf {Q}_{t}\) and \(\mathbf {M}_{t}\) differ by a continuous process and thus have the same jumps. The second inequality is for some \(C'>0\) by Part (5) of Proposition 3.1 along with the relation \(|p|\le 2^{\frac{1}{2}}H^{\frac{1}{2}}(x,p)\). The last inequality is by Lemma 3.2, and \(C''\) is independent of \(\lambda <1\).
Putting our results together
$$\begin{aligned} \mathbb {E}^{(\lambda )}_{s}\big [\mathbf {A}_{t}^{+} \big ]\ge -2^{\frac{1}{2}} H^{\frac{1}{2}}(s)+\frac{ \mathbb {E}^{(\lambda )}_{s}\big [|\mathbf {M}_{t}|^{2}\big ]^{2}}{ \mathbb {E}^{(\lambda )}_{s}\big [|\mathbf {M}_{t}|^{3}\big ]}\ge -2^{\frac{1}{2}} H^{\frac{1}{2}}(s)+\frac{ c^{2}}{ C'' }t^{\frac{1}{2}}, \end{aligned}$$
which proves the lemma. \(\square \)

Proof of Proposition 3.4

Part (1): By Part (1) of Proposition 2.4 applied to the constant function \(g(s)=1\), we have that
$$\begin{aligned} \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [\tilde{n}_{1}+1\big ]= \frac{1}{ \int _{\Sigma }ds\Psi _{\infty ,\lambda }(s)h(s)}= \lambda ^{-\frac{1}{2}} \frac{(2\pi )^{\frac{1}{2}}}{ \int _{\Sigma }ds h(s)}+ O (1), \end{aligned}$$
where the order equality is for small \(\lambda \). The same equality holds with \(\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [\tilde{n}_{1}+1\big ]\) replaced with \(\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [R_{1}\big ]\) by Part (2) of Proposition 2.4.
Part (2): We can prove the result through the Laplace transform by showing that there is a \(C>0\) such that for all \(\gamma \in {\mathbb R}^{+}\)
$$\begin{aligned} \sup _{\lambda <1}\big | \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [e^{-\gamma \tilde{n}_{1}}\big ] -1 \big |\le C\gamma ^{\frac{1}{2}}\quad \text {and}\quad \sup _{\lambda < 1} \big |\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [e^{-\gamma R_{1}}\big ] -1 \big | \le C\gamma ^{\frac{1}{2}}. \end{aligned}$$
The proof for \(R_{1}\) and \(\tilde{n}_{1}\) are similar, and we focus on \(R_{1}\). Also, it is sufficient to prove the result with \(R_{1}'\) rather than \(R_{1}\) since, by Lemma 3.5, the random variable \(R_{1}-R_{1}'\) has finite expectation. We will study the following regimes for \(\gamma \):
  1. (i).

    \(\gamma <\lambda \),

     
  2. (ii).

    \(\lambda \le \gamma \) and \(\gamma \) sufficiently small.

     
The case (i) can be shown with a simple linearization around \(\gamma =0\). As a result of Part (1), there exists a \(C'>0\) such that
$$\begin{aligned} \big |\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [e^{-\gamma R_{1}'}\big ] -1 \big |\le \gamma \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [R_{1}' \big ] \le C'\gamma \lambda ^{-\frac{1}{2}}. \end{aligned}$$
When \(\gamma <\lambda \) the bound on the right side is smaller than \(C'\gamma ^{\frac{1}{2}}\).
For the regime (ii), we can no longer rely on the first derivative of the Laplace transform because the upper bound is growing as \( O (\lambda ^{-\frac{1}{2}})\). In the analysis below, we will show that there is a \(c>0\) such that for all \( \gamma \) and \(\lambda \le 1\),
$$\begin{aligned} \big | \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [e^{-\gamma R_{1}'}\big ] -1 \big |\le \frac{1}{ c\mathbb {E}^{(\lambda )}_{\nu }\big [\int \limits _{0}^{\gamma ^{-1}}dr\mathcal {A}_{\lambda }^{+}(S_{r}) \big ]-c^{-1}\gamma ^{-\frac{1}{4}}}. \end{aligned}$$
(3.12)
However, by Lemma 3.6 there is \(c'>0\) such that all \(\gamma ^{-1}\le \lambda ^{-1} \) and \(s\in \text {Supp}(\nu )\),
$$\begin{aligned} \mathbb {E}^{(\lambda )}_{s}\big [\int \limits _{0}^{\gamma ^{-1}}dr\mathcal {A}_{\lambda }^{+}(S_{r}) \big ]\ge c'\gamma ^{-\frac{1}{2}}-c'^{-1}. \end{aligned}$$
(3.13)
Combining (3.12) and (3.13) yields statement (ii).
In order to show (3.12), we will show some preliminary bounds. The difference between \(\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [e^{-\gamma R_{1}'}\big ]\) and \(1\) is smaller than
$$\begin{aligned} \big | \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [e^{-\gamma R_{1}'}\big ] -1 \big |\le \frac{\big | \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [e^{-\gamma R_{1}' }\big ] -1 \big |}{\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [e^{-\gamma R_{1}'}\big ]}&\le \Big (\sum _{m=1}^{\infty } \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [e^{-\gamma R_{1}'} \big ]^{m}\Big )^{-1}\nonumber \\&\le \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\Big [\sum _{m=1}^{\infty } e^{-\gamma R_{m}'}\Big ]^{-1}. \end{aligned}$$
(3.14)
The third inequality follows since
$$\begin{aligned} \Big (\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [e^{-\gamma R_{1}'} \big ]\Big )^{m}=\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [e^{-\gamma \sum _{n=1}^{m} R_{n}'-R_{n-1}} \big ] \ge \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [e^{-\gamma R_{m}'}\big ], \end{aligned}$$
where the \(R_{n}'-R_{n-1}\) are independent by Part (2) of Proposition 2.1 and distributed as \(R_{1}'\) when the initial distribution of the split process is \(\tilde{\nu }\), and the inequality is from \(\sum _{n=1}^{m} R_{n}'-R_{n-1}\le R_{m}'\). By (3.14) it is sufficient to give a lower bound for \(\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [\sum _{m=1}^{\infty } e^{-\gamma R_{m}'}\big ]\). This term can be rewritten as
$$\begin{aligned} \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\Big [\sum _{m=1}^{\infty } e^{-\gamma R_{m}'}\Big ]= \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\Big [\sum _{m=0}^{\infty } e^{-\gamma \tau _{m}}\chi (\zeta _{m}=1) \Big ]= \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\Big [\sum _{m=0}^{\infty } e^{-\gamma \tau _{m}}h(\sigma _{m}) \Big ], \end{aligned}$$
(3.15)
where \(\sigma _{m}=S_{\tau _{m}}\) is the resolvent chain and \(\zeta _{m}=Z_{\tau _{m}}\) is the binary component of the split chain. The first equality in (3.15) is from the definition of the times \(R_{m}'=\tau _{\tilde{n}_{m}}\), and the second equality is by Part (3) of Proposition 2.3. The right side above is equal to
$$\begin{aligned} \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\Big [\sum _{m=0}^{\infty } e^{-\gamma \tau _{m}}h(\sigma _{m}) \Big ]&= \mathbb {E}^{(\lambda )}_{\nu }\Big [\sum _{m=0}^{\infty } e^{-\gamma \tau _{m}}h(\sigma _{m}) \Big ] =\nu (h)+ \mathbb {E}^{(\lambda )}_{\nu }\Big [\int \limits _{0}^{\infty }dr e^{-\gamma r}h(S_{r}) \Big ]\nonumber \\&\ge e^{-1} \mathbb {E}^{(\lambda )}_{\nu }\Big [\int \limits _{0}^{\gamma ^{-1}}dr h(S_{r})\Big ]. \end{aligned}$$
(3.16)
The first equality uses that the argument of the expectation is a function of only the times \(\tau _{m}\) and the resolvent chain \(\sigma _{m}\) in order to revert to the original statistics. The second equality in (3.16) holds since the \(m=0\) term in the sum is \(\mathbb {E}^{(\lambda )}_{\nu } [h(\sigma _{0})]=\nu (h)\) and
$$\begin{aligned} \sum _{m=1}^{\mathbf {N}_{t}} e^{-\gamma \tau _{m}}h(S_{\tau _{m}}) - \int \limits _{0}^{t}dr e^{-\gamma r}h(S_{r}) \end{aligned}$$
is a mean zero martingale which converges to a limiting value as \(t\rightarrow \infty \). The above process is a martingale because the terms \(e^{-\gamma \tau _{m}}h(\sigma _{m})=e^{-\gamma \tau _{m}}h(S_{\tau _{m}})\) in the sum occur with Poisson rate one.
Thus far we have shown that
$$\begin{aligned} \big | \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [e^{-\gamma R_{1}}\big ] -1 \big |\le \frac{1}{ e^{-1} \mathbb {E}^{(\lambda )}_{\nu }\Big [\int _{0}^{\gamma ^{-1}}dr h(S_{r}) \Big ]}. \end{aligned}$$
(3.17)
Now we find a lower bound for \(\mathbb {E}^{(\lambda )}_{\nu }\Big [\int _{0}^{\gamma ^{-1}}dr h(S_{r}) \Big ]\) in terms of the same expression except with \(h\) replaced by \(\mathcal {A}_{\lambda }^{+}\). Define the constant
$$\begin{aligned} u_{\lambda }:= \frac{\int _{\Sigma }ds\Psi _{\infty ,\lambda }(s)h(s) }{ \int _{\Sigma }ds\Psi _{\infty ,\lambda }(s)\mathcal {A}_{\lambda }^{+}(s)}. \end{aligned}$$
By the triangle inequality and going to the split statistics,
$$\begin{aligned}&\Big |\mathbb {E}^{(\lambda )}_{\nu }\Big [\int \limits _{0}^{\gamma ^{-1}}dr h(S_{r}) \Big ]-u_{\lambda }\mathbb {E}^{(\lambda )}_{\nu }\Big [\int \limits _{0}^{\gamma ^{-1}}dr \mathcal {A}_{\lambda }^{+}(S_{r}) \Big ]\Big |\nonumber \\&\le \Big | \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\Big [\sum _{n=0}^{\tilde{N}_{\gamma ^{-1}}}\int \limits _{R_{n}}^{R_{n+1}}dr\Big (h(S_{r})-u_{\lambda }\mathcal {A}_{\lambda }^{+}(S_{r}) \Big )\Big ] \Big |\nonumber \\&\quad +\,\, \tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\Big [\sup _{0\le n\le \tilde{N}_{\gamma ^{-1}}}\int \limits _{R_{n}}^{R_{n+1}}dr\Big (h(S_{r})+u_{\lambda }\mathcal {A}_{\lambda }^{+}(S_{r}) \Big )\Big ]\nonumber \\&=\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [\tilde{N}_{\gamma ^{-1}}+1\big ] \left| \frac{ \int _{\Sigma }ds\Psi _{\infty ,\lambda }(s)\Big (h(s)-u_{\lambda }\mathcal {A}_{\lambda }^{+}(s) \Big )}{ \int _{\Sigma }ds\Psi _{\infty ,\lambda }(s)h(s)} \right| + O \Big (\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [\tilde{N}_{\gamma ^{-1}}\big ]^{\frac{1}{2}}\Big )= O (\gamma ^{-\frac{1}{4}}),\nonumber \\ \end{aligned}$$
(3.18)
where \(\tilde{N}_{t}\) is the number of life cycles completed by time \(t\). The inequality covers the leftover interval \([\gamma ^{-1},R_{\tilde{N}_{\gamma ^{-1}}+1}]\) of the integration. The first term on the third line of (3.18) is zero by the definition of \(u_{\lambda }\). Also, the first term on the second line is equal to the first term on the third line since \(\tilde{S}_{R_{n}}\) has distribution \(\tilde{\nu }\) when conditioned on \(\tilde{\mathcal {F}}_{R_{n}'}\) by Part (1) of Proposition 2.1 and by Part (2) of Proposition 2.4. The second term on the second line of (3.18) is bounded by a constant multiple of \(\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}[\tilde{N}_{\gamma ^{-1}}]\) by the same reasoning as in (3.6). Finally, \(\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}\big [\tilde{N}_{\gamma ^{-1}}\big ] \) is \( O (\gamma ^{-\frac{1}{2}})\) by the same argument as in the proof of Lemma 3.3.
The constant \(u_{\lambda }\) satisfies
$$\begin{aligned} u_{\lambda }= \frac{\int \limits _{\Sigma }ds e^{-\lambda H(s)}h(s) }{ \int \limits _{\Sigma }ds e^{-\lambda H(s)}\mathcal {A}_{\lambda }^{+}(s)}=\frac{\int \limits _{\Sigma }ds h(s) }{ \int \limits _{\Sigma }ds \mathcal {A}_{\lambda }^{+}(s)} + O (\lambda ^{\frac{1}{2}})= \mathbf {u}+ O (\lambda ^{\frac{1}{2}})\ge \frac{\mathbf {u}}{2}, \end{aligned}$$
where \(\mathbf {u}=\int \limits _{\Sigma }ds h(s)\), and the inequality holds for \(\lambda \) small enough. The third equality is by Part (3) of Proposition 3.1. These observations imply that for small enough \(\lambda >0\)
$$\begin{aligned} \mathbb {E}^{(\lambda )}_{\nu }\Big [\int \limits _{0}^{\gamma ^{-1}}dr h(S_{r}) \Big ]\ge \frac{\mathbf {u}}{2}\mathbb {E}^{(\lambda )}_{\nu }\Big [\int \limits _{0}^{\gamma ^{-1}}dr \mathcal {A}_{\lambda }^{+}(S_{r}) \Big ]- O (\gamma ^{-\frac{1}{4}}). \end{aligned}$$
Plugging this inequality into (3.17) gives (3.12). \(\square \)

4 Bounding Integral Functionals Over a Life Cycle

In this section we prove Proposition 4.1, which effectively bounds the expected fluctuations for the momentum drift \(D_{t}=\int \limits _{0}^{t}dr\frac{dV}{dx}(X_{r})\) over the period of a single life cycle.

Proposition 4.1

  1. 1.
    For any \(m\in \mathbb {N}\), there is a \(C>0\) such that
    $$\begin{aligned} \sup _{\lambda \le 1} \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\sup _{0\le t\le R_{1}}\Big (\int \limits _{0}^{t}dr \frac{dV}{dx}(X_{r}) \Big )^{2m} \Big ]< C. \end{aligned}$$
     
  2. 2.
    There is a \(C>0\) such that for all \((x,p,z)\in \tilde{\Sigma }\),
    $$\begin{aligned} \sup _{\lambda \le 1} \tilde{\mathbb {E}}_{(x,p,z)}^{ (\lambda )} \Big [\Big |\int \limits _{0}^{R_{1}}dr \frac{dV}{dx}(X_{r}) \Big | \Big ]< C\big (1+\log (1+|p|)\big ). \end{aligned}$$
     
\(\square \)
For the task of proving Proposition 4.1, we rely on the bounds stated in Theorem 4.2 for the generalized resolvent \(U^{(\lambda )}:L^{\infty }(\Sigma )\rightarrow L^{\infty }(\Sigma )\) given by
$$\begin{aligned} \big (U^{(\lambda )}g\big )(s):=\mathbb {E}_{s}^{(\lambda )}\Big [\int \limits _{0}^{\infty }dt g(S_{t})e^{-\int _{0}^{t}dr h(S_{r})} \Big ],\quad s\in \Sigma , \end{aligned}$$
(4.1)
where \(h:\Sigma \rightarrow {\mathbb R}^{+}\) is defined as in Convention 2.2. The expression (4.1) would have the form of a standard resolvent if the function \(h\) were replaced by a constant. In coarse terms, the generalized resolvent \(U^{(\lambda )}\) characterizes the expected duration that the process \(S_{t}\) resides in different regions of phase space before returning to the set \(supp (h)\subset \Sigma \) when beginning from a point \(s\in \Sigma \). The end time for the random walk to \(supp (h)\) is better described as a Poisson time with variable rate depending stochastically on \(S_{t} \) through \(h(S_{t}) \). Operators of the form (4.1) were introduced in [27], and the following theorem is from [8].

Theorem 4.2

There is a \(c>0\) such that for any \(g\in L^{\infty }(\Sigma )\) with \(g \ge 0\) and \(|p|\le \lambda ^{-1}\),
$$\begin{aligned} \big (U^{(\lambda )}g\big )(x,p)&\le c\Vert g\Vert _{\infty }+c|p|\sup _{H'>\frac{1}{2}\lambda ^{-2}}g(x',p')\\&+c\int _{H'\le \frac{1}{2} \lambda ^{-2}}dp'dx'\big (1+\min (|p'|, |p|)\big )g(x',p'),\\ \Vert U^{(\lambda )}g\Vert _{\infty }&\le c\lambda ^{-1}\sup _{H'>\frac{1}{2}\lambda ^{-2}}g(x',p') +c\sup _{H'\le \frac{1}{2}\lambda ^{-2}}\big (U^{(\lambda )}g\big )(x',p'), \end{aligned}$$
where \(H':=H(x',p')\).

The analysis in the proof of Proposition 4.1 also applies to Proposition 4.3, which is easier because the “velocity function” \(g(x,p)=\frac{dV}{dx}(x)\) of Proposition 4.1 does not have explicit decay for \(|p|\gg 1\). The decay for \(\frac{dV}{dx}(x)\) at high momentum only occurs as a time-averaged effect, which is exposed in Lemma 4.7.

Proposition 4.3

Let \(g:\Sigma \rightarrow {\mathbb R}\) satisfy that \( |g(x,p)|\le \frac{ C}{1+|p|^{2}} \) for some \(C>0\) and all \((x,p)\in \Sigma \).
  1. 1.
    For any \(m\in \mathbb {N}\), there is a \(C>0\) such that
    $$\begin{aligned} \sup _{\lambda \le 1} \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\sup _{0\le t\le R_{1}}\Big (\int \limits _{0}^{t}dr g(S_{r}) \Big )^{2m} \Big ]< C. \end{aligned}$$
     
  2. 2.
    There is a \(C>0\) such that for all \((x,p,z)\in \tilde{\Sigma }\),
    $$\begin{aligned} \sup _{\lambda \le 1} \tilde{\mathbb {E}}_{(x,p,z)}^{ (\lambda )} \Big [\Big |\int \limits _{0}^{R_{1}}dr g(S_{r}) \Big | \Big ]< C\big (1+\log (1+|p|)\big ). \end{aligned}$$
     

4.1 An Inequality for Summation Functionals Over a Life Cycle

Recall that \(\sigma _{n}=S_{\tau _{n}}\) denotes the resolvent chain and that \(\tilde{\delta }_{s}=\chi (z=0) (1-h(s))\delta _{s}+\chi (z=1)h(s)\delta _{s} \) is the splitting of the \(\delta \)-distribution at \(s\in \Sigma \). The following lemma states that the generalized resolvent \((U^{(\lambda )}g)(s)\) can be used to bound the expression \(\tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\big [\sum _{n=1}^{\tilde{n}_{1}} g(\sigma _{n}) \big ]\).

Lemma 4.4

The following inequality holds for all \(g\in L^{\infty }(\Sigma , {\mathbb R}^{+})\), \(\lambda >0\), and \(s\in \Sigma \):
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\sum _{n=1}^{\tilde{n}_{1}} g(\sigma _{n}) \Big ]\le \big (U^{(\lambda )}g\big )(s)+\sup _{s\in Supp (h)} \big (U^{(\lambda )}g\big )(s). \end{aligned}$$

Proof

Let the function \(h:\Sigma \rightarrow [0,1]\) and the measure \(\nu \) on \(\Sigma \) be defined as in Convention 2.2. Recall that \(\sigma _{n}=S_{\tau _{n}}\) denotes the resolvent chain and has transition kernel \(\mathcal {T}_{\lambda }\). Define the following operators on \(L^{\infty }({\mathbb R})\):
$$\begin{aligned} \mathcal {W}^{(\lambda )}=\sum _{n=0}^{\infty }\big ((1-h)\mathcal {T}_{\lambda }\big )^{n}\quad \text {and}\quad \widetilde{\mathcal {W}}^{(\lambda )}=\sum _{n=0}^{\infty }\big (\mathcal {T}_{\lambda }-h\otimes \nu \big )^{n}. \end{aligned}$$
In terms of the operators \(\mathcal {T}_{\lambda }\) and \(h\otimes \nu \) on \(L^{\infty }({\mathbb R})\), we can write the expressions in the statement of Lemma 4.4 as
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\sum _{n=1}^{\tilde{n}_{1}} g(\sigma _{n}) \Big ]&= \big ((\mathcal {T}_{\lambda }-h\otimes \nu )\widetilde{\mathcal {W}}^{(\lambda )}g\big )(s),\end{aligned}$$
(4.2)
$$\begin{aligned} U^{(\lambda )}&= \mathcal {T}_{\lambda }\mathcal {W}^{(\lambda )}. \end{aligned}$$
(4.3)
The above representation for \(\tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\big [\sum _{n=1}^{\tilde{n}_{1}} g(\sigma _{n}) \big ]\) can be found in [28]. The representation for \(U^{(\lambda )}\) can be understood through the alternative generalized resolvent form \(U^{(\lambda )}=\frac{1}{h-\mathcal {L}} \); see [23] for other representations. By the identity (4.3) and \(\mathcal {T}_{\lambda }-h\otimes \nu \le \mathcal {T}_{\lambda }\), we have that
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\sum _{n=1}^{\tilde{n}_{1}} g(\sigma _{n}) \Big ]&= \big ((\mathcal {T}_{\lambda }-h\otimes \nu ) \widetilde{\mathcal {W}}^{(\lambda )}g\big )(s) \le \big (\mathcal {T}_{\lambda }\mathcal {W}^{(\lambda )}g\big )(s)\nonumber \\&+\sup _{s\in \Sigma } \Big (\big ((\mathcal {T}_{\lambda }-h\otimes \nu )\widetilde{\mathcal {W}}^{(\lambda )}g\big )(s)-\big ((\mathcal {T}_{\lambda }-h\otimes \nu )\mathcal {W}^{(\lambda )}g\big )(s) \Big ).\qquad \quad \end{aligned}$$
(4.4)
With the identity \(\widetilde{\mathcal {W}}^{(\lambda )}-\mathcal {W}^{(\lambda )}= \widetilde{\mathcal {W}}^{(\lambda )} (h\mathcal {T}_{\lambda }-h\otimes \nu )\mathcal {W}^{(\lambda )}\), we obtain the equality below:
$$\begin{aligned} (\mathcal {T}_{\lambda }-h\otimes \nu )\widetilde{\mathcal {W}}^{(\lambda )}g-(\mathcal {T}_{\lambda }-h\otimes \nu )\mathcal {W}^{(\lambda )}g&= (\mathcal {T}_{\lambda }-h\otimes \nu )\widetilde{\mathcal {W}}^{(\lambda )} (h\mathcal {T}_{\lambda }-h\otimes \nu )\mathcal {W}^{(\lambda )}g\nonumber \\&\le (\mathcal {T}_{\lambda }-h\otimes \nu )\widetilde{\mathcal {W}}^{(\lambda )}h\big (1_{Supp (h)} \mathcal {T}_{\lambda }\mathcal {W}^{(\lambda )}\big )g\nonumber \\&\le \big \Vert 1_{Supp (h)} \mathcal {T}_{\lambda }\mathcal {W}^{(\lambda )}g \big \Vert _{\infty }. \end{aligned}$$
(4.5)
For the first inequality above, we have thrown away the negative term and used that \(h=h1_{Supp (h)}\). The second inequality in (4.5) uses that for all \(s\in \Sigma \)
$$\begin{aligned} \big ((\mathcal {T}_{\lambda }-h\otimes \nu )\widetilde{\mathcal {W}}^{(\lambda )}h\big )(s)\le \big (\widetilde{\mathcal {W}}^{(\lambda )}h\big )(s)=1. \end{aligned}$$
The equality \( \big (\widetilde{\mathcal {W}}^{(\lambda )}h\big )(s)=1 \) holds due to the recurrence of the dynamics and otherwise we would have \( \big (\widetilde{\mathcal {W}}^{(\lambda )}h\big )(s)\le 1 \). To prove this fact, note that the series form of \( \widetilde{W}^{(\lambda )}\) implies that the function \(F(s):= (\widetilde{W}^{(\lambda )}h)(s)\) satisfies
$$\begin{aligned} F(s)= h(s)+\big ((\tau -h\otimes \nu ) F\big )(s) = h(s)+ (\tau F)(s) -h(s) \nu (F) = \big (\tau F\big )(s), \end{aligned}$$
where the third equality uses that \(\nu (F)=1\), which is a consequence of recurrence and Theorem 3 of [28]. Thus the function \(F(s)\) is invariant of the dynamics and must be constant and hence equal to one by the normalization \(\nu (F)=1\).
Applying (4.5) in (4.4) along with the identity (4.3), we have that
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\sum _{n=1}^{\tilde{n}_{1}} g(\sigma _{n}) \Big ] \le \big (U^{(\lambda )}g\big )(s)+\sup _{s\in {{\mathrm{supp}}}(h)}\big (U^{(\lambda )}g\big )(s). \end{aligned}$$
\(\square \)

The following lemma states that an additive functional of the resolvent chain \(\sum _{n}g_{\lambda }(\sigma _{n})\) has arbitrary finite moments when the summation is over a single life cycle and \(g_{\lambda }\ge 0\) has sufficient decay at large momentum. In other words, not much typically happens over a single life cycle.

Lemma 4.5

Let \(g_{\lambda }:\Sigma \rightarrow {\mathbb R}^{+}\), and suppose that there is a \(C>0\) such that
$$\begin{aligned} g_{\lambda }(x,p)\le C\max \big (\frac{1}{1+|p|^{2}},\lambda \big ). \end{aligned}$$
for all \(\lambda <1\) and \((x,p)\in \Sigma \). Then,
$$\begin{aligned} \sup _{\lambda <1}\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\Big (\sum _{n=0}^{\tilde{n}_{1}} g_{\lambda }(\sigma _{n}) \Big )^{m}\Big ]<\infty , \quad m\in \mathbb {N}. \end{aligned}$$

Proof

For the case \(m=1\) and \(f=g_{\lambda }\), we have the closed expression
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sum _{n=0}^{\tilde{n}_{1}}f(\sigma _{n}) \Big ]&= \frac{ \int _{\Sigma }ds\Psi _{\infty ,\lambda }(s)f(s) }{\int _{\Sigma }ds\Psi _{\infty ,\lambda }(s)h(s) } \end{aligned}$$
(4.6)
$$\begin{aligned}&\le \frac{ \int _{|p|\le \lambda ^{-1}}dxdp f(x,p) +\frac{2}{\lambda ^{\frac{1}{2}}}erfc (\lambda ^{-\frac{1}{2}}) \sup _{|p|> \lambda ^{-1}}f(x, p)}{\int _{\Sigma }dxdp e^{-\lambda H(x,p)}h(x,p) }, \end{aligned}$$
(4.7)
where \(erfc (q)=\int _{q}^{\infty }dp e^{-\frac{p^{2}}{2}}\) is the complementary error function, and the equality holds by Part (1) of Proposition 2.4. The right side of (4.7) is finite by our conditions on \(g_{\lambda }\) and since the denominator is approximately \(\mathbf {u}=\int _{\Sigma }ds h(s)\) for small \(\lambda \), and thus the right side of (4.7) is bounded away from zero for \(\lambda <1\).
For \(m=2\), we write
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\Big (\sum _{n=0}^{\tilde{n}_{1}} g_{\lambda }(\sigma _{n})\Big )^{2} \Big ]&= \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sum _{n=0}^{\tilde{n}_{1}} g_{\lambda }^{2}(\sigma _{n})+ 2\sum _{ n<m}^{\tilde{n}_{1}} g_{\lambda }(\sigma _{n}) g_{\lambda }(\sigma _{m}) \Big ] \nonumber \\&= \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sum _{n=0}^{\tilde{n}_{1}} g_{\lambda }^{2}(\sigma _{n})+2g_{\lambda }(\sigma _{n})\tilde{\mathbb {E}}^{(\lambda )}\Big [\sum _{ m=n+1}^{\tilde{n}_{1}} g_{\lambda }(\sigma _{m}) \,\Big |\,\tilde{\mathcal {F}}_{\tau _{n}^{-}}\Big ] \Big ] \nonumber \\&\le \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sum _{n=0}^{\tilde{n}_{1}} g_{\lambda }^{2}(\sigma _{n})+2 g_{\lambda }(\sigma _{n})\big (U^{(\lambda )}g_{\lambda }\big )(\sigma _{n}) +2 g_{\lambda }(\sigma _{n})B(g_{\lambda }) \Big ],\nonumber \\ \end{aligned}$$
(4.8)
where \(B:L^{\infty }(\Sigma ,{\mathbb R}^{+})\rightarrow {\mathbb R}^{+} \) is defined by
$$\begin{aligned} B(g)=\sup _{ s\in Supp (h)} \big (U^{(\lambda )}g\big )(s). \end{aligned}$$
To see (4.8) recall that \(\sigma _{n}:=S_{\tau _{n}}\) and that the \(\sigma \)-algebra \(\tilde{\mathcal {F}}_{\tau _{n}^{-}}\) contains knowledge of the state \(\sigma _{n}\). The value \(\sup _{\lambda <1} B(g_{\lambda })\) is finite by the bound we assumed for \(g_{\lambda }\) and the bound on \(U^{(\lambda )}g_{\lambda }\) from Theorem 4.2; see (4.10) below. The inequality in (4.8) applies Part (3) of Proposition 2.3, to get the equality below
$$\begin{aligned} \tilde{\mathbb {E}}^{(\lambda )}\Big [\sum _{ m=n+1}^{\tilde{n}_{1}} g_{\lambda }(\sigma _{m}) \,\Big |\,\tilde{\mathcal {F}}_{\tau _{n}^{-}}\Big ] = \tilde{\mathbb {E}}_{\tilde{\delta }_{\sigma _{n}}}^{(\lambda )}\Big [\sum _{ m=1}^{\tilde{n}_{1}} g_{\lambda }(\sigma _{m}) \Big ] \le \big (U^{(\lambda )} g_{\lambda }\big )(\sigma _{n})+B(g_{\lambda }). \end{aligned}$$
(4.9)
The inequality in (4.9) follows by an application of Lemma 4.4.
We can apply (4.7) with \(f= g_{\lambda }^{2}+2g_{\lambda } U^{(\lambda )}g_{\lambda }+2g_{\lambda }B(g_{\lambda })\) to bound the right side of (4.8). Clearly the contribution from \(g_{\lambda }^{2}\) is not a problem since \(g_{\lambda }^{2}(x,p)\le Cg_{\lambda }(x,p)\). For \(g_{\lambda } U^{(\lambda )}(g_{\lambda })\) there are constants such that
$$\begin{aligned} g_{\lambda }(p)\big (U^{(\lambda )}g_{\lambda }\big )(p)&\le (const )\frac{1+\log (1+|p|)}{|p|^{2}}, \text {for all} |p|\le \lambda ^{-1},\,\lambda <1 \\ g_{\lambda }(p) \big (U^{(\lambda )}g_{\lambda }\big )(p)&\le (const )\frac{1+\log (1+|\lambda ^{-1}|)}{|\lambda |^{-2}}, \text {for all} |p|> \lambda ^{-1},\,\lambda <1. \end{aligned}$$
We have applied Theorem 4.2 along with our conditions on \(g_{\lambda }\) to get
$$\begin{aligned} \big (U^{(\lambda )} g_{\lambda }\big )(x,p)&\le c+c\int \limits _{0}^{|p|}dp^{\prime }p^{\prime }|g_{\lambda }(p^{\prime })|\le c+c'\int \limits _{0}^{|p|}dp^{\prime }\frac{p^{\prime }}{1+|p^{\prime }|^{2}}\le c\nonumber \\&+\,c''\log (1+|p|), \end{aligned}$$
(4.10)
for some \(c,c',c''>0\) and all \(\lambda <1\) and \(|p|\le \lambda ^{-1}\). The inequality above follows similarly for the domain \(|p|> \lambda ^{-1}\).
Now we sketch the proof for the general case \(m>2\). For \(\epsilon _{j}\in \{<,=\}\) and \(j<m\), let the set \(\ell ^{(\tilde{n}_{1})}(\epsilon _{1},\ldots ,\epsilon _{m-1}) \) be the collection of all \((r_{1},\ldots ,r_{m})\in [0,\tilde{n}_{1}]^{m}\) satisfying the relations
$$\begin{aligned} r_{1}\,\epsilon _{1}\, r_{2} \,\ldots \epsilon _{m-1} \,r_{m}. \end{aligned}$$
Also define,
$$\begin{aligned} f_{(\epsilon _{1},\ldots , \epsilon _{m-1})}= A_{\epsilon _{1}}\cdots A_{\epsilon _{m-1}} g_{\lambda }, \end{aligned}$$
where \( A_{=},A_{<}\) are maps on \(L^{\infty }(\Sigma ,{\mathbb R}^{+})\) in which \(A_{=}\) is multiplication by \(g_{\lambda }\) and \(A_{<}=A_{=}(U^{(\lambda )}+B) \). We can write
$$\begin{aligned}&\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\Big (\sum _{n=0}^{\tilde{n}_{1}} g_{\lambda }(\sigma _{r_{n}}) \Big )^{m}\Big ]= \begin{array}{l}\text {Lin. comb. over}\\ (\epsilon _{1},\ldots ,\epsilon _{m-1})\in \{<,=\}^{m-1} \end{array}\quad \\&\quad \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sum _{ \ell ^{(\tilde{n}_{1})}(\epsilon _{1},\ldots ,\epsilon _{m-1})} g_{\lambda }(\sigma _{r_{1}})\cdots g_{\lambda }(\sigma _{r_{m}}) \Big ]. \end{aligned}$$
However, the following inequality holds:
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sum _{ \ell ^{(\tilde{n}_{1})}(\epsilon _{1},\ldots ,\epsilon _{m-1})} g_{\lambda }(\sigma _{r_{1}})\cdots g_{\lambda }(\sigma _{r_{m}}) \Big ]\le \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sum _{n=0}^{\tilde{n}_{1}} f_{(\epsilon _{1},\ldots , \epsilon _{m-1})}(\sigma _{n}) \Big ], \end{aligned}$$
(4.11)
because we can write the difference between the right and left side of (4.11) as a sum of positive terms \(\mathbf {c}_{v-1}-\mathbf {c}_{v}\) indexed by \(v\in [1,m-1]\), where
$$\begin{aligned} \mathbf {c}_{v}=\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sum _{ \ell ^{(\tilde{n}_{1})}(\epsilon _{1},\ldots ,\epsilon _{v})} g_{\lambda }(\sigma _{r_{1}})\cdots g_{\lambda }(\sigma _{r_{v}})f_{(\epsilon _{v+1},\ldots , \epsilon _{m-1})}(\sigma _{r_{v+1}}) \Big ]. \end{aligned}$$
When \(\epsilon _{v-1}\) is \(=\), then \(\mathbf {c}_{v-1}\) and \(\mathbf {c}_{v}\) are identically equal. When \(\epsilon _{v-1}\) is \(<\), then the difference \(\mathbf {c}_{v}-\mathbf {c}_{v-1}\) is equal to
$$\begin{aligned}&\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sum _{ \ell ^{(\tilde{n}_{1})}(\epsilon _{1},\ldots ,\epsilon _{v-1})} g_{\lambda }(\sigma _{r_{1}})\cdots g_{\lambda }(\sigma _{r_{v-1}})\Big (g_{\lambda }(\sigma _{r_{v}}) \sum _{n=r_{v}+1}^{\tilde{n}_{1}} f_{(\epsilon _{v+1},\ldots , \epsilon _{m-1})}(\sigma _{n}) -f_{(\epsilon _{v},\ldots , \epsilon _{m-1})}(\sigma _{r_{v}}) \Big ) \Big ]\\&= \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sum _{ \ell ^{(\tilde{n}_{1})}(\epsilon _{1},\ldots ,\epsilon _{v-1})} g_{\lambda }(\sigma _{r_{1}})\cdots g_{\lambda }(\sigma _{r_{v-1}}) \\&\times \Big (g_{\lambda }(\sigma _{r_{v}}) \tilde{\mathbb {E}}^{(\lambda )}\Big [\sum _{n=r_{v}+1}^{\tilde{n}_{1}} f_{(\epsilon _{v+1},\ldots , \epsilon _{m-1})}(\sigma _{n})\Big |\tilde{\mathcal {F}}_{\tau _{r_{v}}^{-}}\Big ] -f_{(\epsilon _{v},\ldots , \epsilon _{m-1})}(\sigma _{r_{v}}) \Big ) \Big ]\\&\le \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sum _{ \ell ^{(\tilde{n}_{1})}(\epsilon _{1},\ldots ,\epsilon _{v-1})} g_{\lambda }(\sigma _{r_{1}})\cdots g_{\lambda }(\sigma _{r_{v-1}})\\&\times \Big (g_{\lambda }(\sigma _{r_{v}})\big (U^{(\lambda )}f_{(\epsilon _{v},\ldots , \epsilon _{m-1})}\big )(\sigma _{r_{v}})+f_{(\epsilon _{v},\ldots , \epsilon _{m-1})}(\sigma _{r_{v}})B\big (f_{(\epsilon _{v},\ldots , \epsilon _{m-1})}\big ) -f_{(\epsilon _{v},\ldots , \epsilon _{m-1})}(\sigma _{r_{v}}) \Big ) \Big ] = 0, \end{aligned}$$
where the inequality follows from the strong Markov property and Lemma 4.4 by the same argument as in (4.9).
We are left to bound \(\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\big [\sum _{n=0}^{\tilde{n}_{1}} f_{(\epsilon _{1},\ldots , \epsilon _{m-1})}(\sigma _{n}) \big ]\). The worst case scenario is when all the \(\epsilon _{j}\) are equal to \(<\) because mere multiplication by \(g_{\lambda }(p)\) introduces more decay for large \(|p|\). By our conditions on \(g_{\lambda }\) and \(m-1\) applications of Theorem 4.2,
$$\begin{aligned} \big ((U^{(\lambda )})^{m-1}g_{\lambda }\big )(x,p)\le c^{m-1}\frac{\big (1+\log (1+|p|)\big )^{m-1}}{1+|p|^{2}},\quad |p|\le \lambda ^{-1}, \end{aligned}$$
and we get another bound for \(|p|>\lambda ^{-1}\) that is smaller than a fixed multiple of \(\lambda ^{-2m-1}\) for all \(\lambda <1\). Applying the inequality (4.7), we obtain the bound. \(\square \)

4.2 Inequalities for the Momentum Drift

The first two parts in the lemma below follow from the conservation of energy and the quadratic formula and do not depend on the potential being periodic. The third part of Lemma 4.6 is a statement about mixing on the torus. If the particle begins with a high momentum \(|P_{0}|\gg 1\) and is stopped at a random exponential time \(\tau \), then the distribution on the torus \(\mathbb {T}=[0,1)\) at the stopping time will be roughly uniform–even in the presence of the bounded periodic potential \(V(x)\).

Lemma 4.6

Let \((X_{t},\,P_{t})\) evolve according to the Hamiltonian \(H(x,p)=\frac{1}{2}p^{2}+V(x)\), for a positive potential \(V(x)\) with \(\sup _{x}\big |\frac{dV}{dx}(x)\big |<\infty \). If the initial momentum has \(|P_{0}|^{2}>4\sup _{x}V(x)\), then the difference \(P_{t}-P_{0}=-\int _{0}^{t}dr\frac{dV}{dx}(X_{r})\) satisfies the inequalities
  1. 1.

    \(\sup _{t\in {\mathbb R}^{+}} \big |\int \limits _{0}^{t}dr\frac{dV}{dx}(X_{r}) \big |\le 2\sup _{x}V(x) |P_{0}|^{-1} \), and

     
  2. 2.

    \(\Big | -\int \limits _{0}^{t}dr\frac{dV}{dx}(X_{s})-\frac{ V(X_{t})-V(X_{0})}{ P_{0}}\Big |\le 2t\sup _{x}\big |\frac{dV}{dx}(x)\big | \sup _{x}V(x) |P_{0}|^{-2} \).

     
  3. 3.
    Suppose further that \(V(x)\) has period one. If \(\tau \) is exponentially distributed with mean \(\mathbf {r}^{-1}\) and \(F:\mathbb {T}\rightarrow {\mathbb R}\) is a function on the torus and bounded, then
    $$\begin{aligned} \Big |\mathbb {E}_{(X_{0},P_{0})}\big [F(X_{\tau })\big ]-\int \limits _{\mathbb {T}}dx F(x)\Big |\le \mathbf {r}\Vert F\Vert _{\infty }|P_{0}|^{-1}+ O (|P_{0}|^{-2}). \end{aligned}$$
     

Proof

Part (1): Since \(|P_{0}|^{2}>4\sup _{x}V(x)\), the momentum \(P_{t}\) will not change sign at any time. By the conservation of energy
$$\begin{aligned} \frac{1}{2}\big |P_{0}+(P_{t}-P_{0})\big |^{2}-\frac{1}{2}P_{0}^{2}=-V(X_{t})+V(X_{0}). \end{aligned}$$
Using the quadratic formula and that \(P_{t},\,P_{0}\) have the same sign,
$$\begin{aligned} |P_{t}-P_{0}|&=\Big ||P_{0}|-\big (P_{0}^{2}+2V(X_{0})-2V(X_{t})\big )^{\frac{1}{2}}\Big |\nonumber \\&\le \Big | \frac{1}{2}\int \limits _{0}^{2V(X_{0})-2V(X_{t})}dw \big (P_{0}^{2}+w\big )^{-\frac{1}{2}}\Big |<\frac{2\sup _{x}V(x)}{|P_{0}|}, \end{aligned}$$
since \(\big (P^{2}_{0}+w\big )^{-\frac{1}{2}}\le \sqrt{2}|P_{0}|^{-1}<2|P_{0}|^{-1}\) for \(|w|\le \frac{1}{2}\,P_{0}^{2}\).
Part (2): With the identity \(V(X_{t})-V(X_{0})=\int \limits _{0}^{t}dr \frac{dV}{dx}(X_{r})P_{r}\), then
$$\begin{aligned} \Big | \int \limits _{0}^{t}dr \frac{dV}{dx}(X_{r})-\frac{ V(X_{t})-V(X_{0})}{ P_{0}}\Big |&\le \int \limits _{0}^{t}dr \Big |\frac{dV}{dx}(X_{r})\big (1-\frac{P_{r}}{P_{0}} \big )\Big | \\&\le t |P_{0}|^{-1}\sup _{x}\big |\frac{dV}{dx}(x)\big |\sup _{r}|P_{r}-P_{0}\big |\\&\le 2t\sup _{x}\big |\frac{dV}{dx}(x)\big | \sup _{x}V(x)\, |P_{0}|^{-2} , \end{aligned}$$
where we applied Part (1) for the last inequality.
Part (3): Let \(d_{s}:\mathbb {T}\rightarrow {\mathbb R}^{+}\) be the density of the particle on the torus at time \(\tau \) starting from the point \(s=(X_{0},P_{0})\in \Sigma \). We have that
$$\begin{aligned} \mathbb {E}_{s}\big [F(X_{\tau }) \big ]=\int \limits _{\mathbb {T}}dx d_{s}(x) F(x). \end{aligned}$$
This leads to the simple bound
$$\begin{aligned} \Big | \mathbb {E}_{s}\big [F(X_{\tau }) \big ]- \int \limits _{\mathbb {T}}dx F(x) \Big | \le \Vert F\Vert _{\infty }\Vert d_{s}-1\Vert _{1}. \end{aligned}$$
(4.12)
Thus it is sufficient for us bound the \(1\)-norm of \(d_{s}-1\), and, in fact, our bounds can be made in the supremum norm. Our method for bounding (4.12) will be to analyze a closed form for \(d_{s}(x)\) that is possible due to the periodic form of the particle’s trajectory \(X_{t}\), \(t\ge 0\).
Notice that \(d_{s}\) can be written as
$$\begin{aligned} d_{s}(a)=\sum _{n=1}^{\infty } \frac{ \mathbf {r}\,e^{-\mathbf {r}\,t_{n}(a)}}{ |P_{t_{n}(a)}|}= \frac{ \mathbf {r} e^{-\mathbf {r}\,t_{1}(a)}}{ |P_{t_{1}(a)}|}\sum _{n=0}^{\infty }\,e^{-\mathbf {r}\,n\Delta }=\frac{ \mathbf {r} e^{-\mathbf {r}\,t_{1}(a)}}{ |P_{t_{1}(a)}| \big (1- e^{-\mathbf {r}\Delta } \big ) }, \end{aligned}$$
(4.13)
where \(t=t_{1}(a),\, t_{2}(a),\cdots \) are the periodic sequence of times at which \(X_{t}\,mod (1)=a\) and \(\Delta \) is the increment between successive times \(t_{n}(a)\). These times will exist for every \(a\in \mathbb {T}\) as long as \(H(s)>\sup _{x}V(x) \). If \(4\sup _{x}V(x)\le P_{0}^{2}\), then \(|P_{t}-P_{0}|\le 2\big (\sup _{x}V(x)\big )|P_{0}|^{-1}\) by Part (1). Thus for large initial momentum, \(|P_{0}|\gg (\sup _{x}V(x))^{\frac{1}{2}}\), the momentum process \(P_{r}\) is nearly constant \(\approx P_{0}\) and the period \(\Delta \) is close to \(\frac{1}{|P_{0}|}\). To get a precise bound for the difference between \(\Delta \) and \(\frac{1}{|P_{0}|}\), notice that when \(|P_{0}|\) is large enough so that \(|P_{t}-P_{0}|\le 2\sup _{x}V(x)|P_{0}|^{-1}<\frac{1}{2}|P_{0}|\), then clearly \(\frac{1}{2|P_{0}|}\le \Delta \le \frac{2}{|P_{0}|}\) since the particle always travels with speeds \(|P_{t}|\in [\frac{1}{2}|P_{0}|,\frac{3}{2}|P_{0}|]\). Hence, the difference between \(\Delta \) and \(\frac{1}{|P_{0}|}\) is smaller than
$$\begin{aligned} \Big |\Delta -\frac{1}{|P_{0}|}\Big |\le \frac{1}{|P_{0}|}\Big |\int \limits _{0}^{\Delta }dr\, P_{0}-S(P_{0})\Big | \le \frac{1}{|P_{0}|} \int \limits _{0}^{\Delta }dr |P_{r}- P_{0}| \le \frac{4\sup _{x}V(x)}{|P_{0}|^{3}},\quad \end{aligned}$$
(4.14)
where the second inequality uses that \(\int _{0}^{\Delta }drP_{r}=S(P_{0})\). Using the triangle inequality
$$\begin{aligned} \big | d_{s}(a)-1 \big |&\le \Big |d_{s}(a)-\frac{ \mathbf {r} e^{-\mathbf {r}\,t_{1}(a)}}{ |P_{0}| \big (1- e^{-\mathbf {r}\Delta } \big ) } \Big |+\Big |\frac{ \mathbf {r} e^{-\mathbf {r}\,t_{1}(a)}}{ |P_{0}| \big (1- e^{-\mathbf {r}\Delta } \big ) }-\frac{ \mathbf {r} e^{-\mathbf {r}\,t_{1}(a)}}{ |P_{0}| \big (1- e^{-\frac{\mathbf {r}}{ |P_{0}|}} \big ) }\Big |\nonumber \\&+ \Big |\frac{ \mathbf {r} e^{-\mathbf {r}\,t_{1}(a)}}{ |P_{0}| \big (1- e^{-\frac{\mathbf {r}}{|P_{0}|}} \big ) }-1\Big |\nonumber \\&\le \frac{2\mathbf {r}}{|P_{0}|}+ O (\frac{1}{|P_{0}|^{2}}), \end{aligned}$$
(4.15)
where the last inequality follows by further computations using the inequalities above. For instance, we can bound the first term on the first line of (4.15) by
$$\begin{aligned} \Big |d_{s}(a)-\frac{ \mathbf {r} e^{-\mathbf {r}\,t_{1}(a)}}{ |P_{0}| \big (1- e^{- \mathbf {r}\Delta } \big ) } \Big |\le \frac{\big | P_{t_{1}(a)}- P_{0}\big |}{|P_{t_{1}(a)}|\, |P_{0}|}\, \frac{ \mathbf {r}}{ \big (1- e^{- \mathbf {r}\Delta } \big ) } \le \frac{ 4 \sup _{x}V(x)}{ \Delta |P_{0}|^{3}} \le \frac{ 8 \sup _{x}V(x)}{ |P_{0}|^{2}} , \end{aligned}$$
where the inequalities hold for sufficiently large \(|P_{0}|\). The first inequality above follows from Part (1) and the second inequality uses that \(\Delta \ge \frac{1}{2|P_{0}|}\) by the remark above (4.14). \(\square \)
Define the functions \(\mathbf {C}^{(\lambda )}_{n}:\Sigma \rightarrow {\mathbb R}\),
$$\begin{aligned} \mathbf {C}^{(\lambda )}_{0}(s)&= \tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\chi (Z_{\tau _{1}}=0) \int \limits _{\tau _{1}}^{\tau _{2}}dr\frac{dV}{dx}(X_{r}) \Big ], \\ \mathbf {C}^{(\lambda )}_{n}(s)&= \tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\Big (\chi (Z_{\tau _{1}}=0) \int \limits _{\tau _{1}}^{\tau _{2}}dr\frac{dV}{dx}(X_{r}) - \mathbf {C}^{(\lambda )}_{0}(s) \Big )^{2n}\Big ], \quad n\ge 1, \end{aligned}$$
where \(\tau _{1},\tau _{2}\) are the first two partition times and \(\tilde{\delta }_{s}= \big (1-h(s)\big )\delta _{(s,0)}+h(s)\delta _{(s,1)} \), i.e., the splitting of the \(\delta \)-distribution at \(s\in \Sigma \). The presence of the factor \(\chi (Z_{\tau _{1}}=0)\) in the above definitions is a small technical precaution, and if \(\chi (Z_{\tau _{1}}=0)\) is removed in the formula for \(\mathbf {C}^{(\lambda )}_{0}(s) \), then we have
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\int \limits _{\tau _{1}}^{\tau _{2}}dr \frac{dV}{dx}(X_{r}) \Big ]=\mathbb {E}_{s}^{(\lambda )}\Big [\int \limits _{\tau _{1}}^{\tau _{2}}dr\frac{dV}{dx}(X_{r}) \Big ]=\mathbb {E}_{s}^{(\lambda )}\Big [\int \limits _{0}^{\infty }dt\,t\,e^{-t}\frac{dV}{dx}(S_{t}) \Big ]. \end{aligned}$$
Part (1) of Lemma 4.6 is the main tool in the proof of Part (1) of Lemma 4.7, and the proof for Part (2) of Lemma 4.7 makes use of Parts (2) and (3) of Lemma 4.6 with \(F(x):=V(x)\).

Lemma 4.7

For any \(n>1\), there exists a \(C>0\) such that for all \(\lambda <1\) and \((x,p)\in \Sigma \),
  1. 1.

    \(\big | \mathbf {C}^{(\lambda )}_{n}(x,p) \big | \le Cmax \big (\frac{1}{1+|p|^{2n}}, \lambda ^{2n} \big ) \),

     
  2. 2.

    \(\big | \mathbf {C}^{(\lambda )}_{0}(x,p)\big |\le Cmax \big (\frac{1}{1+|p|^{2}}, \lambda \big )\).

     

Proof

Part (1): For \(v=2n\), notice that \(\mathbf {C}^{(\lambda )}_{n}(s)\) is smaller than
$$\begin{aligned} \mathbf {C}^{(\lambda )}_{n}(s)\le \tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\Big |\int \limits _{\tau _{1}}^{\tau _{2}}dr \frac{dV}{dx}(X_{r}) \Big |^{v} \Big ]=\mathbb {E}_{s}^{(\lambda )}\Big [\Big |\int \limits _{\tau _{1}}^{\tau _{2}}dr \frac{dV}{dx}(X_{r}) \Big |^{v} \Big ], \end{aligned}$$
(4.16)
where the equality holds since the initial distribution \(\tilde{\delta }_{s}\) is the splitting of \(\delta _{s}\), and the argument of the expectations only depends on the original (pre-split) statistics. The quantity on the right side of (4.16) is closely related to Part (1) of Lemma 4.6 except that the momentum now makes random jumps and the limits of integration \(\tau _{1}\), \(\tau _{2}\) are also random. The randomness of the limits of integration is not very important here except that the integration interval should not be too long, so, for simplicity, we will bound \(\mathbb {E}_{s}^{(\lambda )}\big [\big |\int _{0}^{\tau _{1}}dr\frac{dV}{dx}(X_{r}) \big |^{v} \big ]\) rather than the expression on the right side of (4.16).
The analysis must be split into cases based on the size of the initial momentum \(p\) since a particle with high momentum \(|p|\gg \lambda ^{-1}\), \(\lambda \ll 1\) will tend to receive many collisions in a small amount of time, which contrasts with the situation \(|p|\le \lambda ^{-1}\) where only several collisions are likely to occur in the time interval \([0,\tau _{1}]\). In the high momentum situation \(|p|\gg \lambda ^{-1}\), the absolute value of the momentum is likely to drift downwards due to the higher frequency of collisions with oncoming particles. We will bound \( \mathbb {E}_{s}^{(\lambda )}\big [\big |\int _{0}^{\tau _{1}}dr\,\frac{dV}{dx}(X_{r}) \big |^{v} \big ]\) for \(s=(x,p)\) in the following three regimes:
  1. (i).

    arbitrary \(p\),

     
  2. (ii).

    \(1\ll |p|\le \lambda ^{-1}\),

     
  3. (iii).

    \( \lambda ^{-1}<|p|\).

     
We will use that the escape rate function \(\mathcal {E}_{\lambda }(p):=\int \limits _{{\mathbb R}}dp'\mathcal {J}_{\lambda }(p,p')\) has bounds of the form
$$\begin{aligned} \frac{1}{8(1+\lambda )} \le \mathcal {E}_{\lambda }(p)\le \frac{1}{8(1+\lambda )}\big (1+C\lambda |p|\big ) \end{aligned}$$
(4.17)
for some \(C>0\) and all \(\lambda <1\) and \(p\in {\mathbb R}\), which can be deduced easily from the form of the jump rates \(\mathcal {J}_{\lambda }(p,p')\). We will also use that for small \(\lambda <1\)
$$\begin{aligned} \sup _{|p^{\prime }|>\frac{1}{\lambda }}\frac{ \int _{[-\frac{1}{\lambda }, \frac{1}{\lambda }]}dp^{\prime \prime } \frac{1}{1+|p^{\prime \prime } |^{v}}\mathcal {J}_{\lambda }(p^{\prime } , p^{\prime \prime })}{ \int _{[-\frac{1}{\lambda },\frac{1}{\lambda }]}dp^{\prime \prime } \mathcal {J}_{\lambda }(p^{\prime } , p^{\prime \prime })} = \mathcal {O}(\lambda ^{v}). \end{aligned}$$
(4.18)
The order equality (4.18) holds because the conditional distribution for a momentum jump starting from a momentum \(|p'|>\frac{1}{\lambda }\) and conditioned to jump to a value \(|p''|\le \frac{1}{\lambda }\) will be concentrated in the vicinity of the border \(|p''|\approx \frac{1}{\lambda }\) where \(\frac{1}{1+|p^{\prime \prime } |^{v}}= \mathcal {O}(\lambda ^{v})\). This is a consequence of the exponential decay found in the form of the jump rates \(\mathcal {J}_{\lambda }(p,p')\).
  1. (i).
    For arbitrary \(s\in \Sigma \), we have
    $$\begin{aligned} \mathbb {E}^{(\lambda )}_{s}\Big [\Big |\int \limits _{0}^{\tau _{1}}dr \frac{dV}{dx}(X_{r})\Big |^{v} \Big ] \le \sup _{x\in \mathbb {T}}\big |\frac{dV}{dx}(x)\big |^{v} \mathbb {E}\big [\tau _{1}^{v}\big ] \le v!\,\sup _{x}\big |\frac{dV}{dx}(x)\big |^{v}, \end{aligned}$$
    (4.19)
    since \(\tau _{2}-\tau _{1}\) is a mean one exponential.
     
  2. (ii).
    Next we consider \(s=(x,p)\) for the regime \(1\ll |p|<\lambda ^{-1}\). As long as the momentum stays below \(2\lambda ^{-1}\) over the time interval \([0,\tau _{1}]\), the collisions will occur with Poisson rate smaller than \(\mathcal {E}_{\lambda }(2\lambda ^{-1})\), which is uniformly finite by (4.17). Thus, in that case, the expected number of collisions up to time \(\tau _{1}\) is uniformly finite for \(\lambda <1\), and as a consequence the momentum of the particle will not fluctuate significantly from its initial value \(p\). To show that \(|p|\) typically stays well below \(2\lambda ^{-1}\), let us bound the probability of the event that \(|P_{r}|\notin \big [\frac{1}{2}|p|,\frac{3}{2}|p|]\) for some \(r\le \tau _{1}\):
    $$\begin{aligned}&\mathbb {P}_{s}^{(\lambda )}\Big [|P_{r}|\notin \big [\frac{1}{2}|p|,\frac{3}{2}|p|\big ] \text { for some} r\le \tau _{1} \Big ]\nonumber \\&\quad \le \left( \frac{2}{|p|}\right) ^{w}\mathbb {E}^{(\lambda )}_{s}\Big [\sup _{0\le r\le \varsigma \wedge \tau _{1}}\big | P_{r}-p|^{w} \Big ] \nonumber \\&\quad \le \left( \frac{4}{|p|}\right) ^{w}\sup _{\begin{array}{c} (x,p)\\ |p|\le \lambda ^{-1} \end{array}}\mathbb {E}^{(\lambda )}_{(x,p)}\Big [\sup _{0\le r\le \varsigma \wedge \tau _{1}}\big | J_{r}|^{w} \Big ] + w!\,\left( \frac{4}{|p|}\right) ^{w}\sup _{x}\left| \frac{dV}{dx}(x)\right| ^{w}, \end{aligned}$$
    (4.20)
    where \(w\ge 1\), \(\varsigma \) is the first jump time such that \(|P_{r}|\) leaves \( \big [\frac{1}{2}|p|,\frac{3}{2}|p|\big ]\), and \(J_{r}=P_{r}-p+\int _{0}^{t}dr \frac{dV}{dx}(X_{r})\) is the sum of the momentum jumps up to time \(r\). The first inequality in (4.20) is Jensen’s, and for the second inequality, we have used \((x+y)^{w}\le 2^{w}(x^{w}+y^{w})\) and (4.19) to bound the contribution of the potential drift. The probability densities of individual momentum jumps conditioned to jump from momentum \(\hat{p}\), \(d_{\hat{p}}(p')=\frac{ \mathcal {J}_{\lambda }(\hat{p},p')}{ \int _{\Sigma }dp'' \mathcal {J}_{\lambda }(\hat{p},p'')} \), have uniformly controlled Gaussian tails for \(|\hat{p}|\le 2\lambda ^{-1} \) and occur with Poisson rate \(\mathcal {E}_{\lambda }(P_{r})\le \mathcal {E}_{\lambda }(2\lambda ^{-1})\) for \(r\le \varsigma \). Thus the expectation of \(\sup _{0\le r\le \varsigma \wedge \tau _{1}}\big | J_{r}|^{w}\) above is uniformly finite. Since \(w\ge 1\) is arbitrary, it follows that the probability on the first line of (4.20) decays super-polynomially quickly for \(|p|\gg 1\). Now we bound \(\mathbb {E}_{s}^{(\lambda )}\big [\big |\int _{0}^{\tau _{1}}dr\,\frac{dV}{dx}(X_{r}) \big |^{v} \big ]\). Define the times \(t_{n}^{\prime }=t_{n}\wedge \tau _{1}\wedge \varsigma \), where \(t_{n}\) is the time of the \(n\)th momentum jump. By writing
    $$\begin{aligned} \int \limits _{0}^{\tau _{1}}dr\,\frac{dV}{dx}(X_{r}) = \sum _{n=0}^{\infty } \int \limits _{t_{n}^{\prime }}^{t_{n+1}^{\prime }}dr\frac{dV}{dx}(X_{r})+\chi (\varsigma \le \tau _{1}) \int \limits _{\varsigma }^{\tau _{1}}dr\,\frac{dV}{dx}(X_{r}), \end{aligned}$$
    we can apply the triangle inequality to get
    $$\begin{aligned}&\mathbb {E}_{s^{\prime }}^{(\lambda )}\Big [\Big |\int \limits _{0}^{\tau _{1}}dr \frac{dV}{dx}(X_{r}) \Big |^{v} \Big ]^{\frac{1}{v}} \nonumber \\&\le \mathbb {E}_{s^{\prime }}^{(\lambda )}\Big [\Big (\sum _{n=0}^{\infty } \Big |\int \limits _{t_{n}^{\prime }}^{t_{n+1}^{\prime }}dr \frac{dV}{dx}(X_{r}) \Big |\Big )^{v}\Big ]^{\frac{1}{v}}+\mathbb {E}_{s^{\prime }}^{(\lambda )}\Big [\chi (\varsigma \le \tau _{1}) \Big | \int \limits _{\varsigma }^{\tau _{1}}dr \frac{dV}{dx}(X_{r}) \Big |^{v} \Big ]^{\frac{1}{v}} \nonumber \\&\le \mathbb {E}_{s^{\prime }}^{(\lambda )}\Big [\Big (\sum _{n=0}^{\infty } \Big |\int \limits _{t_{n}^{\prime }}^{t_{n+1}^{\prime }}dr\frac{dV}{dx}(X_{r}) \Big |\Big )^{v}\Big ]^{\frac{1}{v}}+\Big ((2\,v)!\,\sup _{x\in \mathbb {T}}\big |\frac{dV}{dx}(x)\big |^{2v} \Big )^{\frac{1}{2v}}\mathbb {P}_{s^{\prime }}^{(\lambda )}\big [\varsigma \le \tau _{1} \big ]^{\frac{1}{2v}},\nonumber \\ \end{aligned}$$
    (4.21)
    where the second inequality follows by Cauchy-Schwarz and because \(\tau _{1}\) is a mean one exponential. The probability \(\mathbb {P}_{s^{\prime }}^{(\lambda )}\big [\varsigma \le \tau _{1} \big ]\) decays faster than any polynomial by (4.20). The first term on the right side of (4.21) has the bound
    $$\begin{aligned} \mathbb {E}_{s^{\prime }}^{(\lambda )}\Big [\Big (\sum _{n=0}^{\infty } \Big |\int \limits _{t_{n}^{\prime }}^{t_{n+1}^{\prime }}dr \frac{dV}{dx}(X_{r}) \Big |\Big )^{v}\Big ]\le \Big (\frac{4\sup _{x\in \mathbb {T}}V(x)}{|p|}\Big )^{v}\mathbb {E}_{s^{\prime }}^{(\lambda )}\big [\mathcal {N}_{\varsigma }^{v} \big ], \end{aligned}$$
    (4.22)
    where \(\mathcal {N}_{t}\) is the number of collisions up to time \(t\). The above inequality uses the definition of the \(t_{n}^{\prime }\)’s to conclude that for each \(n\), either \(t_{n}^{\prime }=t_{n+1}^{\prime }\) so that \(\int _{t_{n}^{\prime }}^{t_{n+1}^{\prime }}dr\frac{dV}{dx}(X_{r})=0 \), or \( |P_{t_{n}^{\prime }}| \ge \frac{1}{2}|p|\) so that we can apply Part (1) of Lemma 4.6 to bound \(\big |\int _{t_{n}^{\prime }}^{t_{n+1}^{\prime }}dr\frac{dV}{dx}(X_{r})\big |\). The counting process \(\mathcal {N}_{t}\) has Poisson rate \(\mathcal {E}_{\lambda }(P_{t})\) at time \(t\). For times \(t<\varsigma \), we have that \(\mathcal {E}_{\lambda }(P_{t})\le \sup _{\lambda <1} \mathcal {E}_{\lambda }(2\lambda ^{-1}):=\mathbf {r}\) and
    $$\begin{aligned} \mathbb {E}_{s^{\prime }}^{(\lambda )}\big [\mathcal {N}_{\varsigma }^{v} \big ]\le \mathbb {E}\big [(N^{\prime }_{\tau })^{v} \big ]=\frac{1}{1+\mathbf {r}}\sum _{n=0}^{\infty }n^{v}\left( \frac{\mathbf {r}}{1+\mathbf {r}} \right) ^{n}<\infty , \end{aligned}$$
    where \(N_{t}^{\prime }\) is a Poisson process with rate \(\mathbf {r}\) and the random variable \(\tau \) is mean one, exponentially distributed, and independent of \(N_{t}^{\prime }\). The first inequality can be seen by a construction \(N^{\prime }_{\tau }\approx \mathcal {N}_{\varsigma }+\mathcal {N}_{\tau }^{\prime }\) for a jump process \(\mathcal {N}^{\prime }_{r}\) with Poisson jump rate \( \mathbf {r}-\mathcal {E}_{\lambda }(P_{t})\) for \(t\le \varsigma \) and rate \(\mathbf {r}\) for \(t>\varsigma \) whose jumps are decided independently of the jumps of \(\mathcal {N}_{r}\).
     
  3. (iii).
    For the regime \(|p|>\lambda ^{-1}\), our analysis must treat the possibility that many collisions occur over the time interval \([\tau _{1},\tau _{2}]\) (specifically, when \(|p|\gg \lambda ^{-1}\)). Let \(\vartheta =\tau _{1}\wedge \vartheta ^{\prime } \) where \(\vartheta ^{\prime }\) is the hitting time that the absolute value of the momentum \(|P_{t}|\) jumps below \(\lambda ^{-1}\). The hitting time \(\vartheta ^{\prime } \) is finite, and, in fact, has an expectation that is bounded by a multiple of \(\lambda ^{-1}\) independently of the initial momentum \(|p|>\lambda ^{-1}\). However, the details for these points do not matter for this proof. Let \(\varphi _{s}\) be the distribution on \(\mathbb {T}\times [-\lambda ^{-1},\lambda ^{-1}]\) for \((X_{\vartheta ^{\prime }},P_{\vartheta ^{\prime }})\) starting from \(s\in \Sigma \). By the triangle inequality and the strong Markov property
    $$\begin{aligned} \mathbb {E}_{s}^{(\lambda )}\Big [\Big |\int \limits _{0}^{\tau _{1}}dr \frac{dV}{dx}(X_{r})\Big |^{v}\Big ]^{\frac{1}{v}}&\le \mathbb {E}_{s}^{(\lambda )}\Big [\Big |\int \limits _{0}^{\vartheta }dr\frac{dV}{dx}(X_{r})\Big |^{v}\Big ]^{\frac{1}{v}}\nonumber \\&+ \mathbb {E}_{\varphi _{s}}^{(\lambda )}\Big [\Big |\int \limits _{0}^{\tau _{1}}dr \frac{dV}{dx}(X_{r})\Big |^{v}\Big ]^{\frac{1}{v}}. \end{aligned}$$
    (4.23)
     
For the first term on the right side on (4.23), we can write
$$\begin{aligned} \mathbb {E}_{s}^{(\lambda )}\Big [\Big |\int \limits _{0}^{\vartheta }dr\frac{dV}{dx}(X_{r})\Big |^{v}\Big ]&= \mathbb {E}_{s}^{(\lambda )}\Big [\Big |\sum _{n=1}^{\mathcal {N}_{\vartheta }}\int \limits _{t_{n-1}}^{t_{n}}dr\,\frac{dV}{dx}(X_{r})\Big |^{v}\Big ]\\&\le 2^{v}\big (\sup _{x}V(x)\big )^{v} \mathbb {E}_{s}^{(\lambda )}\Big [\Big |\sum _{n=1}^{\mathcal {N}_{\vartheta }}\frac{1}{|P_{t_{n}^{-}}|} \Big |^{v}\Big ]\\&\le \frac{1}{c}\lambda ^{v} \mathbb {E}_{s}^{(\lambda )}\Big [\Big |\sum _{n=1}^{\mathcal {N}_{\vartheta }}\frac{1}{\mathcal {E}_{\lambda }(P_{t_{n}^{-}})} \Big |^{v}\Big ]. \end{aligned}$$
The first inequality is Part (1) of Lemma 4.6, which is applied for the Hamiltonian evolution on each interval \([t_{n-1},t_{n})\). The second inequality holds since there is a \(c>0\) such that \( \mathcal {E}_{\lambda }(p)\le c\lambda |p|\) for all \(|p|\ge \lambda ^{-1}\) as a consequence of (4.17). Notice that the difference \(\sum _{n=1}^{\mathcal {N}_{r}}\frac{1}{\mathcal {E}_{\lambda }(P_{t_{n}^{-}})} -r\) is a martingale with predictable quadratic variation \(\int _{0}^{r}ds \frac{1}{\mathcal {E}_{\lambda }(P_{s})}\) since the counting process \(\mathcal {N}_{r}\) has jump rate \(\mathcal {E}_{\lambda }(P_{r})\). By the triangle inequality and the relation \(\vartheta \le \tau _{1}\),
$$\begin{aligned} \mathbb {E}_{s}^{(\lambda )}\Big [\Big |\sum _{n=1}^{\mathcal {N}_{\vartheta }}\frac{1}{\mathcal {E}_{\lambda }(P_{t_{n}^{-}})} \Big |^{v}\Big ]^{\frac{1}{v}}&\le \mathbb {E}_{s}^{(\lambda )}\Big [\Big |\sum _{n=1}^{\mathcal {N}_{\tau _{1}}}\frac{1}{\mathcal {E}_{\lambda }(P_{t_{n}^{-}})} -\tau _{1} \Big |^{v}\Big ]^{\frac{1}{v}}+ \mathbb {E}_{s}^{(\lambda )}\big [\tau _{1}^{v}\big ]^{\frac{1}{v}} \\&\le C'\mathbb {E}_{s}^{(\lambda )}\Big [\Big | \int \limits _{0}^{\tau _{1}}ds\frac{1}{ \mathcal {E}_{\lambda }(P_{s})} \Big |^{v} \Big ]^{\frac{1}{v}}+\mathbb {E}_{s}^{(\lambda )}\Big [\sup _{1\le n\le \mathcal {N}_{\tau _{1}}} \frac{1}{\big |\mathcal {E}_{\lambda }(P_{t_{n}^{-}}) \big |^{v}} \Big ]^{\frac{1}{v}}\\&+\,\mathbb {E}_{s}^{(\lambda )}\big [\tau _{1}^{v}\big ]^{\frac{1}{v}}\\&\le 8(C'+1)(1+\lambda )(v!)^{\frac{1}{v}}+(v!)^{\frac{1}{v}}, \end{aligned}$$
where the constant \(C'\) arises from an application of Rosenthal’s inequality, and the third inequality holds since \(\mathcal {E}_{\lambda }(p) \ge \frac{1}{8(1+\lambda )}\) by (4.17) and because the random variable \(\tau _{1}\) is exponential with mean one.
For the second term on the right side of (4.23), we can apply our results (i) and (ii) above to guarantee the existence of a \(C>0\) such that
$$\begin{aligned} \mathbb {E}_{\varphi _{s}}^{(\lambda )}\Big [\Big |\int \limits _{0}^{\tau _{1}}dr\frac{dV}{dx}(X_{r})\Big |^{v}\Big ]&\le C\int \limits _{\Sigma }d\varphi _{s}(x^{\prime },p^{\prime })\frac{1}{1+|p^{\prime } |^{v}}\nonumber \\&\le C\sup _{|p^{\prime }|>\frac{1}{\lambda }}\frac{\int _{[-\frac{1}{\lambda }, \frac{1}{\lambda }]}dp^{\prime \prime } \frac{1}{1+|p^{\prime \prime } |^{v}}\mathcal {J}_{\lambda }(p^{\prime }, p^{\prime \prime })}{ \int _{[-\frac{1}{\lambda },\frac{1}{\lambda }]}dp^{\prime \prime } \mathcal {J}_{\lambda }(p^{\prime } , p^{\prime \prime })} , \end{aligned}$$
(4.24)
where the third expression should be understood as the supremum over all \(|p^{\prime }|>\lambda ^{-1}\) for the expectation of \(\frac{1}{1+|P_{\tau }|^{v}}\) conditioned on \(p^{\prime }=P_{\tau ^{-}} \). The final term in (4.24) is \(\mathcal {O}(\lambda ^{v})\) by (4.18).

Part (2): We now seek to take full advantage of the averaging that results from integrating \( \frac{dV}{dx}(X_{r}) \) between two random times \(r\in [\tau _{1},\tau _{2}]\). If only the upper limit of integration were random, such as for the expression \( \big | \mathbb {E}_{s}^{(\lambda )}\big [\int _{0}^{\tau _{1}}dr\frac{dV}{dx}(X_{r}) \big ] \big |\), then we would only have an upper bound proportional to \(max \big (\frac{1}{1+|p|}, \lambda \big )\). The bound for \( \mathbf {C}^{(\lambda )}_{0}(s) \) in the region \(|p|\ge \lambda ^{-1}\) follows from Part (1), so we will focus our analysis on the regime \(1 \ll | p| < \lambda ^{-1}\). We will proceed by approximating the quantity \( \mathbf {C}^{(\lambda )}_{0}(s) \) by expressions that are progressively easier to analyze.

By (4.19), we have an uniform upper bound for \(\sup _{s}|\mathbf {C}_{0}^{(\lambda )}(s)|\). The difference between \(\mathbf {C}^{(\lambda )}_{0}(s)\) and \(\mathbb {E}_{s}^{(\lambda )}\big [\int _{\tau _{1}}^{\tau _{2}}dr\frac{dV}{dx}(X_{r})\big ]\) is small when \(|p|\gg 1\) since
$$\begin{aligned} \Big |&\mathbb {E}_{s}^{(\lambda )}\Big [\int \limits _{\tau _{1}}^{\tau _{2}}dr\frac{dV}{dx}(X_{r})\Big ]-\mathbf {C}^{(\lambda )}_{0}(s) \Big | =\Big |\tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\int \limits _{\tau _{1}}^{\tau _{2}}dr\frac{dV}{dx}(X_{r})\Big ]-\mathbf {C}^{(\lambda )}_{0}(s) \Big |\nonumber \\&=\Big |\tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\chi (z_{\tau _{1}}=1)\int \limits _{\tau _{1}}^{\tau _{2}}dr\frac{dV}{dx}(X_{r})\Big ] \Big | \le \tilde{\mathbb {E}}_{\tilde{\delta }_{s}}\Big [\Big |\int \limits _{\tau _{1}}^{\tau _{2}}dr\frac{dV}{dx}(X_{r})\Big |^{2}\Big ]^{\frac{1}{2}}\tilde{\mathbb {P}}_{\tilde{\delta }_{s}}^{(\lambda )}\big [z_{\tau _{1}}=1 \big ]^{\frac{1}{2}} \nonumber \\&=\mathbb {E}_{s}\Big [\Big |\int \limits _{\tau _{1}}^{\tau _{2}}dr\frac{dV}{dx}(X_{r})\Big |^{2}\Big ]^{\frac{1}{2}}\mathbb {E}_{s}^{(\lambda )}\big [h(S_{\tau _{1}}) \big ]^{\frac{1}{2}} \le 2^{\frac{1}{2}}\sup _{x}\big |\frac{dV}{dx}(x)\big |^{\frac{1}{2}}\mathbb {E}_{s}^{(\lambda )}\big [h(S_{\tau _{1}}) \big ]^{\frac{1}{2}}. \end{aligned}$$
(4.25)
The first and third equalities use that \( \mathbb {E}_{s}^{(\lambda )}=\tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\), and the identity \(\tilde{\mathbb {P}}_{\tilde{\delta }_{s}}^{(\lambda )}\big [z_{\tau _{1}}=1 \big ]= \mathbb {E}_{s}^{(\lambda )}\big [h(S_{\tau _{1}}) \big ] \) used in the third equality can be shown using Part (3) of Proposition 2.3. The first inequality is Cauchy-Schwarz, and the second inequality uses that \(\tau _{2}-\tau _{1}\) is a mean one exponential. The function \(h(s)\le 1\) has compact support, and there is a \(c>0\) such that \(\mathbb {E}_{(x,p)}^{(\lambda )}\big [h(S_{\tau _{1}}) \big ]\le ce^{-\lambda ^{-1}}\vee e^{-|p|} \). In fact, the bound can be given a Gaussian form as a consequence of the Gaussian tails found in the jump rates (1.2). It follows that the difference of \(\mathbf {C}^{(\lambda )}_{0}(s)\) and \(\mathbb {E}_{s}^{(\lambda )}\big [\int _{\tau _{1}}^{\tau _{2}}dr\frac{dV}{dx}(X_{r})\big ]\) is negligible for our purpose.
By the above remarks, we may work with \(\mathbb {E}_{s}^{(\lambda )}\big [\int _{\tau _{1}}^{\tau _{2}}dr\frac{dV}{dx}(X_{r})\big ]\). Now we will show that the difference of this term with the expression \(\frac{1}{p}\mathbb {E}_{s}^{(\lambda )}\big [V(X_{\tau _{2}})-V(X_{\tau _{1}}) \big ]\) is \( O (p^{-2})\). For \(p\) in the regime \(1\ll |p|\le \lambda ^{-1}\), define \(\varsigma \) as in Part (1) and define \(t_{n}\) as the sequence of collision times starting after \(\tau _{1}\) with \(t_{0}=\tau _{1}\), and \(t_{n}^{\prime }=t_{n}\wedge \varsigma \wedge \tau _{2} \). Similarly to Part (1), we can write
$$\begin{aligned} \int \limits _{\tau _{1}}^{\tau _{2}}dr\frac{dV}{dx}(X_{r}) = \sum _{n=0}^{\infty } \int \limits _{t_{n}^{\prime }}^{t_{n+1}^{\prime }}dr\frac{dV}{dx}(X_{r})+\chi (\varsigma \le \tau _{2}) \int \limits _{\varsigma \vee \tau _{1}}^{\tau _{2}}dr\frac{dV}{dx}(X_{r}). \end{aligned}$$
The difference between the expressions is bounded by
$$\begin{aligned} \Big | \mathbb {E}_{s}^{(\lambda )}\Big [\int \limits _{\tau _{1}}^{\tau _{2}}dr\frac{dV}{dx}(X_{r})\Big ]&- \frac{1}{p}\mathbb {E}_{s}^{(\lambda )}\big [V(X_{\tau _{2}})-V(X_{\tau _{1}}) \big ] \Big |\nonumber \\&\le \Big (\frac{\sup _{x}V(x)}{|p|} +\sup _{x}\big |\frac{dV}{dx}(x)\big | \Big )\mathbb {P}_{s}\big [\varsigma \le \tau _{2} \big ] \nonumber \\&+ \mathbb {E}_{s}^{(\lambda )}\Big [\sum _{n=0}^{\mathcal {N}_{\varsigma }-1}\Big | \int \limits _{t_{n}^{\prime }}^{t_{n+1}^{\prime }}dr\frac{dV}{dx}(X_{r})-\frac{V(X_{t_{n+1}^{\prime }})-V(X_{t_{n}^{\prime }})}{P_{t_{n}^{\prime }}} \Big | \Big ] \nonumber \\&+ \mathbb {E}_{s}^{(\lambda )}\Big [\sum _{n=0}^{\mathcal {N}_{\varsigma } -1}\big |V(X_{t_{n+1}^{\prime }})-V(X_{t_{n}^{\prime }})\big |\,\Big | \frac{ 1}{P_{t_{n}^{\prime }}} -\frac{1}{p}\Big | \Big ], \end{aligned}$$
(4.26)
where \(\mathcal {N}_{r}\), \(r\ge \tau _{1}\) is the number of collision times \(t_{n}\) in the interval \((\tau _{1},r]\), and the first term on the right side bounds the expectations of \(\frac{V(X_{\tau _{2}})-V(X_{\varsigma })}{p}\) and \( \chi (\varsigma \le \tau _{2}) \int _{\varsigma \vee \tau _{1}}^{\tau _{2}}dr\frac{dV}{dx}(X_{r})\). The inequality (4.26) follows by adding and subtracting terms \(\frac{V(X_{t_{n+1}^{\prime }})-V(X_{t_{n}^{\prime }})}{P_{t_{n}^{\prime }}}\) for \(n\in [0,\mathcal {N}_{\varsigma })\) and applying the triangle inequality. By the same analysis as in (4.20), \( \mathbb {P}_{s}\big [\varsigma \le \tau _{2} \big ]\) decays super-polynomially with \(|p|\gg 1\). We will bound the second and third lines of (4.26) below.
The second line of (4.26) has the bound
$$\begin{aligned}&\mathbb {E}_{s}^{(\lambda )}\Big [\sum _{n=0}^{\mathcal {N}_{\varsigma }-1}\Big | \int \limits _{t_{n}^{\prime }}^{t_{n+1}^{\prime }}dr\frac{dV}{dx}(X_{r})-\frac{V(X_{t_{n+1}^{\prime }})-V(X_{t_{n}^{\prime }})}{P_{t_{n}^{\prime }}} \Big | \Big ]\\&\le 2\sup _{x}\big |\frac{dV}{dx}(x)\big | \sup _{x}V(x)\, \mathbb {E}_{s}^{(\lambda )}\Big [\sum _{n=0}^{\mathcal {N}_{\varsigma }-1}\frac{ t_{n+1}^{\prime }- t_{n}^{\prime }}{ |P_{\tau _{n}}|^{2}} \Big ]\\&\le \frac{8}{|p|^{2}}\sup _{x}\big |\frac{dV}{dx}(x)\big | \sup _{x}V(x), \end{aligned}$$
where the second inequality uses that \(|P_{\tau _{n}^{\prime }}|\ge \frac{1}{2}|p|\), by definition, for \(n\le \mathcal {N}_{\varsigma }\), and also uses that
$$\begin{aligned} \sum _{n=0}^{\mathcal {N}_{\varsigma }}t_{n+1}^{\prime }-t_{n}^{\prime } =\varsigma -\tau _{1}\le \tau _{2}-\tau _{1}\quad \text {and hence}\quad \mathbb {E}_{s}^{(\lambda )}\Big [\sum _{n=0}^{\mathcal {N}_{\varsigma }}t_{n+1}^{\prime }-t_{n}^{\prime }\Big ]\le \mathbb {E}_{s}^{(\lambda )}\big [\tau _{2}-\tau _{1} \big ] =1. \end{aligned}$$
For the third line on the right side of (4.26),
$$\begin{aligned}&\mathbb {E}_{s}^{(\lambda )}\Big [\sum _{n=0}^{\mathcal {N}_{\varsigma }-1}\big |V(X_{t_{n+1}^{\prime }})-V(X_{t_{n}^{\prime }})\big |\,\Big | \frac{ 1}{P_{t_{n}^{\prime }}} -\frac{1}{p}\Big | \Big ] \\&\le \sup _{x}V(x) \,\mathbb {E}_{s}^{(\lambda )}\Big [\sum _{n=0}^{\mathcal {N}_{\varsigma }-1}\frac{\big | p-P_{t_{n}^{\prime }} |}{|pP_{t_{n}^{\prime }}| } \Big ] \le \frac{2}{|p|^{2}}\big (\sup _{x}V(x)\big ) \mathbb {E}_{s}^{(\lambda )}\Big [\sum _{n=0}^{\mathcal {N}_{\varsigma }-1}\big | p-P_{t_{n}^{\prime }} | \Big ] \\&\le \frac{2}{|p|^{2}}\big (\sup _{x}V(x)\big )\, \mathbb {E}_{s}^{(\lambda )}\Big [\sup _{0\le r\le \varsigma }\big | P_{r}-p\big |^{2}\Big ]^{\frac{1}{2}} \mathbb {E}_{s}^{(\lambda )}\big [\mathcal {N}_{\varsigma }^{2}\big ]^{\frac{1}{2}} = O (|p|^{-2}). \end{aligned}$$
The second inequality uses the definition of \(\varsigma \) to conclude that \(\frac{1}{2}|p|\le |P_{t_{n}^{\prime }}|\) for \(n\le \mathcal {N}_{\varsigma }\), and the third inequality is Cauchy-Schwarz. Arbitrary moments of \(\mathcal {N}_{\varsigma }\) are finite by (4.22).
Our final task is to bounding the expression \(\frac{1}{p}\mathbb {E}_{s}^{(\lambda )}\big [V(X_{\tau _{2}})-V(X_{\tau _{1}}) \big ]\), and we only need to show that \(\big |\mathbb {E}_{s}^{(\lambda )}\big [V(X_{\tau _{2}})-V(X_{\tau _{1}}) \big ]\big |= O (|p|^{-1})\) for \(|p|\gg 1\). By the triangle inequality,
$$\begin{aligned} \big |\mathbb {E}^{(\lambda )}_{s}\big [V(X_{\tau _{2}})-V(X_{\tau _{1}}) \big ]\big |\le \Big |\mathbb {E}^{(\lambda )}_{s}\big [V(X_{\tau _{2}})\big ]-\!\!\int _{\mathbb {T}}dxV(x)\big ]\Big |\!+\!\Big |\mathbb {E}^{(\lambda )}_{s}\big [V(X_{\tau _{1}})\big ]-\!\!\int _{\mathbb {T}}dxV(x)\big ]\Big |. \end{aligned}$$
The terms on the right side are similar, so we will study the second. Bounding the difference between \(\mathbb {E}^{(\lambda )}_{s}\big [V(X_{\tau _{1}})\big ]\) and \(\int _{\mathbb {T}}dx V(x)\) is very close in spirit to Part (3) of Lemma 4.6 except that we now must treat an Hamiltonian evolution perturbed by some random momentum kicks. As in Part (1), the assumption that \(|p|\le \lambda ^{-1}\) ensures that not many momentum kicks are likely to occur.

We can reconstruct the counting process \(\mathcal {N}_{t}\) for the number of collisions up to time \(\varsigma \) as follows. Let \(N^{\prime }\) be a Poisson clock with rate \(\mathbf {r}=\mathcal {E}_{\lambda }(2\lambda ^{-1})\) as in Part (1). The Poisson rate of jumps \(\mathcal {E}_{\lambda }(P_{t})\) for the process \(\mathcal {N}_{t}\) satisfies \(\mathcal {E}_{\lambda }(P_{t})\le \mathbf {r}\) for times \(t\le \varsigma \). At each jump time \(r_{n}\le \varsigma \) for the Poisson process \(N^{\prime }\), we then flip an independent coin with weight \(\mathbf {r}^{-1} \mathcal {E}_{\lambda }(P_{r_{n}})\) to determine if a jump for \(\mathcal {N}_{t}\) (i.e. a collision) occurred at time \(r_{n}\). This construction recovers the statistics for \(\mathcal {N}_{t}\). We then define \(r_{n}^{\prime }=r_{n}\wedge \tau _{1}\) for \(n\le N^{\prime }_{\varsigma \wedge \tau _{1}}\). Conditioned on the past \(\mathcal {F}_{r^{\prime }_{n}}\) and the event \(\tau _{1}>r_{n}^{\prime }\), the increment \(r_{n+1}^{\prime }-r_{n}^{\prime }\) is exponentially distributed with mean \((1+\mathbf {r})^{-1}\). When conditioned on the event \(\tau _{1}=r_{n+1}^{\prime }\), the increment \(r_{n+1}^{\prime }-r_{n}^{\prime }\) is exponential with mean \(1\).

We can rewrite the expectation \(\mathbb {E}^{(\lambda )}_{s}\big [V(X_{\tau _{1}})\big ]\) in terms of the complimentary events \(\tau _{1}>\max _{n}\,r_{n}^{\prime }\) and \(\tau _{1}=r_{n}^{\prime }\) for some \(n\) as follows:
$$\begin{aligned} \mathbb {E}^{(\lambda )}_{s}\big [V(X_{\tau _{1}})\big ]&= \mathbb {E}^{(\lambda )}_{s}\big [V(X_{\tau _{1}})\chi (\tau _{1}>\max _{n}\,r_{n}^{\prime }) \big ]\nonumber \\&+ \mathbb {E}^{(\lambda )}_{s}\Big [\sum _{n=0}^{\infty }\chi (\tau _{1}=r_{n+1}^{\prime })\, \mathbb {E}^{(\lambda )}_{s}\big [V(X_{\tau _{1}})\,\big |\,\mathcal {F}_{r^{\prime }_{n}},\,\tau _{1}=r^{\prime }_{n+1} \big ] \Big ]. \end{aligned}$$
(4.27)
The first term on the right is smaller than \(\sup _{x}V(x)\) times the probability of the event \(\max _{n}r_{n}^{\prime }\ne \tau _{1}\), which can also be phrased as the event that \(\varsigma <\tau \). By the analysis in Part (1), \(\mathbb {P}_{s}^{(\lambda )}[\varsigma <\tau ]\) is super-polynomially small in \(|p|\gg 1\). Thus \(\sum _{n=0}^{\infty }\mathbb {P}^{(\lambda )}_{s}[\tau _{1}=r_{n+1}^{\prime }]\) is super-polynomially close to \(1\). Since \(r_{n+1}^{\prime }-r_{n}^{\prime }\) is exponentially distributed, by Part (3) of Lemma 4.6 we have that
$$\begin{aligned} \Big |\mathbb {E}^{(\lambda )}_{s}\big [V(X_{\tau _{1}})\,\big |\,\mathcal {F}_{r^{\prime }_{n}},\,\tau _{1}=r^{\prime }_{n+1} \big ]-\int \limits _{\mathbb {T}}dx V(x)\Big |\le (\text {const})|P_{r_{n}^{\prime }}|^{-1}\le 2(\text {const})|p|^{-1}. \end{aligned}$$
The second inequality above follows since \(|P_{r_{n}^{\prime }}|\ge \frac{1}{2}|p|\) by the definition of the times \(r_{n}^{\prime }\), which are less than \(\varsigma \). \(\square \)

4.3 Proof of Proposition 4.1

Proof of Proposition 4.1

Part (1): Recall that \(\sigma _{n}=S_{\tau _{n}}\) and that \( \mathbf {N}_{t}\) is defined as the number partition times \(\tau _{n}\), \(n\ge 1\) to have occurred up to time \(t\). For \(0\le t< R_{1}\) we can write \(\int _{0}^{t}dr\frac{dV}{dx}(X_{r})\) as
$$\begin{aligned} \int \limits _{0}^{t}dr\frac{dV}{dx}(X_{r})&= \int \limits _{0}^{\tau _{1}} dr\frac{dV}{dx}(X_{r}) - \chi (\zeta _{\mathbf {N}_{t}}=0)\int \limits _{t}^{\tau _{\mathbf {N}_{t}+1}} dr\frac{dV}{dx}(X_{r})\nonumber \\&- \chi (\zeta _{\mathbf {N}_{t}+1}=0)\int \limits _{\tau _{\mathbf {N}_{t}+1}}^{\tau _{\mathbf {N}_{t}+2}} dr\frac{dV}{dx}(X_{r}) +\sum _{n=0}^{ \mathbf {N}_{t}} \mathbf {C}^{(\lambda )}_{0}(\sigma _{n})+\mathbf {m}_{t}+\mathbf {m}_{t}', \end{aligned}$$
(4.28)
where \(\mathbf {m}_{t}\) and \(\mathbf {m}_{t}'\) correspond to odd and even contributions of the form
$$\begin{aligned} \mathbf {m}_{t}&:= \sum _{n=1}^{\lfloor \frac{1}{2} \mathbf {N}_{t}-\frac{1}{2}\rfloor +1}\Big (\chi (\zeta _{2n}=0) \int \limits _{\tau _{2n}}^{\tau _{2n+1}} dr\frac{dV}{dx}(X_{r})- \mathbf {C}^{(\lambda )}_{0}(\sigma _{2n-1}) \Big ), \\ \mathbf {m}_{t}'&:= \sum _{n=0}^{\lfloor \frac{1}{2}\mathbf {N}_{t}\rfloor }\Big (\chi (\zeta _{2n+1}=0) \int \limits _{\tau _{2n+1}}^{\tau _{2n+2}} dr\frac{dV}{dx}(X_{r})- \mathbf {C}^{(\lambda )}_{0}(\sigma _{2n}) \Big ). \end{aligned}$$
The processes \(\mathbf {m}_{t},\mathbf {m}_{t}'\) are not adapted to \( \tilde{\mathcal {F}}_{t}\) since, for instance, \(\mathbf {m}_{t}'\) makes the jump
$$\begin{aligned} \chi (\zeta _{2n+1}=0) \int \limits _{\tau _{2n+1}}^{\tau _{2n+2}} dr\frac{dV}{dx}(X_{r})- \mathbf {C}^{(\lambda )}_{0}(\sigma _{2n}) \end{aligned}$$
at time \(\tau _{2n}\), and the size of the jump depends on \(X_{t}\) up to time \(\tau _{2n+2}\). Let \(\tilde{\mathcal {F}}_{t}''\) be the \(\sigma \)-algebra of all information before time \(\tau _{n+2}\), where \(\tau _{n}\le t<\tau _{n+1}\), plus knowledge of the time \(\tau _{n+2}\). The process \(\mathbf {m}_{t}'\) is a martingale with respect to \(\tilde{\mathcal {F}}_{t}''\). To see this let us consider a time \(t<\tau _{2n-1}\), then the following equalities hold:
$$\begin{aligned} \tilde{\mathbb {E}}\Big [\chi (\zeta _{2n+1}=0) \int \limits _{\tau _{2n+1}}^{\tau _{2n+2}} dr\frac{dV}{dx}(X_{r})\,\Big |\,\tilde{\mathcal {F}}_{t}'' \Big ]&= \tilde{\mathbb {E}}\Big [\tilde{\mathbb {E}}\Big [\chi (\zeta _{2n+1}=0) \int \limits _{\tau _{2n+1}}^{\tau _{2n+2}} dr\frac{dV}{dx}(X_{r})\,\Big |\,\tilde{\mathcal {F}}_{\tau _{2n}^{-}}\Big ]\,\Big |\,\tilde{\mathcal {F}}_{t}'' \Big ] \nonumber \\&= \tilde{\mathbb {E}}\Big [\tilde{\mathbb {E}}_{\tilde{\delta }_{\sigma _{2n}}}\Big [\chi (\zeta _{2n+1}=0) \int \limits _{\tau _{2n+1}}^{\tau _{2n+2}} dr\frac{dV}{dx}(X_{r})\Big ]\,\Big |\,\tilde{\mathcal {F}}_{t}'' \Big ] \nonumber \\&= \tilde{\mathbb {E}}\Big [\mathbf {C}^{(\lambda )}_{0}(\sigma _{2n})\,\Big |\,\tilde{\mathcal {F}}_{t}'' \Big ]. \end{aligned}$$
(4.29)
The nested conditional expectation on the first line uses that \(\tilde{\mathcal {F}}_{t}''\subseteq \tilde{\mathcal {F}}_{\tau _{2n}^{-}}\), and the third equality is by the definition of \(\mathbf {C}^{(\lambda )}_{0}\). The second equality applies Part 3 of Proposition 2.3 and the strong Markov property starting from the time \(\tau _{2n}\); recall that \(\tilde{\delta }_{s}\) is the splitting of the \(\delta \)-distribution at \(s\in \Sigma \). The predictable quadratic variation \(\langle \mathbf {m}_{t}' \rangle \) for the martingale \(\mathbf {m}_{t}'\) with respect to \(\tilde{\mathcal {F}}_{t}''\) has the form
$$\begin{aligned} \langle \mathbf {m}' \rangle _{t} =\sum _{n=1}^{\lfloor \frac{1}{2}\mathbf {N}_{t}\rfloor }\mathbf {C}^{(\lambda )}_{1}(\sigma _{2n}). \end{aligned}$$
(4.30)
The analogous statements hold for \(\mathbf {m}_{t}\).
By the triangle inequality for (4.28),
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sup _{0\le t\le R_{1}}\Big |\int \limits _{0}^{t}dr\frac{dV}{dx}(X_{r}) \Big |^{2m} \Big ]^{\frac{1}{2m}}&\le 6\sup _{x}\big |\frac{dV}{dx}(x)\big |\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\big [\sup _{0\le t< R_{1}}(\tau _{\mathbf {N}_{t}+1}-\tau _{\mathbf {N}_{t}})^{2m} \big ]^{\frac{1}{2m}}\nonumber \\&+\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sup _{0\le t< R_{1}}\Big (\sum _{n=0}^{ \mathbf {N}_{t}}| \mathbf {C}^{(\lambda )}_{0}(\sigma _{n})| \Big )^{2m} \Big ]^{\frac{1}{2m}}\nonumber \\&+ \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\big [\sup _{0\le t< R_{1}}\big |\mathbf {m}_{t}\big |^{2m} \big ]^{\frac{1}{2m}}+\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\big [\sup _{0\le t< R_{1}}\big |\mathbf {m}_{t}'\big |^{2m} \big ]^{\frac{1}{2m}},\nonumber \\ \end{aligned}$$
(4.31)
where we have bounded each of the first three terms on the right side of (4.28) by the supremum of \(|\frac{dV}{dx}(x)|\) multiplied by the longest interval \(\tau _{n+1}-\tau _{n}\) for \(n\le \tilde{n}_{1}\). We used a factor of \(6\) rather than \(3\) since there is one term for which the interval \([\tau _{2n+1},\tau _{2n+2}]\) will have \(\tau _{2n+1}\in [R_{1},R_{2}]\) rather than \(< R_{1}\). We thus double the bound since we can apply the strong Markov property starting from time \(R_{1}\). We now look at the terms on the right side one-by-one.
For the first term on the right side of (4.31),
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\big [\sup _{0\le t< R_{1}}(\tau _{\mathbf {N}_{t}+1}-\tau _{\mathbf {N}_{t}})^{2m} \big ]&= \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\big [\sup _{0\le n\le \tilde{n}_{1}}(\tau _{n+1}-\tau _{n})^{2m} \big ]\nonumber \\&\le c^{2m}\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\mathbb {E}\big [\sup _{0\le n\le \tilde{n}_{1}}\mathbf {e}_{n}^{2m}\,\big |\,\tilde{n}_{1} \big ] \Big ]\nonumber \\&\le c'+c'\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\big [\big (\log (1+\tilde{n}_{1}) \big )^{2m} \big ], \end{aligned}$$
(4.32)
where \(\mathbf {e}_{n}\) are i.i.d. mean one exponential random variables independent of everything else. The \(c>0\) in the first inequality is from Lemma 3.5 and replacing the \((\tau _{n+1}-\tau _{n})\)’s with the \(\mathbf {e}_{n}\)’s. The \(c'>0\) for the second inequality exists by an elementary analysis of \(\tilde{\mathbb {E}}^{(\lambda )}\big [\sup _{0\le n\le N}\mathbf {e}_{n}^{2m}\big ] \) for \(m>0\) and independent exponential random variables \(\mathbf {e}_{n}\) with mean one. The value \(\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\big [\big (\log (1+\tilde{n}_{1}) \big )^{2m} \big ]\) is finite by Proposition 3.4 since the fractional moments \(\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\big [\tilde{n}_{1}^{\alpha } \big ]\) are finite for \(0<\alpha <\frac{1}{2}\).
For the second term on the right side of (4.31), obviously
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sup _{0\le t< R_{1}}\Big (\sum _{n=0}^{ \mathbf {N}_{t}}| \mathbf {C}^{(\lambda )}_{0}(\sigma _{n})| \Big )^{2m} \Big ] = \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\Big (\sum _{n=0}^{ \tilde{n}_{1}}| \mathbf {C}^{(\lambda )}_{0}(\sigma _{n})| \Big )^{2m} \Big ] \end{aligned}$$
since \(\mathbf {N}_{t}=\tilde{n}_{1}\) for \(t \in [R_{n-1},R_{n})\). By Lemma 4.7 \(g_{\lambda }= \mathbf {C}^{(\lambda )}_{0}\) has the inequality \(|g_{\lambda }(x,p)|\le Cmax (\frac{1}{1+p^{2}}, \lambda )\) for \(\lambda <1\). Hence, by Lemma 4.5, the above sum is bounded independently of \(\lambda <1\).
The last two terms on the right side of (4.31) are similar, so we will only treat the last. By Rosenthal’s inequality there is a \(C''\) such that
$$\begin{aligned}&\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\big [\sup _{0\le t< R_{1}}\big |\mathbf {m}_{t}' \big |^{2m} \big ] \\&\le C''\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\big [\langle \mathbf {m}'\rangle _{R_{1}}^{m} \big ] + C''\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sup _{0\le n\le \lfloor \frac{ \tilde{n}_{1}}{2}\rfloor }\Big | \chi (\zeta _{2n+1}=0) \int \limits _{\tau _{2n+1}}^{\tau _{2n+2}} dr\frac{dV}{dx}(X_{r})- \mathbf {C}^{(\lambda )}_{0}(\sigma _{2n}) \Big |^{2m}\Big ] \\&\le C''\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\big [\langle \mathbf {m}'\rangle _{R_{1}}^{m} \big ] + C''\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sum _{n=0}^{\tilde{n}_{1}}\Big | \chi (\zeta _{n+1}=0) \int \limits _{\tau _{n+1}}^{\tau _{n+2}} dr\frac{dV}{dx}(X_{r})- \mathbf {C}^{(\lambda )}_{0}(\sigma _{n}) \Big |^{4m}\Big ]^{\frac{1}{2}} \\&\le C''\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\Big (\sum _{n=0}^{\tilde{n}_{1}+1}\mathbf {C}^{(\lambda )}_{1}(\sigma _{n}) \Big )^{m}\Big ]+C''\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sum _{n=0}^{\tilde{n}_{1}}\mathbf {C}^{(\lambda )}_{2m}(\sigma _{n}) \Big ]^{\frac{1}{2}}. \end{aligned}$$
For the second inequality, we have used the standard technique to bound the supremum in the second term by using \(\big (\sup _{n}a_{n}\big )^{2}\le \sum _{n}a_{n}^{2}\) and Jensen’s inequality, and we have also included the odd-numbered terms. The first term in the third inequality is bounded with the equality (4.30) and by including the odd-numbered terms. The second term in the third equality is bounded by inserting a nested conditional expectation with respect to \(\tilde{\mathcal {F}}_{\tau _{n}^{-}}\) for the \(n\)th term in the sum and applying the argument in (4.29). To bound both terms on the right side, we can apply Lemmas 4.7 and 4.5 as above.
Part (2): Similarly to Part (1), we begin by writing \(\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r})\) as in (4.28) and using the triangle inequality to get
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\Big [\Big |\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r})\Big |\Big ]&\le \tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\Big [\Big | \int \limits _{0}^{\tau _{1}} dr\frac{dV}{dx}(X_{r}) \Big |\Big ]+\tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\Big [\Big | \int \limits _{t}^{\tau _{\tilde{n}_{1}+1}} dr\frac{dV}{dx}(X_{r}) \Big |\Big ]\nonumber \\&+\tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\Big [\Big | \int \limits _{\tau _{\tilde{n}_{1}+1}}^{\tau _{\tilde{n}_{1}+2}} dr\frac{dV}{dx}(X_{r}) \Big |\Big ]+\tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\Big [\sum _{n=0}^{ \tilde{n}_{1}}\big |\mathbf {C}^{(\lambda )}_{0}(\sigma _{n}) \big | \Big ]\nonumber \\&+ \tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\big [\big | \mathbf {m}_{R_{1}}\big | ^{2} \big ]^{\frac{1}{2}} + \tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\big [\big | \mathbf {m}_{R_{1}}'\big |^{2} \big ]^{\frac{1}{2}}, \end{aligned}$$
(4.33)
where we have also applied Jensen’s inequality to the last two terms. The first three terms on the right side and \(\tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\big [|\mathbf {C}^{(\lambda )}_{0}(\sigma _{0})| \big ]=|\mathbf {C}^{(\lambda )}_{0}(s)|\) (the \(n=0\) summand from the fourth term on the right) are bounded by \(c\sup _{x}|\frac{dV}{dx}(x)|\), where \(c>0\) is from Lemma 3.5. This follows for the first term, for instance, since
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\Big [\Big | \int \limits _{0}^{\tau _{1}} dr\frac{dV}{dx}(X_{r}) \Big |\Big ]\le \sup _{x}\big |\frac{dV}{dx}(x)\big | \tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\big [\tau _{1} \big ] \le c \sup _{x}\big |\frac{dV}{dx}(x)\big |, \end{aligned}$$
where the second inequality is by Lemma 3.5.
Since \(\mathbf {m}_{t}'\) is a martingale, we have the first equality below
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\big [\big | \mathbf {m}_{R_{1}}' \big | ^{2} \big ]= \tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\big [\langle \mathbf {m}'\rangle _{R_{1}} \big ] = \tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\Big [\sum _{n=1}^{\lfloor \frac{1}{2} \tilde{n}_{1} \rfloor }\mathbf {C}^{(\lambda )}_{1}(\sigma _{2n}) \Big ]\le \tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\Big [\sum _{n=1}^{ \tilde{n}_{1}}\mathbf {C}^{(\lambda )}_{1}(\sigma _{n}) \Big ]. \end{aligned}$$
A similar calculation holds for the term \(\tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\big [\big | \mathbf {m}_{R_{1}}\big | ^{2} \big ]\). With the above remarks,
$$\begin{aligned} \tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\Big [\Big |\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r})\Big |\Big ]&\le 4c\sup _{x}\big |\frac{dV}{dx}(x)\big | +\tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\Big [\sum _{n=1}^{\tilde{n}_{1}}\big | \mathbf {C}^{(\lambda )}_{0}(\sigma _{n})\big | \Big ]\\&+\, 2\tilde{\mathbb {E}}_{\tilde{s}}^{(\lambda )}\Big [\sum _{n=1}^{\tilde{n}_{1}}\mathbf {C}^{(\lambda )}_{1}(\sigma _{n}) \Big ]^{\frac{1}{2}}\\&\le 4c \sup _{x}\big |\frac{dV}{dx}(x)\big | +c\tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\sum _{n=1}^{\tilde{n}_{1}}\big | \mathbf {C}^{(\lambda )}_{0}(\sigma _{n})\big | \Big ]\\&+\, 2c^{\frac{1}{2}} \tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\sum _{n=1}^{\tilde{n}_{1}}\mathbf {C}^{(\lambda )}_{1}(\sigma _{n}) \Big ]^{\frac{1}{2}} \\&\le 4c \sup _{x}\big |\frac{dV}{dx}(x)\big | +c\big (U^{(\lambda )}\mathbf {C}^{(\lambda )}_{0}\big )(s)+c\sup _{s\in {{\mathrm{supp}}}(h)} \big (U^{(\lambda )}\mathbf {C}^{(\lambda )}_{0}\big )(s)\\&+\, 2c^{\frac{1}{2}}\Big (\big (U^{(\lambda )}\mathbf {C}^{(\lambda )}_{1}\big )(s)+\sup _{s\in {{\mathrm{supp}}}(h)} \big (U^{(\lambda )}\mathbf {C}^{(\lambda )}_{1}\big )(s)\Big )^{\frac{1}{2}}, \end{aligned}$$
where the map \(U^{(\lambda )} :L^{\infty }(\Sigma )\rightarrow L^{\infty }(\Sigma )\) was defined in (4.1). The second inequality uses that \(\tilde{\mathbb {E}}_{\tilde{\delta }_{s}}\) is defined to be \((1-h(s))\tilde{\mathbb {E}}_{(s,0)} +h(s)\tilde{\mathbb {E}}_{(s,1)} \). The third inequality follows by Lemma 4.4. Applying Part (1) from Lemma 4.7 for \(g_{\lambda }(s)= \mathbf {C}^{(\lambda )}_{0}(s)\) and \(g_{\lambda }(s)= \mathbf {C}^{(\lambda )}_{1}(s)\) in combination with Lemma 4.2, we obtain a logarithmic bound in \(|p|\) for the right side. \(\square \)

5 Bounds for the Cumulative Potential Forcing

In this section we prove Theorem 1.2.

5.1 The Martingale Approximating the Potential Drift Process

As discussed in Sect. 1.2.3, we will approximate the potential drift \(D_{t}\) by the process
$$\begin{aligned} \tilde{M}_{t}:= \sum _{n=1}^{ \tilde{N}_{t}}\Big (\int \limits _{R_{n}}^{R_{n+1}}dr\frac{dV}{dx}(X_{r})-\big (\mathfrak {R}^{(\lambda )}\frac{dV}{dx}\big )(S_{R_{n}}) +\big (\mathfrak {R}^{(\lambda )}\frac{dV}{dx}\big )(S_{R_{n+1}}) \Big ). \end{aligned}$$
Part (1) of Proposition 4.1 and the strong Markov property imply that the increments of \(\tilde{M}_{t}\) have finite moments. The lemma below states that \(\tilde{M}_{t}\) is a martingale and gives a closed form for its predictable quadratic variation.

Lemma 5.1

The process \(\tilde{M}_{t}\) is a martingale with respect to the filtration \(\tilde{\mathcal {F}}_{t}'\). Moreover, the predictable quadratic variation \(\langle \tilde{M}\rangle _{t}\) has the form
$$\begin{aligned} \langle \tilde{M}\rangle _{t}= \sum _{n=1}^{\tilde{N}_{t}}\check{\upsilon }_{\lambda }\big (S_{R_{n}} \big ), \end{aligned}$$
where \(\check{\upsilon }_{\lambda }:\Sigma \rightarrow {\mathbb R}^{+}\) is defined as
$$\begin{aligned} \check{\upsilon }_{\lambda }\big (s \big )&= 2\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\delta }_{s}}\Big [\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r})\big (\mathfrak {R}^{(\lambda )}\frac{dV}{dx}\big )(S_{r})\Big ]\\&+\int \limits _{\Sigma }d\nu (s')\Big (\big (\mathfrak {R}^{(\lambda )}\frac{dV}{dx}\big )(s')\Big )^{2} -\Big (\big (\mathfrak {R}^{(\lambda )}\frac{dV}{dx}\big )(s) \Big )^{2}. \end{aligned}$$
In the above, \(\tilde{\delta }_{s}\) is the splitting of the \(\delta \)-distribution at \(s\in \Sigma \).

Proof

Recall that for a partition time \(\mathbf {t}\), \(\tilde{\mathcal {F}}_{\mathbf {t}^{-}}\) refers to the \(\sigma \)-algebra containing all information before time \(\mathbf {t}\) along with the additional information that \(\mathbf {t}\) is a partition time. The jump times \(R_{n}'\) for the process \(\tilde{M}_{t}\) are predictable with respect to the filtration \(\tilde{\mathcal {F}}_{t}' \), although we will show that the values of the jumps still have mean zero with respect to the information known before the time of the jump. We can rewrite the martingale as
$$\begin{aligned} \tilde{M}_{t}&= \sum _{n=1}^{ \tilde{N}_{t}}\int \limits _{R_{n}}^{R_{n+1}}dr\frac{dV}{dx}(X_{r})-\tilde{\mathbb {E}}^{(\lambda )}\Big [\int \limits _{R_{n}}^{R_{n+1}}dr\frac{dV}{dx}(X_{r}) \,\Big |\,\tilde{\mathcal {F}}_{R_{n}^{-}}\Big ]\\&+\,\tilde{\mathbb {E}}^{(\lambda )}\Big [\int \limits _{R_{n+1}}^{R_{n+2}}dr\frac{dV}{dx}(X_{r}) \,\Big |\,\tilde{\mathcal {F}}_{R_{n+1}^{-}}\Big ] \end{aligned}$$
since, by applying Part (3) of Proposition 2.3 and the strong Markov property at time \(R_{n}\), we have the first equality below
$$\begin{aligned} \tilde{\mathbb {E}}^{(\lambda )}\Big [\int \limits _{R_{n}}^{R_{n+1}}dr\frac{dV}{dx}(X_{r})\,\Big |\,\tilde{\mathcal {F}}_{R_{n}^{-}} \Big ]=\tilde{\mathbb {E}}_{ \tilde{\delta }_{S_{R_{n}}}}^{(\lambda )}\Big [\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r}) \Big ]=\big (\mathfrak {R}^{(\lambda )}\frac{dV}{dx}\big )(S_{R_{n}})+c, \end{aligned}$$
(5.1)
and the second equality in (5.1) is for some \(c\in {\mathbb R}\) by Part (3) of Proposition 2.4.
For fixed \(t\in {\mathbb R}^{+}\) knowledge of whether or not the event \(t<R_{n}'\) occurred will be contained in the \(\sigma \)-algebra \(\tilde{\mathcal {F}}_{t}' \) for each \(n\in \mathbb {N}\). The jumps of \(\tilde{M}_{t}\) have mean zero since for any \(n\in \mathbb {N}\) such that \(t<R_{n}'\) the conditional expectation of the \(n\)th jump given \(\tilde{\mathcal {F}}_{t}'\) is
$$\begin{aligned} \tilde{\mathbb {E}}^{(\lambda )}\Bigg [\int \limits _{R_{n}}^{R_{n+1}}dr\frac{dV}{dx}(X_{r})&- \tilde{\mathbb {E}}^{(\lambda )}\Big [\int \limits _{R_{n}}^{R_{n+1}}dr\frac{dV}{dx}(X_{r}) \,\Big |\,\tilde{\mathcal {F}}_{R_{n}^{-}}\Big ]\\&+\,\tilde{\mathbb {E}}^{(\lambda )}\Big [\int \limits _{R_{n+1}}^{R_{n+2}}dr\frac{dV}{dx}(X_{r}) \,\Big |\,\tilde{\mathcal {F}}_{R_{n+1}^{-}} \Big ]\,\Bigg |\,\tilde{\mathcal {F}}_{t}' \Bigg ]\\&= \tilde{\mathbb {E}}_{ \tilde{\nu }}^{(\lambda )}\Big [\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r}) \Big ]=0. \end{aligned}$$
To get the first equality above, we can insert a nested conditional expectation with respect to \(\tilde{\mathcal {F}}_{R_{n}^{-}}\) since \(\tilde{\mathcal {F}}_{t}'\subset \tilde{\mathcal {F}}_{R_{n}^{-}}\), and the first two terms in the expectation then cancel. For the third term in the first equality, we use the strong Markov property at the time \(R_{n+1}\) and that \(\tilde{S}_{R_{n+1}}\) has distribution \(\tilde{\nu }\) when conditioned on \(\tilde{\mathcal {F}}_{R_{n}}\) (and thus when conditioned on \(\tilde{\mathcal {F}}_{t}'\)) by Part (1) of Proposition 2.1. The second equality is by Part (2) of Proposition 2.4 with \(g(x,p)=\frac{dV}{dx}(x)\) since \(\Psi _{\infty ,\lambda }(\frac{dV}{dx})=0\). Thus \(\tilde{M}_{t}\) is a martingale.
For \( t\in [R_{n-1}',R_{n}') \) the \(\sigma \)-algebra \(\tilde{\mathcal {F}}_{t}'\) contains all information before time \(R_{n}\), i.e., \(\tilde{\mathcal {F}}_{t}'=\tilde{\mathcal {F}}_{R_{n}^{-}}\), along with knowledge of the value \(R_{n}\). The predictable quadratic variation \(\langle \tilde{M}\rangle _{t}\) must have the form of a discrete sum over \(\sum _{n=1}^{\tilde{N}_{t}}\) because the jump times \(R_{n}'\) are predictable. As a consequence of Part (3) of Proposition 2.3, the conditional distribution for \(\tilde{S}_{R_{n}}\) given \(\tilde{\mathcal {F}}_{t}'=\tilde{\mathcal {F}}_{R_{n}^{-}}\) is \(\tilde{\delta }_{S_{R_{n}}} \). The variance of the \(n\)th jump for \(\tilde{M}_{t}\) conditioned on \(\tilde{\mathcal {F}}_{t}'\) is \(\check{\upsilon }_{\lambda }(S_{R_{n}}) \), where
$$\begin{aligned} \check{\upsilon }_{\lambda }(s)&= \tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\Big (\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r})-\tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r}) \Big ] +\tilde{\mathbb {E}}_{ \tilde{\delta }_{S_{R_{1}}}}^{(\lambda )}\Big [\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r}) \Big ]\Big )^{2}\Big ]\\&= 2\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\delta }_{s}}\Big [\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r})\int \limits _{r}^{R_{2}}dr'\frac{dV}{dx}(X_{r'})\Big ]+\int \limits _{\Sigma }d\nu (s')\Big (\tilde{\mathbb {E}}_{\tilde{\delta }_{s'}}^{(\lambda )}\Big [\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r}) \Big ]\Big )^{2}\\&-\Big (\tilde{\mathbb {E}}_{\tilde{\delta }_{s}}^{(\lambda )}\Big [\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r}) \Big ]\Big )^{2}. \end{aligned}$$
This expression for \(\check{\upsilon }_{\lambda }(s)\) can be written in terms of \( \big (\mathfrak {R}^{(\lambda )}\,\frac{dV}{dx}\big )(s)\) as done in the statement of this lemma by using that \( \big (\mathfrak {R}^{(\lambda )}\,\frac{dV}{dx}\big )(s)=\tilde{\mathbb {E}}_{ \tilde{\delta }_{s}}^{(\lambda )}\big [\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r}) \big ]-c\) and the same reasoning in the proof of Part (4) of Proposition 2.4. \(\square \)

Define \(\upsilon _{\lambda }:=\int _{\Sigma }d\nu (s) \check{\upsilon }_{\lambda }(s)\).

Lemma 5.2

The value \(\upsilon _{\lambda }\in {\mathbb R}^{+}\) is uniformly bounded for \(\lambda <1\), and \(\upsilon _{\lambda }\) depends continuously on the parameter \(\lambda \).

Proof

By the proof of Proposition 2.4, the value \(\upsilon _{\lambda }\in {\mathbb R}^{+}\) can be written as
$$\begin{aligned} \upsilon _{\lambda }&= \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\int \limits _{0}^{R_{1}}dr \frac{dV}{dx}(X_{r}) \int \limits _{r}^{R_{2}}dr'\frac{dV}{dx}(X_{r'}) \Big ] \nonumber \\&= \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\int \limits _{0}^{R_{1}}dr \frac{dV}{dx}(X_{r}) \int \limits _{r}^{R_{1}}dr'\frac{dV}{dx}(X_{r'}) \Big ]+ \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\int \limits _{0}^{R_{1}}dr \frac{dV}{dx}(X_{r}) \int \limits _{R_{1}}^{R_{2}}dr'\frac{dV}{dx}(X_{r'}) \Big ] \nonumber \\&\le \frac{3}{2}\tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\sup _{0\le t \le R_{1}}\Big (\int \limits _{0}^{t}dr \frac{dV}{dx}(X_{r}) \Big )^{2} \Big ] +\frac{1}{2} \tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\Big (\int \limits _{R_{1}}^{R_{2}}dr'\frac{dV}{dx}(X_{r'}) \Big )^{2} \Big ] \nonumber \\&= 2\tilde{\mathbb {E}}_{\tilde{\nu }}^{ (\lambda )} \Big [\sup _{0\le t \le R_{1}}\Big (\int \limits _{0}^{t}dr \frac{dV}{dx}(X_{r}) \Big )^{2} \Big ] , \end{aligned}$$
(5.2)
where the inequality applies the relation \(2ab\le a^{2}+b^{2}\) for \(a= \int _{0}^{R_{1}}dr \frac{dV}{dx}(X_{r}) \) and \(b=\int _{R_{1}}^{R_{2}}dr'\frac{dV}{dx}(X_{r'})\), and the third equality uses the strong Markov property and the fact that \(\tilde{S}_{R_{1}}\) has distribution \(\tilde{\nu }\) by Part (1) of Proposition 2.1. The right side of (5.2) is uniformly bounded for \(\lambda <1\) by Part (1) of Proposition 4.1, and hence the values \(\upsilon _{\lambda }\) are uniformly bounded for small \(\lambda \). By the same reasoning as above, the second moment of the random variable \(Y^{(\lambda )}:=\int _{0}^{R_{1}}dr \frac{dV}{dx}(X_{r}) \int _{r}^{R_{2}}dr'\frac{dV}{dx}(X_{r'})|_{\tilde{\mathbb {P}}_{\tilde{\nu }}^{ (\lambda )}}\) is uniformly bounded for the parameter values \(\lambda <1\). To clarify, “\(Y^{(\lambda )}\)” refers to the stated random variable with respect to the parameter value \(\lambda \in {\mathbb R}^{+}\) for the underlying statistics. With the uniform bound on the second moment, it is sufficient that the distribution for the random variable \(Y^{(\lambda )}\) is weakly continuous as a function of \(\lambda \in {\mathbb R}^{+}\) to guarantee that \(\upsilon _{\lambda }\) is continuous. If the random times \(R_{1},R_{2}\) had deterministic upper bounds, then the weak convergence would be clear. By introducing a time cut-off \(t>0\), the random variable \(Y_{t}^{(\lambda )}:=\int _{0}^{R_{1}\wedge t}dr \frac{dV}{dx}(X_{r}) \int _{r}^{R_{2}\wedge t}dr'\frac{dV}{dx}(X_{r'})|_{\tilde{\mathbb {P}}_{\tilde{\nu }}^{ (\lambda )}}\) is weakly continuous as a function of \(\lambda \). The random variables \(Y_{t}^{(\lambda )}\) converge in probability uniformly to \(Y^{(\lambda )}\) as \(t\rightarrow \infty \) for \(\lambda <1\) since \(R_{2}\) has uniform fractional moments \(\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\nu }}[R_{2}^{\frac{1}{4}}]\) for \(\lambda <1\) by Lemma 3.4. For \(\lambda _{n}\rightarrow \lambda <1\), the following diagram characterizes the convergences involved:where the downward arrows signify a uniform sense for the convergence in probability as \(t\rightarrow \infty \). The above convergences imply that \(Y^{(\lambda _{n})} \) converges weakly to \( Y^{(\lambda )}\) as \(n\rightarrow \infty \). \(\square \)

The following lemma relates the predictable quadratic variation of \( \tilde{M}_{t}\) to the counting process \(\tilde{N}_{t}\) and is somewhat stronger than we require.

Lemma 5.3

As \(\lambda \searrow 0\) the following order equality holds:
$$\begin{aligned} \tilde{\mathbb {E}}^{(\lambda )}\Big [\sup _{0\le t\le T}\Big |\lambda ^{\frac{1}{2}} \langle \tilde{M}\rangle _{\frac{t}{\lambda }} -\lambda ^{\frac{1}{2}}\upsilon _{\lambda } \tilde{N}_{\frac{t}{\lambda }}\Big | \Big ]= O (\lambda ^{\frac{1}{4}}). \end{aligned}$$
Also, for any \(t\in {\mathbb R}^{+}\) the following expectations are equal: \(\tilde{\mathbb {E}}^{(\lambda )}\big [\langle \tilde{M}\rangle _{t} \big ] =\upsilon _{\lambda }\tilde{\mathbb {E}}^{(\lambda )}\big [\tilde{N}_{t}\big ]\).

Proof

We only prove the equality \(\tilde{\mathbb {E}}^{(\lambda )}\big [\langle \tilde{M}\rangle _{t} \big ] =\upsilon _{\lambda }\tilde{\mathbb {E}}^{(\lambda )}\big [\tilde{N}_{t}\big ]\) here, which is what we use in the proof of Theorem 1.2. The remainder of the proof is placed in Sect. 7.3. Lemma 5.1 yields the first equality below:
$$\begin{aligned} \tilde{\mathbb {E}}^{(\lambda )}\big [\langle \tilde{M}\rangle _{t}\big ]=\tilde{\mathbb {E}}^{(\lambda )}\Big [\sum _{n=1}^{\tilde{N}_{t}} \check{\upsilon }_{\lambda }(S_{R_{n}}) \Big ]=\tilde{\mathbb {E}}^{(\lambda )}\Big [\sum _{n=1}^{\tilde{N}_{t}} \mathbb {E}\big [\check{\upsilon }_{\lambda }(S_{R_{n}})\,\big |\,\tilde{\mathcal {F}}_{R_{n}'} \big ]\Big ]=\upsilon _{\lambda }\tilde{\mathbb {E}}^{(\lambda )}\big [\tilde{N}_{t}\big ]. \end{aligned}$$
The third equality holds since \(S_{R_{n}}\) has distribution \(\nu \) when conditioned on \(\tilde{\mathcal {F}}_{R_{n}'}\) by Part (1) of Proposition 2.1. \(\square \)

5.2 Proof of Theorem 1.2

Proof of Theorem 1.2

We will work with the split dynamics to show that
$$\begin{aligned} \mathbb {E}^{(\lambda )}\Big [\sup _{t\in [0,\frac{T}{\lambda }]}|D_{t}|\Big ]=\tilde{\mathbb {E}}^{(\lambda )}\Big [\sup _{t\in [0,\frac{T}{\lambda }]}|D_{t}|\Big ]= O (\lambda ^{-\frac{1}{4}}). \end{aligned}$$
Let \(\tilde{M}_{t}\) be the martingale from Lemma 5.1. We can write \(D_{t}\) as
$$\begin{aligned} D_{t}&= \int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r})-\int \limits _{t}^{R_{\tilde{N}_{t}+1}}dr\frac{dV}{dx}(X_{r}) \nonumber \\&+\,\tilde{\mathbb {E}}_{ \tilde{S}_{R_{1}}}\Big [\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r}) \Big ] -\tilde{\mathbb {E}}_{ \tilde{S}_{R_{\tilde{N}_{t}+1}}}\Big [\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r}) \Big ]+ \tilde{M}_{t} , \end{aligned}$$
(5.3)
where we have used Part (3) of Proposition 2.4. The triangle inequality gives
$$\begin{aligned} \mathbb {E}^{(\lambda )}\Big [\sup _{t\in [0,\frac{T}{\lambda }]}|D_{t}|\Big ]&\le 2\tilde{\mathbb {E}}^{(\lambda )}\Big [\Big |\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r})\big |\Big ]\nonumber \\&+\,3\tilde{\mathbb {E}}^{(\lambda )}\Big [\sup _{R_{1}\le t\le \frac{T}{\lambda }}\Big |\int \limits _{t}^{R_{\tilde{N}_{t}+1}}dr\frac{dV}{dx}(X_{r})\Big | \Big ] +\mathbb {E}^{(\lambda )}\Big [\sup _{t\in [0,\frac{T}{\lambda }]}|\tilde{M}_{t}|\Big ].\qquad \end{aligned}$$
(5.4)
For the first term on the right side of (5.4), we can apply Part (2) of Proposition 4.1 to get
$$\begin{aligned} \tilde{\mathbb {E}}^{(\lambda )}\Big [\Big |\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r})\big |\Big ]&\le \int d\tilde{\mu }(x,p,z) \tilde{\mathbb {E}}_{(x,p,z)}^{(\lambda )}\Big [\Big |\int \limits _{0}^{R_{1}}dr\frac{dV}{dx}(X_{r})\Big |\Big ]\nonumber \\&\le C\int \limits _{{\mathbb R}}d\mu (x,p)\big (1+\log (1+|p|)\big )< C\int \limits _{{\mathbb R}}d\mu (x,p)\big (1+|p|\big ),\nonumber \\ \end{aligned}$$
(5.5)
where \(\mu \) is the initial measure on \(\Sigma \) and \(\tilde{\mu }\) is its splitting. By our assumption on \(\mu \), the first moment is finite, and hence (5.5) is finite.
The second term on the right side of (5.4) is less simple since it depends on \(t\). Writing
$$\begin{aligned} \int \limits _{t}^{R_{\tilde{N}_{t+1}}}dr\frac{dV}{dx}(X_{r})=\int \limits _{R_{\tilde{N}_{t}}}^{R_{\tilde{N}_{t} +1}}dr\frac{dV}{dx}(X_{r}) -\int \limits _{R_{\tilde{N}_{t}}}^{t}dr\frac{dV}{dx}(X_{r}), \end{aligned}$$
then with the triangle inequality
$$\begin{aligned} \tilde{\mathbb {E}}^{(\lambda )}\Big [\sup _{0\le t\le \frac{T}{\lambda }}\Big |\int \limits _{t}^{R_{\tilde{N}_{t}+1}}dr\frac{dV}{dx}(X_{r})\Big | \Big ]&\le \tilde{\mathbb {E}}^{(\lambda )}\Big [\sup _{0\le t\le \frac{T}{\lambda }}\Big (\Big |\int \limits _{R_{\tilde{N}_{t}}}^{t}dr\frac{dV}{dx}(X_{r})\Big |+\Big |\int \limits _{R_{\tilde{N}_{t}}}^{R_{\tilde{N}_{t}+1}}dr\frac{dV}{dx}(X_{r})\Big | \Big )\Big ] \nonumber \\&\le 2 \tilde{\mathbb {E}}^{(\lambda )}\Big [\sup _{1\le m\le \tilde{N}_{t}} \sup _{t\in [R_{m},R_{m+1}]} \Big |\int \limits _{R_{m}}^{t}dr\frac{dV}{dx}(X_{r})\Big | \Big ]. \end{aligned}$$
(5.6)
We can get rid of the first supremum with the inequality
$$\begin{aligned}&\tilde{\mathbb {E}}^{(\lambda )}\Big [\sup _{1\le m\le \tilde{N}_{\frac{T}{\lambda }}} \sup _{t\in [R_{m},R_{m+1}]} \Big |\int \limits _{R_{m}}^{t}dr\frac{dV}{dx}(X_{r})\Big | \Big ]\nonumber \\&\quad \le \tilde{\mathbb {E}}^{(\lambda )}\Big [\sum _{m=1}^{\tilde{N}_{\frac{T}{\lambda }}} \sup _{t\in [R_{m},R_{m+1}]} \Big |\int \limits _{R_{m}}^{t}dr\frac{dV}{dx}(X_{r})\Big |^{2} \Big ]^{\frac{1}{2}}\nonumber \\&\quad = \tilde{\mathbb {E}}^{(\lambda )}\Big [\sum _{m=1}^{\tilde{N}_{\frac{T}{\lambda }}}\tilde{\mathbb {E}}^{(\lambda )}\Big [\sup _{t\in [R_{m},R_{m+1}]} \Big |\int \limits _{R_{m}}^{t}dr\frac{dV}{dx}(X_{r})\Big |^{2}\,\Big |\,\tilde{\mathcal {F}}_{R_{m}'} \Big ] \Big ]^{\frac{1}{2}}\nonumber \\&\quad = \tilde{\mathbb {E}}^{(\lambda )}[\tilde{N}_{\frac{T}{\lambda }}]^{\frac{1}{2}} \tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\Big [\sup _{t\in [0,R_{1}]} \Big |\int \limits _{0}^{t}dr\frac{dV}{dx}(X_{r})\Big |^{2} \Big ]^{\frac{1}{2}} = O (\lambda ^{-\frac{1}{4}}). \end{aligned}$$
In the second equality, we have applied the strong Markov property at the times \(R_{m}'\) and have used that \(\tilde{S}_{R_{m}}\) is distributed as \(\tilde{\nu }\) given \(\tilde{\mathcal {F}}_{R_{m}'}\) by Part (1) of Proposition 2.1. By Part (1) of Proposition 4.1, the expectation \(\tilde{\mathbb {E}}_{\tilde{\nu }}^{(\lambda )}\) above is bounded uniformly for \(\lambda <1\). The expectation \(\tilde{\mathbb {E}}^{(\lambda )}[\tilde{N}_{\frac{T}{\lambda }}]\) is \( O (\lambda ^{-\frac{1}{2}})\) by Lemma 3.3.
Now for the last term in (5.4). Applying Jensen’s and then Doob’s maximal inequality, we get the first two relations below:
$$\begin{aligned} \tilde{\mathbb {E}}^{(\lambda )}\Big [\sup _{0\le t\le \frac{T}{\lambda }}\big |\tilde{M}_{t}\big | \Big ]&\le \tilde{\mathbb {E}}^{(\lambda )}\Big [\sup _{0\le t\le \frac{T}{\lambda }}\big |\tilde{M}_{t}\big |^{2} \Big ]^{\frac{1}{2}}\le 2\tilde{\mathbb {E}}^{(\lambda )}\big [\big |\tilde{M}_{\frac{T}{\lambda }}\big |^{2} \big ]^{\frac{1}{2}}\nonumber \\&= 2\tilde{\mathbb {E}}^{(\lambda )}\big [\langle \tilde{M}\rangle _{\frac{T}{\lambda }}\big ]^{\frac{1}{2}} =2\upsilon _{\lambda }^{\frac{1}{2}}\tilde{\mathbb {E}}^{(\lambda )}\big [\tilde{N}_{\frac{T}{\lambda }} \big ]^{\frac{1}{2}}= O (\lambda ^{-\frac{1}{4}}). \end{aligned}$$
(5.7)
The second equality is by Lemma 5.3, and we again apply Lemma 3.3 to get \(\tilde{\mathbb {E}}^{(\lambda )}\big [\tilde{N}_{\frac{T}{\lambda }} \big ]= O (\lambda ^{-\frac{1}{2}})\). Also, \(\upsilon _\lambda \) is uniformly bounded for \(\lambda <1\) by Lemma 5.2. \(\square \)

6 Convergence to the Ornstein–Uhlenbeck Process

In this section we will now prove Theorem 1.3. As a preliminary, Sect. 6.1 characterizes the martingale and drift components in the semi-martingale decomposition for the jump process \(J_{t}\) defined in (1.3). The main ingredient for the study of \(J_{t}\) is the bound in Lemma 3.2 on the typical energy of the particle obtained over the time interval \([0,\frac{T}{\lambda }]\) for small \(\lambda \).

6.1 Limiting Behavior for the Jump Process

The jump processes \(J_{t}\) can be written as a sum of a martingale \(M_{t}\) and a predictable part as
$$\begin{aligned} J_{t}=M_{t}+\int \limits _{0}^{t}dr\mathcal {D}_{\lambda }(P_{r}) \quad \text {for}\quad \mathcal {D}_{\lambda }(p)=\int \limits _{{\mathbb R}}dp^{\prime }(p^{\prime }-p) {\mathcal {J}}_{\lambda }(p,p^{\prime }). \end{aligned}$$
In order to write an expression for the predictable quadratic variation \(\langle M\rangle _{t}\) in terms of the jump rates \(\mathcal {J}_{\lambda }\), let us define for \(m\in {\mathbb N}\)
$$\begin{aligned} \Pi _{\lambda ,m}(p)=\int \limits _{{\mathbb R}}dp^{\prime }(p^{\prime }-p)^{m} {\mathcal {J}}_{\lambda }(p,p^{\prime }). \end{aligned}$$
(6.1)
Note that \(\mathcal {E}_{\lambda }(p):=\Pi _{\lambda ,0}(p)\) is the escape rate for the jump process when \(P_{t}=p\), and \(\mathcal {D}_{\lambda }\) is equal to \(\Pi _{\lambda ,m}\) for \(m=1\). We also define \(\mathcal {Q}_{\lambda }(p)=\Pi _{\lambda ,2}(p)\). The predictable quadratic variation for \(M_{t}\) can then be written in the closed form
$$\begin{aligned} \langle M\rangle _{t}= \int \limits _{0}^{t}dr\mathcal {Q}_\lambda (P_{r}). \end{aligned}$$
The proposition below collects some facts regarding the functions \(\mathcal {D}_{\lambda }(p)\).

Proposition 6.1

There are constants \(C, C_m >0\) such that for all \(\lambda \le 1\) and all \(p\in {\mathbb R}\),
  1. 1.

    \(\frac{1}{8(\lambda +1)} \le \mathcal {E}_\lambda (p)\le \frac{1}{8(\lambda +1)}\left( 1+C\lambda |p|\right) \) and \(\lambda |p| \le C\mathcal {E}_\lambda (p),\)

     
  2. 2.

    \( \big |\mathcal {D}_{\lambda }(p) +\frac{\lambda p}{2}\big |\le C\lambda ^2 (|p|+p^{2})\),

     
  3. 3.

    \( \left| \mathcal {Q}_{\lambda }(p)-1 \right| \le C\lambda +C\lambda |p|+C\lambda ^{3}|p|^3\),

     
  4. 4.

    \( \Pi _{\lambda ,2m}(p)\le C_m(1+\lambda |p|)^{2m+1} \).

     
Next we show that the drift rate \(\mathcal {D}_{\lambda }(p)\) can be effectively replaced by the linear form \(-\frac{\lambda p}{2}\). Let \(P_{t}^{\prime }\) be the solution to the integral equation
$$\begin{aligned} P_{t}^{\prime }=P_{0}-\int \limits _{0}^{t}dr\frac{dV}{dx}(X_{r})-\frac{1}{2}\lambda \int \limits _{0}^{t}drP_{r}^{\prime } +M_{t}. \end{aligned}$$
(6.2)
Also let us denote \(P_{t}^{(\lambda )}:=\lambda ^{\frac{1}{2}}P_{\frac{t}{\lambda }}^{(\lambda )}\) and \(P_{t}^{(\lambda ),\prime }:=\lambda ^{\frac{1}{2}}P_{\frac{t}{\lambda }}^{\prime }\).

Lemma 6.2

Fix \(T>0\). The difference between \(P_{t}^{(\lambda ),\prime }\) and \(P_{t}^{(\lambda ),\prime }\) for small \(\lambda >0 \) satisfies
$$\begin{aligned} \mathbb {E}^{(\lambda )}\big [\sup _{0\le t\le T}\big |P_{t}^{(\lambda ),\prime }- P_{t}^{(\lambda )}\big |\big ]= O (\lambda ^{\frac{1}{2}}). \end{aligned}$$

Proof

Since \(P_{t}\) satisfies the integral equation
$$\begin{aligned} P_{t}=P_{0}-\int \limits _{0}^{t}dr\frac{dV}{dx}(X_{r})+ \int \limits _{0}^{t}dr \mathcal {D}_{\lambda }(P_{r}) +M_{t}, \end{aligned}$$
and \(P_{t}'\) satisfies (6.2), we have the identity
$$\begin{aligned} P_{t}^{(\lambda ),\prime }\!-\! P_{t}^{(\lambda )}\!=\! \int \limits _{0}^{t}dr R_{\lambda }(P_{r}^{(\lambda )})\!-\!\frac{1}{2}\int \limits _{0}^{t}dr e^{-\frac{1}{2}(t-r)}\int \limits _{0}^{r}ds R_{\lambda }(P_{s}^{(\lambda )})\!=\!\int \limits _{0}^{t} dre^{-\frac{1}{2}(t-r)} R_{\lambda }(P_{r}^{(\lambda )}) , \end{aligned}$$
where \(R_{\lambda }(p)= -\lambda ^{-\frac{1}{2}}D_{\lambda }(\frac{p}{\lambda ^{\frac{1}{2}}})-\frac{1}{2} p\). Thus, we have the first inequality below:
$$\begin{aligned} \mathbb {E}^{(\lambda )}\Big [\sup _{0\le t\le T}\big |P_{t}^{(\lambda ),\prime }- P_{t}^{(\lambda )}\big |\Big ]&\le 2(1-e^{-\frac{1}{2}T}) \mathbb {E}^{(\lambda )}\Big [\sup _{0\le t\le T} \big |R_{\lambda }(P_{t}^{(\lambda )}) \big | \Big ]\\&\le 2C\lambda ^{\frac{3}{2}}+2C\lambda ^{\frac{1}{2}} \mathbb {E}^{(\lambda )}\Big [\sup _{0\le t\le T} \big | \lambda ^{\frac{1}{2}}P_{\frac{t}{\lambda }} \big |^{2} \Big ]\\&= O (\lambda ^{\frac{1}{2}}). \end{aligned}$$
The second inequality follows by Part (2) of Proposition 6.1. By bounding \(|P_{t}|\le (2H_{t})^{\frac{1}{2}}\) and applying Lemma 3.2, the expectation on the second line is uniformly bounded for \(\lambda <1\). Thus, the above is \( O (\lambda ^{\frac{1}{2}})\). \(\square \)

The following lemma gives a central limit theorem for the martingale \(M_{t}^{(\lambda )}=\lambda ^{\frac{1}{2}}M_{\frac{t}{\lambda }}\).

Lemma 6.3

As \(\lambda \searrow 0\) the martingale \(M^{(\lambda )}_{t}=\lambda ^{\frac{1}{2}} M_{\frac{t}{\lambda }}\) converges in law with respect to the uniform metric to a standard Brownian motion \(\mathbf {B}\) over the interval \(t\in [0,T]\).

Proof

To prove the central limit theorem, we prove the following:
  1. (i).

    For each \(t\in {\mathbb R}^{+}\), the predictable quadratic variation process \(\langle M^{(\lambda )}\rangle _{t}\) converges in probability to \(t\) as \(\lambda \searrow 0\).

     
  2. (ii).
    For any \(\epsilon >0\), then as \(\lambda \rightarrow 0\)
    $$\begin{aligned} \mathbb {P}^{(\lambda )}\Big [\sup _{0\le r\le \frac{T}{\lambda }}\big (M_{r}-M_{r-}\big )^{2}> \frac{\epsilon }{\lambda } \Big ]\longrightarrow 0. \end{aligned}$$
     
By [29, Theorem VIII.2.13] (i) and (ii) are sufficient to prove that \(M^{(\lambda )}_{t}\) converges in law to a Brownian motion.
  1. (i).
    We prove a somewhat stronger statement. Note that
    $$\begin{aligned} \langle M^{(\lambda )}\rangle _{t}-t= \lambda \int \limits _{0}^{\frac{t}{\lambda }}dr\big (\mathcal {Q}_{\lambda }(P_{r})-1\big ). \end{aligned}$$
    For the expectation of the supremum of the difference between \(\langle M^{(\lambda )}\rangle _{t}\) and \(t\) over the interval \([0,T]\), we have
    $$\begin{aligned} \mathbb {E}^{(\lambda )}\Big [\sup _{t\in [0,T]} \Big | \langle M^{(\lambda )}\rangle _{t}-t \Big |\Big ]&\le \lambda \mathbb {E}^{(\lambda )}\Big [\int \limits _{0}^{\frac{T}{\lambda }}dr\big |\mathcal {Q}_{\lambda }(P_{r})-1\big |\Big ]\\&\le CT\lambda ^{\frac{1}{2}}\Big (\lambda ^{\frac{1}{2}}+ \mathbb {E}^{(\lambda )}\Big [\sup _{0\le r\le \frac{T}{\lambda }} \lambda ^{\frac{1}{2}} |P_{r}|\Big ]\\&+\,\lambda \mathbb {E}^{(\lambda )}\Big [\sup _{0\le r\le \frac{T}{\lambda }} \lambda ^{\frac{3}{2}} |P_{r}|^{3}\Big ]\Big )\\&= O (\lambda ^{\frac{1}{2}}), \end{aligned}$$
    where the second inequality is for some \(C>0\) by Part (3) of Proposition 6.1. The expectations in the second line above are bounded uniformly for \(\lambda <1\) by Lemma 3.2 since \(|P_{r}|\le 2^{\frac{1}{2}}H_{r}^{\frac{1}{2}}\). The above implies that \(\langle M^{(\lambda )}\rangle _{t}\) converges in probability to \(t\) as \(\lambda \searrow 0\).
     
  2. (ii).
    Recall that \({\mathcal {N}}_{t}\) is the number of collisions over the time interval \([0,t]\) and that \(t_{1},\ldots , t_{{\mathcal {N}}_{t}}\) are the corresponding jump times. The probability has the following bounds:
    $$\begin{aligned}&\mathbb {P}^{(\lambda )}\Big [\sup _{0\le r\le \frac{T}{\lambda }}\big (M_{r}-M_{r^-}\big )^{2}> \frac{\epsilon }{\lambda } \Big ]\nonumber \\&\quad \le \frac{\lambda }{\epsilon }\mathbb {E}^{(\lambda )}\Big [\Big (\sum _{n=1}^{{\mathcal {N}}_{\frac{T}{\lambda }}} \big (M_{t_{n}}-M_{t_{n}^-}\big )^{4}\Big )^{\frac{1}{2}}\Big ] \le \frac{\lambda }{\epsilon }\mathbb {E}^{(\lambda )}\Big [\sum _{n=1}^{{\mathcal {N}}_{\frac{T}{\lambda }}} \big (M_{t_{n}}-M_{t_{n}^-}\big )^{4}\Big ]^{\frac{1}{2}} \nonumber \\&\quad = \frac{\lambda }{\epsilon }\mathbb {E}^{(\lambda )}\Big [\sum _{n=1}^{{\mathcal {N}}_{\frac{T}{\lambda }}} \mathbb {E}^{(\lambda )}\Big [\big (M_{r}-M_{r^-}\big )^{4}\,\big |\,P_{r^-},\,{\mathcal {N}}_{r}={\mathcal {N}}_{r^{-}}+1 \Big ]\Big |_{r=t_{n}}\Big ]^{\frac{1}{2}}\nonumber \\&\quad = \frac{\lambda }{\epsilon }\mathbb {E}^{(\lambda )}\Big [\sum _{n=1}^{{\mathcal {N}}_{\frac{T}{\lambda }}} \frac{\Pi _{\lambda ,4}(P_{t_{n}^-})}{\mathcal {E}_{\lambda }(P_{t_{n}^-})} \Big ]^{\frac{1}{2}}= \frac{\lambda }{\epsilon }\mathbb {E}^{(\lambda )}\Big [\int \limits _{0}^{\frac{T}{\lambda }}dr \Pi _{\lambda , 4}(P_{r}) \Big ]^{\frac{1}{2}}. \end{aligned}$$
    (6.3)
    The second inequality is Jensen’s, and the first inequality is Chebyshev’s followed by the elementary relation
    $$\begin{aligned} \sup _{1\le m\le n} a_{m}\le \Big (\sum _{m=1}^{n} a_{m}^{2}\Big )^{\frac{1}{2}}, a_{n}\ge 0. \end{aligned}$$
    The first equality in (6.3) holds since the process
    $$\begin{aligned} \sum _{n=1}^{{\mathcal {N}}_{t}}\Big (\big (M_{t_{n}}-M_{t_{n}^-}\big )^{4}- \mathbb {E}^{(\lambda )}\Big [\big (M_{r}-M_{r^-}\big )^{4}\,\big |\,P_{r^-},\,{\mathcal {N}}_{r}={\mathcal {N}}_{r^{-}}+1 \Big ]\Big |_{r=t_{n}} \Big ) \end{aligned}$$
    is a mean zero martingale. The second equality uses that a jump for \(M_{r}\) is a jump for \(P_{r}\) (since they differ by a continuous process) and that the conditional expectation for \(\big (P_{r}-P_{r^{-}}\big )^{4}\) given the value \(P_{r^{-}}\) and the information that \(r\in {\mathbb R}^{+}\) is a jump time is given by the ratio of \(\Pi _{\lambda ,4}(P_{r^-})\) by \(\mathcal {E}_{\lambda }(P_{r^{-}})\):
    $$\begin{aligned}&\mathbb {E}^{(\lambda )}\Big [\big (M_{r}-M_{r^{-}}\big )^{4}\,\big |\,P_{r^{-}},\,{\mathcal {N}}_{r}={\mathcal {N}}_{r^{-}}+1 \Big ]\\&= \mathbb {E}^{(\lambda )}\Big [\big (P_{r}-P_{r^-}\big )^{4}\,\big |\,P_{r^-},\,{\mathcal {N}}_{r}={\mathcal {N}}_{r^{-}}+1 \Big ] =\frac{\Pi _{\lambda ,4}(P_{r^-})}{\mathcal {E}_{\lambda }(P_{r^-})}. \end{aligned}$$
    The last equality follows because the jump times \(t_{n}\) occur with Poisson rate \(\mathcal {E}_{\lambda }(P_{r})\).
     
Squaring the right side of (6.3),
$$\begin{aligned} \frac{\lambda ^{2}}{\epsilon ^{2}}\mathbb {E}^{(\lambda )}\Big [\int \limits _{0}^{\frac{T}{\lambda }}dr \Pi _{\lambda ,4}(P_{r}) \Big ]&\le C\frac{\lambda ^2}{\epsilon ^2}\mathbb {E}^{(\lambda )}\Big [\int \limits _{0}^{\frac{T}{\lambda }}dr\big (1+\lambda |P_{r}|\big )^{5} \Big ]\\&\le C\frac{\lambda }{\epsilon ^{2}}\mathbb {E}^{(\lambda )}\Big [\lambda \int \limits _{0}^{\frac{T}{\lambda }}dr\big (1+\lambda 2^{\frac{1}{2}} H_{r}^{\frac{1}{2}} \big )^{5} \Big ], \end{aligned}$$
where we have applied Part (4) of Proposition 6.1 in the first inequality, and the bound \(|P_{r}|^{2}\le 2H_{r}\) for the second. By Lemma 3.2 the expectation on the right side is uniformly bounded for \(\lambda <1\). Thus, the above is \( O (\lambda )\) and the left side of (6.3) is \( O (\lambda ^{\frac{1}{2}})\), which proves the Lindberg condition. \(\square \)

6.2 Proof of Theorem 1.3

Proof of Theorem 1.3

Let \(P_{t}^{(\lambda ),\prime }\) be defined as in Lemma 6.2. By Lemma 6.2 the difference \(P_{t}^{(\lambda )}-P_{t}^{(\lambda ),\prime }\) converges to zero with respect to the uniform metric over the interval \(t\in [0,T]\) as \(\lambda \searrow 0\). Thus we can work with \(P_{t}^{(\lambda ),\prime }\) rather than \(P_{t}^{(\lambda )}\). Define the map \(\mathcal {G}:L^{\infty } ([0,T])\rightarrow L^{\infty }([0,T])\) given by
$$\begin{aligned} \mathcal {G}(h)_{t}=h_{t}-\frac{1}{2}\int \limits _{0}^{t}dr e^{-\frac{1}{2}(t-r)}h_{r}, h\in L^{\infty }([0,T]). \end{aligned}$$
Notice that the solution \(\mathfrak {p}_{t}\) to the Langevin equation (1.4) has the explicit solution
$$\begin{aligned} \mathfrak {p}_{t}=\mathcal {G}(\mathbf {B})_{t}, \end{aligned}$$
(6.4)
where we have assumed \(\mathfrak {p}_{0}=0\). Moreover, the integral equation (6.2) for \(P_{t}^{(\lambda ),\prime }\) admits the closed form
$$\begin{aligned} P_{t}^{(\lambda ),\prime }= e^{-\frac{1}{2}t}\lambda ^{\frac{1}{2}} P_{0}+\mathcal {G}(\lambda ^{\frac{1}{2}}D_{\frac{\cdot }{\lambda }})_{t}+ \mathcal {G}(M^{(\lambda )})_{t}. \end{aligned}$$
(6.5)
By our assumption (2) of List 1.1, the moment \(\mathbb {E}^{(\lambda )}[|P_{0}|]\) is finite, and thus the first term on the right side of (6.5) converges in probability to zero as \(\lambda \searrow 0\). The random variable \(\sup _{0\le t\le T}\big |\mathcal {G}(\lambda ^{\frac{1}{2}}D_{\frac{\cdot }{\lambda }})_{t}\big |\) converges in probability to zero also because
$$\begin{aligned} \mathbb {E}^{(\lambda )}\Big [\sup _{0\le t\le T}\big |\mathcal {G}(\lambda ^{\frac{1}{2}}D_{\frac{\cdot }{\lambda }})_{t}\big |\Big ]\le 2\mathbb {E}^{(\lambda )}\Big [\sup _{0\le t\le T}\big |\lambda ^{\frac{1}{2}}D_{\frac{t}{\lambda }}\big |\Big ]= O (\lambda ^{\frac{1}{4}}), \end{aligned}$$
where the order equality follows by Theorem 1.2. By Lemma 6.3 \(M_{t}^{(\lambda )}\) converges in law to a standard Brownian motion \( \mathbf {B}\) with respect to the uniform metric. Since the map \(\mathcal {G}\) is continuous with respect to the supremum norm, \(\mathcal {G}(M^{(\lambda )})_{t}\) converges in law to the process \(\mathcal {G}(\mathbf {B})_{t}\) with respect to the uniform metric. The process \(\mathcal {G}(\mathbf {B})_{t}\) is Ornstein–Uhlenbeck, and thus \(P_{t}^{(\lambda ),\prime }\) converges in law to the Ornstein–Uhlenbeck process and \(P_{t}^{(\lambda )}\) does also. \(\square \)

7 Miscellaneous Proofs

7.1 Proofs for Sect. 2

Proof of Proposition 2.3

Part (1): For \(t\in {\mathbb R}^{+}\) let \(\big (\mathbf {x}_{t}(x,p),\mathbf {p}_{t}(x,p)\big )\) be the trajectory starting from the phase-space point \((x,p)\) and evolving according to the Hamiltonian \(H(x,p)=\frac{1}{2}p^{2}+V(x)\). The kernel \(\mathcal {T}_{\lambda }\) satisfies the following closed integral equation:
$$\begin{aligned}&\mathcal {T}_{\lambda }\big (x,p;dx',dp'\big )=\int \limits _{0}^{\infty }dt\delta \big (\mathbf {x}_{t}(x,p)-x',\mathbf {p}_{t}(x,p)-p'\big )e^{-\int \limits _{0}^{t}ds(1+\mathcal {E}_{\lambda }(\mathbf {p}_{s}(x,p)))}\\&+\int \limits _{0}^{\infty }dt \int \limits _{{\mathbb R}}dp''\mathcal {J}_{\lambda }\big (\mathbf {p}_{t}(x,p), p'' \big )\mathcal {T}_{\lambda }\big (\mathbf {x}_{t}(x,p),p'';dx',dp'\big )e^{-\int \limits _{0}^{t}ds(1+\mathcal {E}_{\lambda }(\mathbf {p}_{s}(x,p)))}. \end{aligned}$$
We obtain a series expansion for \(\mathcal {T}_{\lambda }\) by iterating the above integral equation such that the \(n\)th term corresponds to the event that \(n-1\) collisions occur over the time interval \([0,t]\). By considering the contribution to the transition kernel \(\mathcal {T}_{\lambda }\) resulting from two collisions over the mean one exponential time interval, we have the following lower bound:
$$\begin{aligned}&\mathcal {T}_{\lambda }\big (x,p;dx',dp'\big ) \ge dx'dp'\int \limits _{{\mathbb R}}dp''\int \limits _{0}^{\infty }dt \int \limits _{0}^{t}dt_{2}\int \limits _{0}^{t_{2}}dt_{1} \mathcal {J}_{\lambda }\big (\mathbf {p}_{t_{1}}(x,p), p''\big )\nonumber \\&\mathcal {J}_{\lambda }\big (\mathbf {p}_{t_{2}-t_{1}}(\mathbf {x}_{t_{1}}(x,p),p''), \mathbf {p}_{-(t-t_{2})}(x',p') \big )\nonumber \\&\times e^{-\int \limits _{0}^{t-t_{2}}ds(1+\mathcal {E}_{\lambda }(\mathbf {p}_{-s}(x',p')))} e^{-\int \limits _{0}^{t_{2}-t_{1}}ds(1+\mathcal {E}_{\lambda }(\mathbf {p}_{s}(\mathbf {x}_{t_{1}}(x,p),p'')))}e^{-\int \limits _{0}^{t_{1}}ds(1+\mathcal {E}_{\lambda }(\mathbf {p}_{s}(x,p)))}. \end{aligned}$$
(7.1)
Let \(A\subset {\mathbb R}\) be the set of \(p\) with \( 2 l \le \frac{1}{2}p^{2} \le 5l\). Let \(c>0\) be the minimum of the values
$$\begin{aligned} \inf _{\begin{array}{c} \lambda < 1\\ p^{2} \le l,\,p'\in A \end{array}} \mathcal {J}_{\lambda }(p,p'),\quad \inf _{\begin{array}{c} \lambda < 1\\ p^{2} \le l,\,p'\in A \end{array}} \mathcal {J}_{\lambda }(p',p),\quad \text {and}\quad \inf _{\lambda < 1} \int \limits _{0}^{\infty }dt \int \limits _{0}^{t}dt_{2}\int \limits _{0}^{t_{2}}dt_{1}e^{-t(1+\mathcal {E}_{\lambda }(\sqrt{10 l}))}. \end{aligned}$$
We have defined the set \(A\) to exclude the arguments \(p,p'\) of \(\mathcal {J}_{\lambda }\) from being close since the rates \(\mathcal {J}_{\lambda }(p,p')\) have a zero along the line \(p'=p\). If \(H(x,p),H(x',p')\le l\), the conservation of energy guarantees that \(\frac{1}{2}\mathbf {p}_{t_{1}}^{2}(x,p)\le l\) and \(\frac{1}{2}\mathbf {p}_{-(t-t_{2})}^{2}(x',p') \le l\). Also, if \( 3l\le \frac{1}{2}(p'')^{2}\le 4l \), then \(\mathbf {p}_{t_{2}-t_{1}}(\mathbf {x}_{t_{1}}(x,p),p'')\in A\) since the kinetic energy can not fluctuate by more than \(l\) through the Hamiltonian evolution. For all \((x,p),(x',p')\) with \(H(x,p),H(x',p')\le l\) and all \(\lambda <1\),
$$\begin{aligned} \mathcal {T}_{\lambda }\big (x,p;dx',dp'\big )\ge c^{3} dx'dp' \Big (\int \limits _{{\mathbb R}}dp''\chi (3l\le \frac{1}{2}(p'')^{2}\le 4l) \Big ). \end{aligned}$$
The above follows by restricting the integration over \(p''\in {\mathbb R}\) to \(3l\le \frac{1}{2}(p'')^{2}\le 4l\). Thus we can take \( \mathbf {u}:= c^{3}U^{2} \int \limits _{{\mathbb R}}dp''\chi (3l\le \frac{1}{2}(p'')^{2}\le 4l) \).
Now we prove the upper bound for \(\mathcal {T}_{\lambda }(s,ds')\). Notice that for \(\Psi \in L^{1}(\Sigma )\cap L^{\infty }(\Sigma )\), then \( \Vert \mathcal {T}_{\lambda }(\Psi )\Vert _{\infty } \le \Vert \Psi \Vert _{\infty } \). In other words, \(\mathcal {T}_{\lambda }\) is a contraction in the supremum norm. This is evident from the resolvent form \(\mathcal {T}_{\lambda }=\int _{0}^{\infty }dt e^{-t}\Phi _{t,\lambda }\), where \(\Phi _{t,\lambda }\) are the dynamical maps for the master equation (1.7), and by the inequalities
$$\begin{aligned} \big \Vert \mathcal {T}_{\lambda }\Psi \big \Vert _{\infty }\le \int \limits _{0}^{\infty }dt e^{-t}\Vert \Phi _{t,\lambda }(\Psi ) \big \Vert _{\infty }\le \Vert \Psi \Vert _{\infty }. \end{aligned}$$
The maps \(\Phi _{t,\lambda }\) are contractive in the supremum norm since the dynamics is driven by an Hamiltonian flow, which preserves the supremum norm, and a noise satisfying the detailed balance condition
$$\begin{aligned} e^{-\frac{1}{2}p_{1}^{2}}\mathcal {J}_{\lambda }(p_{1},p_{2})= e^{-\frac{1}{2}p_{2}^{2}}\mathcal {J}_{\lambda }(p_{2},p_{1}). \end{aligned}$$
When \(H(s')\ne H(s)\) then a collision must occur over the time interval \([0,\tau _{1}]\) in order for \(S_{0}=s\) and \(S_{\tau _{1}}=s'\). Considering the event that the first collision occurs before time \(\tau _{1}\), then the strong Markov property gives the first equality below:
$$\begin{aligned} \mathcal {T}_{\lambda }(s,ds')=\mathbb {E}^{(\lambda )}_{s}\Big [\chi (t_{1}\le \tau _{1})\mathcal {T}_{\lambda }(S_{t_{1}},ds') \Big ]\le \Vert D_{s}^{(\lambda )}\Vert _{\infty }, \end{aligned}$$
(7.2)
where \(D_{s}^{(\lambda )}\) is the probability density of the first collision when starting from \(s\in \Sigma \). The density has the closed form
$$\begin{aligned} D_{s}^{(\lambda )}(x',p')=\int \limits _{0}^{\infty }dt \delta \big (\mathbf {x}_{t}(s)-x'\big ) \mathcal {J}_{\lambda }\big (\mathbf {p}_{t}(s),p' \big ) e^{-\int \limits _{0}^{t}dr\mathcal {E}_{\lambda }(\mathbf {p}_{r}(s))}. \end{aligned}$$
When \(H(s)\ge l=1+2\sup _{x}V(x)\), the particle will revolve around the torus freely with speed \(|p|\ge \big (2+2\sup _{x}V(x) \big )^{\frac{1}{2}}\). Using the above form for \(D_{s}^{(\lambda )}\):
$$\begin{aligned} D_{s}^{(\lambda )}(x',p')&\le \left( \sup _{p,p'\in {\mathbb R}} \frac{ \mathcal {J}_{\lambda }\big (p , p'\big )}{ \mathcal {E}_{\lambda }(p)} \right) \int \limits _{0}^{\infty }dt\delta \big (\mathbf {x}_{t}(s)-x'\big ) e^{-\int \limits _{0}^{t}dr\mathcal {E}_{\lambda }(\mathbf {p}_{r}(s))} \mathcal {E}_{\lambda }(\mathbf {p}_{t}(s))\\&\le \left( \sup _{p,p'\in {\mathbb R}} \frac{ \mathcal {J}_{\lambda }\big (p , p'\big )}{ \mathcal {E}_{\lambda }(p)} \right) \left( \sum _{n=1}^{\infty } \frac{\mathcal {E}_{\lambda }(q(s,x')) }{q(s,x')} e^{-n\int \limits _{\mathbb {T}}da\frac{\mathcal {E}_{\lambda }(q(s,a)) }{q(s,a)} } \right) \\&\le \left( \sup _{p,p'\in {\mathbb R}} \frac{ \mathcal {J}_{\lambda }\big (p, p' \big )}{ \mathcal {E}_{\lambda }(p)} \right) \left( \frac{ \frac{\mathcal {E}_{\lambda }(q(s,x')) }{q(s,x')}}{1- e^{-\int \limits _{\mathbb {T}}da\frac{\mathcal {E}_{\lambda }(q(s,a)) }{q(s,a)} }} \right) , \end{aligned}$$
where \(q(s,a):=2^{\frac{1}{2}}(H(s)-V(a))^{\frac{1}{2}}\). The two terms in the product on the right are bounded uniformly for all \(\lambda <1\) and \(s\in \Sigma \) with \(H(s)> l\).

Part (2): This follows easily from the construction of the split process.

Part (3): It is almost surely true that a collision will not occur at the partition time \(\mathbf {t}\). Since the trajectory \(S_{t}\) is continuous between collision times \(\lim _{t\nearrow \mathbf {t}}S_{t}=S_{\mathbf {t}}\). Thus, information about the state \(S_{\mathbf {t}}\) will be contained in the \(\sigma \)-algebra \(\tilde{\mathcal {F}}_{\mathbf {t}^{-}}\). We claim that the probability of the binary component \( Z_{\mathbf {t}}\) being \(1\) given \(\tilde{\mathcal {F}}_{\mathbf {t}^{-}}\) has probability \(h(S_{\mathbf {t}})\), which would imply that \(\tilde{S}_{\mathbf {t}}\) has distribution \(\tilde{\delta }_{S_{\mathbf {t}}}\) given \(\mathcal {F}_{\mathbf {t}^{-}}\).

To verify that \(\tilde{\mathbb {P}}^{(\lambda )}[Z_{\mathbf {t}}=1|\tilde{\mathcal {F}}_{\mathbf {t}^{-}}]=h(S_{\mathbf {t}})\) as claimed above, let \( \mathbf {t}' \) be the partition time preceding \(\mathbf {t}\). By the strong Markov property at the time \( \mathbf {t}' \) and since \(Z_{t}\) is constant over the interval \(t\in [\mathbf {t}',\mathbf {t})\), we have the first equality below:
$$\begin{aligned} \tilde{\mathbb {P}}^{(\lambda )}[Z_{\mathbf {t}}=1\,|\,\tilde{\mathcal {F}}_{\mathbf {t}^{-}}]&= \tilde{\mathbb {P}}^{(\lambda )}\big [Z_{\mathbf {t}}=1\,|\,(S_{r}; r\in [\mathbf {t}',\mathbf {t}]),\,Z_{\mathbf {t}'} \big ]\nonumber \\&=\tilde{\mathbb {P}}^{(\lambda )}\big [Z_{\mathbf {t}}=1\,|\, S_{\mathbf {t}'},\, S_{\mathbf {t}},\,Z_{\mathbf {t}'} \big ]= h(S_{\mathbf {t}}). \end{aligned}$$
(7.3)
The second equality holds since the distribution of the bridge \((S_{r}; r\in (\mathbf {t}',\mathbf {t}))\) is independent of \(Z_{\mathbf {t}'}\) and \(Z_{\mathbf {t}}\) given the endpoints \( S_{\mathbf {t}'}\), \(S_{\mathbf {t}}\). The probability \(\tilde{\mathbb {P}}^{(\lambda )}\big [Z_{\mathbf {t}}=1\,|\, S_{\mathbf {t}'},\, S_{\mathbf {t}},\,Z_{\mathbf {t}'} \big ]\) is well-defined within the context of the split resolvent chain, and the last equality in (7.3) follows from the form of the transition kernel \(\tilde{\mathcal {T}}_{\lambda }(s_{1}, z_{1}; ds_{2},z_{2})\), defined above (2.2), for \(s_{1}=S_{\mathbf {t}'}\), \( z_{1}=Z_{\mathbf {t}'}\), \(s_{2}=S_{\mathbf {t}}\), and \( z_{2}=Z_{\mathbf {t}}\). \(\square \)

7.2 Proofs for Propositions 3.1 and 6.1

We will begin with the proof of Proposition 6.1 since the proof of Part (1) of Proposition 3.1 depends on it.

Proof of Proposition 6.1

To ease the notations, we denote \(g(q)=(2\pi )^{-\frac{1}{2}}e^{-\frac{q^2}{2}}\). Since \(\mathcal {E}_\lambda (p) \), \(\mathcal {D}_\lambda (p) \), and \(\Pi _\lambda ^{(2m)}(p)\) are even functions, we will assume without loss of generality that \(p>0\).

Part (1): After the change of variables \(q=\frac{\lambda +1}{2}p'+\frac{\lambda -1}{2}p=\frac{\lambda +1}{2}(p'-{p})+\lambda p\), we have
$$\begin{aligned} \mathcal {E}_\lambda (p) = \frac{2\eta }{\lambda +1}\int \limits _{\mathbb R}dq|q-\lambda p|g(q) =\frac{2\eta }{\lambda +1}\left( 2\int \limits _{\lambda p}^\infty qg(q)dq+\lambda p\int \limits _{-\lambda p}^{+\lambda p}dqg(q)\right) .\quad \end{aligned}$$
(7.4)
We have \(\int \limits _{\lambda p}^\infty qg(q) dq = g(\lambda p)\), and \(\int \limits _{-\lambda p}^{+ \lambda p}dqg(q) \le \min (1,\lambda |p|)\). By calculus we also see that \(\alpha \mapsto g(\alpha ) + \alpha \int _0^\alpha dq g(q)\) has a minimum over \({\mathbb R}\) at \(0\). With the above remarks and \(\eta =\frac{(2\pi )^{\frac{1}{2}}}{32}\),
$$\begin{aligned} \frac{1}{8(\lambda +1)}=\mathcal {E}_\lambda (0) \le \mathcal {E}_\lambda (p) \le \frac{4\eta g(0)+2\eta \min (\lambda |p|,\lambda ^2p^2)}{(\lambda +1)} =\frac{1+C\min (\lambda |p|,\lambda ^2|p|^2)}{8(\lambda +1)}.\nonumber \\ \end{aligned}$$
(7.5)
Part (2): We use the same technique to estimate \(\mathcal {D}_\lambda (p)\). Here we find
$$\begin{aligned} \mathcal {D}_\lambda (p)&= \frac{4\eta }{(\lambda +1)^2}\Big (-\int \limits _{-\lambda p}^{+\lambda p} q^2 g(q) dq -4 \lambda p\int \limits _{\lambda p}^\infty qg(q)dq - \lambda ^2 p^2\int \limits _{-\lambda p}^{+\lambda p}g(q)dq\Big )\nonumber \\&= -\frac{8\eta \lambda p}{(\lambda +1)^{2}}g(\lambda p) -\frac{4\eta \lambda p}{(\lambda +1)^{2}}\int \limits _{-\lambda p}^{+\lambda p}g(q)dq-\frac{4\eta \lambda ^2 p^2}{(\lambda +1)^2} \int \limits _{-\lambda p}^{+\lambda p}g(q)dq. \end{aligned}$$
(7.6)
It follows that for \(\lambda p\ll 1\)
$$\begin{aligned} \mathcal {D}_\lambda (p) = -\frac{ \lambda p}{2(\lambda +1)^2} + O(\lambda ^2 p^2), \end{aligned}$$
so \(|\mathcal {D}_\lambda (p)+\frac{\lambda p}{2}|\) is bounded by a constant multiple of \(\lambda ^{2}(|p|+|p|^{2})\) in that regime. When \(\lambda p\) is not small, we can also find a bound for \(|\mathcal {D}_\lambda (p)+\frac{\lambda p}{2}|\) of the same form through (7.6) using that \(\int _{{\mathbb R}}g(q)dq=1\) and \(g(q)\le 1\).
Part (3): Now we repeat the computation for \(\mathcal {Q}_\lambda (p)\) and we get
$$\begin{aligned} \mathcal {Q}_\lambda (p)&= \frac{8\eta }{(\lambda +1)^3}\bigg ((4+2\lambda ^{2}p^{2}) g(\lambda p) + \lambda p (3+\lambda ^2 p^2)\int \limits _{-\lambda p}^{+\lambda p} g(q) dq\bigg )\\&= \frac{8}{(\lambda +1)^2} \mathcal {E}_\lambda (p) +\frac{16\eta \lambda ^{2}p^{2}}{(\lambda +1)^3} g(\lambda p)+ \frac{8\eta \lambda p(1+ \lambda ^{2} p^{2})}{(\lambda +1)^3} \int \limits _{-\lambda p}^{+\lambda p} g(q) dq.\nonumber \end{aligned}$$
(7.7)
With the above
$$\begin{aligned}&\Big |\mathcal {Q}_\lambda (p) -\frac{1}{(\lambda +1)^3} \Big |\le \frac{1}{(\lambda +1)^3} \Big |1- 8 \mathcal {E}_\lambda (p) \Big |\\&\quad +\frac{16\eta \lambda ^{2}p^{2}}{(\lambda +1)^3} g(\lambda p)+ \frac{8\eta \lambda p(1+ \lambda ^{2} p^{2})}{(\lambda +1)^3} \int \limits _{-\lambda p}^{+\lambda p} g(q) dq. \end{aligned}$$
With the bounds for \(\mathcal {E}_\lambda (p) \) from Part (1), the right side above is bounded by a constant multiple of \(\lambda p+(\lambda p)^{3}\).

Part (4): Finally, reasoning as above, it is easy to produce an upper bound for \(\Pi _\lambda ^{(2m)}(p)\) which is a polynomial of degree \(2m+1\) in \(\lambda |p|\). \(\square \)

Proof of Proposition 3.1

Our first observation is that \(\mathcal {A}_\lambda (x,p)\), \(\mathcal {K}_{\lambda ,n}(x,p)\), and \(\mathcal {V}_\lambda (x,p)\) are symmetric in \(p\in {\mathbb R}\). Hence we can assume without loss of generality that \(p>0\).

Part (1): We note that
$$\begin{aligned} 2^{\frac{1}{2}} H^{\frac{1}{2}}(x,p') -2^{\frac{1}{2}} H^{\frac{1}{2}}(x,p)&= \frac{ p'^2- p^2}{2^{\frac{1}{2}}\left( \frac{p'^2}{2}+V(x)\right) ^{\frac{1}{2}}+2^{\frac{1}{2}}\left( \frac{p^2}{2}+V(x)\right) ^{\frac{1}{2}}}\nonumber \\&= (p'-p)\Gamma (p',p), \end{aligned}$$
(7.8)
where \(\Gamma (p',p):=2^{-\frac{1}{2}}(p'+p)\big (H^{\frac{1}{2}}(x,p')+ H^{\frac{1}{2}}(x,p)\big )^{-1}\). Thus one can write
$$\begin{aligned} \mathcal {A}_\lambda (x,p) = \mathcal {D}_\lambda (p) - \int _{\mathbb R}dp'(p'-p)(1-\Gamma (p',p))\mathcal {J}_\lambda (p,p'). \end{aligned}$$
(7.9)
For the proof of \(\mathcal {A}_\lambda ^-(x,p)\le |\mathcal {D}_\lambda (p)|\), we have to show that the integral in (7.9) is non-positive. For this we use the monotonicity of \(\Gamma (p,p')\) in \(p'\) for fixed \(p\) and the following straightforward inequality is valid for all \(p,r\ge 0\):
$$\begin{aligned} \mathcal {J}_\lambda (p,p+r)\le \mathcal {J}_\lambda (p,p-r). \end{aligned}$$
Then we have
$$\begin{aligned}&\int \limits _{\mathbb R}dp'(p'-p)(1-\Gamma (p,p'))\mathcal {J}_\lambda (p,p') \\&= \int \limits _0^\infty rdr(1-\Gamma (p,p+r))\mathcal {J}_\lambda (p,p+r) + \int \limits _0^\infty rdr(\Gamma (p,p-r)-1)\mathcal {J}_\lambda (p,p-r) \\&\le \int \limits _0^\infty rdr(\Gamma (p,p-r)-\Gamma (p,p+r))\mathcal {J}_\lambda (p,p-r) \le 0. \end{aligned}$$
Finally, we know as a consequence of Part (2) of Proposition 6.1 that there is a \(C'>0\) such that
$$\begin{aligned} \left| \mathcal {D}_\lambda (p)\right| \le C'\lambda | p|+ C'\lambda ^2 p^2. \end{aligned}$$
Part (2): Now we prove the bound for \(\mathcal {A}_\lambda ^+(x,p)\), which will follow by finding an upper bound for \(\mathcal {A}_\lambda (x,p)\). For \(\tilde{p}=\frac{1-\lambda }{1+\lambda }p\) we can write \(p-\tilde{p}=\frac{2\lambda p}{\lambda +1}\) and
$$\begin{aligned} \mathcal {A}_\lambda (x,p) = \int \limits _{\mathbb R}dp'(p'-\tilde{p})\Gamma (p,p')\mathcal {J}_\lambda (p,p') - \frac{2\lambda p}{\lambda +1} \int \limits _{\mathbb R}dp'\Gamma (p,p')\mathcal {J}_\lambda (p,p'). \end{aligned}$$
(7.10)
For the first term on the right side of (7.10), we have
$$\begin{aligned} - \frac{2\lambda p}{\lambda +1} \int \limits _{\mathbb R}dp'\Gamma (p,p')\mathcal {J}_\lambda (p,p')&\le - \frac{2\lambda p}{\lambda +1} \int \limits _{-\infty }^{-p} dp'\Gamma (p,p')\mathcal {J}_\lambda (p,p')\\&\le \frac{2\lambda p}{\lambda +1} \int \limits _{-\infty }^{-p} dp' \mathcal {J}_\lambda (p,p'), \end{aligned}$$
where we have simply thrown away the negative part of the integration to get the first inequality. The above is exponentially decreasing in \(p\) (uniformly in \(\lambda \)).
To bound the second term on the right side of (7.10), we begin with a bound for \( \frac{\partial \Gamma (p,p')}{\partial p'}\). There is \(C>0\) such that for all \(p>0\) and \(\frac{p}{2}\le p'\le \frac{3p}{2}\)
$$\begin{aligned} 0 \le \frac{\partial \Gamma (p,p')}{\partial p'}=\frac{(p^2+2V(x))^\frac{1}{2}(p'^2+2V(x))^\frac{1}{2} - pp' +2V(x)}{(p'^2+2V(x))^{\frac{1}{2}}\left( (p^2+2V(x))^{\frac{1}{2}}+(p'^2+2V(x))^{\frac{1}{2}}\right) ^2} \le \frac{C}{1+p^3}.\nonumber \\ \end{aligned}$$
(7.11)
By writing \(r=p'-\tilde{p}\) and \(g(q)=(2\pi )^{-\frac{1}{2}}e^{-\frac{q^2}{2}}\), the first term on the right side of (7.10) is bounded by
$$\begin{aligned}&\int \limits _{\mathbb R}dp'(p'-\tilde{p})\Gamma (p,p')\mathcal {J}_\lambda (p,p')\\&\quad = \frac{\eta (\lambda \!+\!1)}{2}\int \limits _0^\infty rdr \left( \Gamma (p,\tilde{p}\!+\!r)\left| r \!-\!\frac{2\lambda p}{\lambda \!+\!1}\right| \!-\! \Gamma (p,\tilde{p}-r)\left| r\!+\!\frac{2\lambda p}{\lambda \!+\!1}\right| \right) g\left( \frac{\lambda \!+\!1}{2} r\right) \\&\quad \le \frac{\eta (\lambda +1)}{2} \int \limits _0^{\frac{2\lambda p}{\lambda +1}} rdr \frac{2\lambda p}{\lambda +1}\left( \Gamma (p,\tilde{p}+r)-\Gamma (p,\tilde{p}-r)\right) g\left( \frac{\lambda +1}{2}r\right) \\&\quad \quad + \frac{\eta (\lambda +1)}{2} \int \limits _{\frac{2\lambda p}{\lambda +1}}^\infty r^2dr\left( \Gamma (p,\tilde{p}+r)-\Gamma (p,\tilde{p}-r)\right) g\left( \frac{\lambda +1}{2}r\right) . \end{aligned}$$
For the inequality above, we used the fact that \(\Gamma (p,\tilde{p}+r)+\Gamma (p,\tilde{p}-r)\ge 0\), which is true because \(\Gamma (p,\tilde{p}-r)\ge \Gamma (p,-\tilde{p}-r)\) and \(|\Gamma (p,-\tilde{p}-r)|\le \Gamma (p,\tilde{p}+r)\) for all \(r\ge 0\). Now, for the first integral, we can use (7.11) and we find that it is bounded by \(\frac{C \lambda }{1+p^2}\). For the second integral, we first observe that the integral over \(r\ge \frac{\tilde{p}}{2}\) is decaying exponentially, and for the integral over \(0\le r\le \frac{\tilde{p}}{2}\), we again use that (7.11) and get a \(\frac{C}{1+p^3}\) bound.
Part (3): For \(\lambda =0\) we have that \(\mathcal {J}_0(p,p') =\frac{1}{64}|p'-p|e^{-\frac{(p'-p)^{2}}{8}}\). Since \(H^{\frac{1}{2}}(x,p)\) is a convex function in \(p\in {\mathbb R}\) for each \(x \in \mathbb {T}\), we have
$$\begin{aligned} \mathcal {A}_0(x,p)=\frac{1}{64} \int \limits _{{\mathbb R}}dp'\Big (2^{\frac{1}{2}} H^{\frac{1}{2}}(x,p')-2^{\frac{1}{2}} H^{\frac{1}{2}}(x,p) \Big ) |p'-p|e^{-\frac{(p'-p)^{2}}{8}}\ge 0. \end{aligned}$$
In other terms, \(\mathcal {A}_0^+(x,p)=\mathcal {A}_0 (x,p)\). Our first task will be to establish that \(\int \limits _{{\mathbb R}}dp \mathcal {A}_0^+(x,p)=1\) for each \(x\). By Part (2) we also know that \(\mathcal {A}_0^+(x,p)\) has a bound of the form \(\frac{C}{1+p^{2}}\), so \(\int \limits _{|p|\ge L}dp\mathcal {A}_0^+(x,p)= O (L^{-1}) \) for large \(L\). Hence, \(\int \limits _{{\mathbb R}}dp\mathcal {A}_0^+(x,p)\) can be approximated by
$$\begin{aligned} \int \limits _{|p|\le L}dp\mathcal {A}_0^+(x,p)&= \frac{1}{64} \int \limits _{|p|\le L}dp \int \limits _{{\mathbb R}}dp'\Big (2^{\frac{1}{2}} H^{\frac{1}{2}}(x,p')-2^{\frac{1}{2}} H^{\frac{1}{2}}(x,p) \Big ) |p'-p|e^{-\frac{(p'-p)^{2}}{8}} \nonumber \\&= \frac{1}{64} \int \limits _{|p|\le L}dp \int \limits _{|p'|\ge L}dp \Gamma (p,p') (p'-p) |p'-p|e^{-\frac{(p'-p)^{2}}{8}} \nonumber \\&= \frac{1}{64} \int \limits _{|p|\le L}dp \int \limits _{|p'|\ge L}dp' (p'-p)^2e^{-\frac{(p'-p)^{2}}{8}} \nonumber \\&+ \frac{1}{64} \int \limits _{|p|\le L}dp \int \limits _{|p'|\ge L}dp'\big (1-\Gamma (p,p')\big ) (p'-p)^2e^{-\frac{(p'-p)^{2}}{8}}. \end{aligned}$$
(7.12)
The first term term on the right side of (7.12) satisfies
$$\begin{aligned} \frac{1}{64} \int \limits _{|p|\le L}dp \int \limits _{|p'|\ge L}dp' (p'-p)^2e^{-\frac{(p'-p)^{2}}{8}}\approx \frac{1}{32} \int \limits _{{\mathbb R}^{+}}dp p^{3} e^{-\frac{p^{2}}{8}}=1, \end{aligned}$$
where the error of the approximation is exponentially small for \(L\gg 1\). For the second term on the right side of (7.12),
$$\begin{aligned}&\int \limits _{|p|\le L}dp \int \limits _{|p'|\ge L}dp'\big (1-\Gamma (p,p')\big ) (p'-p)^2e^{-\frac{(p'-p)^{2}}{8}}\\&\approx 2\int \limits _{{\mathbb R}^{+}}dr r^2e^{-\frac{r^{2}}{8}} \int \limits _{ 0}^{r}dv\big (1-\Gamma (L+v-r,L+v)\big ), \end{aligned}$$
where this approximation also has an exponentially small error. We can see that the above expression is \( O (L^{-2})\) by observing that for \(\frac{ p}{2}\le p'\)
$$\begin{aligned} 0\le 1- \Gamma (p',p) \le 1-\Gamma \left( \frac{p}{4},\frac{p}{4}\right) = 1- \frac{p}{\left( p^2+32V(x)\right) ^{\frac{1}{2}}}= O (p^{-2}). \end{aligned}$$
(7.13)
Next we focus on bounding the difference between \(\int _{{\mathbb R}}dp\mathcal {A}_\lambda ^{+}(x,p) \) and \(\int _{{\mathbb R}}dp\mathcal {A}_0^{+}(x,p) =1\) in the small \(\lambda \) limit. By the analysis at the beginning of the proof of Part (2), we have the first equality below:
$$\begin{aligned} \int \limits _{{\mathbb R}}dp\mathcal {A}_\lambda ^{+}(x,p)&= \int \limits _{{\mathbb R}} dp \Big [\int \limits _{\mathbb R}dp'(p'-\tilde{p})\Gamma (p,p')\mathcal {J}_\lambda (p,p') \Big ]_{+}+ O (\lambda ),\nonumber \\&= \int \limits _{-\lambda ^{-\frac{1}{2}}}^{\lambda ^{-\frac{1}{2}}} dp \Big [\int \limits _{\mathbb R}dp'(p'-\tilde{p})\Gamma (p,p')\mathcal {J}_\lambda (p,p') \Big ]_{+}+ O (\lambda ^{\frac{1}{2}}) \nonumber \\&= \int \limits _{-\lambda ^{-\frac{1}{2}}}^{\lambda ^{-\frac{1}{2}}} dp \mathcal {A}_0^{+}(x,p) + O (\lambda ^{\frac{1}{2}})\nonumber \\&= \int \limits _{{\mathbb R}} dp\mathcal {A}_0^{+}(x,p)+ O (\lambda ^{\frac{1}{2}}). \end{aligned}$$
(7.14)
In the above \([\cdot ]_{+}\) refers to the positive part of a function. The second equality in (7.14) uses that the integrand for the integration in \(p\) is bounded by a multiple of \(\frac{1}{1+p^2}\) by the analysis in Part (2). For the third equality above, we have used that \(\mathcal {A}_0^{+}(x,p)= \int _{\mathbb R}dp'(p'-\tilde{p})\Gamma (p,p')\mathcal {J}_0(p,p') \) is positive, \(|\Gamma (p,p')|\le 1\), and
$$\begin{aligned}&\Big |\int \limits _{-\lambda ^{-\frac{1}{2}}}^{\lambda ^{-\frac{1}{2}}} dp\Big (\Big [\int \limits _{\mathbb R}dp'(p'-\tilde{p})\Gamma (p,p')\mathcal {J}_\lambda (p,p') \Big ]_{+} -\mathcal {A}_0^{+}(x,p)\Big )\Big |\\&\le \int \limits _{-\lambda ^{-\frac{1}{2}}}^{\lambda ^{-\frac{1}{2}}} dp \int \limits _{\mathbb R}dp'|p'-\tilde{p}| \Big |\mathcal {J}_\lambda (p,p') -\mathcal {J}_0(p,p') \Big |\\&= O (\lambda ^{\frac{1}{2}}). \end{aligned}$$
Part (4): The bounds for \(\mathcal {K}_{\lambda ,n}\) are straightforward since by (7.8) and \(|\Gamma (p',p)|\le 1\) we have
$$\begin{aligned} \mathcal {K}_{\lambda ,n}(x,p) \le 2^{-\frac{n}{2}} \int \limits _{\mathbb R}dp' |p-p'|^n \mathcal {J}_\lambda (p,p'). \end{aligned}$$
Writing \(r=p'-p\) there is a constant \(C_{n}'\) such that for all \(\lambda <1\) and \((x,p)\in \Sigma \)
$$\begin{aligned} \mathcal {K}_{\lambda ,n}(x,p)&\le C_n' \left( \int \limits _0^\infty r^{n+1} dr g\Big (\frac{\lambda +1}{2} r +\lambda p\Big ) +\int \limits ^{\infty }_{2p}r^{n+1} dr g\Big (\frac{\lambda +1}{2} r -\lambda p\Big )\right) \\&\le C_n' \left( \int \limits _0^\infty r^{n+1} dr g\Big (\frac{\lambda +1}{2} r\Big ) + \int \limits _{2p}^\infty r^{n+1}dr g\Big (\frac{r}{2}\Big )\right) . \end{aligned}$$
Therefore, \(\mathcal {K}_{\lambda ,n}(x,p)\) is bounded by a constant. \(\square \)

7.3 Proofs for Sect. 5

Proof of Lemma 5.3

By Lemma 5.1 \(\langle \tilde{M}\rangle _{t}\) is a sum of terms \( \check{\upsilon }_{\lambda }(S_{R_{n}})\), and so the difference between \(\langle \tilde{M}\rangle _{t}\) and \(\upsilon _{\lambda }\tilde{N}_{t}\) can be written
$$\begin{aligned} \langle \tilde{M}\rangle _{t}-\upsilon _{\lambda }\tilde{N}_{t}&= \sum _{n=1}^{\tilde{N}_{t}} \big (\check{\upsilon }_{\lambda }(S_{R_{n}}) -\upsilon _{\lambda }\big ) = \check{\upsilon }_{\lambda }(S_{R_{1}})-\check{\upsilon }_{\lambda }(S_{R_{\tilde{N}_{t}+1}})-\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\delta }_{S_{R_{\tilde{N}_{t}+1}}}} \big [\check{\upsilon }_{\lambda }(S_{R_{1}}) \big ]\\&+\,\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\delta }_{S_{R_{1}}}} \big [\check{\upsilon }_{\lambda }(S_{R_{1}}) \big ] \\&+\sum _{n=1}^{\tilde{N}_{t}} \big (\check{\upsilon }_{\lambda }(S_{R_{n+1}}) -\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\delta }_{S_{R_{n}}}} \big [\check{\upsilon }_{\lambda }(S_{R_{1}}) \big ]+\tilde{\mathbb {E}}^{(\lambda )}_{ \tilde{\delta }_{S_{R_{n+1}}}} \big [\check{\upsilon }_{\lambda }(S_{R_{1}}) \big ] -\upsilon _{\lambda } \big ), \end{aligned}$$
where \(\tilde{\delta }_{s}\) is the splitting of the \(\delta \)-distribution at \(s\in \Sigma \). Notice that \(\upsilon _{\lambda }=\int _{\Sigma }d\nu (s)\check{\upsilon }_{\lambda }(s) \). The sum on the right is a martingale with respect to \(\tilde{\mathcal {F}}_{t}'\) by the same reasoning that \(\tilde{M}_{t}\) is a martingale. Since \(S_{R_{n}}\in Supp (\nu )\) for \(n\ge 1\), we have the standard inequalities:
$$\begin{aligned}&\lambda ^{\frac{1}{2}}\tilde{\mathbb {E}}^{(\lambda )}\Big [\sup _{0\le t\le \frac{T}{\lambda }}\Big | \langle \tilde{M}\rangle _{t}-\upsilon _{\lambda }\tilde{N}_{t}\Big |\Big ] \le 4 \lambda ^{\frac{1}{2}}\sup _{s\in Supp (\nu )}\check{\upsilon }_{\lambda }(s)\\&\qquad +\,2\lambda ^{\frac{1}{2}}\tilde{\mathbb {E}}^{(\lambda )}\Big [\sum _{n=1}^{\tilde{N}_{\frac{T}{\lambda }}} \Big (\check{\upsilon }_{\lambda }(S_{R_{n+1}}) -\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\delta }_{S_{R_{n}}}} \big [\check{\upsilon }_{\lambda }(S_{R_{1}}) \big ]+\tilde{\mathbb {E}}^{(\lambda )}_{\tilde{\delta }_{S_{R_{n+1}}}} \big [\check{\upsilon }_{\lambda }(S_{R_{1}}) \big ] -\upsilon _{\lambda } \Big )^{2}\Big ]^{\frac{1}{2}}\\&\quad \le 4 \lambda ^{\frac{1}{2}}\sup _{s\in Supp (\nu )}\check{\upsilon }_{\lambda }(s)+2^{\frac{3}{2}}\lambda ^{\frac{1}{2}}\tilde{\mathbb {E}}^{(\lambda )}\big [\tilde{N}_{\frac{T}{\lambda }}\big ]^{\frac{1}{2}}\Big (\int \limits _{\Sigma }ds\check{\upsilon }^{2}_{\lambda }(s)-\Big (\int \limits _{\Sigma }ds\check{\upsilon }_{\lambda }(s)\Big )^{2} \Big )^{\frac{1}{2}}= O (\lambda ^{\frac{1}{4}}). \end{aligned}$$
\(\square \)

Footnotes

  1. 1.

    These velocities refer to the original length scale, before stretching by a factor of \(\lambda ^{-1}\).

Notes

Acknowledgments

We are grateful to Professor Höpfner for sending a copy of Touati’s unpublished paper and giving helpful comments. We also thank an anonymous referee for offering many useful suggestions towards improving the presentation of this article. This work is supported by the European Research Council Grant No. 227772 and NSF Grant DMS-08446325.

References

  1. 1.
    Adams, C.S., Sigel, M., Mlynek, J.: Atom optics. Phys. Rep. 240, 143–210 (1994)ADSCrossRefGoogle Scholar
  2. 2.
    Athreya, K.B., Ney, P.: A new approach to the limit theory of recurrent Markov chains. Trans. Am. Math. Soc. 245, 493–501 (1978)CrossRefMATHMathSciNetGoogle Scholar
  3. 3.
    Brunnschweiler, A.: A connection between the Boltzmann equation and the Ornstein–Uhlenbeck process. Arch. Ration. Mech. Anal. 76, 247–263 (1981)CrossRefMATHMathSciNetGoogle Scholar
  4. 4.
    Birkl, G., Gatzke, M., Deutsch, I.H., Rolston, S.L., Phillips, W.D.: Bragg scattering from atoms in optical lattices. Phys. Rev. Lett. 75, 2823–2827 (1998)ADSCrossRefGoogle Scholar
  5. 5.
    Chung, K.L.: A Course in Probability Theory. Academic Pres, New York (1976)Google Scholar
  6. 6.
    Clark, J.T.: Suppressed dispersion for a randomly kicked quantum particle in a Dirac comb. J. Stat. Phys. 150, 940–1015 (2013)ADSCrossRefMATHMathSciNetGoogle Scholar
  7. 7.
    Clark, J.T.: A limit theorem to a time-fractional diffusion. Lat. Am. J. Probab. Math. Stat. 10(1), 117–156 (2013)MATHMathSciNetGoogle Scholar
  8. 8.
    Clark, J., Dubois, L.: Bounds for the state-modulated resolvent of a linear Boltzmann generator. J. Phys. A 45, 225207 (2012)ADSCrossRefMathSciNetGoogle Scholar
  9. 9.
    Clark, J., Maes, C.: Diffusive behavior for randomly kicked Newtonian particles in a periodic medium. Commun. Math. Phys. 301, 229–283 (2011)ADSCrossRefMATHMathSciNetGoogle Scholar
  10. 10.
    Dürr, D., Goldstein, S., Lebowitz, J.L.: A mechanical model for a Brownian motion. Commun. Math. Phys. 78, 507–530 (1981)ADSCrossRefMATHGoogle Scholar
  11. 11.
    Dzhaparidze, K., Valkeila, E.: On the Hellinger type distances for filtered experiments. Probab. Theory Relat. Fields 85, 105–117 (1990)CrossRefMATHMathSciNetGoogle Scholar
  12. 12.
    Freidlin, M.I., Wentzell, A.D.: Random perturbations of Hamiltonian systems. Mem. Am. Math. Soc. 109(523) (1994)Google Scholar
  13. 13.
    Friedman, N., Ozeri, R., Davidson, N.: Quantum reflection of atoms from a periodic dipole potential. J. Opt. Soc. Am. B 15, 1749–1755 (1998)ADSCrossRefGoogle Scholar
  14. 14.
    Hall, P., Heyde, C.C.: Martingale Limit Theory and its Application. Academic Press, New York (1980)MATHGoogle Scholar
  15. 15.
    Hennion, H.: Sur le mouvement d’une particule lourde soumise à des collisions dans un système infini de particules légères. Z. Wahrscheinlichkeitstheorie Verw. Geb. 25, 123–154 (1973)CrossRefMATHMathSciNetGoogle Scholar
  16. 16.
    Holley, R.: The motion of a heavy particle in a one dimensional gas of hard spheres. Probab. Theory Relat. Fields 17, 181–219 (1971)MathSciNetGoogle Scholar
  17. 17.
    Höpfner, R., Löcherbach, E.: Limit theorems for null recurrent Markov processes. Mem. Am. Math. Soc. 161 (2003)Google Scholar
  18. 18.
    Jacod, J., Shiryaev, A.N.: Limit Theorems for Stochastic Processes. Springer, Berlin (1987)CrossRefMATHGoogle Scholar
  19. 19.
    Kunze, S., Dürr, S., Rempe, G.: Bragg scattering of slow atoms from a standing light wave. Europhys. Lett. 34, 343–348 (1996)ADSCrossRefGoogle Scholar
  20. 20.
    Komorowski, T., Landim, C., Olla, S.: Fluctuations in Markov Processes. Springer, Berlin (2012)CrossRefMATHGoogle Scholar
  21. 21.
    Löcherbach, E., Loukianova, D.: On Nummelin splitting for continuous time Harris recurrent Markov processes and application to kernel estimation for multi-dimensional diffusions. Stoch. Processes Appl. 118, 1301–1321 (2008)CrossRefMATHGoogle Scholar
  22. 22.
    McClelland, J.J.: Atom-optical properties of a standing-wave light field. J. Opt. Soc. Am. B 12, 1761–1768 (1995)ADSCrossRefGoogle Scholar
  23. 23.
    Meyn, S.P., Tweedie, R.L.: Generalized resolvents and Harris recurrence of Markov processes. Contemp. Math. 149, 227–250 (1993)CrossRefMathSciNetGoogle Scholar
  24. 24.
    Montroll, E.W., Weiss, G.H.: Random walks on lattices, II. J. Math. Phys. 6, 167–181 (1965)ADSCrossRefMathSciNetGoogle Scholar
  25. 25.
    Morsch, O.: Dynamics of Bose–Einstein condensates in optical lattices. Rev. Mod. Phys. 78, 179–215 (2006)ADSCrossRefGoogle Scholar
  26. 26.
    Nelson, E.: Dynamical Theories of Brownian Motion. Princeton University Press, Princeton (1967)MATHGoogle Scholar
  27. 27.
    Neveu, J.: Potentiel Markovien récurrent des chaînes de Harris. Ann. Inst. Fourier 22, 7–130 (1972)CrossRefMathSciNetGoogle Scholar
  28. 28.
    Nummelin, E.: A splitting technique for Harris recurrent Markov chains. Z. Wahrsheinlichkeitstheorie Verw. Geb. 43, 309–318 (1978)CrossRefMATHMathSciNetGoogle Scholar
  29. 29.
    Pollard, D.: Convergence of Stochastic Processes. Springer, New York (1984)CrossRefMATHGoogle Scholar
  30. 30.
    Spohn, H.: Large Scale Dynamics of Interacting Particles. Springer, Berlin (1991)CrossRefMATHGoogle Scholar
  31. 31.
    Szász, D., Tóth, B.: Towards a unified dynamical theory of the Brownian particle in an ideal gas. Commun. Math. Phys. 111, 41–62 (1987)ADSCrossRefMATHGoogle Scholar
  32. 32.
    Touati, A.: Théorèmes limites pour les processus de Markov récurrents. C. R. Acad. Sci. Paris Sér. I Math. 305(19), 841–844 (1987)MATHMathSciNetGoogle Scholar
  33. 33.
    Uhlenbeck, G.E., Ornstein, L.S.: On the theory of Brownian motion. Phys. Rev. 36, 823–841 (1930)ADSCrossRefMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of MathematicsMichigan State UniversityEast LansingUSA
  2. 2.Department of MathematicsUniversity of HelsinkiHelsinkiFinland

Personalised recommendations