1 Introduction

The Gaussian orthogonal, unitary and symplectic ensembles (GOE, GUE, GSE) are some of the most studied random matrix models. These are symmetric (resp. Hermitian or symplectic) matrices with i.i.d. standard real (resp. complex or quaternion) normal entries modulo the appropriate symmetry. It has been know from the classical results of Gaudin and Mehta [20] that if we appropriately scale the spectrum in the bulk (e.g. near zero) then we obtain a limiting point process. The point process can be described via its \(n\)-point correlation functions. These are given by determinantal formulas in the GUE case and Pfaffian formulas in the GOE, GSE cases. (See [2, 13, 20] for more details and the precise description.)

The GOE, GUE, GSE models can be naturally included in a one-parameter family of distributions. The joint eigenvalue distribution for these classical models is known to be

$$\begin{aligned} p(\lambda _1,\dots ,\lambda _n)=\frac{1}{Z_{n,\beta }} \prod _{1\le i<j\le n} |\lambda _i-\lambda _j|^\beta e^{-\frac{\beta }{4} \sum \limits _{i=1}^n \lambda _i^2} \end{aligned}$$
(1)

where \(\beta \) is equal to 1, 2 and 4 in the three cases. Note, that the constant \(\beta /4\) in the exponential can be easily changed via linear scaling. It is natural to consider the density (1) for general \(\beta >0\), this is the Gaussian (or Hermitian) \(\beta \)-ensemble. In [23] the authors show the existence of the bulk scaling limit for general \(\beta \). In particular, if \(\Lambda _{n,\beta }\) is distributed according to (1) then \(2 \sqrt{n} \Lambda _{n,\beta }\) converges to a random point process, denoted by \({\hbox {Sine}}_\beta \). For \(\beta =1,2, 4\) this gives the bulk limit process for the GOE, GUE, GSE ensembles.

The \({\hbox {Sine}}_\beta \) process can be described through its counting function using a system of stochastic differential equations. Consider the system

$$\begin{aligned} d \alpha _\lambda =\lambda \frac{\beta }{4}e^{-\tfrac{\beta }{4} t} dt\!+\!\hbox {Re}\left[ (e^{-i \alpha _\lambda }-1)(dB_1+i dB_2) \right] , \quad \alpha _\lambda (0)=0, \quad t\in [0,\infty )\quad \end{aligned}$$
(2)

where \(B_1, B_2\) are independent standard Brownian motions. Note, that this is a one-parameter family of SDEs driven by the same complex Brownian motion. In [23] it was shown that \(N_\beta (\lambda )=\lim \nolimits _{t\rightarrow \infty } \frac{1}{2\pi } \alpha _\lambda (t)\) exists almost surely and it is an integer valued monotone increasing function in \(\lambda \). Moreover, the function \(\lambda \rightarrow N_\beta (\lambda )\) has the same distribution as the counting function of the \({\hbox {Sine}}_\beta \) process, i.e. the distribution of the number of points in \([0,\lambda ]\) for \(\lambda >0\) is given by that of \(N_\beta (\lambda )\).

Note, that for any fixed \(\lambda \) the process \(\alpha _\lambda \) satisfies the SDE

$$\begin{aligned} d \alpha _\lambda =\lambda \frac{\beta }{4}e^{-\tfrac{\beta }{4} t} dt+2 \sin (\alpha _\lambda /2) dB_t, \qquad \alpha _\lambda (0)=0, \quad t\in [0,\infty ) \end{aligned}$$
(3)

where \(B_t=B_t^{(\lambda )}=\int _0^t \hbox {Re}\big [- i e^{-\tfrac{1}{2}{i \alpha _\lambda (s)}}d(B_1+i dB_2)\big ]\) is a standard Brownian motion which depends on \(\lambda \). Thus, if we are interested in the number of points in a given interval \([0,\lambda ]\) then it is enough to study the SDE (3) instead of the system (2).

Using the SDE characterization of the \({\hbox {Sine}}_\beta \) process one can show that it is translation invariant with density \((2\pi )^{-1}\) (see [23]). In particular, in a large interval \([0,\lambda ]\) one expects roughly \((2\pi )^{-1} \lambda \) points. In [19] the authors refined this by showing that \(N_\beta (\lambda )\) satisfies a central limit theorem, it is asymptotically normal with mean \(\frac{\lambda }{2\pi }\) and variance \(\frac{2}{\beta \pi ^2} \log \lambda \).

The main goal of the current paper is to characterize the large deviation behavior of \(N_\beta (\lambda )\). We will find the asymptotic probability of seeing an average density different from \((2\pi )^{-1}\) on a large interval. Our main theorem will show that \(\lambda ^{-1} N_\beta (\lambda )\) satisfies a large deviation principle with a good rate function.

Before stating the exact form of the theorem we need to introduce a couple of notations. We will use

$$\begin{aligned} K(a)= \int _0^{\pi /2} \frac{dx}{\sqrt{1-a \sin ^2 x}}, \quad E(a)= \int _0^{\pi /2} \sqrt{1-a\sin ^2 x}dx, \end{aligned}$$
(4)

for the complete elliptic integrals of the first and second kind, respectively. Note that there are several conventions denoting these functions, we use the one in [1]. We also introduce the following function for \(a<1\):

$$\begin{aligned} {\mathcal {H}}(a)&= (1-a)K(a)-E(a). \end{aligned}$$
(5)

Now we are ready to state our main theorem.

Theorem 1

Fix \(\beta >0\). The sequence of random variables \(\frac{1}{ \lambda }N_\beta (\lambda )\) satisfies a large deviation principle with scale \(\lambda ^2\) and good rate function \(\beta {I_{{Sine}}} (\rho )\) with

$$\begin{aligned} {I_{{Sine}}} (\rho )= \frac{1}{8} \left[ \frac{\nu }{8}+ \rho {\mathcal {H}}(\nu )\right] ,\qquad \nu =\gamma ^{(-1)}(\rho ), \end{aligned}$$
(6)

where \(\gamma ^{(-1)}\) denotes the inverse of the continuous, strictly decreasing function given by

$$\begin{aligned} \gamma (\nu )= \left\{ \begin{array}{l@{\quad }l} \frac{{\mathcal {H}}(\nu )}{8} \int \limits _{-\infty }^\nu {\mathcal {H}}^{-2}(x)dx, \quad &{} \mathrm{if }\,\, \nu <0,\\ \tfrac{1}{2\pi }, \quad &{} \mathrm{if }\,\, \nu =0,\\ \frac{{\mathcal {H}}(\nu )}{8} \int \limits _{1}^\nu {\mathcal {H}}^{-2}(x)dx, \quad &{} \mathrm{if } \,\,0< \nu < 1,\\ 0, \quad &{}\mathrm{if }\,\, \nu =1. \end{array}\right. \end{aligned}$$
(7)

Roughly speaking, this means that the probability of seeing close to \(\rho \lambda \) points in \([0,\lambda ]\) for a large \(\lambda \) is asymptotically \(e^{-\lambda ^2 \beta I_{\mathrm{Sine}}(\rho )}\). The precise statement is that if \(G\) is an open, and \(F\) is a closed subset of \([0,\infty )\) then

$$\begin{aligned}&\liminf _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} P(\tfrac{1}{\lambda }N_\beta (\lambda )\in G)\ge -\inf _{x\in G} \beta I_{\mathrm{Sine}}(x), \\&\limsup _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} P(\tfrac{1}{\lambda }N_\beta (\lambda )\in F)\le -\inf _{x\in F} \beta I_{\mathrm{Sine}}(x). \end{aligned}$$

The function \(\gamma \) may also be defined as the solution to the equation \(4x(1-x)\gamma ''(x)= \gamma (x)\) on the intervals \((-\infty ,0]\) and \([0,1]\) with boundary conditions \(\lim \nolimits _{x\rightarrow 0^{\pm }} \gamma (x)=\tfrac{1}{2\pi }\), \(\gamma (1)=0\) and \(\lim \nolimits _{x\rightarrow -\infty } \frac{\gamma (x)}{\sqrt{|x|}}=\tfrac{1}{4}\). The rate function \(I_{\mathrm{Sine}}(\rho )\) is strictly convex and non-negative with \(I_{\mathrm{Sine}}(\tfrac{1}{2\pi })=0\) and \(I_{\mathrm{Sine}}(0)=\tfrac{1}{64}\). The function \(I_{\mathrm{Sine}}(\tfrac{1}{2\pi }+x)\) behaves like \(-\tfrac{\pi ^2 x^2}{4 \log (1/|x|)}\) for small \(|x|\), and \(I_{\mathrm{Sine}}(\rho )\) grows like \(\frac{1}{2} \rho ^2 \log \rho \) as \(\rho \rightarrow \infty \). These statements will be proved in Proposition 20.

We note that the behavior of \(I_{\mathrm{Sine}}(\rho )\) near \(\rho =\tfrac{1}{2\pi }\) is formally consistent with the already mentioned central limit theorem of \(N_\beta (\lambda )\). For \(\rho =\tfrac{1}{2\pi }+x\) with a small, but fixed \(|x|\) the probability of seeing close to \(\tfrac{1}{2\pi }\lambda +x \lambda \) points in \([0,\lambda ]\) is approximately \(\exp \big (\!\!-\tfrac{\beta \pi ^2 \lambda ^2 x^2}{4 \log (1/|x|)}\big )\). Now let us assume, that this is true even if \(x\) decays with \(\lambda \), even though this regime is not covered in our theorem. If we substitute \(\lambda x=\sqrt{\tfrac{2}{\beta \pi ^2} \log \lambda } \cdot y\) (with a fixed \(y\)), then this probability would asymptotically equal to \(e^{-{y^2}/{2}}\). This is in agreement with the fact that \(N_\beta (\lambda )\) is asymptotically normal with mean \(\tfrac{1}{2\pi }\lambda \) and variance \(\tfrac{2}{\beta \pi ^2} \log \lambda \).

Before moving on, a couple of historical notes are in order. In [23] the authors also show another large deviation statement for the \({\hbox {Sine}}_\beta \) process regarding large intervals, namely that the asymptotic probability of not seeing any points in \([0,\lambda ]\) is approximately \(e^{-\tfrac{\beta }{64} \lambda ^2}\). In [24] this result was sharpened by providing the more precise asymptotics of

$$\begin{aligned} P(N_\beta (\lambda )=0)=(\kappa _\beta +o(1)) \lambda ^{\upsilon _\beta } \exp \left\{ -\tfrac{\beta }{64}\lambda ^2+\left( \tfrac{\beta }{8}-\tfrac{1}{4}\right) \lambda \right\} , \quad \hbox {as} \; \lambda \rightarrow \infty \end{aligned}$$
(8)

with \(\upsilon _\beta =\tfrac{1}{4}\left( \tfrac{\beta }{2}-\tfrac{2}{\beta }-3\right) \) and a positive constant \(\kappa _\beta \) whose value was not determined. Similar results have been proven before for the classical cases \(\beta =1, 2, 4\), see e.g. [3, 8, 22, 25]. Moreover, the value of \(\kappa _\beta \) and higher order asymptotics were also established for these specific cases by [7, 11, 18]. Further extension in the classical cases include the exact asymptotics of \(P(N_\beta (\lambda )=n)\) for fixed \(n\) and also for \(n=o(\lambda )\). (See [22] and [13] for details.) In all of these results the main term of the asymptotic probability is \(e^{-\tfrac{\beta }{64} \lambda ^2}\). This is consistent with our result, as Theorem 1 and \( I_{\mathrm{Sine}}(0)=\frac{1}{64}\) implies

$$\begin{aligned} \lim \limits _{\varepsilon \rightarrow 0} \lim \limits _{\lambda \rightarrow \infty } \tfrac{1}{\lambda ^2} \log P(N_\beta (\lambda )\le \varepsilon \lambda )=-\tfrac{\beta }{64}. \end{aligned}$$

The large deviation rate function (6) has been predicted using non-rigorous scaling and log-gas arguments in [10] and [12]. (See Section 14.6 of [13] for an overview.) Using the same techniques [14] treats the corresponding problem for the soft edge and hard edge limit processes of \(\beta \)-ensembles.

One can also study the large deviation behavior of the empirical distribution of the \(\beta \)-ensembles on a macroscopic level. It is known that after scaling with \(\sqrt{n}\) the empirical measure of the distribution (1) converges to the Wigner semicircle law. In [4] the authors prove a large deviation principle for the scaled empirical measure, this describes the asymptotic probability of seeing a different density profile than the semicircle. One could consider our theorem a microscopic analogue of that result.

We will study the asymptotic behavior of the diffusions (2) and (3) by comparing them to similar diffusions with piecewise constant drifts. This connects us to another, related symmetric random matrix ensemble. Let \(H_{n, \sigma }\) be a random symmetric tridiagonal matrix with entries equal to 1 above and below the diagonal and i.i.d. normals with mean zero and variance \(\tfrac{\sigma ^2}{n}\) on the diagonal.

$$\begin{aligned} H_{n,\sigma }=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} \omega _1 &{} 1 &{} &{} &{}\\ 1 &{} \omega _2&{} 1 &{}&{} \\ &{} 1 &{}\ddots &{} &{}\\ &{} &{} &{}\ddots &{}1 \\ &{} &{} &{}1 &{} \omega _n \\ \end{array} \right) , \qquad \omega _i\sim N(0,\sigma ^2 n^{-1} ). \end{aligned}$$
(9)

The matrix \(H_{n,\sigma }\) can be viewed as a one-dimensional discrete random Schrödinger operator. In [19] it was shown that the bulk scaling limit of the spectrum of \(H_{n,\sigma }\) (along appropriate subsequences) is a point process with density \((2\pi )^{-1}\) denoted by \(\hbox {Sch}_\tau \). (The parameter \(\tau >0\) depends on \(\sigma \) and the point in the spectrum where we zoom in to take the limit.) The process \(\hbox {Sch}_\tau \) can be characterized via its counting function in a similar way to the \({\hbox {Sine}}_\beta \) process. Consider the following one-parameter family of SDEs:

$$\begin{aligned} d \phi _\lambda =\lambda dt+dB_0+\hbox {Re}\left[ e^{-i \phi \lambda } (dB_1+i dB_2) \right] , \qquad \phi _\lambda (0)=0, \quad t\in [0,\infty )\quad \end{aligned}$$
(10)

where \(B_0, B_1, B_2\) are independent standard Brownian motions. Then the random set

$$\begin{aligned} \Lambda _\tau {:=} \, \{\lambda : \phi _{\lambda /\tau }(\tau )\in 2\pi {\mathbb Z}\} \end{aligned}$$

has the same distribution as \(\hbox {Sch}_\tau \). Denote the counting function of the process by \(\widetilde{N}_\tau \), i.e. for \(\lambda >0\) let \(\widetilde{N}_\tau (\lambda )=\#(\hbox {Sch}_\tau \cap [0,\lambda ])\). In [19] it was shown that \(\widetilde{N}_\tau (\lambda )\) is close to a normal with mean \(\tfrac{\lambda }{2\pi }\) and a constant variance \(\tfrac{\tau }{4\pi ^2}\). In our next result we derive the large deviation behavior of \(\widetilde{N}_\tau (\lambda )\), this is the analogue of Theorem 1 for the \(\hbox {Sch}_\tau \) processes.

Theorem 2

Fix \(\tau >0\). The sequence of random variables \(\frac{1}{ \lambda }\widetilde{N}_\tau (\lambda )\) satisfies a large deviation principle with scale \(\lambda ^2\) and rate function \(\frac{1}{\tau } I_\mathrm{Sch}(\cdot )\) where \(I_\mathrm{Sch}(\rho )={\mathcal {I}}(2\pi \rho )\) and for \(q>0\)

$$\begin{aligned} {\mathcal {I}}(q)&=\frac{2-a}{8}-\frac{E(a)}{4 K(a)}, \quad \mathrm{with }\quad a=a(q)=K^{-1}(\pi /(2q)) \end{aligned}$$
(11)

and \({\mathcal {I}}(0)=1/8\).

The rate function \( I_{\hbox {Sch}}(\rho )\) is strictly convex and locally quadratic at the absolute minimum point \(\rho =\tfrac{1}{2\pi }\). (See Proposition 16.) The local behavior of \( I_{\hbox {Sch}}(\rho )\) at \(\rho =\tfrac{1}{2\pi }\) is formally consistent with the fact that \(N_\tau (\lambda )-\tfrac{\lambda }{2\pi }\) is close to a normal random variable with a constant variance \(\tfrac{\tau }{4\pi ^2}\).

The proofs of Theorems 1 and 2 will rely on path level large deviation principles on the corresponding stochastic differential equations. These in turn will follow by analyzing the hitting time of \(2\pi \) for the diffusion

$$\begin{aligned} d\widetilde{\alpha }_\lambda =\lambda dt+2 \sin (\widetilde{\alpha }_\lambda /2) dB, \qquad \widetilde{\alpha }_\lambda (0)=0, \quad t\in [0,\infty ) \end{aligned}$$
(12)

for a fixed large \(\lambda \). Note, that for a fixed \(\lambda \) the process \(\widetilde{\alpha }_\lambda (t)\) is equal in distribution to \(\phi _\lambda (t)-\phi _0(t)\) from (10).

In the next section we summarize some of the important properties of the SDEs we work with, and state the needed path level large deviation results. In Sect. 3 we study the diffusion \(\widetilde{\alpha }_\lambda \) from (12) using the Cameron–Martin–Girsanov change of measure technique. In Sect. 4 we will derive a path level large deviation principle for the diffusion \(\widetilde{\alpha }_\lambda \) from (12). Using this result and comparing \(\alpha _\lambda \) from (3) to a similar diffusion with piecewise constant drift allows us to derive a path level large deviation principle for \(\alpha _\lambda \). This is done in Sect. 5. In Sect. 6 we analyze the rate functions for the path level large deviations and in Sect. 7 we complete the proofs of Theorems 1 and 2. In the “Appendix” we will discuss various properties and asymptotics of the used special functions.

2 Properties of the diffusions corresponding to \({\hbox {Sine}}_\beta \) and \(\hbox {Sch}_\tau \)

Our starting point is the observation that if \(\lambda >0\) is fixed, then if the diffusion \(\widetilde{\alpha }_\lambda \) (defined in (12)) hits \(2n \pi \) for \(n\in {\mathbb Z}\), it will stay above it. This can be seen from the fact that when \(\widetilde{\alpha }_\lambda \) hits \(2n \pi \) the noise term vanishes, but the drift term is always positive. Introduce the notations

$$\begin{aligned} \lfloor y \rfloor _{2\pi }= \max \{ 2\pi k: 2\pi k \le y\}, \qquad \lceil y \rceil _{2\pi }= \min \{ 2\pi k: 2\pi k \ge y\}. \end{aligned}$$

From the strong Markov property we immediately get the following proposition.

Proposition 3

Fix \(\lambda >0\). Then the process \(\lfloor \widetilde{\alpha }_\lambda (t) \rfloor _{2\pi }\) is non-decreasing in \(t\). Moreover, the waiting times between the jump times of this process are i.i.d. with the same distribution as the hitting time

$$\begin{aligned} \tau _\lambda =\inf \{t: \widetilde{\alpha }_\lambda (t)\ge 2\pi \}. \end{aligned}$$
(13)

Consider the diffusions \(\widetilde{\alpha }^{(1)}_\lambda \) and \(\widetilde{\alpha }^{(2)}_\lambda \) which are strong solutions of the SDE (12), but with initial conditions \(\widetilde{\alpha }^{(1)}_\lambda (0)=c_1\le \widetilde{\alpha }^{(2)}_\lambda =c_2\). Then a simple coupling argument shows that \(\widetilde{\alpha }^{(1)}_\lambda (t) \le \widetilde{\alpha }^{(2)}_\lambda (t)\) for all \(t\ge 0\). Our next proposition will build on this statement using the strong Markov property.

Proposition 4

Let \(0=t_0<t_1<t_2<\dots <t_n=T\) and fix a \(\lambda >0\). Consider the solution \(\widetilde{\alpha }_\lambda (t)\) of (12) on \([0,T]\). Then there exists independent random variables \(\xi _1, \xi _2, \dots , \xi _n\) so that

$$\begin{aligned} \lfloor \xi _i \rfloor _{2\pi } \le \lfloor \widetilde{\alpha }_\lambda (t_i)\rfloor _{2\pi }-\lfloor \widetilde{\alpha }_\lambda (t_{i-1})\rfloor _{2\pi }\le \lfloor \xi _i \rfloor _{2\pi }+2\pi , \quad 1\le i \le n, \end{aligned}$$
(14)

and \(\xi _i\) is distributed as \(\widetilde{\alpha }_\lambda (t_i-t_{i-1})\).

Fig. 1
figure 1

The coupling of Proposition 4. The process \(\widetilde{\alpha }_\lambda \) is the diffusion in the middle, it is sandwiched between \(\widetilde{\alpha }^{(1)}_\lambda \) and \(\widetilde{\alpha }^{(2)}_\lambda =\widetilde{\alpha }^{(1)}_\lambda +2\pi \) which start at integer multiples of \(2\pi \) at the beginning of the coupling interval

Proof

Let \(\hat{\alpha }_i(s)\) be defined as the strong solution of (12) on \([t_{i-1},t_i]\) with initial condition \(\hat{\alpha }_i(t_{i-1})=0\) and let \(\xi _i= \hat{\alpha }_i(t_i)\). Clearly, \(\xi _i, 1\le i \le n\) are independent random variables and \(\xi _i\mathop {=}\limits ^{\scriptscriptstyle d}\widetilde{\alpha }_\lambda (t_i-t_{i-1})\), we just have to show that (14) holds. Fix an integer \(1\le i \le n\) and for \(s\in [t_{i-1},t_i]\) define

$$\begin{aligned}&\widetilde{\alpha }^{(1)}_\lambda (s)=\hat{\alpha }_i(s)+\lfloor \widetilde{\alpha }_\lambda (t_{i-1})\rfloor _{2\pi }, \qquad \widetilde{\alpha }^{(2)}_\lambda (s)=\hat{\alpha }_i(s)+\lfloor \widetilde{\alpha }_\lambda (t_{i-1})\rfloor _{2\pi }+2\pi . \end{aligned}$$

Then \(\widetilde{\alpha }_\lambda , \widetilde{\alpha }^{(1)}_\lambda , \widetilde{\alpha }^{(2)}_\lambda \) are all strong solutions of (12) on \( [t_{i-1},t_i]\) with initial conditions

$$\begin{aligned} \widetilde{\alpha }^{(1)}_\lambda (t_{i-1})\le \widetilde{\alpha }_\lambda (t_{i-1})\le \widetilde{\alpha }^{(2)}_\lambda (t_{i-1})= \widetilde{\alpha }^{(1)}_\lambda (t_{i-1})+2\pi . \end{aligned}$$

The ordering is preserved by the coupling so we have

$$\begin{aligned} \widetilde{\alpha }^{(1)}_\lambda (t_{i})\le \widetilde{\alpha }_\lambda (t_{i})\le \widetilde{\alpha }^{(2)}_\lambda (t_{i})= \widetilde{\alpha }^{(1)}_\lambda (t_{i})+2\pi . \end{aligned}$$

(See Fig. 1 for an illustration.) From this we get

$$\begin{aligned}&\lfloor \widetilde{\alpha }_\lambda (t_i)\rfloor _{2\pi }-\lfloor \widetilde{\alpha }_\lambda (t_{i-1})\rfloor _{2\pi }= \lfloor \widetilde{\alpha }_\lambda (t_i)\rfloor _{2\pi }-\lfloor \widetilde{\alpha }^{(1)}_\lambda (t_{i-1})\rfloor _{2\pi }\\&\qquad \ge \lfloor \widetilde{\alpha }^{(1)}_\lambda (t_i)\rfloor _{2\pi }-\lfloor \widetilde{\alpha }^{(1)}_\lambda (t_{i-1})\rfloor _{2\pi }= \lfloor \xi _i \rfloor _{2\pi }, \end{aligned}$$

and

$$\begin{aligned}&\lfloor \widetilde{\alpha }_\lambda (t_i)\rfloor _{2\pi }-\lfloor \widetilde{\alpha }_\lambda (t_{i-1})\rfloor _{2\pi }= \lfloor \widetilde{\alpha }_\lambda (t_i)\rfloor _{2\pi }-\lfloor \widetilde{\alpha }^{(1)}_\lambda (t_{i-1})\rfloor _{2\pi }\\&\qquad \le \lfloor \widetilde{\alpha }^{(2)}_\lambda (t_i)\rfloor _{2\pi }-\lfloor \widetilde{\alpha }^{(1)}_\lambda (t_{i-1})\rfloor _{2\pi }= \lfloor \xi _i \rfloor _{2\pi }+2\pi . \end{aligned}$$

\(\square \)

We will also need another type of coupling for a slightly more general family of diffusions. Consider the SDE

$$\begin{aligned} d\xi _{f,c}=f dt+\hbox {Re}((e^{-i \xi _{f,c}}-1)(dB_1+i dB_2)), \quad \xi _{f,c}(0)=c, \quad t\in [0,\infty ) \end{aligned}$$
(15)

where \(f\) is an integrable non-negative function. Note, that for fixed \(f, c\) this process has the same distribution as

$$\begin{aligned} d\widetilde{\xi }_{f,c}=f dt+2 \sin (\widetilde{\xi }_{f,c}/2) dB, \quad \widetilde{\xi }_{f,c}(0)=c, \quad t\in [0,\infty ) \end{aligned}$$
(16)

The following properties of \(\xi _{f,c}\) follow from the basic theory of diffusions and standard coupling arguments.

Proposition 5

  1. (i)

    Let \(\tau _{2\pi n}\) be the hitting time of \(2\pi n\), where \(2\pi n>c\) and \(n\) is an integer. Then for any \(t>\tau _{2\pi n}\) we have \(\xi _{f,c}\ge 2\pi n\). In particular, if \(c\ge 0\) then \(\xi _{f,c}(t)\) stays non-negative for all \(t>0\).

  2. (ii)

    If \(f\ge g\) and \(\xi _{f,a}\) and \(\xi _{g,b}\) are driven by the same Brownian motions then \(\xi _{f,a}-\xi _{g,b}\) has the same distribution as \(\xi _{f-g,a-b}\). If \(a\ge b\) then \(\xi _{f,a}-\xi _{g,b}\) stays a.s. non-negative for all \(t\).

  3. (iii)

    For any finite \(T\) we have the following exponential tail bound

    $$\begin{aligned} P(\xi _{f,0}(T)\ge ka)\le 2 \left( \frac{\int _0^T f(t) dt}{2\pi a}\right) ^k, \qquad k\in {\mathbb N}. \end{aligned}$$
    (17)

    If \(\int _0^\infty f(t) dt<\infty \) then \(\xi _{f,c}(\infty )=\lim \nolimits _{t\rightarrow \infty } \xi _{f,c}(t)\) exists a.s. and the previous bound holds for \(T=\infty \) as well.

Sketch of the proof

The first statement follows from the strong Markov property and the fact that in (16) the noise term vanishes if \(\widetilde{\xi }_{f,c}\in 2\pi {\mathbb Z}\), but the drift is always non-negative. The first part of (ii) follows by considering the difference of the SDEs for \(\xi _{f,a}\), \(\xi _{g,b}\) and noting that \((e^{-i \xi _{f,a}}-e^{-i \xi _{g,b}})(dB_1+idB_2)\) has the same distribution as \((e^{-i (\xi _{f,a}-\xi _{g,b})}-1)(dB_1+idB_2)\). The second part of (ii) follows from the first statement.

Finally, (17) follows from the Markov inequality and the strong Markov property. The existence of the limit is proved in Proposition 9 of [23]. (See that proposition for more details on the proof.) \(\square \)

Our main theorems will follow from the following path level large deviations.

Theorem 6

Fix \(\beta >0\) and let \(\alpha _\lambda (t)\) be the process defined in (2) or (3). Then the sequence of rescaled processes \((\tfrac{\alpha _\lambda (t)}{\lambda }, t\in [0,\infty ))\) satisfies a large deviation principle on \(C[0,\infty )\) with the uniform topology with scale \(\lambda ^2\) and good rate function \({\mathcal {J}}_{\mathrm{Sine}_\beta }\). The rate function \({\mathcal {J}}_{\mathrm{Sine}_\beta }\) is defined as

$$\begin{aligned} {\mathcal {J}}_{\mathrm{Sine}_\beta }(g) = \int _0^\infty {\mathfrak {f}}^2(t) {\mathcal {I}}\left( g'(t)/{\mathfrak {f}}(t)\right) dt, \quad \mathrm{with} \quad {\mathfrak {f}}(t)={\mathfrak {f}}_\beta (t)=\tfrac{\beta }{4}e^{-\tfrac{\beta }{4}t} \end{aligned}$$

in the case where \(g(0)=0\) and \(g\) is absolutely continuous with non-negative derivative \(g'\). In all other cases \({\mathcal {J}}_{\mathrm{Sine}_\beta }(g)\) is defined as \(\infty \).

Theorem 7

Fix \(T>0\) and let \(\widetilde{\alpha }_\lambda (t)\) be the process defined in (12). Then the sequence of rescaled processes \((\tfrac{\widetilde{\alpha }_\lambda (t)}{\lambda }, t\in [0,T])\) satisfies a large deviation principle on \(C[0,T]\) with the uniform topology with scale \(\lambda ^2\) and good rate function \({\mathcal {J}}_{\mathrm{Sch},T}\). The rate function is defined as

$$\begin{aligned} {\mathcal {J}}_{\mathrm{Sch},T}(g)=\int _0^T {\mathcal {I}}\left( g'(t)\right) dt \end{aligned}$$

in the case where \(g(0)=0\) and \(g\) is absolutely continuous with non-negative derivative \(g'\), and \({\mathcal {J}}_{\mathrm{Sch},T}(g)=\infty \) in all other cases.

In order to prove Theorem 7 we observe that \(\tfrac{\widetilde{\alpha }_\lambda (t)}{\lambda }\) is close to \(\tfrac{\lfloor \widetilde{\alpha }_\lambda (t)\rfloor _{2\pi }}{\lambda }\) for large \(\lambda \) and by Proposition 3 we only need to analyze the hitting time \(\tau _\lambda \) to understand the evolution of \(\lfloor \widetilde{\alpha }_\lambda (t)\rfloor _{2\pi }\). The proof of Theorem 6 will follow along similar lines after approximating the drift in (2) with a piecewise constant function.

3 Analysis of the hitting time \(\tau _\lambda \)

The following proposition summarizes our bounds on the relevant hitting times.

Proposition 8

Let \(\tau _{\lambda }=\inf \{t: \widetilde{\alpha }_\lambda (t)\ge 2\pi \}\) where \(\widetilde{\alpha }_\lambda \) is the solution of (12) and fix \(a<1\). Then we have

$$\begin{aligned} E e^{ \frac{\lambda ^2 a}{8} \tau _{\lambda }- \frac{\lambda (|a|\wedge \sqrt{|a|})}{4} \tau _{\lambda }} \le e^{-\lambda {\mathcal {H}}(a)}. \end{aligned}$$
(18)

Let \(t_a=4 K(a)\) and fix \(0<\varepsilon <|t_a-2\pi |\). Then we have

$$\begin{aligned} P(\lambda \tau _{\lambda } \in [t_a- \varepsilon ,t_a+\varepsilon ] )&\ge A(\varepsilon ,\lambda ,a) e^{ -\lambda ({\mathcal {H}}(a)+\tfrac{a t_a}{8}) - \lambda \frac{|a|\varepsilon }{8} -\lambda \frac{|a|}{2}(t_a+\varepsilon )} \end{aligned}$$
(19)

where \(\lim \nolimits _{\lambda \rightarrow \infty } A(\varepsilon , \lambda , a)=1\) for fixed \(a, \varepsilon \).

Our first step is a change of variables in (12). We introduce \(X_\lambda (t)=\log (\tan (\widetilde{\alpha }_\lambda (t)/4))\), by Itô’s formula this satisfies the SDE

$$\begin{aligned} dX_\lambda = \frac{\lambda }{2} \cosh X_\lambda \ dt+ \frac{1}{2}\tanh X_\lambda \ dt + dB_t, \qquad X_\lambda (0)=-\infty . \end{aligned}$$
(20)

The distribution of the hitting time of \(2\pi \) for \(\widetilde{\alpha }_\lambda (t)\) is the same as that of the hitting time of \(\infty \) for \(X_\lambda \). With a small abuse of notation from now on we will use the notation \(\tau _\lambda \) for the blow-up time of \(X_\lambda (t)\), i.e. \(\tau _\lambda =\sup \{t: X_\lambda (t)<\infty \}\). In order to study \(\tau _\lambda \) we will introduce a similar diffusion with a modified drift. Let \(a<1\) and consider

$$\begin{aligned} d Y_{\lambda ,a}= \frac{\lambda }{2}\sqrt{ \cosh ^2 Y_{\lambda ,a} -a}\ dt+ \frac{1}{2} \tanh Y_{\lambda ,a} dt+dB_t, \qquad Y_{\lambda ,a}(0)=-\infty . \end{aligned}$$
(21)

To prove Proposition 8 we will choose an appropriate \(a\) and compare \(X_\lambda \) with the diffusion \(Y_{\lambda ,a}\) using the Cameron–Martin–Girsanov formula. Introduce the following notations for the drifts:

$$\begin{aligned} f_\lambda (x)=\frac{\lambda }{2} \cosh x+ \frac{1}{2}\tanh x, \qquad h_{\lambda ,a}(y)= \frac{\lambda }{2}\sqrt{ \cosh ^2 y- a}\ + \frac{1}{2} \tanh y. \end{aligned}$$

Note, that we have the uniform bound

$$\begin{aligned} \left| f_\lambda (x)-h_{\lambda , a}(x) \right| \le \tfrac{1}{2} \lambda |a|. \end{aligned}$$
(22)

The following proposition will be our main tool for our estimates.

Proposition 9

Fix \(a<1\) and consider \(X=X_\lambda \) and \(Y=Y_{\lambda ,a}\). Denote by \(\tau _\lambda \) and \(\tau _{Y,\lambda }\) the blowup times of \(X\) and \(Y\) to \(\infty \). Then for any \(s>0\), we have

$$\begin{aligned} P(\lambda \tau _\lambda >s)=E\left[ \mathbf{1}(\lambda \tau _{Y,\lambda }>s) e^{-G_{s/\lambda }(Y)}\right] , \end{aligned}$$
(23)

and

$$\begin{aligned} 1=E e^{-G_{ \tau \wedge (s/\lambda ) }(Y)}=E e^{G_{ \tau \wedge (s/\lambda )}(X)}, \end{aligned}$$
(24)

where

$$\begin{aligned} G_s(X)= \int _0^s h_{\lambda ,a}(X(t))-f_\lambda (X(t)) dX- \frac{1}{2}\int _0^s (h_{\lambda ,a}^2(X)-f_\lambda ^2(X))dt. \end{aligned}$$

Proof

This is just the Cameron–Martin–Girsanov formula for diffusions with explosion. Note, that because of (22) the process \(e^{G_{\tau \wedge s}(X)}\) satisfies the Novikov criterion and it is a positive martingale. From this the usual steps of the proof can be completed (see e.g. [16, 17]). \(\square \)

Proof of Proposition 8

We first estimate the Girsanov exponent

$$\begin{aligned} G_s(X)&= \frac{\lambda }{2} \int _0^s (\sqrt{ \cosh ^2 X-a}- \cosh X) d X\\&- \frac{1}{2}\int _0^s\left( -\frac{\lambda ^2}{4} a+ \frac{\lambda }{2}( \sqrt{ \cosh ^2 X-a}- \cosh X)\tanh X \right) \ dt. \end{aligned}$$

Applying Itô’s formula for \(\theta (X)=h_{\lambda ,a}(X)-f_\lambda (X)\) we have that \( \int _0^t \theta (X)dX= \int _{X_0}^{X_t} \theta (x)dx- \frac{1}{2} \int _0^s \theta '(X)dt. \) This gives us

$$\begin{aligned} G_s(X)&= \frac{\lambda ^2 a}{8} s+ \frac{\lambda }{2} \int _{-\infty }^{X_s} \left( \sqrt{ \cosh ^2 x-a}- \cosh x\right) dx\\&+\frac{ \lambda }{4} \int _0^s \frac{a \tanh }{\sqrt{\cosh ^2 X-a}} \cdot \frac{\sqrt{\cosh ^2 X-a}-\cosh X}{\sqrt{\cosh ^2 X-a}+\cosh X}ds. \end{aligned}$$

Note, that

$$\begin{aligned} \frac{1}{2} \int _\mathbb {R}\left( \sqrt{ \cosh ^2 x-a}- \cosh x\right) dx&= - \int _0^{\pi /2} \frac{a}{1+ \sqrt{1-a \sin ^2 y}}dy\\&= (1-a)K(a)-E(a)={\mathcal {H}}(a), \end{aligned}$$

where this last equality can be seen by differentiating both sides with respect to \(a\) and checking equality at \(a=0\). It is not hard to check that

$$\begin{aligned} \left| \tfrac{a \tanh }{\sqrt{\cosh ^2 X-a}} \tfrac{\sqrt{\cosh ^2 X-a}-\cosh X}{\sqrt{\cosh ^2 X-a}+\cosh X}\right| \le \left| \tfrac{a \tanh x}{\sqrt{\cosh ^2 x-a}}\right| \le |a|\wedge \sqrt{|a|}, \quad \hbox {for} \; a<1 \end{aligned}$$

uniformly in \(x\). The upper bound \(|a|\) follows from \(\sqrt{\cosh ^2 x-a}\ge |\sinh x|\), while the bound \(\sqrt{|a|}\) requires the optimization of the function \(\frac{|a| \sqrt{y-1}}{\sqrt{y}\sqrt{y-a}}\) for \(y\ge 1\). This gives the bound

$$\begin{aligned} \left| G_{\tau _\lambda }(X)-\frac{\lambda ^2 a \tau }{8}-\lambda {\mathcal {H}}(a)\right| \le \frac{\lambda \tau (|a|\wedge \sqrt{|a|})}{4}. \end{aligned}$$
(25)

To get the exponential moment bound (18) we use \(1=E e^{G_{ \tau \wedge s/\lambda }(X)}\) from (24). We let \(s\rightarrow \infty \), use Fatou’s lemma and (25) to get

$$\begin{aligned} 1\ge E e^{G_\tau (X)}\ge E e^{ \frac{\lambda ^2 a}{8} \tau +\lambda {\mathcal {H}}(a)-\frac{\lambda (|a|\wedge \sqrt{|a|} )\tau }{4 }}. \end{aligned}$$
(26)

Rearranging the terms we get (18).

To prove the lower bound (19) we write

$$\begin{aligned}&P(\lambda \tau _\lambda \in (t_a-\varepsilon , t_a+\varepsilon ))=P(\lambda \tau _{Y,\lambda }>t_a-\varepsilon )- P(\lambda \tau _{Y,\lambda }>t_a+\varepsilon )\nonumber \\&\quad =E\left[ \mathbf{1}(\lambda \tau _{Y,\lambda }>t_a-\varepsilon ) e^{-G_{\tau \wedge (t_a-\varepsilon )/\lambda }(Y)}\right] -E\left[ \mathbf{1}(\lambda \tau _{Y,\lambda }>t_a+\varepsilon ) e^{-G_{\tau \wedge (t_a+\varepsilon )/\lambda }(Y)}\right] \nonumber \\&\quad =E\left[ \mathbf{1}(\lambda \tau _{Y,\lambda }\in (t_a-\varepsilon ,t_a+\varepsilon )) e^{-G_{\tau \wedge (t_a+\varepsilon )/\lambda }(Y)}\right] \nonumber \\&\quad =E\left[ \mathbf{1}(\lambda \tau _{Y,\lambda }\in (t_a-\varepsilon ,t_a+\varepsilon )) e^{-G_{\tau }(Y)}\right] , \end{aligned}$$
(27)

where we used the fact that \(e^{-G_{\tau \wedge t }(Y)}\) is martingale in the third line. Because of (25) we have

$$\begin{aligned} G_\tau (Y)\le \frac{\lambda ^2 a \tau }{8} + \lambda {\mathcal {H}}(a)+\frac{\lambda |a| \tau }{4 }, \end{aligned}$$
(28)

and we can bound the last expectation as

$$\begin{aligned}&E\left[ \mathbf{1}(\lambda \tau _{Y,\lambda }\in (t_a-\varepsilon ,t_a+\varepsilon )) e^{-G_{\tau }(Y)}\right] \\&\qquad \ge E\left[ \mathbf{1}(\lambda \tau _{Y,\lambda }\in (t_a-\varepsilon ,t_a+\varepsilon )) e^{- \frac{\lambda ^2 a \tau }{8} - \lambda {\mathcal {H}}(a)-\frac{\lambda {|a|} \tau }{4 }}\right] \\&\qquad \ge P(\lambda \tau _{Y,\lambda }\in (t_a-\varepsilon ,t_a+\varepsilon )) e^{ -\frac{\lambda a (t_a\pm \varepsilon )}{8} - \lambda {\mathcal {H}}(a)-\frac{\lambda {|a|} (t_a+\varepsilon ) }{4}}, \end{aligned}$$

where we choose the sign of \(\varepsilon \) in \(t_a\pm \varepsilon \) the same way as the sign of \(a\).

If we can show that \(\lim \nolimits _{\lambda \rightarrow \infty }P(\lambda \tau _{Y,\lambda }\in (t_a-\varepsilon ,t_a+\varepsilon )) =1\) for fixed \(a\) and \(\varepsilon \) then this will complete the proof of (19). Note, that \(\widetilde{Y}(t):= Y_{\lambda , a}(t/\lambda )\) satisfies the SDE

$$\begin{aligned} d\widetilde{Y}=\frac{1}{2} \sqrt{\cosh ^2 Y-a} dt+\frac{1}{2\lambda }\tanh \widetilde{Y} dt+\frac{1}{\sqrt{\lambda }} dB_t, \qquad \widetilde{Y}(0)=-\infty . \end{aligned}$$

As \(\lambda \rightarrow \infty \), the strong solution of this SDE converges a.s. to the solution of the ODE

$$\begin{aligned} y'=\frac{1}{2}\sqrt{\cosh ^2 y- a}, \ \ \ y(0)=-\infty . \end{aligned}$$

This ODE can be solved and the solutions satisfies \(\int _{-\infty }^{y(t)} \frac{2}{\sqrt{\cosh ^2 x-a}} dx=t\). This shows that \(y\) explodes exactly at

$$\begin{aligned} \int _{-\infty }^{\infty } \frac{2}{\sqrt{\cosh ^2 x-a}} dx=4K(a)=t_a. \end{aligned}$$

This shows that \(\lim \nolimits _{\lambda \rightarrow \infty }P(\lambda \tau _{Y,\lambda }\in (t_a-\varepsilon ,t_a+\varepsilon )) = 1\) for fixed \(a\) and \(\varepsilon \) and this completes the proof of the proposition. \(\square \)

We can use the tail estimates of \(\tau _\lambda \) to estimate the tail probabilities of \(\widetilde{\alpha }_\lambda (t)\) for a fixed \(t\). Recall the definition of \({\mathcal {I}}(\cdot )\) from (11).

Lemma 10

There exist a constant \(c\) so that for \(\lambda >2\) we have

$$\begin{aligned} e^{-\lambda ^2 t {\mathcal {I}}(q)+\lambda c(t+1)( {\mathcal {I}}(q)+1)}\ge \left\{ \begin{array}{l@{\quad }l} \, P(\lceil \widetilde{\alpha }_\lambda (t)\rceil _{2\pi } \ge q t \lambda ) &{} \mathrm{if}\; q>1,\\ \, P(\lfloor \widetilde{\alpha }_\lambda (t)\rfloor _{2\pi } \le q t \lambda )&{} \mathrm{if}\; 0<q<1. \end{array}\right. \end{aligned}$$
(29)

Moreover, there are absolute constants \(c_0,c_1\) so that if \(qt\lambda , q\) and \(\lambda q \log q\) are all bigger than \(c_0\) then

$$\begin{aligned} P(\lceil \widetilde{\alpha }_\lambda (t)\rceil _{2\pi } \ge q t \lambda )\le e^{-c_1 \lambda ^2 t \,q^2 \log q}. \end{aligned}$$
(30)

Proof

Introduce the hitting times

$$\begin{aligned} \tau _\lambda ^{(n)}=\inf \{t>0: \widetilde{\alpha }_\lambda (t)>2n\pi \}. \end{aligned}$$
(31)

Then by Proposition 3 the random variables \(\tilde{\tau }^{(n)}=\tau ^{(n)}-\tau ^{(n-1)}\) are i.i.d. with the same distribution as \(\tau _\lambda \). Applying the exponential Markov inequality we get

$$\begin{aligned} P(\lfloor \widetilde{\alpha }_\lambda (t)\rfloor _{2\pi } \le q t \lambda )=P\bigg (\sum _{i=1}^{\lceil qt \lambda /(2\pi )\rceil } \tilde{\tau }^{(i)}\ge t\bigg )\le \left( E e^{A \tau _\lambda }\right) ^{\lceil qt \lambda /(2\pi )\rceil } e^{-At}\quad \end{aligned}$$
(32)

with any \(A>0\). Suppose first that \(q<1\) which also implies \(a=a(q)=K^{-1}(\pi /(2q))\in (0,1)\). By choosing

$$\begin{aligned} A=\frac{\lambda ^2 a}{8}-\frac{\lambda {|a|}}{4} \end{aligned}$$
(33)

we have \(A>0\) if \(\lambda >2\) and from (18) we have \(E e^{A\tau _\lambda }\le e^{-\lambda {\mathcal {H}}(a)}\). Together with (32) this gives

$$\begin{aligned} P(\widetilde{\alpha }_\lambda (t)\le q t \lambda )&\le e^{-\lambda {\mathcal {H}}(a) \lceil qt \lambda /(2\pi )\rceil -(\frac{\lambda ^2 a}{8}+\frac{\lambda |a|}{4 })t}\nonumber \\&\le e^{-\frac{q t \lambda ^2}{2\pi } {\mathcal {H}}(a) -(\frac{\lambda ^2 a}{8}+\frac{\lambda |a|}{4})t+\lambda \left| {\mathcal {H}}(a)\right| }=e^{-\lambda ^2 t {\mathcal {I}}(q)+\lambda \left( \tfrac{|a| t}{4}+|{\mathcal {H}}(a)|\right) }\quad \end{aligned}$$
(34)

where we used the definitions (11) and (5).

For the \(q>1\) case we use the same steps. Here \(a=K^{-1}(\pi /(2q))<0\) and \(A\) defined in (33) is negative which is exactly what we need for the exponential Markov inequality. Eventually we get

$$\begin{aligned} P(\lceil \widetilde{\alpha }_\lambda (t)\rceil _{2\pi }\ge q\lambda t)&\le e^{-\lambda {\mathcal {H}}(a) \lfloor qt \lambda /(2\pi )\rfloor -(\frac{\lambda ^2 a}{8}-\frac{\lambda |a|}{4 })t}\nonumber \\&\le e^{-\frac{q t \lambda ^2}{2\pi } {\mathcal {H}}(a) -(\frac{\lambda ^2 a}{8}-\frac{\lambda |a|}{4 })t+\lambda |{\mathcal {H}}(a)|}\nonumber \\&= e^{-\lambda ^2 t {\mathcal {I}}(q)+\lambda \left( \tfrac{|a| t}{4}+|{\mathcal {H}}(a)|\right) }. \end{aligned}$$
(35)

By Lemma 18 in the “Appendix” there is a constant \(c\) so that

$$\begin{aligned} {\mathcal {H}}(a(q))+\tfrac{1}{4} |a(q)| t\le c(t+1)( \, {\mathcal {I}}(q)+1), \end{aligned}$$
(36)

for all \(t, q>0\) which means that we can replace the upper bounds in (34) and (35) with \(e^{-\lambda ^2 t {\mathcal {I}}(q)+\lambda c(t+1)( {\mathcal {I}}(q)+1)}\). This proves the first part of Lemma 10.

For the second part we repeat the same steps as in the \(q>1\) case, but now use

$$\begin{aligned} A=\frac{\lambda ^2 a}{8}-\frac{\lambda \sqrt{|a|}}{4}. \end{aligned}$$

This gives

$$\begin{aligned} P(\lceil \widetilde{\alpha }_\lambda (t)\rceil _{2\pi }\ge q\lambda t)&\le e^{-\lambda {\mathcal {H}}(a) \lfloor qt \lambda /(2\pi )\rfloor -\frac{\lambda ^2 a}{8}t+\frac{\lambda \sqrt{|a|}}{4 }t}. \end{aligned}$$

By Proposition 17 of the “Appendix” if \(q\) is large enough then \(a=K^{-1}(\pi /(2q))>c q^2 \log ^2 q\) with some positive constant \(c\). If \(-a \lambda \) and \(q t \lambda \) are big enough (which can be achieved by choosing \(c_0\) big enough), we will have

$$\begin{aligned} \lfloor qt \lambda /(2\pi )\rfloor >\frac{9}{10} qt \lambda /(2\pi ), \qquad -\frac{\lambda ^2 a}{8}+\frac{\lambda \sqrt{|a|}}{4 }<-\frac{11}{10} \cdot \frac{\lambda ^2 a}{8}. \end{aligned}$$

Then

$$\begin{aligned} -\lambda {\mathcal {H}}(a) \lfloor qt \lambda /(2\pi )\rfloor -\tfrac{1}{8} {\lambda ^2 a}t+\tfrac{1}{4} \lambda \sqrt{|a|} t&< -\lambda ^2 t\left( \tfrac{9}{10}{\mathcal {H}}(a)\tfrac{q}{2\pi }+\tfrac{11}{10}\tfrac{a}{8}\right) \\&= -\lambda ^2 t \left( -\frac{7 a}{80}-\frac{9 E(a)}{40 K(a)}+\frac{9}{40} \right) \\&< -c_2 \lambda ^2 t q^2 \log ^2 q. \end{aligned}$$

with a positive constant \(c_2\), where in the last step we again used the asymptotics given in Proposition 17 together with (89). This completes the proof of (30). \(\square \)

4 The path deviation for the \({\widetilde{\alpha }}_\lambda \) process

In this section we will prove Theorem 7. In order to show the large deviation principle we need that

$$\begin{aligned}&\liminf _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2}P\left( \frac{\widetilde{\alpha }_\lambda (\cdot )}{\lambda }\in G\right) \ge \inf _{g\in G} {\mathcal {J}}_{\mathrm{Sch},T}(g), \quad \hbox {for any open set} \; G\subset C[0,T],\\&\limsup _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2}P\left( \frac{\widetilde{\alpha }_\lambda (\cdot )}{\lambda }\in K\right) \le \inf _{g\in K} {\mathcal {J}}_{\mathrm{Sch},T}(g), \,\,\, \hbox {for any closed set} \; K\subset C[0,T]. \end{aligned}$$

The fact that \( {\mathcal {J}}_{\mathrm{Sch},T}(g)\) is a good rate function will be proved in Proposition 14 of Sect. 6.

We will use the fact that \({\mathcal {I}}(x)\) is strictly convex on \((0,\infty )\) with a global minimum at \({\mathcal {I}}(1)=0\), and also that there is a constant \(c>0\) so that

$$\begin{aligned} c^{-1} \le \frac{{\mathcal {I}}(x)}{x^2 \log ^2 x}\le c, \quad \hbox {for all} \; x>2. \end{aligned}$$
(37)

These statements will be proved in Propositions 16 and 17 of the “Appendix”.

Proof of the large deviations upper bound in Theorem 7

We will follow the standard strategy for proving path level large deviations. Consider a closed subset \(K\) of \(C[0,T]\). We need to bound \(P(\tfrac{1}{\lambda }\widetilde{\alpha }_\lambda (\cdot )\in K)\). Define the \(\delta \)-‘fattening’ of \(K\) as

$$\begin{aligned} K^\delta {:=}\,\{f\in C[0,T]: \Vert f-g\Vert \le \delta \hbox { for some }g\in K\}. \end{aligned}$$
(38)

From now on \(\Vert \cdot \Vert \) denotes the sup-norm on the appropriate interval.

Let \(\pi _N\) be the following projection of \(C[0,T]\) to piecewise linear paths:

$$\begin{aligned} (\pi _N f)(i T/N)=\lfloor f(iT /N)\rfloor _{2\pi }, \qquad 0\le i\le N \end{aligned}$$
(39)

and \(\pi _N f\) is defined linearly between these points. Then

$$\begin{aligned} P( \tilde{\alpha }_\lambda / \lambda \in K)&\le P( \Vert \tilde{\alpha }_\lambda - \pi _N \tilde{\alpha }_\lambda \Vert \ge \delta \lambda )+ P\big ( \tfrac{1}{\lambda } \pi _N ( \tilde{\alpha }_\lambda ) \in K^\delta \big ). \end{aligned}$$
(40)

We will bound the two probabilities in (40) separately.

The first term can be rewritten as

$$\begin{aligned} P\left[ \left\| {\tilde{\alpha }_\lambda }-{ \pi _N \tilde{\alpha }_\lambda }\right\| \ge \delta \lambda \right]&= P\left( \max _k \sup \limits _{t\in \left[ \frac{(k-1)T}{N},\frac{kT}{N}\right] }\left| {\pi _N \tilde{\alpha }_\lambda (t)}\!-\!{ \tilde{\alpha }_\lambda (t)}\right| \ge \delta \lambda \right) \!.\quad \quad \end{aligned}$$
(41)

By Proposition 3 the process \(\lfloor \widetilde{\alpha }_\lambda (t)\rfloor _{2\pi }\) is non-decreasing.Thus for any fixed \(k\) we have

$$\begin{aligned} \sup \limits _{t\in [\frac{(k-1)T}{N},\frac{kT}{N}]}\left| {\pi _N \tilde{\alpha }_\lambda (t)}-{ \tilde{\alpha }_\lambda (t)}\right| \le {\lceil \widetilde{\alpha }_\lambda ({(k+1)T}/{N})\rceil _{2\pi }}- {\lfloor \widetilde{\alpha }_\lambda ({kT}/{N})\rfloor _{2\pi }}. \end{aligned}$$

By Proposition 4 the term on the right is stochastically dominated by \( \widetilde{\alpha }_\lambda (T/N)+4\pi \) therefore

$$\begin{aligned} P( \Vert \tilde{\alpha }_\lambda - \pi _N \tilde{\alpha }_\lambda \Vert \ge \delta \lambda )&\le N P(\widetilde{\alpha }_\lambda (T/N)+4\pi \ge \delta \lambda )\le N P\left( \frac{1}{\lambda }\widetilde{\alpha }_\lambda (T/N)\ge \frac{\delta }{2} \right) \nonumber \\ \end{aligned}$$
(42)

where the last bound holds if \(\lambda >8\pi /\delta \). Using Lemma 10 we get

$$\begin{aligned} N P\left( \frac{1}{\lambda }\widetilde{\alpha }_\lambda (T/N)\ge \frac{\delta }{2} \right)&\le N e^{-(\lambda ^2 \frac{T}{N}+ \lambda c_1(T/N+1)) {\mathcal {I}}\left( \frac{\delta N}{2T}\right) +\lambda c_1(T/N-1) } \end{aligned}$$

and this leads to

$$\begin{aligned} \limsup _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log P( \Vert \tilde{\alpha }_\lambda - \pi _N \tilde{\alpha }_\lambda \Vert \ge \delta \lambda )\le -\frac{T}{N}{\mathcal {I}}\left( \frac{\delta N}{2T}\right) . \end{aligned}$$
(43)

Note, that for fixed \(\delta \) and \(T\) as \(N\rightarrow \infty \) the right hand side converges to \(-\infty \) by (37).

The second term on the right side of (40) can be bounded as

$$\begin{aligned} P\left( \pi _N (\widetilde{\alpha }_\lambda /\lambda )\in K^\delta \right) \le P\left( {\mathcal {J}}_{\mathrm{Sch},T}(\pi _N(\widetilde{\alpha }_\lambda /\lambda ))\ge \inf _{g\in K^\delta } {\mathcal {J}}_{\mathrm{Sch},T}(g) \right) . \end{aligned}$$

We introduce

$$\begin{aligned} \Delta \widetilde{\alpha }_i=\frac{N}{\lambda T}\left( \lfloor \widetilde{\alpha }(i T/N)\rfloor _{2\pi }-\lfloor \widetilde{\alpha }((i-1) T/N)\rfloor _{2\pi }\right) , \qquad \hbox {for }\; 1\le i \le N \end{aligned}$$

and \(C_\delta =\inf _{g\in K^\delta } {\mathcal {J}}_{\mathrm{Sch},T}(g) \). Then we have to bound

$$\begin{aligned} P({\mathcal {J}}_{\mathrm{Sch},T}(\pi _N(\widetilde{\alpha }_\lambda /\lambda ))\ge C_\delta )=P\left( \sum _{i=1}^N \frac{T}{N}{\mathcal {I}}\left( {\Delta \widetilde{\alpha }_i}\right) \ge C_\delta \right) . \end{aligned}$$
(44)

We can apply Proposition 4 with \(t_i=\frac{iT}{N}, 1\le i\le N\) to get independent random variables \(\xi _i\) with \(\xi _i\mathop {=}\limits ^{\scriptscriptstyle d}\widetilde{\alpha }_\lambda (T/N)\) and

$$\begin{aligned} \frac{N}{\lambda T}\lfloor \xi _i \rfloor _{2\pi } \le \Delta \widetilde{\alpha }_i\, \le \,\frac{N}{\lambda T}\left( \lfloor \xi _i \rfloor _{2\pi }+2\pi \right) . \end{aligned}$$

Because of the convexity of \({\mathcal {I}}(\cdot )\) we then have

$$\begin{aligned} {\mathcal {I}}\left( {\Delta \widetilde{\alpha }_i}\right)&\le \max \left( {\mathcal {I}}\left( \frac{N\lfloor \xi _i\rfloor _{2\pi }}{ \lambda T}\right) , {\mathcal {I}}\left( \frac{N \lfloor \xi _i \rfloor _{2\pi }}{\lambda T}+\frac{2\pi N}{\lambda T}\right) \right) \\&\le \left( 1+\frac{2\pi N}{\lambda T}\right) {\mathcal {I}}\left( \frac{N \lfloor \xi _i \rfloor _{2\pi }}{ \lambda T}\right) +c \frac{2\pi N}{\lambda T} \end{aligned}$$

where we used Lemma 19 of the “Appendix” for the last bound. Fix \(1/2>\varepsilon >0\). Using the exponential Markov inequality, the independence of \(\xi _i\) and \(\xi _i\mathop {=}\limits ^{\scriptscriptstyle d}\widetilde{\alpha }_\lambda (T/N)\) we get the bound

$$\begin{aligned}&P\left( \sum _{i=1}^N \frac{T}{N}{\mathcal {I}}\left( {\Delta \widetilde{\alpha }_i}\right) \ge C_\delta \right) \nonumber \\&\qquad \le \left( E e^{(1-2\varepsilon ) \lambda ^2 \frac{T}{N}\left( \left( 1+ \frac{2\pi N}{\lambda T}\right) {\mathcal {I}}\left( \frac{N \lfloor \widetilde{\alpha }_\lambda (T/N)\rfloor _{2\pi }}{ \lambda T}\right) +c \frac{2\pi N}{\lambda T}\right) }\right) ^N e^{-(1-2\varepsilon ) \lambda ^2 C_\delta }\nonumber \\&\qquad \le \left( E e^{(1-\varepsilon ) \lambda ^2 \frac{T}{N} {\mathcal {I}}\left( \frac{N \lfloor \widetilde{\alpha }_\lambda (T/N)\rfloor _{2\pi }}{ \lambda T}\right) }\right) ^N e^{(1-2\varepsilon )c 2\pi \lambda N-(1-2\varepsilon )\lambda ^2C_\delta }, \end{aligned}$$
(45)

where the second inequality holds for fixed \(\varepsilon , N, T\) if \(\lambda \) is big enough. Our next step is to estimate the exponential moment \(E e^{(1-\varepsilon ) \lambda ^2 \frac{T}{N}{\mathcal {I}}\left( \frac{N\lfloor \widetilde{\alpha }_\lambda (T/N)\rfloor _{2\pi }}{ \lambda T}\right) }\) for a fixed \(\varepsilon >0\). By Lemma 11 below if \(N,T, \varepsilon \) are fixed then

$$\begin{aligned} \limsup _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log E e^{(1-\varepsilon ) \lambda ^2 \frac{T}{N}{\mathcal {I}}\left( \frac{N\lfloor \widetilde{\alpha }_\lambda (T/N)\rfloor _{2\pi }}{ \lambda T}\right) }\le 0. \end{aligned}$$

Using this with (45) we get

$$\begin{aligned} \limsup _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log P\left( \sum _{i=1}^N \frac{T}{N}{\mathcal {I}}\left( {\Delta \widetilde{\alpha }_i}\right) \ge C_\delta \right) \le -(1-2\varepsilon )C_\delta . \end{aligned}$$
(46)

Now we let \(\varepsilon \rightarrow 0\) and then \(N\rightarrow \infty \). The bounds (43), (46) with (40) give

$$\begin{aligned} \limsup _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log P(\lambda ^{-1}\widetilde{\alpha }_\lambda (\cdot ) \in K)\le -\inf _{g\in K^\delta } {\mathcal {J}}_{\mathrm{Sch},T}(g). \end{aligned}$$
(47)

Using the fact that \({\mathcal {J}}_{\mathrm{Sch},T}\) is a good rate function (which is proved in Proposition 14 of Sect. 6) we get that the right hand side converges to \(-\inf _{g\in K} {\mathcal {J}}_{\mathrm{Sch},T}(g)\) as \(\delta \rightarrow 0\). (See e.g. Lemma 4.1.6 from [9].) This finishes the proof of the lower bound.

Now we will prove the missing estimate for the lower bound.

Lemma 11

Fix \(t>0\) and \(1>\varepsilon >0\). Then

$$\begin{aligned} \limsup _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log E e^{(1-\varepsilon ) \lambda ^2 t {\mathcal {I}}\left( \frac{\lfloor \widetilde{\alpha }_\lambda (t)\rfloor _{2\pi }}{ \lambda t}\right) }\le 0. \end{aligned}$$

Proof

Introduce the temporary notation \(G(x)= \lambda ^2 t {\mathcal {I}}\left( x\right) \). This is a convex function with \(G(1)=0\) as its minimum. Then we have

$$\begin{aligned} E e^{(1-\varepsilon ) \lambda ^2 t {\mathcal {I}}\left( \frac{ \lfloor \widetilde{\alpha }_\lambda (t)}{ \lambda t}\right) \rfloor _{2\pi }}&\le \ 2-\int _0^{1} (1-\varepsilon )G'(x) e^{(1-\varepsilon )G(x)} P({ \lfloor \widetilde{\alpha }_\lambda (t)\rfloor _{2\pi }}<\lambda t x)dx\\&+\int _{1}^\infty (1-\varepsilon )G'(x) e^{(1-\varepsilon )G(x)} P({ \lfloor \widetilde{\alpha }_\lambda (t)\rfloor _{2\pi }}>{ \lambda t}x)dx. \end{aligned}$$

Using Lemma 10 we get

$$\begin{aligned} P({ \lfloor \widetilde{\alpha }_\lambda (t)\rfloor _{2\pi }}<{ \lambda t}x) \le \exp \big \{ - (1-c_1\lambda ^{-1}(1+ t^{-1}))G(x)+ \lambda c_1 (t+1) \big \} \end{aligned}$$

for \(x<1\) and a similar bound for \( P({ \lfloor \widetilde{\alpha }_\lambda (t)\rfloor _{2\pi }}>{ \lambda t} x) \le P({ \lceil \widetilde{\alpha }_\lambda (t)\rceil _{2\pi }}>{ \lambda t}x) \) for \(x>1\). This gives us

$$\begin{aligned} E e^{(1-\varepsilon ) \lambda ^2t {\mathcal {I}}\left( \frac{\widetilde{\alpha }_\lambda (t)}{ \lambda t}\right) }&\le \ 2-\int _0^{1} (1-\varepsilon )G'(x) e^{((1+t^{-1})(c_1/\lambda )-\varepsilon )G(x)+\lambda c_1(t+1)} dx\\&+\int _{1}^\infty (1-\varepsilon )G'(x) e^{((1+t^{-1})(c_1/\lambda )-\varepsilon )G(x)+\lambda c_1(t+1)} dx\\&\le 2+ 4\varepsilon ^{-1} e^{\lambda \, c_1(t+1)} \end{aligned}$$

where the last inequality holds if \((1+t^{-1})c_1/\lambda <\varepsilon /2\), i.e. for large enough \(\lambda \). From this the lemma follows. \(\square \)

Now we turn to the lower bound proof in the large deviation result of Theorem 7. As we will see, we will be able to reduce the problem to studying the probability of \(\frac{1}{\lambda } \widetilde{\alpha }_\lambda (t)\) being close to a straight line.

Proposition 12

Fix \(q> 0\) and \(T, \varepsilon >0\). Then

$$\begin{aligned} \lim \limits _{\varepsilon \rightarrow 0}\liminf _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log P(\widetilde{\alpha }(t) \in [\lambda (q t-\varepsilon ), \lambda (q t+\varepsilon )], t\in [0,T])&\ge -T {\mathcal {I}}(q). \end{aligned}$$

Proof

As \(q>0\) we may assume \(\varepsilon \le q T/2\) by choosing \(\varepsilon \) small enough. Let \(N=\frac{\lceil (qT+\epsilon )\lambda \rceil _{2\pi }}{2\pi }\), and choose \(\varepsilon _1= \frac{\pi \varepsilon }{2q(qT+\varepsilon )}\), which satisfies \(\varepsilon _1< \frac{\varepsilon \lambda }{2qN}\) for \(\lambda >2\). Recall the definition of \(\tau ^{(n)}_\lambda \) and \(\widetilde{\tau }^{(n)}_\lambda \) from (31). We will prove that

$$\begin{aligned}&P(\widetilde{\alpha }(t) \in [\lambda (q t-\varepsilon ), \lambda (q t+\varepsilon )], t\in [0,T])\\&\qquad \ge P\left( \lambda \widetilde{\tau }^{(k)}_\lambda \in \left( \tfrac{2\pi }{q}-\varepsilon _1, \tfrac{2\pi }{q}+\varepsilon _1\right) , 1\le k \le N\right) . \end{aligned}$$

Roughly speaking, this will follow from the simple fact that if we are within \(\varepsilon /q\) of the line \(y=qt\) in the horizontal direction, then we are within \(\varepsilon \) in the vertical direction. If \( \lambda \widetilde{\tau }^{(k)}_\lambda \ge \frac{2\pi }{q}-\varepsilon _1\) for \(1\le k \le N\) then \(\lambda \tau ^{(k)}_\lambda \ge k( \frac{2\pi }{q}-\varepsilon _1)\) and

$$\begin{aligned} \widetilde{\alpha }_\lambda \left( \tfrac{k}{\lambda }\left( \tfrac{2\pi }{q}-\varepsilon _1\right) \right) \le 2 k \pi = \lambda \frac{2\pi }{2\pi /q-\varepsilon _1}\cdot \frac{k}{\lambda }\left( \tfrac{2\pi }{q}-\varepsilon _1\right) \end{aligned}$$

for \(1\le k \le N\). Together with the fact that \(\lfloor \widetilde{\alpha }_\lambda \rfloor _{2\pi }\) is non-decreasing we get that

$$\begin{aligned} \widetilde{\alpha }_\lambda (t)\le \lambda \frac{2\pi }{2\pi /q-\varepsilon _1}\cdot \left( t+\tfrac{1}{\lambda }(2\pi /q-\varepsilon _1) \right) {,} \quad \hbox { for } t\le \tfrac{N}{\lambda }\left( \tfrac{2\pi }{q}-\varepsilon _1\right) . \end{aligned}$$

This inequality implies \(\widetilde{\alpha }_\lambda (t)\le \lambda (qt+\varepsilon )\), for \( t\le T\), \(\lambda \varepsilon >4\pi \).

The other direction is similar, if we have \( \lambda \widetilde{\tau }^{(k)}_\lambda \le \frac{2\pi }{q}+\varepsilon _1\) for \(1\le k \le N\) then

$$\begin{aligned} \widetilde{\alpha }_\lambda \left( \tfrac{k}{\lambda }\left( \tfrac{2\pi }{q}+\varepsilon _1\right) \right) \ge 2 k \pi = \lambda \frac{2\pi }{2\pi /q+\varepsilon _1}\cdot \frac{k}{\lambda }\left( \tfrac{2\pi }{q}+\varepsilon _1\right) \end{aligned}$$

which implies

$$\begin{aligned} \widetilde{\alpha }_\lambda (t)\ge \lambda \frac{2\pi }{2\pi /q+\varepsilon _1}\cdot \left( t-\tfrac{1}{\lambda }(2\pi /q+\varepsilon _1) \right) {,} \quad \hbox { for } t\le \tfrac{N}{\lambda }\left( \tfrac{2\pi }{q}+\varepsilon _1\right) . \end{aligned}$$

and \(\widetilde{\alpha }_\lambda (t)\ge \lambda (qt-\varepsilon )\) for \(t\le T\). Using the independence of \(\widetilde{\tau }^{(k)}_\lambda \) we get the bound

$$\begin{aligned} P(\widetilde{\alpha }(t) \in [\lambda (q t-\varepsilon ), \lambda (q t+\varepsilon )], t\in [0,T])\ge P\left( \lambda \widetilde{\tau }^{(k)}_\lambda \in ( \tfrac{2\pi }{q}-\varepsilon _1, \tfrac{2\pi }{q}+\varepsilon _1)\right) ^{N}.\nonumber \\ \end{aligned}$$
(48)

By the lower bound (19) we have

$$\begin{aligned}&\log P(\widetilde{\alpha }(t) \in [\lambda (q t-\varepsilon ), \lambda (q t+\varepsilon )], t\in [0,T])\\&\qquad \ge \frac{\lambda (qT+2\varepsilon )}{2\pi }\left( - \frac{2\pi \lambda }{q} {\mathcal {I}}(q)- \frac{\lambda |a|}{8}(\varepsilon _1+4(2\pi /q+\varepsilon _1))+ \log A(\varepsilon _1, \lambda , a)\right) . \end{aligned}$$

Recalling \(\varepsilon _1= \frac{\pi \varepsilon }{2q(qT+\varepsilon )}\) we get

$$\begin{aligned} \lim \limits _{\varepsilon \rightarrow 0}\liminf _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log P(\widetilde{\alpha }(t) \in [\lambda (q t-\varepsilon ), \lambda (q t+\varepsilon )], t\in [0,T])&\ge -T {\mathcal {I}}(q). \end{aligned}$$

\(\square \)

Proof of the lower bound in Theorem 7

Let \(G\) be an open subset of \(C[0,T]\). We would like to show that

$$\begin{aligned} \liminf _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log P\left( \frac{1}{\lambda } \widetilde{\alpha }_\lambda (\cdot ) \in G\right) \ge - \inf _{g\in G} \int _0^T {\mathcal {I}}\left( g'(t)\right) dt. \end{aligned}$$
(49)

For this it is enough to prove that for any \(g\in G\) with \(\int _0^T {\mathcal {I}}\left( g'(t)\right) dt<\infty \) and \(\delta >0\) we have

$$\begin{aligned} \liminf _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log P \left( \frac{1}{\lambda } \widetilde{\alpha }_\lambda (\cdot ) \in G \right) \ge -\int _0^T {\mathcal {I}}\left( g'(t)\right) dt-\delta . \end{aligned}$$
(50)

We can approximate \(g\) with a piecewise linear function \(\tilde{g}\) in the sup-norm so that we have \(|\int _0^T {\mathcal {I}}\left( g_n'(t)\right) dt- \int _0^T {\mathcal {I}}\left( g'(t)\right) dt|<\delta \). Because of this we may assume that \(g\) is piecewise linear, moreover, we may assume that there are no horizontal segments in \(g\). Suppose that \(g\) is linear with slope \(q_i\) on the interval \([T_i,T_{i+1}]\) with \(0\le i \le k-1\) and \(0=T_0<T_1<\dots <T_k=T\). We claim that if \(\lambda >\lambda _0(\varepsilon , k)\) then

$$\begin{aligned}&P \left( \Vert \frac{1}{\lambda } \widetilde{\alpha }_\lambda (\cdot ) -g(\cdot )\Vert \le \varepsilon \right) \nonumber \\&\quad \ge P\left( |\frac{1}{\lambda }( \widetilde{\alpha }_\lambda (t)- \widetilde{\alpha }_\lambda (T_i)) -q_i (t-T_i)|\le \varepsilon /k,\hbox { if } t\in [T_i,T_{i+1}]\right) \nonumber \\&\quad \ge \prod _{i=0}^{k-1} P\left( |\frac{1}{\lambda } \widetilde{\alpha }_\lambda (t) -q_i t|\le \varepsilon /(2k),\hbox { for } t\in [0,T_{i+1}-T_i]\right) \end{aligned}$$
(51)

The first inequality is straightforward, to prove the second we use the coupling in the proof of Proposition 4. Recall the definition of the processes \(\hat{\alpha }_i(s)\) defined on \([t_{i-1},t_i]\). These were independent for different values of \(i\) and the process \(\hat{\alpha }_i(s+t_{i-1}), s\in [0,t_i-t_{i-1}]\) had the same distribution as \(\widetilde{\alpha }_\lambda (s), s\in [0,t_i-t_{i-1}]\). We also had

$$\begin{aligned} \hat{\alpha }_i(s)+\lfloor \widetilde{\alpha }_\lambda (t_{i-1})\rfloor _{2\pi } \le \widetilde{\alpha }_\lambda (s)\le \hat{\alpha }_i(s)+\lfloor \widetilde{\alpha }_\lambda (t_{i-1})\rfloor _{2\pi }+2\pi . \end{aligned}$$

for \(s\in [t_{i-1},t_i]\). By choosing \(\lambda >\lambda _0=4\pi k/\varepsilon \) the inequality (51) follows by the independent increment property of the Brownian motion.

By Proposition 12 we have the bound

$$\begin{aligned}&\lim \limits _{\varepsilon \rightarrow 0} \liminf _{\lambda \rightarrow \infty } \tfrac{1}{\lambda ^2} \log P\left( \Vert \frac{1}{\lambda } \widetilde{\alpha }_\lambda (\cdot ) -g(\cdot )\Vert \le \varepsilon \right) \\&\quad \ge -\sum _{i=0}^{k-1} (T_{i+1}-T_i) {\mathcal {I}}(q_i)= -\int _0^T {\mathcal {I}}\left( g'(t)\right) dt \end{aligned}$$

from which (50) and thus the proof of the lower bound follows.

5 The path deviation for the \({\hbox {Sine}}_\beta \) process

This section contains the proof of Theorem 6. The strategy for the proof is to approximate the SDE (3) with a version where the drift is piecewise constant and then use elements of the proof of Theorem 7. Just as in the proof of Theorem 7, we need to show an upper and a lower bound to prove the large deviation principle. The fact that \({\mathcal {J}}_{\mathrm{Sine}_\beta }\) is a good rate function will be proved in Proposition 14 of Sect. 6.

Proof of the upper bound in Theorem 6

For the proof of the upper bound we go through a series of approximations: we essentially cut of the tail of the process, then replace the drift in the SDE with a piecewise constant version and then approximate the process with a piecewise linear version. Recall that \(\alpha _\lambda (t)\) solves the SDE (2) and that we introduced the notation \({\mathfrak {f}}(t)=\tfrac{\beta }{4}e^{\tfrac{\beta }{4}t}\). Fix \(T>0\), the value of which will go to infinity later. The first approximating process is defined as

$$\begin{aligned} \alpha _\lambda ^{(1)} (t)&= \alpha _\lambda (t) \mathbf{1}(t\le T)+ (\alpha _\lambda (T)+ \lambda (e^{-\frac{\beta }{4}T} - e^{-\frac{\beta }{4}t})) \mathbf{1}(t>T), \end{aligned}$$

this solves the SDE (2) with the noise ‘turned off’ at \(t=T\). For the second process we define

$$\begin{aligned} {\mathfrak {f}}_N(t)= {\mathfrak {f}}(Ti/N), \qquad t\in [Ti/N,T(i+1)/N) \end{aligned}$$
(52)

and consider the solution \(\xi _{\lambda {\mathfrak {f}}_N}\) of (15) with drift \(\lambda {\mathfrak {f}}_N\) and initial condition 0. Let

$$\begin{aligned} \alpha _\lambda ^{(2)}(t) = \xi _{\lambda {\mathfrak {f}}_N} (t) \mathbf{1}(t\le T)+ ( \xi _{\lambda {\mathfrak {f}}_N} (T)+ \lambda (e^{-\frac{\beta }{4}T} - e^{-\frac{\beta }{4}t})) \mathbf{1}(t>T). \end{aligned}$$

Finally, let \(\pi _{MN}\) is the projection defined in (39) with intervals of size \(T/MN\), that is \(\pi _{MN} f\) is the piecewise linear path that satisfies

$$\begin{aligned} (\pi _{MN} f)(Ti/(MN)) = \lfloor f(Ti/(MN))\rfloor _{2\pi }, \end{aligned}$$

and is linear between these values. Define

$$\begin{aligned} \alpha _\lambda ^{(3)}(t) = \pi _{MN} \xi _{\lambda {\mathfrak {f}}_N} (t) \mathbf{1}(t\le T)+ (\pi _{MN} \xi _{\lambda {\mathfrak {f}}_N} (T)+ \lambda (e^{-\frac{\beta }{4}T} - e^{-\frac{\beta }{4}t})) \mathbf{1}(t>T). \end{aligned}$$

Then for any closed set \(K\subset C[0,\infty )\) we have that

$$\begin{aligned}&P\left( \frac{\alpha _\lambda }{\lambda } \in K\right) \le P\Big ( \frac{\alpha _\lambda ^{(3)}}{\lambda } \in K^{3\delta } \Big )+ P(\Vert \alpha _\lambda ^{(1)}- \alpha _\lambda \Vert _\infty \ge \delta \lambda )\nonumber \\&\qquad + P(\Vert \alpha _\lambda ^{(2)}- \alpha _\lambda ^{(1)} \Vert _\infty \ge \delta \lambda )+P(\Vert \alpha _\lambda ^{(3)}- \alpha _\lambda ^{(2)} \Vert _\infty \ge \delta \lambda ), \end{aligned}$$
(53)

where \(K^{3\delta }\) is defined similarly to (38), as the \(3\delta \)-fattening of \(K\). We will begin with the main term. Let

$$\begin{aligned} \mathcal {J}_N (g)= \int _0^\infty {\mathfrak {f}}_N^2(t) {\mathcal {I}}\left( \frac{g'(t)}{{\mathfrak {f}}_N(t)}\right) dt, \end{aligned}$$

and define (similarly to the \(\tilde{\alpha }_\lambda \) case in the proof of Theorem 7)

$$\begin{aligned} \Delta \alpha _i&= \frac{MN}{\lambda {\mathfrak {f}}_N(\tfrac{T i}{MN})T} \left( \lfloor \alpha ^{(3)}(Ti/(MN))\rfloor _{2\pi }- \lfloor \alpha ^{(3)}(T(i-1)/(MN))\rfloor _{2\pi } \right) , \end{aligned}$$

for \(1\le i \le MN\). Then,

$$\begin{aligned} P\Big ( \frac{\alpha _\lambda ^{(3)}}{\lambda } \in K^{3\delta } \Big )&\le P\Bigg ( \mathcal {J}_N\bigg ( \frac{\alpha _\lambda ^{(3)}}{\lambda } \bigg ) \ge \inf _{g\in K^{3\delta }} \mathcal {J}_N(g) \Bigg )\\&= P \Bigg ( \sum _{i=1}^{ MN} \frac{T({\mathfrak {f}}_N(Ti/(MN)))^2}{MN} {\mathcal {I}}(\Delta \alpha _i) \ge \inf _{g\in K^{3\delta }} \mathcal {J}_N(g) \Bigg ). \end{aligned}$$

Take \(\widehat{\alpha }_i\) to solve (12) but with the Brownian motion \(B(t+ Ti/(MN))-B(T i/(MN))\) and \(\lambda _i= \lambda {\mathfrak {f}}_N(Ti/(MN))\). Then using the same arguments as in the bound (45) we get

$$\begin{aligned}&P\Big ( \frac{\alpha _\lambda ^{(3)}}{\lambda } \!\in \! K^{3\delta } \Big )\\&\quad \le e^{-(1-\varepsilon ) \lambda ^2 C_{\delta ,N}}\prod _{i=1}^{ MN} \left( E e^{(1-\varepsilon ) \lambda _i^2 \frac{T}{MN}\left( (1\!+ \frac{2\pi MN}{\lambda T}){\mathcal {I}}\left( \frac{ \lfloor \widehat{\alpha }_i(T/(MN))\rfloor _{2\pi }}{ \lambda _i (T/MN) }\right) +c_2 \frac{2\pi M N}{\lambda _i T}\right) }\right) \end{aligned}$$

where \(C_{\delta ,N}=\inf _{g\in K^{3\delta }} \mathcal {J}_N(g)\). Using the bound proved in Lemma 11 we get that

$$\begin{aligned} \limsup _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2}\log P\Big ( \frac{\alpha _\lambda ^{(3)}}{\lambda } \in K^{3\delta } \Big )&\le -(1-\varepsilon )C_{\delta ,N} \end{aligned}$$
(54)

We now turn to the first error term. Using the fact that \(\lfloor \alpha _\lambda \rfloor _{2\pi }\) is non-decreasing (which follows from (i) of Proposition 5) we get that

$$\begin{aligned} \Vert \alpha _\lambda ^{(1)}-\alpha _\lambda \Vert \le \alpha _\lambda (\infty )-\alpha _\lambda (T)+\lambda e^{-\tfrac{\beta }{4}T}, \end{aligned}$$

where \(\alpha _\lambda (\infty )\) is the limit of \(\alpha _\lambda (t)\) as \(t\rightarrow \infty \). Choose \(T\) large enough so that \(e^{-\frac{\beta }{4}T} \le \delta /2\). Then

$$\begin{aligned} P(\Vert \alpha _\lambda ^{(1)}- \alpha _\lambda \Vert \ge \delta \lambda ) \le P(\alpha _\lambda (\infty )- \alpha _\lambda (T) \ge \delta \lambda /2) \end{aligned}$$

We will deal with this tail probability in Proposition 13 below. In particular, we will show that there is a constant \(c_1>0\) so that

$$\begin{aligned} \limsup _{\lambda \rightarrow \infty }\frac{1}{\lambda ^2} \log P(\Vert \alpha _\lambda ^{(1)}- \alpha _\lambda \Vert \ge \delta \lambda )\le -c_1 T \delta ^2. \end{aligned}$$
(55)

For the second error term we first note that \( \Vert \alpha _\lambda ^{(2)}- \alpha _\lambda ^{(1)}\Vert =\sup _{t\in [0,T]} | \alpha _\lambda ^{(2)}(t)- \alpha _\lambda ^{(1)}(t)|\). Using the coupling of Proposition 5 we can show that on \([0,T]\) the process \(\alpha _\lambda ^{(2)}- \alpha _\lambda ^{(1)}\) will have the same distribution as the solution of the SDE (15) with initial condition 0 and drift \(\lambda ({\mathfrak {f}}_N-{\mathfrak {f}})\ge 0\). Moreover, this process will be non-negative (because the drift is non-negative), and since \(\lambda ({\mathfrak {f}}_N(t)-{\mathfrak {f}}(t))\le \lambda \tfrac{\beta T}{4N}\) for \(t\in [0,T]\), it will be bounded by the solution of the SDE (15) with a constant drift \( \lambda \tfrac{\beta T}{4N}\). Because of this \( \Vert \alpha _\lambda ^{(2)}- \alpha _\lambda ^{(1)}\Vert \) is stochastically bounded by \(\sup _{t\in [0,T]} \widetilde{\alpha }_{\lambda \tfrac{\beta T}{4N}}(t)\le \widetilde{\alpha }_{\lambda \tfrac{\beta T}{4N}}(T)+2\pi \) with \(\widetilde{\alpha }_\lambda \) from (12, using the fact that \(\lfloor \widetilde{\alpha }_\lambda (t)\rfloor _{2\pi }\) is non-decreasing. Thus for \(\delta \lambda >4\pi \) we have

$$\begin{aligned} P( \Vert \alpha _\lambda ^{(2)}- \alpha _\lambda ^{(1)}\Vert \ge \delta \lambda )&\le P\left( \tilde{\alpha }_{\lambda \tfrac{\beta T}{4N}}(T) \ge \tfrac{1}{2} \delta \lambda \right) . \end{aligned}$$

If \(N\) and \(T\) are fixed then if \(\lambda \) is big enough then we can apply Lemma 10 for the right hand side with \(\tilde{\lambda }=\lambda \tfrac{\beta T}{4N}\), \(t=T\) and \(q=\frac{\tfrac{1}{2} \delta \lambda }{T \lambda \tfrac{\beta T}{4N}}=\frac{2\delta N}{\beta T^2}\). This leads to

$$\begin{aligned} \limsup _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log P( \Vert \alpha _\lambda ^{(2)}- \alpha _\lambda ^{(1)}\Vert \ge \tfrac{1}{2} \delta \lambda ) \le -\frac{\beta ^2 T^3}{4^2N^2}{\mathcal {I}}\left( \frac{2\delta N}{\beta T^2} \right) . \end{aligned}$$
(56)

For the third error term we first note that

$$\begin{aligned} \Vert \alpha _\lambda ^{(3)}-\alpha _\lambda ^{(2)}\Vert \le \sup _{t\in [0,T]} |\alpha _\lambda ^{(3)}(t)-\alpha _\lambda ^{(2)}(t)|\le \max _{i} \sup _{t\in [Ti/N,T(i+1)/N]} |\alpha _\lambda ^{(3)}(t)\!-\!\alpha _\lambda ^{(2)}(t)|, \end{aligned}$$

and thus

$$\begin{aligned} P(\Vert \alpha _\lambda ^{(3)}-\alpha _\lambda ^{(2)}\Vert \ge \delta \lambda ) \le \sum _{i=0}^{N-1} P\left( \sup _{t\in [Ti/N,T(i+1)/N]} |\alpha _\lambda ^{(3)}(t)-\alpha _\lambda ^{(2)}(t)|\ge \delta \lambda \right) . \end{aligned}$$

In the interval \([Ti/N,T(i+1)/N]\) the process \(\alpha _\lambda ^{(2)}\) solves the SDE (12) with constant drift \(\lambda {\mathfrak {f}}_N(Ti/N)\). Here we can use the same steps that we used in the proof of Theorem 7 between (41) and (42) to get

$$\begin{aligned} P(\Vert \alpha _\lambda ^{(3)}-\alpha _\lambda ^{(2)}\Vert \ge \delta \lambda )&\le \sum _{i=1}^{ N-1} M P\left( \tilde{\alpha }_{\lambda {\mathfrak {f}}_N(T i/N)}(T/(MN)) \ge \delta \lambda /2\right) \\&\le MN P\left( \tilde{\alpha }_{\tfrac{\beta }{4}\lambda }(T/(MN)) \ge \delta \lambda /2\right) \end{aligned}$$

for \(\lambda \) big enough compared to \(\delta ^{-1}\). For large enough \(\lambda \) we can apply Lemma 10 for the right hand side with \(\tilde{\lambda }=\tfrac{\beta }{4}\lambda \), \(t=T/(MN)\) and \(q=\tfrac{2\delta M N}{\beta T}\) to get

$$\begin{aligned} \limsup _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2}\log P(\Vert \alpha _\lambda ^{(3)}-\alpha _\lambda ^{(2)}\Vert \ge \delta \lambda ) \le -\frac{\beta ^2}{4^2}\frac{T}{MN} {\mathcal {I}}\left( \frac{2\delta MN}{\beta T}\right) . \end{aligned}$$
(57)

Now taking (53) with the bounds (54), (55), (56) and (57) we get

$$\begin{aligned}&\limsup _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log P \Big ( \frac{\alpha _\lambda }{\lambda } \in K \Big ) \nonumber \\&\qquad \le \max \left\{ -(1\!-\!\varepsilon )C_{\delta ,N}, -c_1 T \delta ^2 ,-\frac{\beta ^2 T^3}{4^2N^2}{\mathcal {I}}\left( \frac{2\delta N}{\beta T^2} \right) ,-\frac{\beta ^2}{4^2}\frac{T}{MN} {\mathcal {I}}\left( \frac{2\delta MN}{\beta T}\right) \right\} .\nonumber \\ \end{aligned}$$
(58)

Taking \(N\) to \(\infty \) the last two terms go to \(-\infty \) (using the bounds (37)) while the first term converges to \((1-\varepsilon )C_\delta ^T\) with

$$\begin{aligned} C_{\delta }^T= \inf _{g\in K^{3\delta }}\int _0^T {\mathfrak {f}}^2(t) {\mathcal {I}}\left( g'(t)/{\mathfrak {f}}(t)\right) dt \end{aligned}$$

Letting now \(T\rightarrow \infty \) and then \(\varepsilon \rightarrow 0\) we get

$$\begin{aligned} \limsup _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log P \Big ( \frac{\alpha _\lambda }{\lambda } \in K \Big ) \le -\inf _{g\in K^{3\delta }} {\mathcal {J}}_{\mathrm{Sine}_\beta }(g). \end{aligned}$$

Finally taking \(\delta \rightarrow 0\) and using the fact that \({\mathcal {J}}_{\mathrm{Sine}_\beta }\) is a good rate function gives the result

$$\begin{aligned} \limsup _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log P \Big ( \frac{\alpha _\lambda }{\lambda } \in K \Big ) \le -\inf _{g\in K}{\mathcal {J}}_{\mathrm{Sine}_\beta }(g). \end{aligned}$$

This completes the proof of the lower bound. \(\square \)

We now prove the tail bound for the proof of the lower bound.

Proposition 13

Fix \(T, \delta >0\), then there is a constant \(c>0\) so that

$$\begin{aligned} \limsup _{\lambda \rightarrow \infty }\frac{1}{\lambda ^2} \log P( \alpha _\lambda (\infty )- \alpha _\lambda (T) \ge \delta \lambda ) \le -c T \delta ^2. \end{aligned}$$
(59)

Proof of Proposition 13

Take \(\nu =1/8\), and set \(T_k=\tfrac{k(k+1)}{2}\theta T\) where the value of \(\theta >0\) will be specified later. Then we can break up the probability in question as

$$\begin{aligned} P(\alpha _{\lambda }(\infty )-\alpha _\lambda (T) \ge \delta \lambda )&\le \sum _{k=1}^{\lfloor 2 \sqrt{\lambda } \rfloor } P(\alpha _{\lambda }(T_{k+1})-\alpha _{\lambda }(T_k)\ge \tfrac{\delta }{4} \lambda \nu k^{-(1+\nu )})\nonumber \\&+P(\alpha _{\lambda }(\infty )-\alpha _{\lambda }(T_{ \lfloor 2 \sqrt{\lambda }\rfloor +1})\ge \delta \lambda /2). \end{aligned}$$
(60)

Note, that for any fixed \(s>0\) the process \(\widehat{\alpha }_{s,\lambda }(t)=\alpha _\lambda (s+t)\) satisfies the SDE (3) with \(\hat{\lambda }=\lambda e^{-\tfrac{\beta }{4}s}\) with initial condition \(\alpha _\lambda (s)\). Using the coupling techniques of Propositions 4 and 5 one can show that \(\widehat{\alpha }_{s,\lambda }(t)-\widehat{\alpha }_{s,\lambda }(0)=\alpha _\lambda (s+t)-\alpha _\lambda (s)\) is stochastically dominated by \(\widetilde{\alpha }_{\lambda {\mathfrak {f}}(s)}+2\pi \). This (together with \(T_{k+1}-T_k=\theta (k+1)T\)) gives

$$\begin{aligned} P(\alpha _{\lambda }(T_{k+1})\!-\!\alpha _{\lambda }(T_k)\ge \tfrac{\delta }{4} \lambda \nu k^{-(1+\nu )})&\le P(\widetilde{\alpha }_{\lambda {\mathfrak {f}}(T_k)}(\theta (k+1)T)\ge \tfrac{\delta }{4} \lambda \nu k^{-(1+\nu )}-2\pi )\\&\le P(\widetilde{\alpha }_{\lambda {\mathfrak {f}}(T_k)}(\theta (k+1)T)\ge \tfrac{\delta }{8} \lambda \nu k^{-(1+\nu )}) \end{aligned}$$

where the last bound follows for big enough \(\lambda \) from \(k\le 2\sqrt{\lambda }\). We can use bound (30) of Lemma 10 for the probability on the right with \(\tilde{\lambda }=\lambda {\mathfrak {f}}(T_k)\), \(t=\theta (k+1)T\) and \(q=\frac{\delta \nu k^{-(1+\nu )}}{8\theta T (k+1){\mathfrak {f}}(T_k)}\), since with these choices \(q t \tilde{\lambda }\), \(q\) and \(\tilde{\lambda }q \log q\) are all big, if we choose \(\theta >0\) small enough and then \(\lambda \) big enough. This leads to

$$\begin{aligned}&P(\widetilde{\alpha }_{\lambda {\mathfrak {f}}(T_k)}(T)\ge \tfrac{\delta }{4} \lambda \nu k^{-(1+\nu )})\\&\quad \le \exp \left( -c_1 \tfrac{\delta ^2}{8^2} \lambda ^2 \nu ^2 k^{-2(1+\nu )}\theta ^{-1} (k+1)^{-1}T^{-1} \log ^2\left( \frac{\delta \nu k^{-(1+\nu )}}{8\theta (k+1)T {\mathfrak {f}}(T_k)} \right) \right) \\&\quad \le \exp \left( -c_2 \delta ^2 \lambda ^2 k^{-3-2\nu } T^{-1} \left( c_3+\tfrac{\beta }{4} T \tfrac{k(k+1)}{2}\right) ^2\right) \le \exp \left( -c_4 \lambda ^2 \delta ^2 T k^{1-2\nu } \right) , \end{aligned}$$

with a positive constant \(c_4\), which in turn implies (for large enough \(\lambda \))

$$\begin{aligned} \sum _{k=1}^{\lfloor 2 \sqrt{\lambda } \rfloor } P\left( \alpha _{\lambda }(T_{k+1})-\alpha _{\lambda }(T_k)\ge \tfrac{\delta }{4} \lambda \nu k^{-(1+\nu )}\right) \le 2 \exp \left( -c_4 \lambda ^2 \delta ^2 T \right) . \end{aligned}$$
(61)

Lastly we bound the remaining term using Proposition 5:

$$\begin{aligned} P(\alpha (\infty )- \alpha (T\lambda ) \ge \delta \lambda /2)&= P( \xi _{\lambda {\mathfrak {f}}(\lambda T)}(\infty ) \ge \lfloor \delta \lambda / 2\rfloor ) \le 2 \left( e^{-\frac{\beta }{4} \lambda T}\right) ^{\lfloor \delta \lambda / 2\rfloor }, \end{aligned}$$

which together with (60) and (61) gives us the necessary upper bound for (59). \(\square \)

Proof of the lower bound in Theorem 6

We will show that if \(g\in C[0,\infty )\) with \({\mathcal {J}}_{\mathrm{Sine}_\beta }(g)<\infty \) then

$$\begin{aligned} \lim \limits _{\varepsilon \rightarrow 0} \liminf _{\lambda \rightarrow \infty } \tfrac{1}{\lambda ^2} \log P(\Vert \lambda ^{-1} \alpha _\lambda (\cdot )-g(\cdot ) \Vert \le \varepsilon )\ge -{\mathcal {J}}_{\mathrm{Sine}_\beta }(g). \end{aligned}$$
(62)

From this the lower bound will follow.

In Proposition 14 of the “Appendix” we will prove that if \({\mathcal {J}}_{\mathrm{Sine}_\beta }(g)<\infty \) then \(g(\infty )=\lim \nolimits _{t\rightarrow \infty } g(t)<\infty \) exists. Let \(\varepsilon >0\) and choose \(T>0\) so that

$$\begin{aligned} g(\infty )-g(T)\le \varepsilon /2, \qquad \hbox {and} \qquad e^{-\tfrac{\beta }{4}T}\le \varepsilon /4. \end{aligned}$$
(63)

From the first assumption in (63) and the Markov property we have

$$\begin{aligned}&P(|\lambda ^{-1} \alpha _\lambda (t)-g(t)|\le \varepsilon , t\ge 0)\nonumber \\&\quad \ge P(|\lambda ^{-1} \alpha _\lambda (t)-g(t)|\le \varepsilon /2, t\in [0,T], |\alpha _\lambda (\infty )-\alpha _\lambda (T)|\le \lambda \varepsilon /4)\nonumber \\&\quad \ge P(|\lambda ^{-1} \alpha _\lambda (t)-g(t)|\le \varepsilon /2, t\in [0,T]) \nonumber \\&\qquad \,\,\times \, \sup _x P(\alpha _\lambda (\infty )- \alpha _\lambda (T)\le \lambda \varepsilon /4 \big \vert \alpha _\lambda (T)=x). \end{aligned}$$
(64)

Using the same line of reasoning as in the proof of Proposition 13 (see after (60)) we get that with \(\lambda _T=\lambda e^{-\tfrac{\beta }{4}T}\) we have

$$\begin{aligned} P(\alpha _\lambda (\infty )- \alpha _\lambda (T)&\le \lambda \varepsilon /4 \vert \alpha _\lambda (T)=x)\\&\ge P(\alpha _{ \lambda _T}(\infty )\le \lambda \varepsilon /4-2\pi )\ge P(\alpha _{ \lambda _T}(\infty )\le \lambda \varepsilon /8), \end{aligned}$$

where the second inequality follows if \(\lambda \) is big enough compared to \(\varepsilon \). Now we can use part (iii) of Proposition 5 with \(f(t)=\lambda _T {\mathfrak {f}}(t)\), \(k=1\) and \(a=\lambda \varepsilon /8\) to get

$$\begin{aligned} P(\alpha _{ \lambda _T}(\infty )\le \lambda \varepsilon /8)=1-P(\alpha _{ \lambda _T}(\infty )> \lambda \varepsilon /8)\ge 1-2\frac{8 \lambda _T}{2\pi \lambda \varepsilon }\ge 1-\frac{2}{\pi }, \end{aligned}$$

where the last step follows from the second assumption of (63).

Using this with (64) we get that

$$\begin{aligned}&\liminf _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log P(|\lambda ^{-1} \alpha _\lambda (t)-g(t)|\le \varepsilon , t\ge 0) \\&\quad \ge \liminf _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log P(|\lambda ^{-1} \alpha _\lambda (t)-g(t)|\le \varepsilon /2, t\in [0,T]), \end{aligned}$$

and it is enough to estimate the right hand side. We do this by introducing the process \(\xi _N(t)\) on \([0,T]\) which is a solution of the SDE (15) with initial condition 0 and the piecewise constant drift function \(\lambda {\mathfrak {f}}_N\) where \({\mathfrak {f}}_N\) is defined as in (52). From Proposition 5 we have that \(\alpha _\lambda (t)\le \xi _N(t)\) and \(\widehat{\xi }_N(t)=\xi _N(t)-\alpha _\lambda (t)\) satisfies SDE (16) with initial condition 0 and drift \(\lambda ({\mathfrak {f}}_N(t)-{\mathfrak {f}}(t))\). We have

$$\begin{aligned} P(|\lambda ^{-1} \alpha _\lambda (t)-g(t)|\le \varepsilon /2, t\in [0,T])&\ge P(|\lambda ^{-1} \xi _N(t)-g(t)|\le \varepsilon /4, t\in [0,T])\nonumber \\&-\,P\bigg (\sup _{t\in [0,T]}|\xi _N(t)- \alpha _\lambda (t)|\ge \lambda \varepsilon /4\bigg ).\nonumber \\ \end{aligned}$$
(65)

The second term on the right may be bounded in the same manner as (56). this gives us

$$\begin{aligned} \limsup _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log P\bigg (\sup _{t\in [0,T]}|\xi _N(t)- \alpha _\lambda (t)|\ge \lambda \varepsilon /4\bigg ) \le -\frac{\beta ^2 T^3}{4^2N^2}{\mathcal {I}}\left( \frac{\varepsilon N}{\beta T^2} \right) . \end{aligned}$$

Note, that as \(N\rightarrow \infty \) the right hand side converges to \(-\infty \).

The only thing left is to estimate the first term on the right of (65). Introduce the notation \(t_k=\tfrac{Tk}{N}\). We start with the bound

$$\begin{aligned}&\log P(|\lambda ^{-1} \xi _N(t)-g(t)|\le \varepsilon /4, t\in [0,T])\\&\quad \ge P(|\lambda ^{-1} (\xi _N(s+t_k)-\xi _N(t_k))-(g(s+t_k)-g(t_k))|\le \varepsilon /(4N), s\in [0,T/N]). \end{aligned}$$

For any fixed \(k\) the process \(\xi _N(s+t_k), s\in [0,T/N]\) satisfies the SDE (16) with initial condition \(\xi _N(t_k)\) and a constant drift \(\lambda {\mathfrak {f}}_N(t_k)\). Using the coupling in the proof of Proposition 4 we can construct independent processes \(\widehat{\alpha }_k(t), t\in [0,T/N]\) so that

$$\begin{aligned} \widehat{\alpha }_k(s)-2\pi \le \xi _N(s+t_k)-\xi _N(t_k)\le \widehat{\alpha }_k(t)+2\pi , \quad s\in [0,T/N] \end{aligned}$$

and \(\widehat{\alpha }_k(t), t\in [0,T/N]\) has the same distribution as \(\widetilde{\alpha }_{\lambda {\mathfrak {f}}_N(t_k)}(t), t\in [0,T/N]\). From this it immediately follows that

$$\begin{aligned}&\liminf _{\lambda \rightarrow \infty } \tfrac{1}{\lambda ^2} \log P(|\lambda ^{-1} \xi _N(t)-g(t)|\le \varepsilon /4, t\in [0,T])\\&\ge \sum _{k=0}^{N-1} \liminf _{\lambda \rightarrow \infty } \tfrac{1}{\lambda ^2} \log P\left( \left| \frac{\tilde{\alpha }_{\lambda \mathfrak {f}_{N}(t_k)}(s)}{\lambda }-(g(s+t_k)-g(t_k))\right| <\frac{\varepsilon }{8N}, s\in \left[ 0,\frac{T}{N}\right] \right) . \end{aligned}$$

From our path level large deviation lower bound on \(\widetilde{\alpha }\) we get

$$\begin{aligned}&\liminf _{\lambda \rightarrow \infty } \tfrac{1}{\lambda ^2} \log P(|\lambda ^{-1} \widetilde{\alpha }_{\lambda {\mathfrak {f}}_N(t_k)}(s)-(g(s+t_k)-g(t_k))|\le \varepsilon /(4N), s\in [0,T/N])\\&\qquad \ge -\inf _{\begin{array}{c} |\tilde{g}(s)-g(s)|<\varepsilon /(8N)\\ s\in [t_k,t_{k+1}] \end{array}} {\mathfrak {f}}_N(t_k)^2 \int _0^{T/N} {\mathcal {I}}({\mathfrak {f}}_N(t_k)^{-1} \tilde{g}'(t_k+s)) ds. \end{aligned}$$

This yields the estimate

$$\begin{aligned}&\liminf _{\lambda \rightarrow \infty } \tfrac{1}{\lambda ^2} \log P(|\lambda ^{-1} \alpha _\lambda (t)-g(t)|\le \varepsilon /4, t\in [0,T]) \\&\ge -\inf _{\begin{array}{c} |\tilde{g}(s)-g(s)|< \varepsilon /(8N),\\ s\in [0,T] \end{array}} \int _0^T {\mathfrak {f}}_N(s)^2 {\mathcal {I}}({\mathfrak {f}}_N^{-1} \tilde{g}'(s)) ds. \end{aligned}$$

Letting \(N\rightarrow \infty \) the lower bound converges to \(-\int _0^T {\mathfrak {f}}(s)^2{\mathcal {I}}({\mathfrak {f}}^{-1}(s) g'(s))ds\) which (together with our previous estimates) shows that

$$\begin{aligned} \liminf _{\lambda \rightarrow \infty } \frac{1}{\lambda ^2} \log P(|\lambda ^{-1} \alpha _\lambda (t)-g(t)|\le \varepsilon , t\ge 0)\ge -\int _0^T {\mathfrak {f}}(s)^2{\mathcal {I}}({\mathfrak {f}}^{-1}(s) g'(s))ds. \end{aligned}$$

Letting \(\varepsilon \rightarrow 0\) we also have \(T=T_\varepsilon \rightarrow \infty \) which yields the bound (62) and concludes the proof of the lower bound in the large deviation principle. \(\square \)

6 \(\mathcal {J}_{\hbox {Sch},T}\) and \({\mathcal {J}_{\mathrm{Sine}_\beta }}\) are good rate functions

In this section we will show that \({\mathcal {J}}_{\mathrm{Sch},T}\) and \({\mathcal {J}}_{\mathrm{Sine}_\beta }\) are good rate functions. Our main tools are the bound (37) and the estimate

$$\begin{aligned} {\mathcal {I}}(x)\ge c_1 (x-1)^2, \quad \hbox {if } x>0 \end{aligned}$$
(66)

both of which will be proved in Proposition (17) of the “Appendix”.

Proposition 14

The functions \({\mathcal {J}}_{\mathrm{Sine}_\beta }(\cdot )\) and \({\mathcal {J}}_{\mathrm{Sch},T}(\cdot )\) are both good rate functions on the spaces \(C[0,\infty )\) and \(C[0,T]\) respectively. Moreover, if \(g\in C[0,\infty )\) and \({\mathcal {J}}_{\mathrm{Sine}_\beta }(g)<\infty \) then \(\lim \nolimits _{t\rightarrow \infty } g(t)\) is finite.

Proof

Fix \(T>0\) and \(r\ge 0\). In order to prove that \(K_r=\{g:{\mathcal {J}}_{\mathrm{Sch},T}(\cdot )\le r\}\) is compact we first show the equicontinuity of this set. Suppose that \(g\in K_r\). Then \(g(0)=0\) and \(g'(x)\ge 0\) exists a.e. in \([0,T]\). We have for \(0\le x \le y\le T\)

$$\begin{aligned}&|g(x)-g(y)-(x-y) |=\left| \int _x^y (g'(s)-1) ds \right| \le (y-x)^{1/2} \sqrt{\int _x^y (g'(s)-1)^2 ds}\\&\qquad \le c (y-x)^{1/2} \sqrt{\int _x^y {\mathcal {I}}(g'(s)) ds}\le c (y-x)^{1/2} r^{1/2} \end{aligned}$$

where we used (66) in the second step. This shows that \(K_r\) is equicontinuous. Using Tonelli’s semicontinuity theorem (e.g. Theorem 3.5, [5]) the compactness of \(K_r\) now follows.

The proof for \({\mathcal {J}}_{\mathrm{Sine}_\beta }(\cdot )\) is bit more involved. Fix \(\beta >0\). It is convenient to transform the interval \([0,\infty )\) into \([0,1)\) using the function \(y=1-e^{-\beta t/4}\). Then for a \(g\in C[0,\infty )\) with \({\mathcal {J}}_{\mathrm{Sine}_\beta }(g)<\infty \) we have

$$\begin{aligned} {\mathcal {J}}_{\mathrm{Sine}_\beta }(g)=\int _0^\infty {\mathfrak {f}}^2(t) {\mathcal {I}}(g'(t) {\mathfrak {f}}^{-1}(t)) dt=\frac{\beta }{4} \int _0^1(1-y){\mathcal {I}}(\tilde{g}'(y)) dy \end{aligned}$$

where \(\tilde{g}(y)=g(-\tfrac{4}{\beta } \log (1-y))\), \(\tilde{g}\in C[0,1)\). Consider the functional \(\tilde{{\mathcal {J}}}_{\mathrm{Sine}}(\cdot )\) on \(C[0,1)\) defined as

$$\begin{aligned} \tilde{{\mathcal {J}}}_{\mathrm{Sine}}(g)= \tfrac{\beta }{4}\int _0^1 (1-t) {\mathcal {I}}(g'(t))dt \end{aligned}$$
(67)

if \(g'(t)\) exists and non-negative for a.e. \(0\le t<1\), and as \(\infty \) otherwise. Clearly, if we show that \(\tilde{{\mathcal {J}}}_{\mathrm{Sine}}(\cdot )\) is a good rate function on \(C[0,1)\) then the same will hold for \({\mathcal {J}}_{\mathrm{Sine}_\beta }\). We first show that if \(g\in C[0,1)\) and \(\tilde{{\mathcal {J}}}_{\mathrm{Sine}}(g)<\infty \) then \(\lim \nolimits _{y\rightarrow 1^-}g(y)\) is finite, i.e. we can consider \(\tilde{{\mathcal {J}}}_{\mathrm{Sine}}(\cdot )\) on \(C[0,1]\). We have

$$\begin{aligned} \lim \limits _{y\rightarrow 1^{-}} g(y)=\int _0^1 g'(y) dy\le 2+ \int _0^1 g'(y)\mathbf{1}(g'(y)\ge 2) dy. \end{aligned}$$

We will prove that

$$\begin{aligned}&\hbox {if }h(y)\ge 0,\,\,\hbox { and } \,\,\int _0^1 (1-y) h(y)^2 \log ^2(h(y)+e)dy<\infty ,\,\,\nonumber \\&\quad \hbox { then } \,\,\int _0^1 h(y) dy<\infty . \end{aligned}$$
(68)

Using this with \(h(y)=g'(y)\mathbf{1}(g'(y)\ge 2)\) together with the bound in (37) we get the boundedness of \(\int _0^1 g'(y) dy\) and the existence on \(\lim \nolimits _{y\rightarrow 1^{-}} g(y)\).

Let \(\Phi (x)=x^2 \log ^2(|x|+e)\), this is a strictly convex, even function with \(\lim \nolimits _{x\rightarrow 0} \frac{\Phi (x)}{x}=0\), \(\lim \nolimits _{x\rightarrow \infty } \frac{\Phi (x)}{x}=\infty \) (i.e. \(\Phi \) is a ‘nice Young function’). Introduce the complementary function

$$\begin{aligned} \Psi (x)=\Phi ^*(x)=\sup _{y\ge 0} \{ y |x|-\Phi (y)\}=\int _0^{|x|} (\Phi ')^{(-1)}(y) dy \end{aligned}$$

where \((\Phi ')^{(-1)}\) is the inverse of the strictly increasing function \(\Phi '\) on \([0,\infty )\). Assume that

$$\begin{aligned} A=\int _0^1 (1-y) \Phi (h(y)) dy<\infty \end{aligned}$$
(69)

and let \(\mu \) the measure on \([0,1]\) with \(d\mu =\frac{1}{A}(1-x) dx\). Consider the Orlicz spaces

$$\begin{aligned} L^\Phi _\mu&= \left\{ f: \hbox {there is an } a>0\hbox { with } \int _{[0,1]} \Phi (a f) d\mu <\infty \right\} ,\\ L^\Psi _\mu&= \left\{ f: \hbox {there is an } a>0\hbox { with } \int _{[0,1]} \Psi (a f) d\mu <\infty \right\} \end{aligned}$$

with the Luxemburg-norms defined as

$$\begin{aligned} \Vert f\Vert _{\Phi }&= \inf \left\{ b>0: \int _{[0,1]} \Phi (b^{-1} f) d\mu \le 1\right\} ,\nonumber \\ \Vert f\Vert _{\Psi }&= \inf \left\{ b>0: \int _{[0,1]} \Psi (b^{-1} f) d\mu \le 1\right\} . \end{aligned}$$
(70)

(See e.g. [21] for more on Orlicz spaces.) Note, that by our assumption (69) we have \(\Vert h\Vert _\Phi \le 1\). By the generalized Hölder inequality for Orlicz spaces (c.f. Theorem 3 in Chapter III of [21]), for any \( f\in L^\Psi _\mu \) one has

$$\begin{aligned} \Vert f h\Vert _1 \le 2 \Vert f\Vert _\Psi \Vert h\Vert _\Phi \le 2 \Vert f\Vert _\Psi \end{aligned}$$
(71)

where \(\Vert \cdot \Vert _1\) is the \(L^1\) norm on \([0,1]\) with reference measure \(\mu \). Choose \(f(x)=\frac{1}{1-x}\). If we show that \(\Vert f\Vert _\Psi <\infty \) then this would imply

$$\begin{aligned} \infty >2\Vert f\Vert _\Psi \ge \Vert f h\Vert _1= \tfrac{1}{A} \int _0^1 \frac{1}{1-x} h(x) (1-x)dx=\tfrac{1}{A} \int _0^1 h(x) dx, \end{aligned}$$

and the statement (68) would follow. It is not hard to check, that there is a \(c>0\) so that

$$\begin{aligned} \Psi (x)\le c \frac{x^2}{\log ^2(x+e)}, \quad \hbox {for} \; x\ge 0. \end{aligned}$$
(72)

Since the integral \(\int _0^1 (1-x) \frac{(1-x)^{-2}}{\log ^2((1-x)^{-1}+e)}dx\) is finite, this implies that \(\Vert \tfrac{1}{1-x}\Vert _{\Psi }\) is finite and thus \(\int _0^1 h(y) dy<\infty \). This completes the proof that if \(\tilde{{\mathcal {J}}}_{\mathrm{Sine}}(g)<\infty \) then \(\lim \nolimits _{y\rightarrow 1^-}g(y)\) is finite, and also shows the last statement of the proposition.

Next we will prove the equicontinuity of the set \(K_r=\{f: \tilde{{\mathcal {J}}}_{\mathrm{Sine}}(f)\le r\}\), we will show that if \(g\in K_r\) then for \(\varepsilon <\varepsilon _0\) we have

$$\begin{aligned} |g(a+\varepsilon )-g(a)|\le C (\log \varepsilon ^{-1})^{-1/3} \quad \hbox {for any} \; a\in [0,1-\varepsilon ]. \end{aligned}$$
(73)

Here \(\varepsilon _0, C\) only depend on \(r\).

We first assume \(a\le 1-\sqrt{\varepsilon }\). Then

$$\begin{aligned}&|g(a+\varepsilon )-g(a)-\varepsilon |=\left| \int _a^{a+\varepsilon } ( g'(y)-1) dy\right| \\&\quad \le \left( \int _a^{a+\varepsilon } \frac{1}{1-y}dy\right) ^{1/2} \left( \int _a^{a+\varepsilon } (1-y) (g'(y)-1)^2 dy\right) ^{1/2} \\&\quad \le C r^{1/2} \left( \log \left( 1+\frac{\varepsilon }{1-a-\varepsilon }\right) \right) ^{1/2}\le C r^{1/2} \varepsilon ^{1/4} \end{aligned}$$

Where we used \(1-a>\sqrt{\varepsilon }\), the bound (66) and the fact that \(\varepsilon \) can be chosen to be small enough.

Next we assume that \(a>1-\sqrt{\varepsilon }\). Because of the monotonicity of \(g\) it is enough to bound \(|g(1)-g(1-\sqrt{\varepsilon })|\). Setting \(f(x)=\tfrac{1}{1-x}\) and \(h(x)=g'(x)\mathbf{1}(g'(x)\ge 2)\) we have

$$\begin{aligned} g(1)-g(1-\sqrt{\varepsilon })\le 2\sqrt{\varepsilon }+\int _{1-\sqrt{\varepsilon }}^1 h(x) dx. \end{aligned}$$
(74)

Since \(\tilde{{\mathcal {J}}}_{\mathrm{Sine}}(g)\le r\), we can assume that (69) holds with some finite \(A>0\). We will now follow the previous argument using Orlicz spaces. We use the same definitions for \(\Psi , \Phi \), \(\mu \) but for the norms \(\Vert \cdot \Vert _\Psi , \Vert \cdot \Vert _\Phi \) defined in (70) we use the interval \([1-\sqrt{\varepsilon }, 1]\) instead of \([0,1]\).

Using inequality (74) and (71) we get the bound

$$\begin{aligned} g(1)-g(1-\sqrt{\varepsilon })\le 2\sqrt{\varepsilon }+A\int _{1-\sqrt{\varepsilon }}^1 f(x) h(x) d\mu (x)\le 2\sqrt{\varepsilon }+ A\Vert f\Vert _\Psi . \end{aligned}$$

To estimate \( \Vert f\Vert _\Psi \) we will prove that with \(b=(\log \varepsilon ^{-1})^{-1/3}\) there is a constant \(\varepsilon _0\) depending on \(A\) so that

$$\begin{aligned} \int _{1-\sqrt{\varepsilon }}^1 \Psi (b^{-1} f(x)) d\mu (x)=A^{-1}\int _{1-\sqrt{\varepsilon }}^1(1\!-\!x) \Psi (b^{-1} (1\!-\!x)^{-1}) dx<1, \quad \hbox {for} \; \varepsilon \!<\!\varepsilon _0. \end{aligned}$$

This will imply that for such \(\varepsilon \) we have \(\Vert f\Vert _\Psi \le b\). Using (72) we get

$$\begin{aligned} A^{-1} \int _{1-\sqrt{\varepsilon }}^1(1-x) \Psi (b^{-1} (1-x)^{-1}) dx&\le c b^{-2}A^{-1} \int _0^{\sqrt{\varepsilon }} x^{-1} \frac{1}{\log ^2(2x^{-1} b^{-1})}dx\\&\le \frac{c (\log \varepsilon ^{-1})^{2/3}}{A \log \left( 2 (\log \varepsilon ^{-1})^{-1/3} \varepsilon ^{-1/2}\right) }. \end{aligned}$$

Since the right hand side converges to 0 as \(\varepsilon \rightarrow 0\) we get that \(\Vert f\Vert _\Psi \le b\) for small enough \(\varepsilon \) which in turn leads to the upper bound (73). This completes the proof of the equicontinuity of the set \(K_r\) and the compactness follows again by Tonelli’s theorem. \(\square \)

7 From the path to the endpoint

In this section we will complete the proofs of Theorems 1 and 2.

Proof of Theorem 2

Consider the continuous map \(F:C[0,T]\rightarrow {\mathbb R}\) given by \(F(g)=g(T)/(2\pi )\). By the contraction principle (see e.g. [9]) the random variables \(\tfrac{1}{\lambda }\tfrac{\alpha _\lambda (T)}{2\pi }\) satisfy a large deviation principle with scale function \(\lambda ^2\) and good rate function \(J\) defined as

$$\begin{aligned} J(\rho ) = \min \left\{ \int _0^T {\mathcal {I}}(g'(t))dt: \, g'(t)\ge 0, \,g(T)=2\pi \rho \right\} . \end{aligned}$$
(75)

We will now solve this variational problem. If \(g\) provides the minimum then we can assume that \(g'\) is monotone decreasing. To see this define \(\tilde{g}\) with \(\tilde{g}(0)=0\) and \(\tilde{g}'(t) = \sup \{x : m(g'(s)\ge x)\ge t\}\) where \(m\) indicates Lesbegue measure. Then \(\tilde{g}(T)=g(T)\), \({\mathcal {J}}_{\mathrm{Sch},T}(\tilde{g})= {\mathcal {J}}_{\mathrm{Sch},T}(g)\), and \(\tilde{g}'(t)\) is decreasing. If \(g'>0\) on \([0,a]\) and \(g'=0\) on \((a,1]\) then by the classical variational method we get that \({\mathcal {I}}'(g'(x))\) is constant on \([0,a]\). This means that \(g'(x)=\frac{2\pi \rho }{a}\) on \([0,a]\) and \(g'(x)=0\) on \((a,T]\) and our variational problem is reduced to finding the minimum of

$$\begin{aligned} f(a)=a {\mathcal {I}}\left( \tfrac{2\pi \rho }{a}\right) +(T-a) {\mathcal {I}}(0), \quad \hbox {on } \quad 0\le a\le T. \end{aligned}$$

But we have

$$\begin{aligned} f'(a)={\mathcal {I}}\left( \tfrac{2\pi \rho }{a}\right) -{\mathcal {I}}(0)-{\mathcal {I}}'\left( \tfrac{2\pi \rho }{a}\right) \tfrac{2\pi \rho }{a}<0, \end{aligned}$$

since \({\mathcal {I}}\) is strictly convex, which means that the minimum is at \(a=T\). Thus

$$\begin{aligned} J(\rho )=\min \left\{ \int _0^T {\mathcal {I}}(g'(t))dt, \, g'(t)\ge 0, \,g(T)=2\pi \rho \right\} =T {\mathcal {I}}(2\pi \rho /T) \end{aligned}$$
(76)

is the large deviation rate function for \(\frac{1}{\lambda }\frac{\alpha _\lambda (T)}{2\pi }\).

Now recall that the counting function of \(\hbox {Sch}_\tau \) is given by

$$\begin{aligned} \widetilde{N}_\tau (\lambda )=\# \{ \nu : 0\le \nu \le \lambda , \phi _{\lambda /\tau }(\tau )\in 2\pi {\mathbb Z}\} \end{aligned}$$

where \(\phi _\lambda \) is the solution of (10). Note, that \(\phi _\lambda (t)-\phi _0(t)\) has the same distribution as \(\xi _{f,0}(t)\) with constant \(f=\lambda \), which in turn has the same distribution as \(\widetilde{\alpha }_\lambda (t)\). Using the coupling methods of Proposition 5 we can show that \(\phi _\lambda (t)\) is increasing in \(\lambda \) for any fixed \(t\) (see [19] for a detailed proof of this fact). From this it follows that

$$\begin{aligned} \left| \tilde{N}_\tau (\lambda )-\tfrac{1}{2\pi }\left( \phi _{\lambda /\tau }(\tau )-\phi _0(\tau )\right) \right| \le 1. \end{aligned}$$

This means that in order to get a large deviation principle for \(\tfrac{1}{\lambda }\tilde{N}_\tau (\lambda )\) it is enough to prove one for \(\tfrac{1}{\lambda } \tfrac{\phi _{\lambda /\tau }(\tau )-\phi _0(\tau )}{2\pi }\). But this has the same distribution as \(\frac{1}{\lambda }\frac{\widetilde{\alpha }_{\lambda /\tau }(\tau )}{2\pi }\), and a simple rescaling of (76) completes the proof of the theorem.

Proof of Theorem 1

Theorem 6 shows that \(\tfrac{1}{\lambda } \alpha _{\lambda (\cdot )}\) satisfies a path level large deviation principle. By applying the time change \(y=1- e^{-\tfrac{\beta }{4}t}\), we get that \(t\rightarrow \tfrac{1}{\lambda } \alpha _{\lambda }(1- e^{-\tfrac{\beta }{4}t})\) satisfies a path level LDP on \(C[0,1)\) with the modified rate function \(\tilde{{\mathcal {J}}}_{\mathrm{Sine}}\) given in (67). In Proposition 14 we showed that if \(\tilde{{\mathcal {J}}}_{\mathrm{Sine}}(g)<\infty \) then the limit as \(t\rightarrow 1^{-}\) exists and so the LDP actually holds on \(C[0,1]\). Using the contraction principle with the functional \(F(g)= \tfrac{1}{2\pi } g(1)\), we get that \(\tfrac{1}{\lambda }\tfrac{\alpha _\lambda (\infty )}{2\pi }\) satisfies a large deviation principle with speed function \(\lambda ^2\) and a good rate function

$$\begin{aligned} J^\beta (\rho )&= \min \left\{ \tilde{{\mathcal {J}}}_{\mathrm{Sine}}(g): \, g(1)=2\pi \rho \right\} \\&= \min \left\{ \tfrac{\beta }{4}\int _0^1 (1-t) {\mathcal {I}}(g'(t))dt: \, g(0)=0, g'(t)\ge 0, g(1)=2\pi \rho \right\} . \end{aligned}$$

The counting function \(N_\beta (\lambda )\) of \({\hbox {Sine}}_\beta \) is given by \(\tfrac{\alpha _\lambda (\infty )}{2\pi }\), so Theorem 1 will follow if we can show that the solution of this variational problem is given by \(\beta I_{\mathrm{Sine}}(\rho )\) as defined in the theorem.

The function \(\tilde{{\mathcal {J}}}_{\mathrm{Sine}}\) is a good rate function, so for any \(\rho \ge 0\) the minimum is achieved at some \(g_\rho \in C[0,1]\). Clearly, when \(\rho =\tfrac{1}{2\pi }\) then the minimum is zero, as the \(g(t)=t\) function shows. (We will not denote the dependence of \(\rho \) in \(g=g_\rho \) from this point.)

We may assume that for the minimizer the derivative \(g'\) will not take values from both \((1,\infty )\) and \([0,1)\) because otherwise we could construct a function \(\hat{g}\) with the same boundary condition \(\hat{g}(1)=2\pi \rho \), but with \(\tilde{{\mathcal {J}}}_{\mathrm{Sine}}(g) >\tilde{{\mathcal {J}}}_{\mathrm{Sine}}(\hat{g})\). The construction is as follows. Assume \(\rho <1/(2\pi )\) and that \(A= \{t: g'(t) >1\}\) has positive measure. Since \(\int _0^1 (g'(t)-1) dt=2\pi \rho -1<0\) and \(\int _A (g'(t)-1) dt>0\), by the intermediate value theorem we can find \(B\subset [0,1]\setminus \)A so that \(\int _{A\cup B} (g'(t)-1)dt=0\). Define \(\hat{g}\) with \(\hat{g}(0)=0\), \(\hat{g}'(t)= g'(t)\) if \(t\notin A\cup B\) and \(\hat{g}'(t) =1\) otherwise. Then \(g(1)=\int _0^1 g'(t) dt=\int _0^1 \hat{g}'(t) dt=\hat{g}(1)\), but clearly \(\tilde{{\mathcal {J}}}_{\mathrm{Sine}}(g) > \tilde{{\mathcal {J}}}_{\mathrm{Sine}}(\hat{g})\). A similar construction works for \(\rho > 1/(2\pi )\). Thus we may assume that \(g'(t)\le 1\) for all \(t\) if \(\rho < 1/(2\pi )\), and \(g'(t)\ge 1\) for all \(t\) if \(\rho >1/(2\pi )\).

First assume that \(\rho >\tfrac{1}{2\pi }\). Then \(g'(t)\ge 1\) for all \(t\) and we can use the classical variational method (see e.g. [5]) to conclude that \((1-t){\mathcal {I}}'(g'(t))\) is constant in \(t\). Thus the optimizer is given by a function \(g_\rho \) which satisfies

$$\begin{aligned} g_\rho (0)=0, \qquad {\mathcal {I}}'(g'(t))=\tfrac{c_\rho }{1-t}, \qquad \int _0^1 ({\mathcal {I}}')^{(-1)}\left( \tfrac{c_\rho }{1-t}\right) dt=2\pi \rho , \end{aligned}$$
(77)

for some constant \(c_\rho \) and the solution of the variational problem is

$$\begin{aligned} J^\beta (\rho )=\tfrac{\beta }{4}\int _0^1 (1-t) {\mathcal {I}}\left( ({\mathcal {I}}')^{(-1)}\left( \tfrac{c_\rho }{1-t}\right) \right) dt. \end{aligned}$$
(78)

In Proposition 15 below we will show that this is equal to \(\beta I_{Sine}(\rho )\) as defined in Theorem 1.

Now assume that \(\rho <\tfrac{1}{2\pi }\), here we can assume that the minimizer satisfies \(g'(t)\le 1\). As in the case of \(\hbox {Sch}_\tau \) we may assume \(g'\) is decreasing, this can be shown using the same construction as found in the paragraph directly following equation (75). Suppose that \(g'\) is zero for \(t\in [a,1]\) and \(g'(t)>0\) in \([0,a]\). Then on \([0,a]\) the classical variational method shows that \((1-t){\mathcal {I}}'(g'(t))\) must be constant. Thus the optimizer must be of the following form:

$$\begin{aligned} g'(t)=\left\{ \begin{array}{l@{\quad }l} ({\mathcal {I}}')^{(-1)}\left( \tfrac{c_{\rho ,a}}{1-t}\right) , &{} 0\le t \le a\\ 0, &{} a<t\le 1, \end{array}\right. \end{aligned}$$
(79)

for some constant \(c_\rho \) which satisfies

$$\begin{aligned} 2\pi \rho =\int _0^a ({\mathcal {I}}')^{(-1)}\left( \tfrac{c_{\rho ,a}}{1-t}\right) dt. \end{aligned}$$
(80)

By Propositions 16 and 17 of the “Appendix” the function \({\mathcal {I}}'(x)\) is strictly increasing on \((0,\infty )\) with a limit of \(-\tfrac{1}{2\pi }\) at \(x=0\). Thus \(c_{\rho ,a}\) in (79) cannot be smaller than \(-\frac{1-a}{2\pi }\). Our next claim is that the optimizer has a continuous derivative at \(t=a\), which will identify \(c_{\rho ,a}\) as \(c=-\tfrac{1-a}{2\pi }\). Assume the opposite, i.e. that \(c_{\rho ,a}>-\tfrac{1-a}{2\pi }\) and \(g'(a)>0\). Let \(\eta _{\delta }(x)=\mathbf{1}_{(a,a+\delta )}-\mathbf{1}_{(a-\delta ,a)}\). If \(\delta , \varepsilon \) are small enough then \(g'-\varepsilon \eta _{\delta }\ge 0\) in \([0,1]\) and \(\tilde{g}(t)=\int _0^t (g'(s)-\varepsilon \eta _{\delta }(s))ds\) satisfies the same boundary conditions as \(g\). Since \(g\) is a minimizer, the derivative of \(h(\varepsilon )= \tilde{{\mathcal {J}}}_{\mathrm{Sine}}(g+\varepsilon \eta _\delta )\) at \(\varepsilon =0\) cannot be negative. We can compute the derivative as

$$\begin{aligned} h'(0)&= \frac{\beta }{4} \int _0^1 (1-t){\mathcal {I}}'(g(t)) \eta _\delta (t)dt\\&= -\int _{a-\delta }^a (1-t)\tfrac{c_{\rho ,a}}{1-t}dt+\int _a^{a+\delta } (1-t) \left( -\tfrac{1}{2\pi }\right) dt. \end{aligned}$$

This is equal to \(\delta (-c_{\rho ,a}-\tfrac{1-a}{2\pi })+\tfrac{\delta ^2}{4\pi }\) which is negative if \(\delta \) is small enough (by our assumption that \(c_{\rho ,a}>-\tfrac{1-a}{2\pi }\)). The contradiction shows that we must have \(c=-\tfrac{1-a}{2\pi }\).

Thus the optimizer is given by

$$\begin{aligned} g'(t)=\left\{ \begin{array}{l@{\quad }l} ({\mathcal {I}}')^{(-1)}\left( \tfrac{a-1}{2\pi (1-t)}\right) , &{} 0\le t \le a\\ 0, &{} a<t\le 1. \end{array}\right. \end{aligned}$$
(81)

for some \(0\le a\le 1\) with

$$\begin{aligned} 2\pi \rho =\int _0^a ({\mathcal {I}}')^{(-1)}\left( \tfrac{a-1}{2\pi (1-t)}\right) dt. \end{aligned}$$
(82)

and the solution of the variational problem in the \(2\pi \rho <1\) case is given by

$$\begin{aligned} J^\beta (\rho )=\frac{\beta }{4} \int _0^a (1-t) {\mathcal {I}}\left( ({\mathcal {I}}')^{(-1)}\left( \tfrac{a-1}{2\pi (1-t)}\right) \right) dt+\tfrac{\beta }{64}(1-a)^2. \end{aligned}$$
(83)

In Proposition 15 below we will show that this is equal to \(\beta I_{\mathrm{Sine}}(\rho )\).

Proposition 15

The rate function for the \(\mathrm{Sine}_\beta \) process is given by

$$\begin{aligned} \beta {I_{{Sine}}}(\rho )=\frac{\beta }{8} \left[ \frac{\nu }{8}+ \rho {\mathcal {H}}(\nu )\right] \end{aligned}$$

where \(\nu =\gamma ^{-1}(\rho )\), and \(\gamma \) is the strictly increasing function given in (7).

Proof

We have to show that \(J^\beta (\rho )\) defined by (77) and (78) for \(\rho >1/(2\pi )\) and by (82) and (83) for \(\rho <1/(2\pi )\) is equal to \(\beta I_{Sine}\) given above.

We begin with the case where \(\rho >\tfrac{1}{2\pi }\). In this case the minimizer \(g=g_\rho \) is given by (77). One easily checks that

$$\begin{aligned} \frac{d}{dt}\left( \tfrac{\beta }{8}\left( -(1-t)^2 {\mathcal {I}}(g'(t))+c_\rho (1-t)g'(t)+c_\rho g(t)\right) \right) =\tfrac{\beta }{4}(1-t){\mathcal {I}}(g'(t)).\quad \end{aligned}$$
(84)

From this we get

$$\begin{aligned} J^\beta (\rho )=\frac{\beta }{4} \int _0^1 (1-t) {\mathcal {I}}\left( g'(t)) \right) dt= \tfrac{\beta }{8} \left[ {\mathcal {I}}(g'(0))- c_\rho g'(0)+2 \pi \rho c_\rho \right] \end{aligned}$$

where we used \(g(0)=0\), \(g(1)=2\pi \rho \), and the limits

$$\begin{aligned} \lim \limits _{t\rightarrow 1^-} (1\!-\!t)^2 {\mathcal {I}}(g'(t))\!=\!\lim \limits _{x\rightarrow \infty } \frac{c_\rho ^2 {\mathcal {I}}(x)}{{\mathcal {I}}'(x)^2}\!=\!0, \quad \lim \limits _{t\rightarrow 1^{-}} (1-t) g'(t)\!=\!\lim \limits _{x\rightarrow \infty } \frac{x}{c_\rho {\mathcal {I}}'(x)}=0 \end{aligned}$$

which follow from the asymptotics (90) and (91) to be proven in Proposition 17.

Now for the case where \(\rho <\tfrac{1}{2\pi }\) we have that \(g_\rho \) is given by (81). Using the notation \(c=c_\rho =\frac{a-1}{2\pi }\), the identity (84) gives

$$\begin{aligned} J^\beta (\rho )&= \frac{\beta }{4} \int _0^{2\pi c_\rho +1} (1-t) {\mathcal {I}}\left( g'(t)) \right) dt+ \tfrac{\beta }{8}(2\pi c_\rho )^2{\mathcal {I}}(0)\\&= \tfrac{\beta }{8} \left[ {\mathcal {I}}(g'(0))- c_\rho g'(0)+2 \pi \rho c_\rho \right] , \end{aligned}$$

where we used \(g(0)=0\), \(g(a)=2\pi \rho \), and \(g'(a)=0\). Note, that \(c_\rho >0\) if \(\rho >1/(2\pi )\) and \(-\tfrac{1}{2\pi }\le c_\rho <0\) if \(\rho <1/(2\pi )\). Introducing \(\nu =K^{(-1)}\left( \frac{\pi }{2 ({\mathcal {I}}')^{(-1)}(c_\rho )}\right) \), we get for both \(\rho <1/(2\pi )\) and \(\rho >1/(2\pi )\) that

$$\begin{aligned} J^\beta (\rho )=\frac{\beta }{8}\left( \frac{\nu }{8}+\rho {\mathcal {H}}(\nu )\right) \end{aligned}$$

which agrees with (6), we just have to show that \(\nu =\gamma ^{(-1)}(\rho )\). Note, that \(\nu =\nu (\rho )<0\) if \(2\pi \rho >1\) and \(0<\nu <1\) if \(2\pi \rho <1\).

Recall from (77) and (82) that

$$\begin{aligned} \rho&= \tfrac{1}{2\pi }\int _{0}^1 ({\mathcal {I}}')^{(-1)}\left( \tfrac{c_\rho }{1-t}\right) dt, \quad \text { if } \rho >\frac{1}{2\pi }, \qquad \text { and }\\ \rho&= \tfrac{1}{2\pi } \int _{0}^{2\pi c+1} ({\mathcal {I}}')^{(-1)}\left( \tfrac{c_\rho }{1-t}\right) dt, \quad \text { if } \rho < \frac{1}{2\pi }. \end{aligned}$$

Applying the change of variables to both integrals with a new variable \(x\) satisfying \(\frac{\pi }{2 K(x)}=({\mathcal {I}}')^{(-1)}(\tfrac{c_\rho }{1-t})\), we get that \(\rho \) depends on \(\nu =K^{(-1)}\left( \frac{\pi }{2 ({\mathcal {I}}')^{(-1)}(c_\rho )}\right) \) exactly via (7) which finishes the proof. Note, that the finiteness of the integrals in (7) follow from the asymptotics of \(K(x)\) and \(E(x)\) near 1 and \(-\infty \) (see the proof of Proposition 17). \(\square \)