1 Introduction

Let \(\mathbf{T}\) be an infinite tree with vertex set \(\mathbf{V}\). Each \(v\in \mathbf{V}\) has \(b+1\) neighbors except a vertex, called the root, which has \(b\) neighbors for \(b\ge 2\). We denote the root by \(\mathbf{0}\). For any two vertices \(u,v\in \mathbf{V}\), let \(e=[u,v]\) be the edge with vertices \(u\) and \(v\). We denote by \(\mathbf{E}\) the edge set. Consider a Markov chain \(\mathbf{X}=\{X_i, \omega (e, i)\}\), which starts at \(X_0=\mathbf{0}\) with \(\omega (e, 0)=1\) for all \(e\in \mathbf{E}\), where \(\omega (e,0)\) is called the initial weight. For \(i \ge 1\) and \(e\in \mathbf{E}\), let \(X_i\in \mathbf{V}\) and let \(\omega (e,i)\ge 1\) be the \(i\)-th weight. The transition from \(X_i\) to the nearest neighbor \(X_{i+1}\) is randomly selected with probabilities proportional to weights \(\omega (e,i)\) of incident edges \(e\) to \(X_i\).

After \(X_i\) has changed to \(X_{i+1}\), the weights are updated by the following rule:

$$\begin{aligned} w(e, i+1)= \left\{ \begin{array}{l@{\quad }l@{}} 1+ k(c-1)&{}\mathrm{for}\, [X_i, X_{i+1}]=e\, \hbox {and}\, e \,\hbox {had been traversed}\, k\, \hbox {times},\\ w(e, i) &{} \text{ otherwise } \end{array} \right. \end{aligned}$$

for fixed \(c >1\). With this weight change, the model is called a linearly reinforced random walk. Note that if \(c=1\), then it is a simple random walk.

The linearly reinforced random walk model was first studied by Coppersmith and Diaconis in 1986 (see [4]) for finite graphs on the \(\mathbf{Z}^d\) lattice. They asked whether the walks are recurrent or transient. For \(d=1\), the walks are recurrent for all \(c\ge 1\) (see [3] and [10]). For \(d\ge 1\), Sabot and Tarres [9] showed that the walks are also recurrent for a large \(c\). The other cases on the \(\mathbf{Z}^d\) lattice still remain open. Pemantle [8] studied this model on trees and showed that there exists \(c_0=c_0(b)\ge 4.29\) such that when \(1< c < c_0\), then the walks are transient and when \(c >c_0\), then the walks are recurrent. Furthermore, Collevecchio [2] and Aidekon [1] investigated the behavior of \(h(X_n)\) on the transient phase, where \(h(x)\) denotes by the number of edges from the root to \(x\) for \(x\in \mathbf{T}\). They focused on \(c=2\) and showed that the law of large numbers holds for \(h(X_n)\) with a positive speed for any \(b\ge 2\). More precisely, if \(c=2\), then there exists \(0< T=T(b) < b/(b+2)\) such that

$$\begin{aligned} \lim _{n\rightarrow \infty } {h(X_n)\over n} = T \, \hbox {a.s.}. \end{aligned}$$
(1.1)

By the dominated convergence theorem,

$$\begin{aligned} \lim _{n\rightarrow \infty } \mathbf{E}{h(X_n)\over n} = T. \end{aligned}$$
(1.2)

By a simple computation, the probability that the walks repeatedly move between an edge connected to the root is larger than \(n^{-C}\) for some \(C=C(b)>0\). Therefore,

$$\begin{aligned} n^{-C}\le \mathbf{P}(h(X_n)\le 1), \end{aligned}$$
(1.3)

so the lower tail of \(h(X_n)\) has the following behavior:

$$\begin{aligned} n^{-C}\le \mathbf{P}(h(X_n)\le n(T-\epsilon )) \end{aligned}$$
(1.4)

for all \(\epsilon < T\) and for all large \(n\). In this paper, \(C\) and \(C_i\) are positive constants depending on \(c,\,b,\,\epsilon ,\,N,\,M\), and \(\delta \), but not on \(n\), \(m\), and \(k\). They also change from appearance to appearance. From (1.4), unlike a simple random walk on a tree, we have

$$\begin{aligned} \lim _{n\rightarrow \infty } {-1\over n^{\eta }} \log \mathbf{P}(h(X_n)\le n(T-\epsilon ))=0 \end{aligned}$$
(1.5)

for all \(\epsilon < T\) and for all \(\eta >0\).

We may ask what the behavior of the upper tail is. Unlike the lower tail, we show that the upper tail has a standard large deviation behavior for large \(b\).

Theorem 1

For the linearly reinforced random walk model with \(c=2\) and \(b\ge 70\), and for \(\epsilon >0\), there exists a positive number \( \alpha =\alpha (b, \epsilon )\) such that

$$\begin{aligned} \lim {-1\over n} \log \mathbf{P}(h(X_n)\ge (T+\epsilon )n) =\alpha . \end{aligned}$$

Remark 1

The proof of Theorem 1 depends on a few Collevecchio’s estimates (see Lemma 2.1 as follows). Since his estimates need a requirement that \(b \ge 70\), Theorem 1 also needs this restriction. We conjecture that Theorem 1 holds for all \(b \ge 2\). Durrett et al. [5] also investigated a similar reinforced random walk \(\{{ Y}_k, w(e, i)\}\), except that the weight changes by

$$\begin{aligned} w(e, i+1)= \left\{ \begin{array}{l@{\quad }l@{}} c&{}\mathrm{for}\, [Y_i, Y_{i+1}]=e,\\ w(e, i)&{}\hbox {otherwise} \end{array} \right. \end{aligned}$$
(1.6)

for fixed \(c>1\). This random walk model is called a once-reinforced random walk. For the once-reinforced random walk model, Durrett et al. [5] showed that for any \(c>1\), the walks are always transient. In addition, they also showed the law of large numbers for \(h(Y_n)\). More precisely, they showed that there exists \(0< S=S(c)< b/(b+c)\) such that

$$\begin{aligned} \lim _{n\rightarrow \infty } {h(Y_n)\over n} = S \text{ a.s. }. \end{aligned}$$
(1.7)

We also investigate the large deviations for \(h(Y_n)\). We have the following theorem, similar to the linearly reinforced random walk model.

Theorem 2

For the once-reinforced random walk model with \(c>1\) and for \(\epsilon >0 \), there exists a finite positive number \(\beta =\beta (c, b, \epsilon )\) such that

$$\begin{aligned} \lim {-1\over n} \log \mathbf{P}(h(Y_n)\ge (S+\epsilon )n)=\beta . \end{aligned}$$

Remark 2

It is difficult to compute the precise rate functions \(\alpha \) and \(\beta \). But we may obtain some properties such as the continuity in \(\epsilon \) for them.

We may ask what the lower tail deviation for \(h(Y_n)\) is. Unlike in the linearly reinforced random walk model, the lower tail is still exponentially decaying.

Theorem 3

For the once-reinforced random walk model with \(c>1\) and \(0< \epsilon < S \),

$$\begin{aligned} 0< \liminf {-1\over n} \log \mathbf{P}(h(Y_n)&\le (S-\epsilon )n)\le \limsup {-1\over n} \log \mathbf{P}(h(Y_n)\nonumber \\&\le (S-\epsilon )n)< \infty . \end{aligned}$$

Remark 3

Durrett et al. [5] also showed that (1.7) holds for a finitely many times reinforced random walk. We can also adopt the same proof of Theorems and to show that the same arguments hold for a finitely many times reinforced random walk. In fact, our proofs in Theorems and depend on Durrett, Kesten, and Limic’s Lemmas 7 and 8 (2002). These proofs in their lemmas can be extended for the finitely many times reinforced random walk model.

Remark 4

We believe that the limit exists in Theorem , but we are unable to show it.

2 Preliminaries

In this section, we focus on the linearly reinforced random walk model with \(c=2\). We define a hitting time sequence \(\{t_i\}\) as follows.

$$\begin{aligned} {t}_k=\min \{j\ge 0: h(X_j)=k\}. \end{aligned}$$

Note that walks are transient, so \(h(X_j)\rightarrow \infty \) as \(j\rightarrow \infty \). Thus, \(t_k\) is finite and

$$\begin{aligned} 0=t_0< t_1 < t_2<\cdots < t_k<\cdots <\infty . \end{aligned}$$
(2.1)

With this definition, for each \(k\ge 1\),

$$\begin{aligned} h(X_{{t}_k})-h(X_{{t}_{k-1}})=1. \end{aligned}$$
(2.2)

We also define a leaving time sequence \(\{\rho _i\}\) as follows.

$$\begin{aligned} \rho _i=\max \{ j\ge 0: h(X_j)= i\}. \end{aligned}$$

Since the walk \(\mathbf{X}\) is transient,

$$\begin{aligned} \rho _0 < \rho _1< \cdots < \rho _k <\cdots < \infty . \end{aligned}$$
(2.3)

However, unlike the simple random walk model, \(\{{t}_j-{t}_{j-1}\}\) are not independent increments. So we need to look for independence from these times. To achieve this target, we call \(t_i\) a cut time if

$$\begin{aligned} \rho _i-t_i=0. \end{aligned}$$
(2.4)

Since the walks \(\mathbf{X}\) is transient, we may select these cut times and list all of them in increasing order as

$$\begin{aligned} \tau _1< \cdots < \tau _k < \cdots < \infty . \end{aligned}$$
(2.5)

With these cutting times, we consider difference

$$\begin{aligned} H_k=h(X_{\tau _{k+1}})- h(X_{\tau _{k}}) \quad \text{ for } k=1,2,\ldots . \end{aligned}$$
(2.6)

By this definition, it can be shown that for \(k=1,2,\ldots , \)

$$\begin{aligned} \left( \tau _{k+1}-\tau _{k}, H_k\right) \text{ is } \text{ an } \text{ i.i.d. } \text{ sequence. } \end{aligned}$$
(2.7)

In fact (see page 97 in [2]), to verify (2.7), it is enough to realize that \(X_{\tau _k}\), \(k\ge 1\), are regenerative points for the process \(\mathbf{X}\). These points split the process \(\mathbf{X}\) into \(i.i.d.\) pieces, which are \(\{ X_m, \tau _{k} \le m< \tau _{k+1}\},\,k\ge 1\).

Level \(k\ge 1\) is the set of vertices \(v\) such that \(h(v)=k\). Level \(k\) is a cut level if the walk visits it only once. We also call \(X_k\), the only vertex to be visited, the cut vertex. It follows from the cut time definition that \(X_{\tau _k}\) is a cut vertex for \(k\ge 1\). We want to remark that \(\tau _1\) may or may not be equal zero. If \(\tau _1=0\), the root is a cut vertex. For convenience, we just call \(\tau _0=0\) whether the root is a cut vertex or not. In addition, let

$$\begin{aligned} H_0=h(X_{\tau _1})-h(X_{\tau _0})=h(X_{\tau _1}). \end{aligned}$$
(2.8)

With these definitions, Collevecchio [2] proved the following lemma.

Lemma 2.1

For \(c=2\) and \(b\ge 70\),

$$\begin{aligned} \mathbf{P} ( H_k \ge k)\le 0.115^k \quad \text{ for } k\ge 0. \end{aligned}$$
(2.9)

Furthermore, for \(p_0=1002/1001\),

$$\begin{aligned} \mathbf{E}\tau _1^{p_0} <\infty . \end{aligned}$$
(2.10)

With Lemma 2.1, we can see that \(h(X_{\tau _{k+1}})-h(X_{\tau _{k}})\) is large with a small probability. Also, \(\tau _{k+1}-\tau _{k}\) is large with a small probability. However, to show a large deviation result, we need a much shorter tail requirement. Therefore, we need to truncate both \(H_k=h(X_{\tau _{k+1}})-h(X_{\tau _{k}})\) and \(\tau _{k+1}-\tau _{k}\). We call \(\tau _k\) \(N\)-short for \(k\ge 1\) if

$$\begin{aligned} H_k=h(X_{\tau _{k+1}}) -h(X_{\tau _{k}}) \le N; \end{aligned}$$
(2.11)

otherwise, we call it \(N\)-long. Since we only focus on the transient phase, we have

$$\begin{aligned} \tau _k(N) < \infty . \end{aligned}$$

We list all \(N\)-short cut times as

$$\begin{aligned} \tau _1(N)< \tau _1(N)< \cdots <\infty . \end{aligned}$$
(2.12)

For convenience, we also call \(\tau _0(N)=0\) whether the root is a cut vertex or not. We know that \(\tau _k(N)=\tau _i\) for some \(i\).We denote it by \(\tau _k'(N)=\tau _{i+1}\). In particular, let \(\tau '_0(N)=0\). For \(N>0\), let

$$\begin{aligned} I_n=\max \{i: \tau _i(N)\le n\} \end{aligned}$$

and

$$\begin{aligned} h_n(N)=\sum _{i=0}^{I_n} \left( h\big (X_{\tau _{i}'(N)}\big )-h\big (X_{\tau _i(N)}\big )\right) . \end{aligned}$$

If \(I_n=0\),

$$\begin{aligned} h_n=0. \end{aligned}$$
(2.13)

Now we state standard tail estimates for an i.i.d. sequence. The proof can be followed directly from Markov’s inequality.

Lemma 2.2

Let \(Z_1, \ldots Z_k,\ldots \) be an i.i.d. sequence with \(\mathbf{E}Z_1=0\) and \(\mathbf{E}\exp (\theta Z_1) < \infty \) for some \(\theta >0\), and let

$$\begin{aligned} S_m=Z_1+Z_2+\cdots + Z_m. \end{aligned}$$

For any \(\epsilon >0,\,i\le n\) and \(j\ge n\), there exist \(C_i=C_i(\epsilon )\) for \(i=1,2\) such that

$$\begin{aligned} \mathbf{P}( S_i\ge n\epsilon )\le C_1 \exp (-C_2 n), \end{aligned}$$

and

$$\begin{aligned} \mathbf{P}( S_j \le -\epsilon n)\le C_1 \exp (-C_2 n). \end{aligned}$$

Now we show that \(h_n(N)/n\) and \(h(X_n)/n\) are not very different if \(N\) is large.

Lemma 2.3

For \(\epsilon >0,\,c=2\), and \(b\ge 70\), there exist \(N=N(\epsilon )\) and \(C_i=C_i(\epsilon ,N)\) for \(i=1,2\) such that

$$\begin{aligned} \mathbf{P}( h(X_n)\ge h_n(N) +n\epsilon )\le C_1\exp (-C_2 n). \end{aligned}$$

Proof

If

$$\begin{aligned} h(X_n)-h_n(N)\ge \epsilon n, \end{aligned}$$
(2.14)

we may suppose that there are only \(k\ge 1\) many \(N\)-long cut time pairs \(\{\tau _{i_j}, \tau _{i_j+1}\}\) for \(j=1,\ldots , k\) such that

$$\begin{aligned} {\tau _{i_1}}\!<\! \tau _{i_1+1}< {\tau _{i_2}}<\tau _{i_2+1}\!<\!\cdots < \tau _{{i_j}}<\tau _{i_j+1} \cdots < \tau _{i_{k-1}} \!<\!\tau _{i_{k-1}+1} <\tau _{i_{k}} \!\le n\le \! \tau _{{i_k}+1} \end{aligned}$$

with \(i_1\ge 1\) and with

$$\begin{aligned} \sum _{j=1}^{k} H_{i_j}=\sum _{j=1}^{k} h(X_{\tau _{i_{j}+1}})-h(X_{\tau _{i_j}}) \ge \epsilon n/2, \end{aligned}$$
(2.15)

where

$$\begin{aligned} H_{i_j}=h(X_{\tau _{i_{j}+1}})-h(X_{\tau _{i_j}}) > N \quad \text{ for } j=1,2,\ldots , k\le n/N, \end{aligned}$$
(2.16)

or

$$\begin{aligned} h(X_{\tau _1})\ge \epsilon n/2. \end{aligned}$$
(2.17)

For the second case in (2.17), by Lemma 2.1, there exist \(C_i=C_i(\epsilon )\) for \(i=1,2\) such that

$$\begin{aligned} \mathbf{P} (h(X_{\tau _1})\ge \epsilon n/2)=\mathbf{P}(H_0\ge \epsilon n/2)\le C_1\exp (-C_2 n). \end{aligned}$$
(2.18)

We focus on the first case in (2.15). By (2.7) and Lemma 2.1, \(\{H_1, H_2,\ldots \}\) is an i.i.d sequence with

$$\begin{aligned} \mathbf{P}(H_i\ge m )\le 0.115^m \quad \text{ for } i\ge 1. \end{aligned}$$
(2.19)

Thus, if (2.15) holds, by (2.15) and (2.16), it implies that there exist \(k\) many \(H_i\)s in \(\{H_1,\ldots , H_{n}\}\) for \(1\le k\le \lceil n/N\rceil \) such that \(H_i > N\) and their sum is large than \(\epsilon n/2\).

For a fixed \(k\), it costs at most \(n\atopwithdelims ()k\) to fix the subsequence of these \(H_i\)s from \(\{ H_1,\ldots , H_{n}\}\). We denote by \(H_{i_1}, \ldots , H_{i_k}\) these fixed random variables. Since \(\{H_i\}\) is an i.i.d sequence, the joint distribution of \(H_{i_1}, \ldots , H_{i_k}\) is always the same for different \(i_j\)s. With these observations,

$$\begin{aligned} \mathbf{P}\left( h(X_n)\ge h_n(N) +n\epsilon /2, (2.15) \hbox { holds}\right) \!\le \! \sum _{k=1}^{ \lceil n/N\rceil } {n\atopwithdelims ()k} \mathbf{P} (H_{i_1}+\cdots \!+\!H_{i_k} \ge n\epsilon /2). \end{aligned}$$
(2.20)

By (2.19), we know that

$$\begin{aligned} EH_i= EH_1< \infty \quad \text{ for } \text{ each } i\ge 1. \end{aligned}$$

Since \(k \le n/N+1\), we may take \(N=N(\epsilon )\) large such that for each \(k\le n\) and fixed \(i_1,\ldots , i_k\)

$$\begin{aligned} \mathbf{P} (H_{i_1}+\cdots +H_{i_k} \ge n\epsilon /2)\le \mathbf{P} ([H_{i_1}-EH_{i_1}]+\cdots +[H_{i_k}-EH_{i_k}] \ge n\epsilon /4) \end{aligned}$$
(2.21)

Note that \(\{H_{i_j}-EH_{i_j}\}\) is an i.i.d sequence with a zero-mean and an exponential tail for \(j=1,\ldots , k\), so by Lemma 2.2,

$$\begin{aligned} \mathbf{P} ([H_{i_1}-EH_{i_1}]+\cdots +[H_{i_k}-EH_{i_1}] \ge n\epsilon /4)\le C_3\exp (-C_4 n). \end{aligned}$$
(2.22)

By a standard entropy bound, as given in Corollary 2.6.2 of Engel [6], for \(k\le n/N\),

$$\begin{aligned} {n\atopwithdelims ()k}\le \exp (n\log N/N). \end{aligned}$$
(2.23)

By (2.19)–(2.22), if we take \(N\) large, then there exist \(C_i\!=\!C_i(\epsilon , N)\) for \(i\!=\!5,6\) such that

$$\begin{aligned} \mathbf{P}\left( h_n(X_n)\ge h_n(N) +n\epsilon , (2.15)\, \hbox {holds}\right) \le C_5 n\exp (-C_6 n). \end{aligned}$$
(2.24)

So Lemma 2.3 holds by (2.18) and (2.24). \(\square \)

We also need to control the time difference such that \(\tau _k'(N)-\tau _k(N)\) cannot be large. We call \(\tau _k(N)\) \(M\)-tight for \(k\ge 1\) if

$$\begin{aligned} \tau _{k}'(N) -\tau _{k}(N) \le M. \end{aligned}$$

We list all \(M\)-tight \(N\)-short cut times as

$$\begin{aligned} \tau _1(N,M), \tau _2(N,M),\ldots , \tau _k(N,M),\ldots . \end{aligned}$$

Suppose that \(\tau _k(N,M) < \infty \). We know that \(\tau _k(N,M)=\tau _i\) for some \(i\). We denote \(\tau _k'(N,M)=\tau _{i+1}\). For convenience, we also call \(\tau _0(N,M)=0\) and \( \tau _0'(N,M)=0\) whether the root is a cut vertex or not. Let

$$\begin{aligned} J_n=\max \{i:\tau _i(N,M)\le n\} \end{aligned}$$

and

$$\begin{aligned} h_n(N,M)=\sum _{i=0}^{J_n} \left( h\big (X_{\tau _{i}'(N,M)}\big )-h\big (X_{\tau _i(N,M)} \big )\right) . \end{aligned}$$
(2.25)

If \(J_n=0\), then

$$\begin{aligned} h_n(N,M)=0. \end{aligned}$$
(2.26)

The following lemma shows that \(h_n(N,M)/n\) and \(h_n(N)/n\) are not far away.

Lemma 2.4

For \(\epsilon >0\), for \(N\), and for each \(n\), there exists \(M=M(\epsilon , N)\) such that

$$\begin{aligned} h_n(N)\le h_n(N,M) +n\epsilon . \end{aligned}$$

Proof

If \(h_n(N)>h_n(N,M) +n\epsilon \), we know that there are at least \(\epsilon n/2N\) many \(\{\tau _i(N)\}\) such that

$$\begin{aligned} \tau _i'(N)-\tau _{i}(N) > M. \end{aligned}$$
(2.27)

If we take \(M \ge 3 N\epsilon ^{-1} \), then

$$\begin{aligned} n\ge \sum _{i=1}^{I_n} \left( \tau _i'(N)-\tau _{i}(N)\right) > M \epsilon n/2N > n. \end{aligned}$$
(2.28)

The contradiction shows that

$$\begin{aligned} h_n(N)\le h_n(N,M) +n\epsilon . \end{aligned}$$

So Lemma 2.4 follows. \(\square \)

Let \(\mathcal{E}(\epsilon )\) be the event that \(h(X_n)\ge n(T-\epsilon )\). By Lemmas 2.3 and 2.4,

$$\begin{aligned} \lim _{n\rightarrow \infty }\mathbf{P}(h_n(N,M) \le Tn/2, \mathcal{E}_n(\epsilon ))=0 . \end{aligned}$$

Note that \(\mathbf{P}(\mathcal{E}_n(\epsilon ))\) is near one for large \(n\), so there are at least \(Tn/2M\) many \(\tau _i(N,M)\)s with \(\tau _i(N,M) \le n\) that also have a probability near one for large \(n\). Hence, \(\tau _k(N,M)=\infty \) cannot have a positive probability for each \(k\). Therefore,

$$\begin{aligned} \tau _1(N,M)< \tau _2(N,M)< \cdots \tau _k(N,M)< \cdots < \infty . \end{aligned}$$
(2.29)

By (2.29), we know that \(\tau _{k}(N,M)=\tau _i\) for some \(i\) and

$$\begin{aligned} \tau _k'(N,M)-\tau _k(N,M)=\tau _{i+1}-\tau _{i}. \end{aligned}$$

Therefore, by the same proof of (2.7), for \(k\ge 1\)

$$\begin{aligned} \left\{ \left( \tau _{k}'(N,M)-\tau _{k}(N,M), h\big (X_{\tau _{k}'(N,M)}\big )-h\big (X_{\tau _k(N,M)}\big )\right) \right\} \hbox {is an i.i.d. sequence.}\nonumber \\ \end{aligned}$$
(2.30)

3 Large deviations for \(h_n(N,M)\)

By Lemma 2.1, we let

$$\begin{aligned} \mathbf{E} (\tau _2-\tau _{1})=A\ge 1 \quad \text{ and } \quad \mathbf{E} \left( \tau _{1}'(N,M)-\tau _{1}(N,M)\right) =A(N,M) \end{aligned}$$

and

$$\begin{aligned} \mathbf{E}(h(X_{\tau _{2}})-h(X_{\tau _{1}}))=B\ge 1 \quad \text{ and } \quad \mathbf{E}\left( h\big (X_{\tau _{1}'(N,M)}\big )-h \big (X_{\tau _{1}(N,M)}\big )\right) =B(N,M). \end{aligned}$$

We set

$$\begin{aligned} T_n=\sum _{k=1}^n (\tau _{k+1}-\tau _{k}) \quad \text{ and } \quad T_n(N,M)= \sum _{k=1}^n \left( \tau _k'(N,M)-\tau _k(N,M)\right) \end{aligned}$$

and

$$\begin{aligned} H_n&= \sum _{k=1}^n \left( h\left( X_{\tau _{k+1}}\right) -h\left( X_{\tau _{k}}\right) \right) \quad \text{ and } \quad H_n(N,M)\nonumber \\&= \sum _{k=1}^n \left( h\big (X_{\tau _{k}'(N,M)}\big )-h\big (X_{\tau _k(N,M)}\big )\right) . \end{aligned}$$

By the law of large numbers,

$$\begin{aligned} \lim _{n\rightarrow \infty }{T_n \over n}=A \quad \text{ and } \quad \lim _{n\rightarrow \infty }{T_n(N,M) \over n}=A(N,M) \end{aligned}$$
(3.1)

and

$$\begin{aligned} \lim _{n\rightarrow \infty }{H_n \over n}=B \quad \text{ and } \quad \lim _{n\rightarrow \infty }{H_n(N,M) \over n}=B(N,M) . \end{aligned}$$
(3.2)

If \(\tau _i\le n\le \tau _{i+1}\) for \(i\ge 1\), then

$$\begin{aligned} h(X_{\tau _i})\le h(X_n) \le h(X_{\tau _{i+1}}). \end{aligned}$$
(3.3)

Thus,

$$\begin{aligned} {h(X_{\tau _i})\over \tau _{i+1}}\le {h(X_n)\over n}\le {h(X_{\tau _{i+1}})\over \tau _i}. \end{aligned}$$
(3.4)

By (3.1) and (3.2),

$$\begin{aligned} \lim _{i\rightarrow \infty } {h(X_{\tau _i})\over \tau _{i+1}}=\lim _{i\rightarrow \infty } {h(X_{\tau _{i+1}})\over \tau _{i}} ={B\over A}. \end{aligned}$$
(3.5)

So by (1.1), (3.4), and (3.5),

$$\begin{aligned} {B\over A}=T. \end{aligned}$$
(3.6)

Regarding \(B(N,M)\) and \(A(N,M)\), we have the following lemma.

Lemma 3.1

For \(c=2\) and \(b \ge 70\),

$$\begin{aligned} \lim _{N,M\rightarrow \infty }A(N,M)\!=\!A \quad \text{ and } \quad \lim _{N,M\rightarrow \infty }B(N,M)\!=\!B \quad \text{ and } \quad \lim _{N,M\rightarrow \infty }{B(N,M) \over A(N,M)}\!=\!T. \end{aligned}$$

Proof

By (2.5) and the definitions of \(\tau _1(N)\) and \(\tau _1(N,M)\), for each sample point \(\omega \), there exist large \(N\) and \(M\) such that

$$\begin{aligned} \tau _1(N,M)(\omega )=\tau _1(\omega ), \end{aligned}$$

where \(\tau _1(N,M)(\omega )\) and \(\tau _1(\omega )\) are \(\tau _1(N,M)\) and \(\tau _1\) with \(\omega \). It also follows from the definition of \(\tau '_1(N,M)\) that for the above \(N\) and \(M\),

$$\begin{aligned} \tau _1'(N,M)(\omega )=\tau _2(\omega ). \end{aligned}$$

Thus, for each \(\omega \)

$$\begin{aligned} \lim _{N,M \rightarrow \infty } \tau _1'(N,M)(\omega )-\tau _1(N,M)(\omega )=\tau _2(\omega )-\tau _1(\omega ). \end{aligned}$$
(3.7)

By the dominated convergence theorem,

$$\begin{aligned} \lim _{N,M \rightarrow \infty }A(N,M)=\lim _{N,M \rightarrow \infty } \mathbf{E} (\tau _1'(N,M)-\tau _1(N,M))=\mathbf{E}(\tau _2-\tau _1)=A.\qquad \quad \end{aligned}$$
(3.8)

Similarly,

$$\begin{aligned} \lim _{N,M \rightarrow \infty }B(N,M)=\lim _{N,M \rightarrow \infty } \mathbf{E} \left( h\big (X_{\tau _1'(N,M)}\big )-h\big (X_{\tau _1(N,M)} \big )\right) =B. \end{aligned}$$
(3.9)

Therefore, Lemma 3.1 follows from (3.8), (3.9), and (3.6). \(\square \)

Now we show that \(h_n(N,M)\) has an exponential upper tail.

Lemma 3.2

If \(c=2\) and \(b\ge 70\), then for \(\epsilon >0\), there exist \(N_0=N_0(\epsilon )\) and \(M_0 =M_0(\epsilon )\) such that for all \(N \ge N_0\) and \(M\ge M_0\)

$$\begin{aligned} \mathbf{P} ( h_n(N,M)\ge n (T+\epsilon ))\le C_1 \exp (-C_2 n), \end{aligned}$$
(3.10)

where \(C_i=C_i(\epsilon , N,M)\) for \(i=1,2\) are constants.

Proof

Recall that

$$\begin{aligned} J_n=\max \{i:\tau _i(N,M)\le n\}. \end{aligned}$$

So

$$\begin{aligned}&\mathbf{P} ( h_n(N,M)\ge n (T+B\epsilon ))\nonumber \\&\quad = \mathbf{P} \left( \sum _{i=1}^{J_n} \left( h\big (X_{\tau _{i}'(N,M)}\big )-h\big (X_{\tau _{i}(N,M)}\big )\right) \ge n(T+B\epsilon )\right) \nonumber \\&\quad \le \mathbf{P} \left( \sum _{i=1}^{J_n} \left( h\big (X_{\tau _{i}'(N,M)}\big )\!-\!h\big (X_{\tau _{i}(N,M)}\big )\right) \!\ge \! n(T+B\epsilon ), J_n\!\le \! n\left( {T \over B(N,M)}\!+\!\epsilon /2\right) \right) \nonumber \\&\qquad + \mathbf{P} \left( J_n> n\left( {T \over B(N,M)}+\epsilon /2\right) \right) \nonumber \\&\quad \le \mathbf{P} \left( \sum _{i=1}^{n({T /B(N,M)}+\epsilon /2)} \left( h\big (X_{\tau _{i}'(N,M)}\big )-h\big (X_{\tau _{i}(N,M)}\big )\right) \ge n(T+B\epsilon )\right) \nonumber \\&\qquad + \mathbf{P} \left( J_n> n\left( {T \over B(N,M)}\right) +\epsilon /2)\right) \nonumber \\&\quad = I+II. \end{aligned}$$
(3.11)

Here without loss of generality, we assume that \(n({T /B(N,M)}+\epsilon /2)\) is an integer, otherwise we can use \(\lceil n({T /B(N,M)})+\epsilon /2)\rceil \) to replace \(n({T /B(N,M)})+\epsilon /2)\). We will estimate \(I\) and \(II\) separately. For \(I\), note that by Lemma 3.2, there exist \(N_0=N_0(\epsilon )\) and \(M_0=M_0(\epsilon )\) such that for all \(N\ge N_0\) and \(M\ge M_0\)

$$\begin{aligned}&\mathbf{E} \left( \sum _{i=1}^{n({T /B(N,M)}+\epsilon /2)} \left( h\big (X_{\tau _{i}'(N,M)}\big )-h\big (X_{\tau _{i} (N,M)}\big )\right) \right) \nonumber \\&\quad \le nT(1+B(N,M)\epsilon /2)\le nT(1+2B\epsilon /3). \end{aligned}$$

Note also that by (2.30),

$$\begin{aligned} \left\{ h\left( X_{\tau _{i}'(N,M)}\right) -h\left( X_{\tau _{i}(N,M)} \right) \right\} \text{ is } \text{ a } \text{ uniformly } \text{ bounded } \text{ i.i.d. } \text{ sequence }, \end{aligned}$$

so by Lemma 2.2, there exist \(C_i=C_i(\epsilon , N,M)\) for \(i=3,4\) such that

$$\begin{aligned} \mathbf{P} \left( \sum _{i=1}^{n({T /B(N,M)}+\epsilon /2)} \left( h\big (X_{\tau _{i}'(N,M)}\big )\!-\!h\big (X_{\tau _{i}(N,M)}\big )\right) \!\ge \! n(T+B\epsilon )\right) \!\le \! C_3 \exp (-C_4 n).\nonumber \\ \end{aligned}$$
(3.12)

Now we estimate \(II\). By Lemma 3.1, there exist \(N_0=N_0(\epsilon , b)\) and \(M_0=M_0(\epsilon , b)\) such that for all \(N\ge N_0\) and \(M \ge M_0\)

$$\begin{aligned} \mathbf{P} \left( J_n> n\left( {T \over B(N,M)}\right) +\epsilon /2\right) =\mathbf{P} \left( J_n> n\left( A^{-1}(N,M)+\epsilon /3\right) \right) .\qquad \end{aligned}$$
(3.13)

Here without loss of generality, we also assume that \(n(A^{-1}(N,M)+\epsilon /3)\) is an integer, otherwise we can use \(\lceil n(A^{-1}(N,M)+\epsilon /3)\rceil \) to replace \(n(A^{-1}(N,M)+\epsilon /3)\). Note that

$$\begin{aligned} \left\{ J_n \ge n(A^{-1}(N,M)+\epsilon /3)\right\} \subset \left\{ \sum _{i=1}^{n(A^{-1}(N,M)+\epsilon /3)} (\tau _i'(N,M)- \tau _i(N,M)) \le n\right\} .\nonumber \\ \end{aligned}$$
(3.14)

Note also that

$$\begin{aligned} \mathbf{E} \sum _{i=1}^{n(A^{-1}(N,M)+\epsilon /3)} \left( \tau _i'(N,M)- \tau _i(N,M)\right) = n(1+ \epsilon A(N,M)/3), \end{aligned}$$

and, by (2.30), \(\{\tau _i'(N,M)- \tau _i(N,M)\}\) is a uniformly bounded i.i.d. sequence, so by (3.13), and (3.14), and Lemma 2.2, there exist \(C_i=C_i(\epsilon , b,N,M)\) for \(i=5,6\) such that

$$\begin{aligned}&\mathbf{P} \left( J_n> n\left( {T \over B(N,M)}+\epsilon /2\right) \right) \nonumber \\&\quad \le \mathbf{P} \left( J_n> n(A^{-1}(N,M)+\epsilon /3)\right) \nonumber \\&\quad \le \mathbf{P}\left( \sum _{i=1}^{n(A^{-1}(N,M)+\epsilon /3)} (\tau _i'(N,M)- \tau _i(N,M)) \le n \right) \nonumber \\&\quad \le C_5\exp (-C_6 n). \end{aligned}$$
(3.15)

For all large \(N\) and \(M\), we substitute (3.12) and (3.15) in (3.11) to have

$$\begin{aligned} \mathbf{P} ( h_n(N,M)\ge n (T+\epsilon ))\le I + II \le C_7 \exp (-C_8 n) \end{aligned}$$
(3.16)

for \(C_i=C_i(\epsilon , N,M)\) for \(i=7,8\). Therefore, we have an exponential tail estimate for \(h_n(N,M)\). So Lemma 3.2 follows. \(\square \)

Let

$$\begin{aligned} L_n=\max \{i: \tau _i\le n\} \end{aligned}$$

and

$$\begin{aligned} h_n=\sum _{i=1}^{L_n}\left( h(X_{\tau _{i}})-h(X_{\tau _{i-1}})\right) \text{ if } L_n\ge 1 \quad \text{ and } \quad h_n=0 \text{ if } L_n=0. \end{aligned}$$
(3.17)

Recall that \(\rho _i\) is the leaving time defined in (2.3). We show the following subadditive argument for \(h_n\).

Lemma 3.3

For \(c=2,\,b \ge 2,\,N >0\), and for each pair of positive integers \(n\) and \(m\),

$$\begin{aligned} \mathbf{P} ( h_n&\ge n C, \rho _0 \le N)\mathbf{P} ( h_m\ge m C, \rho _0\le N)\nonumber \\&\le 2^N(b+1)n\mathbf{P} ( h_{n+m+1}\nonumber \\&\ge (n+m) C+1,\rho _0\le N), \end{aligned}$$

for any \(C>0\).

Proof

By the definition in (3.17), there exists \(0\le k\le n\) such that

$$\begin{aligned} \tau _k \le n\le \tau _{k+1}. \end{aligned}$$

So

$$\begin{aligned} h_n= h(X_{\tau _k})\le h(X_n)\le h(X_{\tau _{k+1}}). \end{aligned}$$
(3.18)

For \( i\ge nC\), we denote by \(\mathcal{F}(x, i, N, nC)\) the event that walks \(\{X_1, X_2, \ldots , X_{i}\}\) have

$$\begin{aligned} h(X_j)< nC \quad \text{ for } j< i \quad \text{ and } \quad h(X_i)=x \text{ with } h(x) \ge nC. \end{aligned}$$
(3.19)

In addition, the number of walks \(\{X_1, X_2, \ldots , X_{i}\}\) visiting the root is no more than \(N\).

Note that on \(\{h_n\ge n C, \rho _0\le N\}\), walks eventually move to some vertex \(x\) at some time \(i\) with \(h(x) \ge nC\), and walks \(\{X_1, X_2,\ldots , X_i\}\) visit the root no more than \(N\) times. So we may control \(\{h_n\ge n C, \rho _0\le N\}\) by a finite step walks \(\{X_1, X_2, \ldots , X_{i}\}\) in order to work on a further coupling process. More precisely,

$$\begin{aligned} \mathbf{P} ( h_n\ge n C, \rho _0\le N)\le \sum _{i\le n}\sum _{x} \mathbf{P} \left( \mathcal{F}(x, i, N, nC) \right) \!. \end{aligned}$$
(3.20)

There are \(b+1\) many vertices adjacent to \(x\). We just select one of them and denote it by \(z\) with \(h(z)=h(x)+1\). Let \(e_z\) be the edge with the vertices \(x\) and \(z\). On \(\mathcal{F}(x,i,N, nC)\), we require that the next move \(X_{i+1}\) will be from \(x\) to \(z\). Thus, \(X_{i+1}=z\). We denote this subevent by \(\mathcal{G}(x,z, i, N, nC)\subset \mathcal{F}(x,i,N, nC)\). We have

$$\begin{aligned} \sum _{i\le n}\sum _{x} \mathbf{P} \left( \mathcal{F}(x, i, N, nC) \right) \le (b+1)\sum _{i\le n}\sum _{x} \mathbf{P} \left( \mathcal{G}(x,z, i, N, nC )\right) \!. \end{aligned}$$
(3.21)

Now we focus on \(\{h_m\ge Cm, \rho _0\le N\}\). Let \(\mathbf{T}_z\) be the subtree with the root at \(z\) and vertices in \(\{v: h(v) \ge h(z)\}\). We define \(\{X_n^i(z)\}\) to be the linearly reinforced random walks starting from \(z\) in subtree \(\mathbf{T}_z\) for \(n\ge i+1\) with

$$\begin{aligned} X_{i+1}^{i}(z)=z \quad \text{ and } \quad w(e_z, i+1)=2. \end{aligned}$$

Note that walks \(\{X_n^i(z)\}\) stay inside \(\mathbf{T}_z\), so

$$\begin{aligned} w(e_z, n) =2 \quad \text{ for } n\ge i+1. \end{aligned}$$
(3.22)

We can define \(\tau _k^i\), \(\rho ^i_0\) and \(h_m^i(z)\) for \(\{X_n^i(z)\}\) similar to the definitions of \(\tau _k,\,\rho _0\) and \(h_m\) for \(\{X_n\}\).

On \(w(e_z,i+1)=2\), we consider a probability difference between \(\mathbf{P}(h_m \ge Cm, \rho _0\le N)\) and \(\mathbf{P}(h_m^i(z) \ge Cm, \rho _0^i\le N)\). Note that there are only \(b\) edges from the root, but there are \(b+1\) edges from vertex \(z\) with \(w(e_z,n)=2\), so the two probabilities are not the same. We claim that

$$\begin{aligned} \mathbf{P}(h_m\ge m C, \rho _0\le N)\le 2^N \mathbf{P}(h_m^i(z) \ge Cm, \rho _0^i\le N\,\,|\,\, w(e_z, i+1)=2).\qquad \quad \end{aligned}$$
(3.23)

To show (3.23), we consider a fixed path \((u_0=\mathbf{0}, u_1, u_2, \ldots )\) in \(\mathbf{T}\) with \(\{X_1=u_1, X_2=u_2, \ldots \}\in \{h_m \ge Cm, \rho _0\le N\} \). Note that \([u_j, u_{j+1}]\) is an edge in \(\mathbf{E}\). If we remove \(\mathbf{T}\) from the root to \(z\), it will be \(\mathbf{T}_z\). So path \((\mathbf{0}, u_1, u_2, \ldots )\) in \(\mathbf{T}\) will be a new path \((u_0(z)=z, u_1(z), u_2(z), \ldots )\) in \(\mathbf{T}_z\) after removing. Thus, if

$$\begin{aligned} \{X_0=\mathbf{0}, X_1=u_1,X_2=u_1, \ldots \}\in \{h_m \ge Cm, \rho _0\le N\} , \end{aligned}$$

then

$$\begin{aligned} \{X_{i+1}^i=z, X_{i+2}^i(z)=u_1(z), \ldots \}\in \{h_m ^i(z)\ge Cm, \rho _0^i\le N\} . \end{aligned}$$

On the other hand, given a fixed paths \(\{\mathbf{0}, u_1,\ldots , u_j, \ldots \}\), it follows from the definition of \(\{z, u_{1}(z), \ldots , u_{j}(z), \ldots \}\) that

$$\begin{aligned} w\left( [u_j ,u_{j+1}], k\right) = w\left( [u_{j}(z), u_{j+1}(z)], i+1+k\right) \end{aligned}$$
(3.24)

for any positive integers \(j\) and \(k\). We may focus on a finite part \(\{\mathbf{0}, u_1, \ldots u_l\}\) from \(\{\mathbf{0}, u_1, \ldots \}\). Now if we can show that for all large \(l\), and for each path \(\{\mathbf{0}, u_1, u_2, \ldots , u_l\}\),

$$\begin{aligned}&\mathbf{P} (X_1=u_1, X_2=u_2, \ldots , X_l=u_l)\nonumber \\&\quad \le 2^N \mathbf{P}\left( X_{i+2}^i(z)=u_1(z), X_{i+3}^i(z)=u_2(z),\ldots , X_{i+2+l}^i(z)\right. \nonumber \\&\left. \quad =u_l(z)\,\,|\,\, w(e_z, i+1)=2\right) , \end{aligned}$$
(3.25)

then (3.23) will be followed by the summation of all possible paths \(\{\mathbf{0}, u_1, u_2, \ldots u_l\}\) for both sides in (3.25) and by letting \(l\rightarrow \infty \). Therefore, to show (3.23), we need to show (3.25).

Note that

$$\begin{aligned} \mathbf{P} (X_1\!=\!u_1, X_2=u_2, \ldots , X_l=u_l)\!=\!\prod _{j=1}^l \mathbf{P}(X_j\!=\!u_j\,\,|\,\, X_{j-1}\!=\!u_{j-1},\ldots , X_1=u_1)\nonumber \\ \end{aligned}$$
(3.26)

and

$$\begin{aligned}&\mathbf{P} (X_{i+2}^i=u_1(z), X_{i+3}^i(z)=u_2(z), \ldots , X_{i+2+l}^i(z)=u_l(z))\nonumber \\&\quad =\!\prod _{j=1}^l \mathbf{P}(X_{i+1+j}^i(z)=u_j(z)\,\,|\,\, X_{i+j}^i(z)=u_{j-1}(z),\ldots , X_{i+2}^i(z)\nonumber \\&\quad =u_1(z), w(e_z, i+1)=2). \end{aligned}$$
(3.27)

If \(u_{j-1}=\mathbf{0}\), then

$$\begin{aligned} \mathbf{P}(X_j=u_j\,\,|\,\, X_{j-1}=u_{j-1},\ldots ,X_1=u_1)={w([u_{j-1}, u_{j}], j-1) \over \sum _{e} w(e, j)}, \end{aligned}$$
(3.28)

where the sum in (3.28) takes over all possible edges adjacent to the root with vertices in \(\mathbf{T}\). On the other hand, if \(u_{j-1}=\mathbf{0}\), we know that \(u_{j-1}(z)=z\), then by (3.22),

$$\begin{aligned}&\mathbf{P}(X_{i+1+j}^i(z)\!=\!u_j(z)\,\,|\,\, X_{i+j}^i(z)\!=\!u_{j-1}(z),\ldots , X_{i+2}^i(z)=u_1(z),w(e_z, i+1)=2)\nonumber \\&\quad = {w([u_{j-1}(z), u_{j}(z)], i+j) \over \sum _{e} w(e,i+ j)+w(e_z,i+ j)}\!=\!{w([u_{j-1}(z), u_{j}(z)], i+j) \over \sum _{e} w(e, i+j)+2} , \end{aligned}$$
(3.29)

where the sum in (3.29) takes all edges adjacent to \(z\) with vertices in \(\mathbf{T}_z\) (not including \(e_z\)). We check the numerators in the right sides of (3.28) and (3.29). If \(X_1, \ldots X_{j-1}\) never visit \(u_j\), then both \(w([u_{j-1}, u_j],j-1]=1\) and \(w([u_{j-1}(z), u_{j}(z)], i+j) =1\). Otherwise, by (3.24) the two numerators are also the same. Similarly, the two sums in the denominators in the right sides of (3.28) and (3.29) are the same. Therefore, if \(u_{j-1}=\mathbf{0}\), note that \( \sum _{e} w(e, j)\ge 2\) for all \(j\), so

$$\begin{aligned}&2 \mathbf{P}(X_{i+1+j}^i(z)=u_j(z)\,\,|\,\, X_{i+j}^i(z)=u_{j-1}(z),\ldots , X_{i+2}^i(z)=u_1(z),w(e_z, i+1)=2) \nonumber \\&\quad \ge \mathbf{P}(X_j=u_j\,\,|\,\, X_{j-1}=u_{j-1},\ldots , X_1=u_1). \end{aligned}$$
(3.30)

If \(u_{j-1}\ne \mathbf{0}\), we do not need to consider the extra term \(w(e_z,i+ j)\) in the denominator of the second right side of (3.29). So by the same argument of (3.30), if \(u_{j-1}\ne \mathbf{0}\),

$$\begin{aligned}&\mathbf{P}(X_{i+1+j}^i(z)=u_j(z)\,\,|\,\, X_{i+j}^i(z)=u_{j-1}(z),\ldots , X_{i+2}^i(z)=u_1(z),w(e_z, i+1)=2) \nonumber \\&\quad = \mathbf{P}(X_j=u_j\,\,|\,\, X_{j-1}=u_{j-1},\ldots , X_1=u_1) \end{aligned}$$
(3.31)

Since we restrict \(\rho _0\le N\) and \(\rho ^i_0\le N\), walks \(\{X_1,X_2,\ldots \}\) visit the root no more than \(N\) times. On the other hand, walks \(\{X^i_{i+2}(z), X_{i+3}^i(z), \ldots \}\) also visit \(z\) no more than \(N\) times. This indicates that there are at most \(N\) vertices \(u_j\)s with \(u_j=\mathbf{0}\) for \(1\le j\le l\) for the above path \(\{\mathbf{0}, u_1, \ldots , u_l \}\). Thus, (3.25) follows from (3.26)–(3.31). So does (3.23).

With (3.23), we will show Lemma 3.3. Note that \(\{h_{m}^i(z)\ge m C, \rho _0^i\le N\}\) only depends on the weight configurations of the edges with vertices inside \(\mathbf{T}_z\), and weight \(w(e_z, i+1)\), and the time interval \([i+2, \infty )\). In contrast, on \(\mathcal{G}(x, z, i,N, nC)\), the last move of walks \(\{X_1, \ldots , X_{i}, X_{i+1}\}\) is from \(x\) to \(z\), but the other moves use the edges with the vertices inside \(\{y: h(y)\le h(z)-1\}\). So by (3.23),

$$\begin{aligned}&\mathbf{P} (h_m\ge Cm, \rho _0 \le N)\nonumber \\&\quad \le 2^N \mathbf{P}\left( h_m^i(z) \ge Cm, \rho _0^i\le N\,\,|\,\, w(e_z, i+1)=2\right) \nonumber \\&\quad \le 2^N\mathbf{P}\left( h_m^i (z) \ge Cm, \rho _0^i\le N\,\,|\,\, \mathcal{G}(x,z,i,N, nC)\right) . \end{aligned}$$
(3.32)

By (3.21) and (3.32),

$$\begin{aligned}&\mathbf{P} ( h_n\ge n C,\rho _0\le N)\mathbf{P}(h_m\ge mC,\rho _0\le N)\nonumber \\&\quad \le \sum _{i\le n} \sum _x 2^{N}(b+1)\mathbf{P} \left( \mathcal{G}(x, z,i, N,nC), h_m^i(z)\ge mC, \rho ^i_0\le N\right) .\qquad \end{aligned}$$
(3.33)

If \(i\le n\), then

$$\begin{aligned} h_{m}^i (z)\le h_{m+n-i}^i(z). \end{aligned}$$
(3.34)

By (3.33) and (3.34),

$$\begin{aligned}&\mathbf{P} ( h_n\ge n C, \rho _0\le N)\mathbf{P}(h_m\ge mC,\rho _0\le N)\nonumber \\&\quad \le \sum _{i\le n}\sum _x 2^N(b+1)\mathbf{P} \left( \mathcal{G}(x,z, i, N,nC), h_m^i(z)\ge mC, \rho ^i_0\le N\right) \nonumber \\&\quad \le \sum _{i\le n} 2^N (b+1)\mathbf{P} \left( \bigcup _x \left\{ \mathcal{G}(x, z,i,N, nC),h_{m+n-i}^i(z)\ge m C\right\} \right) . \end{aligned}$$
(3.35)

Note that for each \(x\) and \(i\),

$$\begin{aligned} \left\{ \mathcal{G}(x,z,i, N,nC),h_{m+n-i}^i(z)\ge m C\right\} \end{aligned}$$

implies that the walks first move to \(x\) at time \(i\) with \(h(x) \ge nC\) and the number of walks \(\{X_1, \ldots , X_{i}\}\) back to the root is not more than \(N\). After that, the walks continue to move from \(x\) to \(z\). After this move, the walks move inside subtree \(\mathbf{T}_z\). So \(i\) is a cut time and \(X_{i}\) is a cut vertex with \(h(X_i) \ge nC\). Therefore, together with \(h_{n+m-i}^i(z) \ge mC\), \( \left\{ \mathcal{G}(x, z, i,N, nC),h_{m+n-i}^i(z)\ge m C\right\} \) implies that \(\{h_{n+m+1} \ge (n+m)C+1, \rho _0\le N\}\) occurs. In other words,

$$\begin{aligned} \left\{ \mathcal{G}(x, z, i,N, nC),h_{m+n-i}^i(z)\ge m C\right\} \subset \{h_{n+m+1}\ge (n+m)C+1, \rho _0\le N\}.\nonumber \\ \end{aligned}$$
(3.36)

Therefore,

$$\begin{aligned} \bigcup _{x} \left\{ \mathcal{G}(x,z, i, N,nC),h_{m+n-i}^i(z)\ge m C\right\} \subset \{h_{n+m+1}\ge (n+m)C+1,\rho _0\le N\}.\nonumber \\ \end{aligned}$$
(3.37)

Finally, by (3.35) and (3.37),

$$\begin{aligned}&\mathbf{P} ( h_n\ge n C,\rho _0\le N)\mathbf{P} ( h_m\ge m C,\rho _0\le N)\nonumber \\&\quad \le 2^N (b+1)n \mathbf{P} ( h_{n+m+1}\ge (n+m) C+1,\rho _0\le N). \end{aligned}$$
(3.38)

Therefore, Lemma 3.3 follows from (3.38). \(\square \)

We let

$$\begin{aligned} a_n=-\log \mathbf{P}(h_n \ge (T+\epsilon ) n, \rho _0\le N). \end{aligned}$$
(3.39)

We may take \(\epsilon \) small such that \(T+\epsilon < 1\). By Lemma 3.3, for any \(n\) and \(m\)

$$\begin{aligned} a_{n+m+1}\le a_n+ a_m + \log n + N \log 2+\log (b+1). \end{aligned}$$
(3.40)

By (3.40) and a standard subadditive argument (see (II.6) in Grimmett [7]), we have the following lemma.

Lemma 3.4

For \(c=2\) and any \(N>0\) and \(b \ge 2\), there exists \(0\le \alpha (N) <\infty \) such that

$$\begin{aligned}&\lim _{n\rightarrow \infty } {-1\over n} \log \mathbf{P}(h_n \ge (T+\epsilon )n, \rho _0\le N) \nonumber \\&\quad =\inf _{n} \left\{ {-1\over n} \log \mathbf{P}(h_n \ge (T+\epsilon )n, \rho _0\le N)\right\} =\alpha (N). \end{aligned}$$

It follows from the definition and Lemma 3.4 that \(\alpha (N)\) is a non-negative decreasing sequence in \(N\). Thus, there exists a finite number \(\alpha \ge 0\) such that

$$\begin{aligned} \lim _{N\rightarrow \infty } \alpha (N)=\alpha . \end{aligned}$$
(3.41)

By (3.41) and Lemma 3.4, for each \(N\),

$$\begin{aligned} \alpha \le \alpha (N)\le {-1\over n} \log \mathbf{P}(h_n \ge (T+\epsilon )n, \rho _0\le N). \end{aligned}$$
(3.42)

On the other hand, note that the walk is transient, so \(\rho _0 < \infty \). Thus, for any fixed \(n\),

$$\begin{aligned} \lim _{N\rightarrow \infty }{-1\over n} \log \mathbf{P}(h_n \ge (T+\epsilon )n, \rho _0\le N)={-1\over n} \log \mathbf{P}(h_n \ge (T+\epsilon )n)\qquad \quad \end{aligned}$$
(3.43)

By (3.42) and (3.43),

$$\begin{aligned} \alpha \le \liminf _n{-1\over n} \log \mathbf{P}(h_n \ge (T+\epsilon )n) \end{aligned}$$
(3.44)

Note that for each \(N\),

$$\begin{aligned}&\limsup _{n} {-1\over n} \log \mathbf{P}(h_n \ge (T+\epsilon )n)\\&\quad \le \lim _{n\rightarrow \infty } {-1\over n} \log \mathbf{P}(h_n \ge (T+\epsilon )n, \rho _0\le N)=\alpha (N). \end{aligned}$$

So for each \(\delta >0\) we may take \(N\) large such that

$$\begin{aligned} \limsup _n{-1\over n} \log \mathbf{P}(h_n \ge (T+\epsilon )n)\le \alpha (N) \le \alpha +\delta . \end{aligned}$$
(3.45)

We summarize (3.44) and (3.45) as the following lemma.

Lemma 3.5

For \(c=2\) and any \(b \ge 2\), there exists a constant \(\alpha \ge 0\) such that

$$\begin{aligned} \lim _{n\rightarrow \infty }{-1\over n} \log \mathbf{P}(h_n \ge (T+\epsilon )n)= \alpha . \end{aligned}$$

4 Proof of Theorem 1

Note that for \(\epsilon < 1- T\), and for all large \(n\),

$$\begin{aligned} \left( {b\over b+1}\right) ^n \le \mathbf{P}(h(X_{i+1})> h(X_i)\quad \text{ for } 0\le i\le n) \le \mathbf{P}(h(X_n)\ge n(T+\epsilon )).\nonumber \\ \end{aligned}$$
(4.1)

By (4.1),

$$\begin{aligned} \limsup _{n\rightarrow \infty } {-1\over n} \log \mathbf{P} ( h(X_n)\ge n(T+\epsilon ))< \infty . \end{aligned}$$
(4.2)

Note also that

$$\begin{aligned}&\mathbf{P} ( h(X_n)\ge n(T+\epsilon ))\nonumber \\&\quad \le \mathbf{P} ( h(X_n)\ge n(T+\epsilon ), h_n(N,M) \ge n(T+\epsilon /2))\nonumber \\&\qquad +\mathbf{P} ( h(X_n)- h_n(N,M) \ge n\epsilon /2). \end{aligned}$$
(4.3)

By Lemmas 2.3 and 2.4, for \(\epsilon >0\), we select \(N\) and \(M\) such that

$$\begin{aligned} \mathbf{P} ( h(X_n)\ge n(T+\epsilon ))\le \mathbf{P} ( h_n(N,M) \ge n(T+\epsilon /2))+C_1\exp (-C_2 n).\quad \end{aligned}$$
(4.4)

For \(N\) and \(M\) in (4.4), we may require that \(N \ge N_0\) and \(M \ge M_0\) for \(N_0\) and \(M_0\) in Lemma 3.2. By (4.4) and Lemma 3.2, there exist \(C_i=C_i(\epsilon , N,M)\) for \(i=3,4\) such that

$$\begin{aligned} \mathbf{P} ( h(X_n)\ge n(T+\epsilon ))\le C_3\exp (-C_4 n). \end{aligned}$$
(4.5)

By (4.5), for \(\epsilon >0\),

$$\begin{aligned} 0< \liminf _{n\rightarrow \infty } {-1\over n} \log \mathbf{P} ( h(X_n)\ge n(T+\epsilon )). \end{aligned}$$
(4.6)

It remains for us to show the existence of the limit in Theorem 1. We use a similar proof in Lemma 3.3 to show it. Let \(\mathcal{F}(x, k, n)\) be the event that \(h(X_i) < n(T+\epsilon )\) for \( i=1,\ldots ,k-1\), \(h(X_{k})\ge n(T+\epsilon )\) and \(h(X_k)=x\) for \(k\le n\). Thus,

$$\begin{aligned} \mathbf{P}( h(X_n) \ge n(T+\epsilon ))\le \sum _{k\le n} \sum _{x\in \mathbf{T}} \mathbf{P} (\mathcal{F}(x, k, n)) \end{aligned}$$
(4.7)

Note that \(\mathcal{F}(x, k, n)\) depends on finite step walks \(\{X_0, \ldots , X_{k}\}\). We need to couple the remaining walks \(\{X_{k+1}, X_{k+2}, \ldots \}\) such that \(k\) is a cut time.Let \(\mathcal{Q}(x,k)\) be the event that \(X_{k}=x\) and \(\{X_t\}\) will stay inside \( \mathbf{T}_x\) but never returns to \(x\) for \(t > k\). Since the walks are transient, we may let

$$\begin{aligned} \mathbf{P} (\mathcal{Q}(\mathbf{0}, 0))=\nu >0. \end{aligned}$$
(4.8)

Let \(e_x\) denote the edge with vertices \(x\) and \(w\) for \(h(w) =h(x)-1\). We know that \(\mathcal{Q}(x, k)\) depends on initial weight \(w(e_x, k)\), and the weights in the edges with the vertices in \(\mathbf{T}_x\), respectively.

Therefore, by the same discussion of (3.23) in Lemma 3.3,

$$\begin{aligned} 2\mathbf{P} (\mathcal{Q}(x,k)\,\,\, |\,\,\,\mathcal{F}(x, k, n))\ge \left( {b+2\over b}\right) \mathbf{P} (\mathcal{Q}(x,k)\,\,\, |\,\,\,\mathcal{F}(x, k, n)) \ge \nu . \end{aligned}$$
(4.9)

Thus, by (4.7) and (4.9),

$$\begin{aligned}&\mathbf{P} (h(X_n)\ge n(T+\epsilon ))\nonumber \\&\le \sum _{x\in \mathbf{T}}\sum _{k\le n} \mathbf{P} \left( \mathcal{F}(x, k,n) \right) \mathbf{P} (\mathcal{Q}(x,k)\,\,\, |\,\,\,\mathcal{F}(x,k,n))\left( {b+2\over b}\right) \nu ^{-1}\nonumber \\&\le 2\nu ^{-1}\sum _{k\le n}\mathbf{P} \left( \bigcup _{x\in \mathbf{T} }\mathcal{F}(x,k,n)\cap \mathcal{Q}(x,k)\right) . \end{aligned}$$
(4.10)

If \( \mathcal{F}(x,k,n) \cap \mathcal{Q}(x,k)\) occurs, it implies that the walks move to \(x\) at \(k \le n\) with \(h(x) \ge n(T+\epsilon )\). After that, the walks continue to move inside \(\mathbf{T}_x\) from \(x\) and never return to \(x\). This implies that \(k\) is a cut time and \(X_{k}\) is a cut vertex with \(h(X_{k}) \ge n(T+\epsilon )\). So for \(0\le k\le n\) and for each \(x\),

$$\begin{aligned} \mathcal{F}(x,k,n) \cap \mathcal{Q}(x,k)\subseteq \{h_{k}\ge n(T+\epsilon )\}. \end{aligned}$$
(4.11)

Thus,

$$\begin{aligned} \bigcup _{x\in \mathbf{T}}\mathcal{F}(x,k,n) \cap \mathcal{Q}(x,k)\subseteq \{h_{k}\ge n(T+\epsilon )\}. \end{aligned}$$
(4.12)

Note that for \(0\le k\le n\),

$$\begin{aligned} h_{k} \le h_n. \end{aligned}$$
(4.13)

By (4.10)–(4.13),

$$\begin{aligned} \mathbf{P} (h(X_n)\ge n(T+\epsilon ))\le 2\nu ^{-1} n \mathbf{P}(h_{n}\ge n(T+\epsilon )). \end{aligned}$$
(4.14)

On the other hand, we suppose that \(h_{n}\ge n(T+\epsilon )\). Note that if \(\tau _k\le n\le \tau _{k+1}\), then by (3.18),

$$\begin{aligned} h_n=h\left( X_{\tau _k}\right) \le h(X_n). \end{aligned}$$
(4.15)

By (4.15),

$$\begin{aligned} \mathbf{P} (h_{n}\ge n(T+\epsilon ))\le \mathbf{P} (h(X_n) \ge n(T+\epsilon )). \end{aligned}$$
(4.16)

Now we are ready to show Theorem 1.

Proof of Theorem 1.

Together with (4.14), (4.16), and Lemma 3.5,

$$\begin{aligned} \lim _{n\rightarrow \infty } {1\over n} \log \mathbf{P} ( h(X_n)\ge n(T+\epsilon ))=\alpha . \end{aligned}$$
(4.17)

By (4.2) and (4.6),

$$\begin{aligned} 0< \alpha < \infty . \end{aligned}$$
(4.18)

Therefore, Theorem 1 follows from (4.17) and (4.18). \(\square \)

5 Proof of Theorem 2

Similarly, we define the same cut times \(\tau _i\) that we defined for the linearly reinforced random walk. We have \(\left( \tau _{k+1}-\tau _k, h(Y_{\tau _{k+1}})-h(Y_{\tau _k})\right) \) as an i.i.d. sequence. We can also follow Durrett et al. [5] Lemmas 7 and 8 to show that there exist \(C_i\) for \(i=1,2\) such that, for each \(k\ge 1\),

$$\begin{aligned} \mathbf{P}( \tau _{k+1}-\tau _k\ge m)\le C_1\exp (-C_2m) \end{aligned}$$
(5.1)

and

$$\begin{aligned} \mathbf{P}( h(Y_{\tau _{k+1}})-h(Y_{\tau _k})\ge m)\le C_1\exp (-C_2m). \end{aligned}$$
(5.2)

By (5.1) and (5.2), similar to our approach the linearly reinforced random walk, we set

$$\begin{aligned} S_n= \sum _{k=1}^n (\tau _k-\tau _{k-1})\quad \text{ and } \quad K_n= \sum _{k=1}^n \left( h(Y_{\tau _{k}})-h(Y_{\tau _{k-1}})\right) . \end{aligned}$$
(5.3)

By the law of large numbers,

$$\begin{aligned} \lim _{n\rightarrow \infty }{S_n \over n}=A\quad \text{ and } \quad \lim _{n\rightarrow \infty }{K_n \over n}=B. \end{aligned}$$
(5.4)

With these observations, Theorem 2 can follow from the exact proof of Theorem 1. In fact, we may not need to truncate \(\tau _i\) to \(\tau _i(N,M)\) as we did for Theorem 1, since we can use (5.1) and (5.2) directly.\(\square \)

6 Proof of Theorem 3

Now we need to estimate \(\mathbf{P}( h(Y_n)\le n(S-\epsilon ))\). Let

$$\begin{aligned} L_n=\max \{i, \tau _i \le n\} \end{aligned}$$

and let

$$\begin{aligned} h_n=\sum _{i=1}^{L_n} \left( h\left( Y_{\tau _{i}}\right) -h\left( Y_{\tau _{i-1}}\right) \right) \quad \text{ if } L_n\ge 1\quad \text{ and } \quad h_n=0 \text{ if } L_n=0. \end{aligned}$$
(6.1)

By (1.7), (5.3), and an argument similar to (3.6), we have

$$\begin{aligned} {B\over A}=S. \end{aligned}$$
(6.2)

Since \(h_n \le h(Y_n)\), by (5.1)

$$\begin{aligned}&\mathbf{P}( h(Y_n)\le n(S-\epsilon B))\nonumber \\&\quad \le \mathbf{P}( h_n \le n(S-\epsilon B))\nonumber \\&\quad \le \mathbf{P}\left( \sum _{i=1}^{L_n} \left( h\left( Y_{\tau _{i}}\right) -h\left( Y_{\tau _{i-1}}\right) \right) \le n(S-\epsilon B\right) +\mathbf{P}( \tau _1 > n)\nonumber \\&\quad \le \mathbf{P}\left( \sum _{i=1}^{L_n} \left( h\left( Y_{\tau _{i}}\right) -h\left( Y_{\tau _{i-1}}\right) \right) \le n(S-\epsilon B\right) +C_1\exp (-C_2 n). \end{aligned}$$
(6.3)

We split

$$\begin{aligned}&\mathbf{P}\left( \sum _{i=1}^{L_n} (h(Y_{\tau _{i}})-h(Y_{\tau _{i-1}})) \le n(S-\epsilon B)\right) \\&\quad \le \mathbf{P}\left( \sum _{i=1}^{L_n} (h(Y_{\tau _{i}})-h(Y_{\tau _{i-1}})) \le n(S-\epsilon B), L_n \ge n(SB^{-1} -\epsilon /2)\right) \\&\qquad +\mathbf{P}\left( L_n < n(SB^{-1} -\epsilon /2)\right) =I+II. \end{aligned}$$

We estimate \(I\) and \(II\) separately:

$$\begin{aligned} I&= \mathbf{P}\left( \sum _{i=1}^{L_n} (h(Y_{\tau _{i}})-h(Y_{\tau _{i-1}}) )\le n(S-\epsilon B), L_n \ge n(SB^{-1} -\epsilon /2)\right) \nonumber \\&\le \mathbf{P}\left( \sum _{i=1}^{n(SB^{-1} -\epsilon /2)} (h(Y_{\tau _{i}})-h(Y_{\tau _{i-1}})) \le n(S-\epsilon B)\right) . \end{aligned}$$
(6.4)

Note that

$$\begin{aligned} \mathbf{E}\left( \sum _{i=1}^{n(SB^{-1} -\epsilon /2)} (h(Y_{\tau _{i}})-h(Y_{\tau _{i-1}}))\right) = n(S -\epsilon B/2). \end{aligned}$$
(6.5)

Note also that by (5.2), \(\{h(Y_{\tau _{i}})-h(Y_{\tau _{i-1}})\}\) is an i.i.d. sequence with an exponential tail for \(k\ge 2\), so by Lemma 2.2 there exist \(C_i=C_i(\epsilon , B)\) for \(i=3,4\) such that

$$\begin{aligned} I\le C_3\exp (-C_4 n). \end{aligned}$$
(6.6)

Also, by (6.2),

$$\begin{aligned} II&= \mathbf{P}\left( L_n < n(SB^{-1} -\epsilon /2)\right) =\mathbf{P} \left( \sum _{i=1}^{n(SB^{-1} -\epsilon /2)} (\tau _i-\tau _{i-1})\ge n\right) \nonumber \\&= \mathbf{P} \left( \sum _{i=1}^{n(A^{-1} -\epsilon /2)} (\tau _i-\tau _{i-1})\ge n\right) . \end{aligned}$$

Note that

$$\begin{aligned} \mathbf{E}\sum _{i=1}^{n(A^{-1} -\epsilon /2)} (\tau _i-\tau _{i-1})= n(1-\epsilon A/2). \end{aligned}$$
(6.7)

Note also that by (5.1), \(\{\tau _i-\tau _{i-1}\}\) is an i.i.d. sequence with an exponential tail for \(k\ge 2\), so by Lemma 2.2, there exist \(C_i=C_i(\epsilon , B)\) for \(i=5,6\) such that

$$\begin{aligned} II \le C_5 \exp (-C_6 n). \end{aligned}$$
(6.8)

Together with (6.3), (6.4), (6.6), and (6.8), there exist \(C_i=C_i(c, \epsilon , B)\) for \(i=7,8\) such that

$$\begin{aligned} \mathbf{P}( h(Y_n)\le n(S-\epsilon ))\le C_7 \exp (-C_8 n). \end{aligned}$$
(6.9)

From (6.9),

$$\begin{aligned} 0< \liminf {-1\over n} \log \mathbf{P}( h(Y_n)\le n(S-\epsilon )). \end{aligned}$$
(6.10)

If the walks repeatedly move in the edge connecting the origin in \(n\) times, we have the probability \(C^n\) for a positive constant \(C=C(b)\). Thus, for \(\epsilon < S\) and for all large \(n\),

$$\begin{aligned} C^n\le \mathbf{P}(h(Y_n)\le 1) \le \mathbf{P}(h(Y_n)\le n(S-\epsilon )). \end{aligned}$$
(6.11)

So for \(\epsilon < S\),

$$\begin{aligned} \limsup {-1\over n} \log \mathbf{P}( h(Y_n)\le n(S-\epsilon ))< \infty . \end{aligned}$$
(6.12)

Therefore, Theorem 3 follows from (6.10) and (6.12). \(\square \)