# Statistics in conjugacy classes in free groups

- 167 Downloads

## Abstract

In this paper, we establish statistical results for a convex co-compact action of a free group on a CAT(\(-\,1\)) space where we restrict to a non-trivial conjugacy class in the group. In particular, we obtain a central limit theorem where the variance is twice the variance that appears when we do not make this restriction.

## Keywords

Free group Conjugacy class Convex co-compact Central limit theorem Subshift of finite type Measure of maximal entropy Thermodynamic formalism Transfer operators Generating functions## Mathematics Subject Classification

20F65 20F67 20F69 37C30 37D35 60B10## 1 Introduction and results

*X*,

*d*) (i.e the quotient of the intersection of

*X*and the convex hull of the limit set of \(\Gamma \) is compact). There has been considerable work in trying to understand the statistics of such an action. For example, the following result (a particular case of the Švarc–Milnor lemma) is well-known. Fix a free generating set \({\mathcal {A}} = \{a_1,\ldots ,a_p\}\) and let \(|\cdot |\) denote word length on \(\Gamma \) with respect to \({\mathcal {A}}\). Then, for an arbitrary base point \(o\in X\), there exist constants \(C_1,C_2>0\) such that

*x*| and

*d*(

*o*,

*xo*) are comparable quantities and it is natural to ask if more precise estimates hold, at least typically or on average.

In this paper, we shall consider the corresponding questions when we restrict our group elements to a non-trivial conjugacy class. Let \({\mathfrak {C}}\) be a non-trivial conjugacy class in \(\Gamma \) and let \(k = \min \{|x| \text{: } x \in {\mathfrak {C}}\}\). Let \({\mathfrak {C}}_n = \{x \in {\mathfrak {C}} \text{: } |x|=n\}\) and note that \({\mathfrak {C}}_n\) is non-empty if and only if \(n=k+2m\), \(m \in {\mathbb {Z}}^+\).

### Theorem 1.1

Subject to an additional condition, we also have a central limit theorem.

### Theorem 1.2

A noteworthy feature of this result is that the variance is twice the variance that appears in the unrestricted case. Theorems 1.1 and 1.2 will follow from more general results proved below.

### Remark 1.3

*X*corresponds to the universal cover of

*M*and \(\Gamma \) to the fundamental group, acting as isometries on

*X*. Given a point \(p \in M\) and a non-identity element \(x \in \Gamma \) (thought of as \(\pi _1(M,p)\)), the number

*l*(

*x*) is defined to be the length of the shortest geodesic arc from

*p*to itself in the homotopy class determined by

*x*. This can be reinterpreted as the number

*d*(

*o*,

*xo*), where

*o*is a lift of

*p*to

*X*, returning us to our original setting. Although the results of [15] are stated for manifolds of negative curvature, the arguments used there, in particular the key Lemma 1, only require that

*X*be a CAT(\(-1\)) space. A consequence of this lemma is that

*d*(

*o*,

*xo*) can be written as the Birkhoff sum of a Hölder continuous function on an associated subshift of finite type (Proposition 3 of [15]); this shows that

*d*(

*o*,

*xo*) satisfies the assumption (A1) in the next section. (Of course, the assumption (A2) below is trivially satisfied.)

The existence of a limit in (1.2) continues to hold if \(\Gamma \) is a word hyperbolic group following an observation of Calegari and Fujiwara [1], using a result of Coornaert [3].

*d*(

*o*,

*xo*) as a Birkhoff sum of a Hölder continuous function on \(\Sigma \cup \Gamma \) and the ergodic theorem. (See, for example, Lemma 4.4 and Corollary 4.5 of [16].)

(iii) The fact that the variance in Theorem 1.2 is independent of the choice of conjugacy class is a consequence of the hypothesis that \(\{d(o,xo)-\lambda |x| \hbox { : } x \in \Gamma \}\) is unbounded, which is a condition on the behaviour of the displacement function *d*(*o*, *xo*) over the whole group \(\Gamma \). (The same may be said of the assumption (A3) in the next section.)

*d*(

*o*,

*xo*) and |

*x*|:

*x*, so we may write \(\ell ({\mathfrak {C}})\) and \(\Vert {\mathfrak {C}}\Vert \). Furthermore, \(\ell ({\mathfrak {C}})\) is the length of the closed geodesic on the quotient \(\Gamma \backslash X\) in the free homotopy class determined by \({\mathfrak {C}}\). If

*S*were bounded then we would have \(\ell ({\mathfrak {C}}) = \lambda \Vert {\mathfrak {C}}\Vert \) for all non-trivial conjugacy classes \({\mathfrak {C}}\). In particular, the length spectrum of \(\Gamma \backslash X\), i.e. the set of lengths of closed geodesics, would be contained in the set \(\lambda {\mathbb {Z}}\). However, it is known that the length spectrum is not contained in a discrete subgroup of the reals when

*X*is the real hyperbolic space \({\mathbb {H}}^k\), \(k \ge 2\) or when

*X*is a simply connected surface of pinched variable negative curvature [4], so the hypothesis holds in these cases. More generally, though the hypothesis may fail in particular cases, it will typically hold. For example, if

*X*is a metric tree with quotient metric graph \(\Gamma \backslash X\) then to ensure the hypothesis is satisfied, one only requires that \(\Gamma \backslash X\) has two closed paths whose lengths have irrational ratio.

(v) The above results still hold if *d*(*o*, *xo*) is replaced by a Hölder length function *L*(*x*) as defined in [7].

*z*and

*s*are complex variables. In the geometric setting considered above, this generating function takes the form

*z*is associated to the word length and the variable

*s*to the geometric length (or to a more general weighting below). This generating function is perhaps the main new innovation of the paper, though its analysis is inspired by work on a somewhat similar function in [9]. This allows us to prove our first main result. We conclude the paper in Sect. 5 by proving a central limit theorem over a non-trivial conjugacy class. The results in this paper form part of the first author’s Ph.D. thesis at the University of Warwick.

## 2 Free groups and subshifts

As above, let \(\Gamma \) be a free group with free generating set \(\mathcal {A}=\{a_1,\ldots , a_p\}\), \(p \ge 2\). Write \({\mathcal {A}}^{-1} = \{a_1^{-1}, \ldots , a_p^{-1}\}\). A word \(x_0\cdots x_{n-1}\), with letters \(x_k \in \mathcal {A}\cup \mathcal {A}^{-1}\), is said to be *reduced* if \(x_{k+1} \ne x_k^{-1}\) for each \(k\in \{0,\ldots , n-2\}\) and *cyclically reduced* if, in addition, \(x_0 \ne x_{n-1}^{-1}\). Every non-identity element \(x \in \Gamma \) has a unique representation as a reduced word \(x = x_0 x_1 \cdots x_{n-1}\) and we define the *word length* |*x*| of \(x\), by \(|x|=n\). We associate to the identity element the empty word and set \(|1|=0\). Let \(\Gamma _n = \{x\in \Gamma :|x|=n\}\).

Let \(\mathfrak {C}\) be a non-trivial conjugacy class in \(\Gamma \) and let \(k = \inf \{|x| :x\in \mathfrak {C}\} >0\). The set of elements with shortest word length in the conjugacy class is precisely the set of elements with cyclically reduced word representations. In fact, if \(g=g_1 \cdots g_k\in \mathfrak {C}\) is cyclically reduced then all cyclically reduced words in \(\mathfrak {C}\) are given by cyclic permutations of the letters in \(g_1 \cdots g_k\). Let \(\mathfrak {C}_n = \{ x \in \mathfrak {C} :|x| = n\}\) and note that \(\mathfrak {C}_n\) is non-empty if and only if \(n = k+2m\). If \(x \in \mathfrak {C}_{k+2m}\) then its reduced word representation is of the form \(w_m^{-1} \cdots w_1^{-1} g_1 \cdots g_k w_1 \cdots w_m\), for some cyclically reduced \(g = g_1 \cdots g_k \in \mathfrak {C}_k\) and \(w= w_1 \cdots w_m \in \Gamma _m\) with \(w_1 \ne g_1, g_k^{-1}\). Hence it is convenient to introduce the notation \(\Gamma _m(g) = \{w\in \Gamma _m :w_1 \ne g_1, g_k^{-1}\}\). A simple calculation shows that the number of elements in \(\mathfrak {C}_{k+2m}\) is given by \(\# \mathfrak {C}_{k+2m} = (2p-2)(2p-1)^{m-1} \#\mathfrak {C}_k\).

*subshift of finite type*. This subshift of finite type is formed from the space of infinite reduced words (with the obvious definition) adjoined to the elements of \(\Gamma \) together with the dynamics given by the action of the shift map. It will be convenient to describe this space by means of a transition matrix. Define a \(p \times p\) matrix

*A*, with rows and columns indexed by \(\mathcal {A}\cup \mathcal {A}^{-1}\), by \(A(a,b) = 0\) if \(b=a^{-1}\) and \(A(a,b) =1\) otherwise. We then define

*aperiodic*(i.e. there exists \(n\ge 1\) such that for each pair of indices \((s,t)\), \(A^n(s,t)>0\)), \(\sigma :\Sigma \rightarrow \Sigma \) is

*mixing*(i.e. for every pair of non-empty open sets \(U,V\subset \Sigma \) there is an \(n\in \mathbb {Z}^+\) such that \(\sigma ^{-k} U \cap V \ne \emptyset \) for \(k\ge n\)).

We augment \(\Sigma \) by defining \(\Sigma ^* = \Sigma \cup \Gamma \), where the elements of \(\Gamma \) are identified with finite reduced words in the obvious way. The shift map naturally extends to a map \(\sigma : \Sigma ^*\rightarrow \Sigma ^*\), where, for the finite reduced word \( x_0 x_1\cdots x_{n-1} \in \Gamma \), we set \(\sigma (x_0 x_1\cdots x_{n-1}) = x_1\cdots x_{n-1}\); and for the empty word \(\sigma 1 =1\). It is sometimes useful to think of an element of \(\Gamma \) as an infinite sequence ending in an infinite string of 1s.

We endow \(\Sigma ^*\) with the following metric, consistent with the topology on \(\Sigma \). Fix \(0<\theta <1\) then let \(d_\theta (x,x)=0\) and, for \(x\ne y\), let \(d_\theta (x,y) = \theta ^{k}\), where \(k = \min \{n\in \mathbb {Z}^+ :x_n \ne y_n\}\). For a finite word \(x=x_0 x_1\cdots x_{m-1}\in \Gamma _m\) we take \(x_n=1\) (the empty symbol) for each \(n\ge m\). Then \(\sigma :\Sigma ^*\rightarrow \Sigma ^*\) is continuous and \(\Gamma \) is a dense subset of \(\Sigma ^*\).

*f*is Hölder continuous then the supremum is attained at a unique \(\mu _f \in {\mathcal {M}}\), called the equilibrium state of

*f*. (If \(f : \Sigma ^* \rightarrow {\mathbb {R}}\) then we write \(P(f) := P(f|_\Sigma )\).) The equilibrium state of zero \(\mu _0\) is also called the measure of maximal entropy and

*P*(0) is equal to the topological entropy

*h*of \(\sigma : \Sigma \rightarrow \Sigma \). It is easy to calculate that \(h = \log (2p-1)\) (the logarithm of the largest eigenvalue of

*A*) and that \(\mu _0\) is characterised by

*w*] is the associated cylinder set \([w] \subset \Sigma ^*\) by \([w] =\{(x_j)_{j=0}^\infty \text{: } x_j=w_j, \, j=0,\ldots ,n-1\}\). (Technically, this defines \(\mu _0\) as a measure on \(\Sigma ^*\) with support equal to \(\Sigma \).)

*cohomologous*if there exists a continuous function \(u:\Sigma ^*\rightarrow \mathbb {R}\) such that \(f=g + u\circ \sigma - u\). Two Hölder continuous functions have the same equilibrium state if and only if they differ by the sum of a coboundary and a constant. A function \(f: \Sigma ^*\rightarrow \mathbb {R}\) is

*locally constant*if there exists \(n\ge 1\) such that for all pairs \(x,y\in \Sigma \) with \(x_k = y_k\) for \(0\le k \le n\), \(f(x)=f(y)\). Locally constant functions are automatically Hölder continuous for any choice of Hölder exponent. For a function \(f:\Sigma ^*\rightarrow \mathbb {R}\) we denote by \(f^n(x)\) the Birkhoff sum

### Proposition 2.1

*f*is cohomologous to a constant.

- (A1)
There exists a Hölder continuous function \(f:\Sigma ^*\rightarrow \mathbb {R}\) so that \(F(x) = f^n(x)\) for each \(x\in \Gamma _n\) with \(n\ge 0\), and

- (A2)
\(F(x) = F(x^{-1})\).

### Theorem 2.2

- (A3)
\(F(\cdot ) - {\overline{F}}|\cdot |\) is unbounded as a function from \(\Gamma \) to \({\mathbb {R}}\).

### Lemma 2.3

Let *F* and *f* be as in (A1). Then \(F(\cdot ) -{\overline{F}}|\cdot |\) is bounded if and only if \(f|_\Sigma \) is cohomologous to a constant.

### Proof

*f*is Hölder continuous, this implies that

*f*is cohomologous to a constant.

On the other hand, if *f* is cohomologous to a constant then, again by Hölder continuity, \(\{F(x)-{\overline{F}}|x| \text{: } x \in \Gamma \} = \left\{ f^n(x) - n\int f \, d\mu _0 \text{: } x \in \Gamma _n, \ n \ge 1\right\} \) is bounded. \(\square \)

*F*over \(\Gamma _n\) (without the assumption (A2)). Particular cases of this have appeared in articles by Rivin [17] for homomorphisms, and Horsham and Sharp [7] (see also [6]) for quasimorphisms. Calegari and Fujiwara [1] prove a central limit theorem for quasimorphisms on Gromov hyperbolic groups, but have more restrictions on the regularity of the quasimorphism. Restricting to a non-trivial conjugacy class, we have the following theorem.

### Theorem 2.4

We note the limiting distribution function is independent of the choice of non-trivial conjugacy class. Further, it is interesting that the variance in Theorem 2.4 is twice the variance when we do not restrict elements \(x\in \Gamma \) to a non-trivial conjugacy class.

### Proof of Theorems 1.1 and 1.2

As in the introduction, let the free group \(\Gamma \) act convex co-compactly on a CAT\((-1)\) space (*X*, *d*). Then it was shown in [15] that \(F(x) := d(o,xo)\) satisfies (A1). (In fact, the result in [15] is stated when *X* is a simply connected manifold with bounded negative curvatures but the proof only requires the CAT\((-1)\) property.) Assumption (A2) is clearly satisfied. Therefore, Theorem 1.1 follows from Theorem 2.2. Furthermore, the additional assumption on *d*(*o*, *xo*) in Theorem 1.2 matches (A3) and so Theorem 1.2 also follows. \(\square \)

## 3 Transfer operators

*f*has Hölder exponent \(\alpha \) with respect to \(d_\theta \) then \(f \in \mathcal {F}_{\theta ^\alpha }(\Sigma ,{\mathbb {C}})\)), so there is no loss of generality in restricting to these spaces. Given \(g \in \mathcal {F}_\theta (\Sigma ,\mathbb {C})\), the

*transfer operator*\(L_g: \mathcal {F}_\theta (\Sigma ,\mathbb {C}) \rightarrow \mathcal {F}_\theta (\Sigma ,\mathbb {C})\) is defined pointwise by

### Proposition 3.1

(Ruelle–Perron–Frobenius Theorem) Suppose that \(g \in \mathcal {F}_\theta (\Sigma ,\mathbb {C})\) is real-valued. Then \(L_g: \mathcal {F}_\theta (\Sigma ,\mathbb {C}) \rightarrow \mathcal {F}_\theta (\Sigma ,\mathbb {C})\) has a simple eigenvalue equal to \(e^{P(g)}\), associated strictly positive eigenfunction \(\psi \) and eigenmeasure \(\nu \) (i.e. \(L_g\psi = e^{P(g)}\psi \) and \(L_g^*\nu =e^{P(g)}\nu \)), normalised so that \(\nu \) is a probability measure and \(\int \psi \, d\nu =1\). Furthermore, the rest of the spectrum of \(L_g\) is contained in a disk of radius strictly smaller than \(e^{P(g)}\).

The equilibrium state \(\mu _g\) is given by \(d\mu _g = \psi d\nu \). We say that *g* is *normalised* if \(L_g1=1\) (which in particular implies \(P(g)=0\)). If we replace \(g\) by \(g' = g - P(g) + u - u\circ \sigma \) where \(u = \log \psi \) then \(g'\) is normalised and \(g\) and \(g'\) have the same equilibrium state.

Suppose that \(f,g \in \mathcal {F}_\theta (\Sigma ,{\mathbb {C}})\) are real-valued functions. We consider small perturbations of the operator \(L_{g}\) of the form \(L_{g+sf}\) for values of \(s\in \mathbb {C}\) in a neighbourhood of the origin. Since \(e^{P(g)}\) is a simple isolated eigenvalue of \(L_g\), for small perturbations of \(s\) close to the origin this eigenvalue persists so that the operator \(L_{g+sf}\) has a simple eigenvalue \(\beta (s)\) and corresponding eigenfuction \(\psi _s\) that vary analytically with \(s\) and satisfy \(\beta (0) = e^{P(g)}\) and \(\psi _0=\psi \) [8]. Furthermore, by the upper semi-continuity of the spectral radius, there exists \(\varepsilon >0\) such that, for \(s\) close to the origin, the remainder of the spectrum of \(L_{g+sf}\) lies in a disk of radius \(e^{P(g) -\varepsilon }\). We extend the definition of pressure by setting \(e^{P(g+sf)}= \beta (s)\).

## 4 Proof of Theorem 2.2

### Lemma 4.1

### Proof

### Lemma 4.2

The coefficient of \(z^{k+2m}\) in the power series \(\sum _{m=0}^\infty z^{k+2m} (L_{0}^m \chi _g)(1)\) grow with order \(O(e^{mh})\).

The coefficient in the next lemma grows with the same order.

### Lemma 4.3

The coefficient of \(z^{k+2m}\) in the power series \(\left. \frac{\partial }{\partial s} \delta (z,s) \right|_{s=0}\) grow with order \(O(e^{mh})\).

### Proof

We decompose the transfer operator \(L_{sf}\) into the projection \(R_s\) associated to the eigenspace associated to the eigenvalue \(e^{P(sf)}\) and \(Q_s = L_{sf} - e^{P(sf)}R_s\). For \(s\in \mathbb {C}\) in a neighbourhood of \(s=0\), the operators \(R_s\) and \(Q_s\) are analytic. We use this operator decomposition to obtain the estimates in the next two lemmas.

### Lemma 4.4

### Proof

There is one power series left to study.

### Lemma 4.5

### Proof

## 5 Proof of Theorem 2.4

*F*satisfies (A1), (A2) and (A3). By replacing

*F*with \(F - {\overline{F}}|\cdot |\) (which still satisfies the three assumptions) or, equivalently,

*f*with \(f - \int f \, d\mu _0\), we may assume without loss of generality that \(\int f \, d\mu _0=0\). This reduction does not change the variance. We may then write

### Proposition 5.1

### Corollary 5.2

We use the notation \(\beta (\tau ) = e^{P(\tau f)}\) and \(\beta (0)=e^h\) in the proof of Proposition 5.3.

### Proposition 5.3

The limit of \(\varphi _m(t)\) as \(m\rightarrow \infty \) is \(e^{-\sigma _f^2 t^2}\).

### Proof

## Notes

### Acknowledgements

George Kenison was funded by the Engineering and Physical Sciences Research Council (DTA Award Number 1359001).

## References

- 1.Calegari, D., Fujiwara, K.: Combable functions, quasimorphisms, and the central limit theorem. Ergod. Theory Dyn. Syst.
**30**, 1343–1369 (2010)MathSciNetCrossRefMATHGoogle Scholar - 2.Coelho, Z., Parry, W.: Central limit asymptotics for shifts of finite type. Israel J. Math.
**69**(2), 235–249 (1990)MathSciNetCrossRefMATHGoogle Scholar - 3.Coornaert, M.: Mesures de Patterson-Sullivan sur le bord d’un espace hyperbolique au sens de Gromov. Pac. J. Math.
**159**, 241–270 (1993)MathSciNetCrossRefMATHGoogle Scholar - 4.Dal’bo, F.: Remarques sur le spectre des longueurs d’une surface et comptages. Bol. Soc. Bras. Mat.
**30**, 199–221 (1999)MathSciNetCrossRefMATHGoogle Scholar - 5.Feller, W.: An Introduction to Probability Theory and Its Applications, vol. II, 2nd edn. Wiley, New York (1971)MATHGoogle Scholar
- 6.Horsham, M.: Central limit theorems for quasi-morphisms of surface groups. Ph.D. thesis, Manchester (2008)Google Scholar
- 7.Horsham, M., Sharp, R.: Lengths, quasi-morphisms and statistics for free groups. In: Kotani M, Naito H, Tate T (eds) Spectral Analysis in Geometry and Number Theory, volume 484 of Contemporary Mathematics, pp. 219–237. American Mathematical Society, Providence, RI (2009)Google Scholar
- 8.Kato, T.: Perturbation Theory for Linear Operators. Classics in Mathematics. Springer, Berlin (1995).
**(Reprint of the 1980 edition)**CrossRefGoogle Scholar - 9.Kenison, G., Sharp, R.: Orbit counting in conjugacy classes for free groups acting on trees. J. Topol. Anal.
**9**, 631–647 (2017)MathSciNetCrossRefMATHGoogle Scholar - 10.Kifer, Y.: Large deviations in dynamical systems and stochastic processes. Trans. Am. Math. Soc.
**321**, 505–524 (1990)MathSciNetCrossRefMATHGoogle Scholar - 11.Krantz, S.: Function theory of several complex variables. AMS Chelsea Publishing, Providence (2001).
**(Reprint of the 1992 edition)**CrossRefMATHGoogle Scholar - 12.Parry, W., Pollicott, M.: Zeta functions and the periodic orbit structure of hyperbolic dynamics. Astérisque
**187**–**188**(1990)Google Scholar - 13.Pollicott, M., Sharp, R.: Large deviations and the distribution of pre-images of rational maps. Commun. Math. Phys.
**181**, 733–739 (1996)MathSciNetCrossRefMATHGoogle Scholar - 14.Pollicott, M., Sharp, R.: Comparison theorems and orbit counting in hyperbolic geometry. Trans. Am. Math. Soc.
**350**(2), 473–499 (1998)MathSciNetCrossRefMATHGoogle Scholar - 15.Pollicott, M., Sharp, R.: Poincaré series and comparison theorems for variable negative curvature. In: Turaev V, Vershik A (eds) Topology, Ergodic Theory, Real Algebraic Geometry, volume 202 of American Mathematical Society Translation Series 2, pp. 229–240. American Mathematical Society, Providence (2001)Google Scholar
- 16.Pollicott, M., Sharp, R.: Statistics of matrix products in hyperbolic geometry. In: Kolyada S, Manin Y, Möller M, Moree P, Ward T (eds) Dynamical Numbers: Interplay Between Dynamical Systems and Number Theory, Contemporary Mathematics, vol. 532, pp. 213–230 (2011)Google Scholar
- 17.Rivin, I.: Growth in free groups (and other stories)—twelve years later. Ill. J. Math.
**54**(1), 327–370 (2010)MathSciNetMATHGoogle Scholar - 18.Ruelle, D.: Thermodynamic Formalism, volume 5 of Encyclopedia of Mathematics and Its Applications. Addison-Wesley, Reading (1978)Google Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.