1 Introduction

In the paper we refer to the following well-known theorems of affine differential geometry

Theorem 1.1

(W. Blaschke, A. Deicke, E. Calabi) Let \(f:M\rightarrow \mathbf {R}^{n+1}\) be an elliptic affine sphere whose Blaschke metric is complete. Then the induced structure on M is trivial, that is, the induced affine connection is the Levi–Civita connection of the Blaschke metric. Consequently, the affine sphere is an ellipsoid.

Theorem 1.2

(E. Calabi) Let \(f:M\rightarrow \mathbf {R}^{n+1}\) be a hyperbolic or parabolic affine sphere whose Blaschke metric is complete. Then the Ricci tensor of the metric is negative semi-definite.

The above theorems deal with affine spheres which constitute one of the most important categories studied in the classical affine differential geometry. The mystery of affine spheres is that although they are defined so naturally and analogously to the Riemannian case (i.e. affine lines determined by the affine normal vector field meet at one point or are parallel), they are, as the whole class, unknown. On the other hand, they have exceptionally nice properties. Many particular examples of affine spheres are known; the whole class is divided into subclasses (for instance, elliptic, hyperbolic and parabolic), but it is not seen what a satisfactory description of the whole class might look like. Therefore, it is a way of studying the class to impose additional geometric conditions on a sphere and to prove that it lies in a better-known class of manifolds equipped with some geometric structure. The underlying Riemannian geometry is the first candidate here. In the above theorems the additional condition is the completeness of the Blaschke metric.

The aim of this paper is to generalize Theorems 1.1 and 1.2 to the case of statistical manifolds and to the case where a curvature, which is constant on affine spheres, is only bounded. Statistical manifolds are generalizations of affine hypersurfaces in the sense that the structure on the so-called equiaffine hypersurfaces is a statistical structure but statistical structures are not, in general, realized on affine hypersurfaces, even locally. Conjugate symmetric statistical structures are as important in the geometry of statistical structures as affine spheres in the theory of affine hypersurfaces. Within the two geometries they can be characterized by the same condition. The condition is that the curvature tensor of the affine connection of these structures has the same symmetries as the Riemannian curvature tensor, see Sect. 2 or [6].

We shall prove, in particular, the following result.

Theorem 1.3

Let \((g,\nabla ) \) be a trace-free conjugate symmetric statistical structure on a manifold M. Assume that g is complete on M. If the sectional \(\nabla \)-curvature is bounded from below and above on M then the Ricci tensor of g is bounded from below and above on M. If the sectional \(\nabla \)-curvature is non-negative everywhere then the statistical structure is trivial, that is, \(\nabla ={\hat{\nabla }}\). If the sectional \(\nabla \)-curvature is bounded from 0 by a positive constant then, additionally, M is compact and its first fundamental group is finite.

More precise and more general formulations of this theorem give Theorems 3.1 and 4.1. The meaning of the generalization can be explained as follows. The induced structure on an affine sphere is a conjugate symmetric trace-free statistical structure. But the statistical connection on an affine sphere is projectively flat and its sectional \(\nabla \)-curvature is constant. In the theorems we propose the projective flatness is not needed, which means that the statistical structure can be non-realizable on any Blaschke hypersurface even locally. Moreover, the assumption about the constant curvature is replaced by the assumption that the curvature satisfies some inequalities. Since the notion of the sectional \(\nabla \)-curvature is relatively new, see [1, 6], the theorems proved in this paper show that the notion is meaningful.

In the proof of the first part of Theorem 1.3 we use the same main tool as in Calabi’s theorems, that is, a theorem on weak solutions of differential inequalities for the Laplacian of non-negative functions on complete Riemannian manifolds. In fact, we shall use only a particular version of this theorem. Note that the crucial step in the proof of Theorem 3.1 is an estimation obtained in Lemma 3.2. In the case of affine spheres (Theorem 1.2) the corresponding part of the proof is trivial.

An inspiration for the study of the problems this paper deals with was [4] where the first attempt to a generalization of Theorem 1.1 was made. Let us quote one of Noguchi’s results which in the language of this paper can be displayed as follows.

Theorem 1.4

([4, Theorem 4.1]) Let \((g,\nabla )\) be a trace-free conjugate symmetric statistical structure on a manifold M. Assume that the sectional \(\nabla \)-curvature is point-wise constant and non-negative on M and g is complete. Then the structure is trivial, that is, \(\nabla =\hat{\nabla }\).

We now know that if a statistical structure is conjugate symmetric, then Schur’s lemma holds for the sectional \(\nabla \)-curvature, see Sect. 2.2 or [6]. It implies that in the above theorem the statistical structure can be locally realized on an affine sphere if \(n\ge 3\). But the theorems we discuss here are of global nature and it means that Theorem 1.4 is more general than Theorem 1.1.

2 Preliminaries

2.1 Definitions of statistical structures

We shall shortly recall basic notions of statistical geometry. For details we refer to [6]. Let g be a positive definite Riemannian tensor field on a manifold M. Denote by \(\hat{\nabla }\) the Levi–Civita connection for g. A statistical structure is a pair \((g,\nabla ) \), where \(\nabla \) is a torsion-free connection such that the following Codazzi condition is satisfied

$$\begin{aligned} (\nabla _X g)(Y,Z)=(\nabla _Yg)(X,Z) \end{aligned}$$
(1)

for all \(X,Y,Z\in T_x M\), \(x\in M\). A connection \(\nabla \) satisfying (1) is called a statistical connection for g.

For any connection \(\nabla \) one defines its conjugate (dual) connection \(\overline{\nabla }\) relative to g by the formula

$$\begin{aligned} g(\nabla _XY,Z)+g(Y,\overline{\nabla }_XZ)=Xg(Y,Z). \end{aligned}$$
(2)

It is known that the pairs \((g,\nabla )\) and \((g,\overline{\nabla })\) are simultaneously statistical structures. From now on we assume that \(\nabla \) is a statistical connection for g. We have

$$\begin{aligned} g(R(X,Y)Z,W)=-g(\overline{R}(X,Y)W,Z), \end{aligned}$$
(3)

where R and \({\overline{R}}\) are the curvature tensors for \(\nabla \) and \({\overline{\nabla }}\), respectively. Denote by \(\mathrm{{Ric}}\) and \(\overline{\mathrm{{Ric}}}\) the corresponding Ricci tensors. Note that in general, these Ricci tensors are not necessarily symmetric. The curvature and the Ricci tensor of \(\hat{\nabla }\) will be denoted by \(\hat{R}\) and \(\widehat{\mathrm{{Ric}}}\), respectively. The function

$$\begin{aligned} \rho =\mathrm{tr}\, _g\mathrm{{Ric}}(\cdot ,\cdot ) \end{aligned}$$
(4)

is the scalar curvature of \((g,\nabla )\). Similarly, one can define the scalar curvature \({\overline{\rho }}\) for \((g,\overline{\nabla })\) but, by (3), \(\rho ={\overline{\rho }}\). We also have the usual scalar curvature \({\hat{\rho }}\) for g.

Denote by K the difference tensor between \(\nabla \) and \(\hat{\nabla }\), that is,

$$\begin{aligned} \nabla _XY=\hat{\nabla }_XY+K_XY. \end{aligned}$$
(5)

Then

$$\begin{aligned} \overline{\nabla }_XY=\hat{\nabla }_XY-K_XY. \end{aligned}$$
(6)

K(XY) will stand for \(K_XY\). Since \(\nabla \) and \(\hat{\nabla }\) are without torsion, K as a (1, 2)-tensor is symmetric. We have \( (\nabla _Xg)(Y,Z)=(K _Xg)(Y,Z)=-g(K_XY,Z)-g(Y,K_XZ)\). It is now clear that the symmetry of \(\nabla g\) and K implies the symmetry of \(K_X\) relative to g for each X. The converse also holds. Namely, if \(K_X\) is symmetric relative to g then we have \(( \nabla _Xg)(Y,Z) = -2g(K_XY,Z)\).

We define the statistical cubic form A by

$$\begin{aligned} A(X,Y,Z)=g(K_XY,Z). \end{aligned}$$
(7)

It is clear that a statistical structure can be defined equivalently as a pair (gK), where K is a symmetric tensor field of type (1, 2) which is also symmetric relative to g, or as a pair (gA), where A is a symmetric cubic form.

A statistical structure is trace-free if \(\mathrm{tr}\, _gK(\cdot ,\cdot )=0\) (equivalently, \(\mathrm{tr}\, _gA(X,\cdot ,\cdot )=0\) for every X; equivalently, \(\mathrm{tr}\, K_X=0\) for every X). The trace-freeness is also equivalent to the condition that \(\nabla \nu _g=0\), where \(\nu _g\) is the volume form determined by g. In affine differential geometry the trace-freeness is called the apolarity. The assumption about the trace-freeness of a statistical structure is essential in all the theorems mentioned in the Introduction.

2.2 Relations between curvature tensors of statistical structures

It is known that

$$\begin{aligned} R(X,Y)={\hat{R}}(X,Y) +(\hat{\nabla }_XK)_Y-(\hat{\nabla }_YK)_X+[K_X,K_Y]. \end{aligned}$$
(8)

Writing the same equality for \(\overline{\nabla }\) and adding both equalities, we get

$$\begin{aligned} R(X,Y)+{\overline{R}}(X,Y) =2{\hat{R}}(X,Y) +2[K_X,K_Y]. \end{aligned}$$
(9)

In particular, if \(R={\overline{R}}\) then

$$\begin{aligned} R(X,Y)={\hat{R}}(X,Y) +[K_X,K_Y], \end{aligned}$$
(10)

which can be shortly written as

$$\begin{aligned} R={\hat{R}} +[K,K]. \end{aligned}$$
(11)

Using (9) and assuming that a given statistical structure is trace-free, one gets, see [6],

$$\begin{aligned} \mathrm{{Ric}}(Y,Z)+\overline{\mathrm{{Ric}}} (Y,Z)=2\widehat{\mathrm{{Ric}}}(Y,Z) -2g(K_Y,K_Z). \end{aligned}$$
(12)

In particular, if \((g, \nabla )\) is trace-free then

$$\begin{aligned} 2\widehat{\mathrm{{Ric}}}(X,X)\ge \mathrm{{Ric}}(X,X)+\overline{\mathrm{{Ric}}}(X,X). \end{aligned}$$
(13)

If, moreover, \(R={\overline{R}}\) then

$$\begin{aligned} \widehat{\mathrm{{Ric}}}\ge \mathrm{{Ric}}. \end{aligned}$$
(14)

The following lemma follows from formulas (3) and (8).

Lemma 2.1

Let \((g,\nabla ) \) be a statistical structure. The following conditions are equivalent

  1. (1)

    \(R={\overline{R}}\),

  2. (2)

    \(\hat{\nabla }K\) is symmetric (equiv. \({\hat{\nabla }} A\) is symmetric),

  3. (3)

    g(R(XY)ZW) is skew symmetric relative to ZW.

Note that \(\hat{\nabla }K\) in (2) stands for the (1, 3)-tensor field defined by the formula \(\hat{\nabla }K(X,Y,Z)=(\hat{\nabla }_XK)(Y,Z)\). Of course the same deals with \(\hat{\nabla }A\). A statistical structure satisfying (2) in the above lemma was called in [2] conjugate symmetric. We shall adopt this definition.

Note that the condition \(R={\overline{R}}\) implies the symmetry of \(\mathrm{{Ric}}\).

Taking now the trace relative to g on both sides of (12) and taking into account that \(\rho ={\overline{\rho }}\), we get

$$\begin{aligned} {\hat{\rho }} =\rho +\Vert K\Vert ^2=\rho +\Vert A\Vert ^2 \end{aligned}$$
(15)

for a trace-free statistical structure.

2.3 Sectional \(\nabla \)-curvature

The notion of a sectional \(\nabla \)-curvature was first introduced in [6]. Namely, the tensor field

$$\begin{aligned} {\mathcal {R}}= \frac{1}{2}(R+{\overline{R}}) \end{aligned}$$
(16)

is a Riemannian curvature tensor. In particular, it satisfies the condition

$$\begin{aligned} g({\mathcal {R}}(X,Y)Z,W)=-g({\mathcal {R}}(X,Y)W,Z). \end{aligned}$$

In general, this condition is not satisfied by the curvature tensor R. In the case where a given statistical structure is conjugate symmetric the curvature tensor R satisfies this condition. In [6] we defined the sectional \(\nabla \)-curvature by

$$\begin{aligned} k(\pi ) =g({\mathcal {R}}(e_1,e_2)e_2,e_1) \end{aligned}$$
(17)

for a vector plane \(\pi \in T_xM\), \(x\in M\) and \(e_1,e_2\) any orthonormal basis of \(\pi \). It is a well-defined notion.

In general, Schur’s lemma does not hold for the sectional \(\nabla \)-curvature. But, if a statistical structure is conjugate symmetric (in this case \({\mathcal {R}}=R\)), then some type of the second Bianchi identity holds and, consequently, Schur’s lemma holds, see [6].

2.4 Statistical structures on affine hypersurfaces

The theory of affine hypersurfaces in \(\mathbf {R}^{n+1}\) is a natural source of statistical structures. For the theory we refer to [3] or [5]. We recall here only some selected facts.

Let \(\mathbf {f} :M\rightarrow \mathbf {R}^{n+1}\) be a locally strongly convex hypersurface. For simplicity assume that M is connected and orientable. Let \(\xi \) be a transversal vector field on M. We have the induced volume form \(\nu _\xi \) on M defined as follows

$$\begin{aligned} \nu _\xi (X_1,\ldots ,X_n)=\det (\mathbf {f}_*X_1,\ldots ,\mathbf {f}_*X_n,\xi ). \end{aligned}$$

We also have the induced connection \(\nabla \) and the second fundamental form g defined by the Gauss formula

$$\begin{aligned} D_X\mathbf {f}_*Y=\mathbf {f}_*\nabla _XY +g(X,Y)\xi , \end{aligned}$$

where D is the standard flat connection on \(\mathbf {R}^{n+1}\). Since the hypersurface is locally strongly convex, the second fundamental form g is definite. By multiplying \(\xi \) by \(-1\) if necessary, we can assume that g is positive definite. A transversal vector field is called equiaffine if \(\nabla \nu _\xi =0\). This condition is equivalent to the fact that \(\nabla g\) is symmetric, i.e. \((g,\nabla )\) is a statistical structure. It means, in particular, that for a statistical structure obtained on a hypersurface by a choice of an equiaffine transversal vector field, the Ricci tensor of \(\nabla \) is automatically symmetric. A hypersurface equipped with an equiaffine transversal vector field, and the induced structure is called an equiaffine hypersurface.

Recall now the notion of the shape operator. Having a fixed equiaffine transversal vector field \(\xi \) and differentiating it, we get the Weingarten formula

$$\begin{aligned} D_X\xi = -\mathbf {f}_*\mathcal SX. \end{aligned}$$

The tensor field \({\mathcal {S}}\) is called the shape operator for \(\xi \). If R is the curvature tensor for the induced connection \(\nabla \), then

$$\begin{aligned} R(X,Y)Z=g(Y,Z)\mathcal SX-g(X,Z)\mathcal SY. \end{aligned}$$
(18)

This is the Gauss equation for R. The Gauss equation for the dual structure is the following

$$\begin{aligned} {\overline{R}}(X,Y)Z=g(Y,\mathcal SZ)X-g(X,\mathcal SZ)Y. \end{aligned}$$
(19)

It follows that the dual connection is projectively flat if \(n>2\). The dual connection is also projectively flat for two-dimensional surfaces equipped with an equiaffine transversal vector field, that is, \({\overline{\nabla }} \overline{\mathrm{{Ric}}}\) is symmetric. The form \(g(\mathcal SX,Y)\) is symmetric for any equiaffine transversal vector field.

We have the volume form \(\nu _g\) determined by g on M. In general, this volume form is not covariant constant relative to \(\nabla \). The starting point of the classical affine differential geometry is the theorem saying that there is a unique equiaffine transversal vector field \(\xi \) such that \(\nu _\xi =\nu _g\). This unique transversal vector field is called the affine normal vector field or the Blaschke affine normal. The second fundamental form for the affine normal is called the Blaschke metric. A non-degenerate hypersurface endowed with the affine Blaschke normal is called a Blaschke hypersurface. The induced statistical structure is trace-free on a Blaschke hypersurface. If the affine lines determined by the affine normal vector field meet at one point or are parallel, then the hypersurface is called an affine sphere. In the first case the sphere is called proper in the second one improper. The class of affine spheres is very large. There exist many conditions characterizing affine spheres. For instance, a Blaschke hypersurface is an affine sphere if and only if \(R={\overline{R}}\). Therefore, conjugate symmetric statistical manifolds can be regarded as generalizations of affine spheres. For connected affine spheres the shape operator S is a constant multiple of the identity, i.e. \(S=\kappa \, \mathrm{id}\,\) for some constant \(\kappa \).

If we choose a positive definite Blaschke metric on a connected locally strongly convex affine sphere, then we call the sphere elliptic if \(\kappa >0\), parabolic if \(\kappa =0\) and hyperbolic if \(\kappa <0\).

2.5 Conjugate symmetric statistical structures non-realizable on affine spheres

As we have already mentioned, if \(\nabla \) is a connection on a hypersurface induced by an equiaffine transversal vector field then the conjugate connection \(\overline{\nabla }\) is projectively flat. Therefore, the projective flatness of the conjugate connection is a necessary condition for \((g,\nabla )\) to be realizable as the induced structure on a hypersurface equipped with an equiaffine transversal vector field. In fact, one of the fundamental theorems in affine differential geometry (see, e.g. [5]) says, roughly speaking, that it is also a sufficient condition for the local realizability of a Ricci symmetric statistical structure, but we will not need it in this paper. Note also that if \((g,\nabla )\) is a conjugate symmetric statistical structure then \(\nabla \) and \({\overline{\nabla }}\) are simultaneously projectively flat. Indeed, it is obvious for \(n>2\). If \(n=2\) we can argue as follows. It suffices to prove that if \({\overline{\nabla }}\) is projectively flat then so is \(\nabla \). Since \(R={\overline{R}}\), \(\nabla \) is Ricci symmetric. By the fundamental theorem mentioned above, \((g,\nabla )\) can be locally realized on an equiaffine surface in \(\mathbf {R}^3\). By Lemma 12.5 from [6] the surface is an equiaffine sphere, that is, the shape operator is locally a constant multiple of the identity, and hence, \(\nabla \) is projectively flat. It follows that if \((g,\nabla )\) is conjugate symmetric then it is locally realizable on an equiaffine hypersurface if only if \(\nabla \) or \({\overline{\nabla }}\) is projectively flat.

We shall now consider trace-free conjugate symmetric statistical structures. The following fact was observed in [6], see Proposition 4.1 there. If \((g,\nabla )\) is the induced statistical structure on an affine sphere, the metric g is not of constant sectional curvature and \(\alpha \ne 1, -1\) is a real number, then \(\nabla ^\alpha :=\hat{\nabla }+\alpha K\) is not projectively flat and therefore it cannot be realized (even locally) on any affine sphere. Of course, \((g,\nabla ^\alpha )\) is again a statistical conjugate symmetric structure (by 2) of Lemma 2.1) and since the initial structure was trace-free (because an affine sphere is endowed with the Blaschke structure), \((g,\nabla ^\alpha )\) is trace-free as well. Note also that there are very few affine spheres whose Blaschke metric has constant sectional curvature, see [3], which means that the assumption that g is not of constant sectional curvature is not restrictive.

The following example shows another easy way of producing conjugate symmetric trace-free statistical structures which are non-realizable (even locally) on affine spheres.

Let \(M=\mathbf {R}^n\), where \(n\ge 4\), be equipped with the standard flat metric tensor field g. Let \(x^1,\ldots ,x^n\) be the canonical coordinate system and \(e_1,\ldots ,e_n\) be the canonical orthonormal frame. Define the cubic form \(A=(A_{ijk})\) on M, where \(A_{ijk}=A(e_i,e_j,e_k)\), by

$$\begin{aligned} \begin{array}{rcl} &{}&{}A_{ijk}=0 \ \ \mathrm{{if}} \ \mathrm{{at}} \ \mathrm{{least}}\ \mathrm{{two}}\ \mathrm{{of}}\ \mathrm{{indices}} \ {\textit{i,j,k}} \ \mathrm{{are}}\ \mathrm{{equal}},\\ &{}&{} A_{ijk}\in \mathbf {R}^+\ \mathrm{{if}}\ \mathrm{{the}}\ \mathrm{{indices}}\ {\textit{i,j,k}}\ \mathrm{{are}}\ \mathrm{{mutually}} \ \mathrm{{distinct}}. \end{array} \end{aligned}$$
(20)

Then \({\hat{\nabla }} K=0\) and, consequently, \(R={\overline{R}}\). Observe now that the connection \({\overline{\nabla }}=\hat{\nabla }-K\), where \(g(K(X,Y),Z)=A(X,Y,Z)\), is not projectively flat, and therefore, \((g,\nabla )\) cannot be realized on any Blaschke hypersurface, even locally. Indeed, suppose that \({\overline{\nabla }}\) is projectively flat. Then we must have \(g(R(e_i,e_j)e_j, e_l)=0\) for \(i\ne j\) and \(l\ne i,j\). On the other hand, by (8), we have

$$\begin{aligned} g(R(e_i,e_j)e_j,e_l)= & {} g([K_{e_i},K_{e_j}]e_j,e_l)=-g(K_{e_j}K_{e_i}e_j,e_l)\\= & {} -g(K(e_i,e_j),K(e_j,e_l))=-\sum _{s=1}^nA_{ijs}A_{jls}. \end{aligned}$$

By (20) it is clear that the function \(-\sum _{s=1}^nA_{ijs}A_{jls}\) is negative if \(n\ge 4\).

Another version of this example (with \({\hat{\nabla }} K\ne 0\)) is given by the symmetric \(A_{ijk}\), where

$$\begin{aligned} \begin{array}{rcl} &{}&{}A_{ijk}=0 \ \quad \mathrm {if} \ \mathrm{at} \ \mathrm{least}\ \mathrm{two}\ \mathrm{of}\ \mathrm{indices} \ i,j,k \ \mathrm{are}\ \mathrm{equal},\\ &{}&{} (A_{ijk})_x= x^1+\cdots +\widehat{(x^i)}+\cdots +\widehat{(x^j)}+\cdots +\widehat{(x^k)}+\cdots +x^n \ \mathrm{for}\ i<j<k, \end{array} \end{aligned}$$
(21)

where \(\widehat{(x^l)}\) means that the coordinate \(x^l\) was removed from the sum. One can easily check that \({\hat{\nabla }} A\) is symmetric. Indeed, we want to check that \(\partial _lA_{ijk}=\partial _iA_{ljk}\) for \(l\ne i\). It is sufficient to assume that \(j\ne k\). Consider the cases: (a) \(l=j\) or \(l=k\), (b) \(i=j\) or \(i=k\), \(l \ne j\), \(l\ne k\), (c) \(i\ne j\), \(i\ne k\), \(l\ne j\), \(l\ne k\). In cases (a) and (b) both sides of the required equality vanish. In the last case, where all indices are mutually distinct, on both sides of the required equality we get 1.

In the same manner as in the previous example, one sees that \(\overline{\nabla }\) is not projectively flat on \((\mathbf {R}^+)^n\) if \(n\ge 4\).

The considerations of this subsection show that the class of conjugate symmetric trace-free statistical manifolds is much larger than the class of affine spheres, even in the local setting.

3 Curvature bounded conjugate symmetric trace-free statistical structures

Let \(n=\dim M\) and \((g,\nabla )\) be a statistical structure on M. From now on we assume that the structure is trace-free and conjugate symmetric. Assume moreover that

$$\begin{aligned} H_2\le k(\pi )\le H_1, \end{aligned}$$
(22)

for every vector plane \(\pi \subset T_xM\) and \(x\in M\). Denote by \(\varepsilon \) the difference \(H_1-H_2\) and set

$$\begin{aligned} H_3=H_2-\frac{n-2}{2}\varepsilon . \end{aligned}$$

The quantities \(H_1\), \(H_2\) and \(\varepsilon \) can be functions on M (not satisfying any smoothness assumptions), but in the main theorem of this section, that is, in Theorem 3.1, \(H_3\) must be a real number. The condition (22) can be written as

$$\begin{aligned} H_1-\varepsilon \le k(\pi )\le H_1 \end{aligned}$$
(23)

or

$$\begin{aligned} H_3+\frac{n-2}{2}\varepsilon \le k(\pi )\le H_3+\frac{n}{2}\varepsilon . \end{aligned}$$
(24)

Theorem 3.1

Let \((g,\nabla )\) be a trace-free conjugate symmetric statistical structure on an n-dimensional manifold M. Assume that (Mg) is complete and the sectional \(\nabla \)-curvature k satisfies the inequalities (24) on M, where \(H_3\) is a non-positive number and \(\varepsilon \) is a non-negative function on M. Then the Ricci tensor \({\widehat{\mathrm{{Ric}}}}\) of g satisfies the inequalities

$$\begin{aligned} (n-1)H_3+\frac{(n-1)(n-2)}{2}\varepsilon \le {\widehat{\mathrm{{Ric}}}}\le -(n-1)^2H_3 +\frac{(n-1)n}{2}\varepsilon . \end{aligned}$$
(25)

The scalar curvature \({\hat{\rho }}\) of g satisfies the inequalities

$$\begin{aligned} n(n-1)H_3+\frac{n(n-1)(n-2)}{2}\varepsilon \le \hat{\rho }\le \frac{n^2(n-1)}{2}\varepsilon . \end{aligned}$$
(26)

Proof

In what follows the scalar multiplication g will be also denoted by \(\langle \ , \ \rangle \). The following lemma is crucial in the following proof. \(\square \)

Lemma 3.2

Let V be any unit vector of \(T_pM\). Denote by \(T_V\) the (0, 4)-tensor given by

$$\begin{aligned} T_V(X,Y,Z,W)=-\langle K_VX,R(Y,Z)W\rangle -2\langle K_VW,R(Y,Z)X\rangle . \end{aligned}$$
(27)

Assume that

$$\begin{aligned} H_3+\frac{n-2}{2}\varepsilon \le k( \pi )\le H_3+\frac{n}{2}\varepsilon \end{aligned}$$
(28)

for some \(H_3\in \mathbf {R}\), \(\varepsilon \in \mathbf {R}^+\) and for all vector planes \(\pi \subset T_pM\). Then

$$\begin{aligned} \langle T'_V,A_V\rangle \ge (n+1)H_3 \psi _V, \end{aligned}$$
(29)

where

$$\begin{aligned}&A_V(X,Z)=A(V,X,Z), \end{aligned}$$
(30)
$$\begin{aligned}&T'_V(X,Z)=\mathrm{tr}\, _gT_V(X,\cdot ,Z,\cdot ) \end{aligned}$$
(31)

and

$$\begin{aligned} \psi _V=\langle A_V,A_V\rangle . \end{aligned}$$
(32)

Proof of Lemma 3.2

Let \(e_1,\ldots ,e_n\) be an orthonormal eigenbasis of \(K_V\) and \(K_V e_i=\lambda _ie_i\) for \(i=1,\ldots , n\). Then \(\psi _V=\lambda _1^2+\cdots +\lambda _n^2\). We have

$$\begin{aligned} \langle {T'} _V, A_V\rangle= & {} -\sum _{i,j,k}\left[ \langle K_Ve_j,R(e_i,e_k)e_i\rangle \langle K_Ve_j,e_k\rangle +2\langle K_Ve_i, R(e_i,e_k)e_j\rangle \langle K_Ve_j,e_k\rangle \right] \\= & {} \sum _{i,j}(\lambda _j^2-2\lambda _i\lambda _j) k_{ij}, \end{aligned}$$

where \(k_{ij}=k(e_i\wedge e_j)\). Since \(k_{ij}=k_{ji}\) and \(k_{ii}=0\), we obtain

$$\begin{aligned} \langle T'_V,A_V \rangle= & {} (\lambda _1^2k_{11}+\cdots +\lambda _1^2k_{1n})+\cdots +(\lambda _n^2k_{n1}+\cdots +\lambda _n^2k_{nn}) -4\sum _{i<j}\lambda _i\lambda _jk_{ij}\nonumber \\= & {} \sum _{i<j}(\lambda _j-\lambda _i)^2k_{ij}-2\sum _{i<j}\lambda _i\lambda _j k_{ij}. \end{aligned}$$
(33)

In the last term we now replace \(\lambda _n\) by \(-\lambda _1-\cdots -\lambda _{n-1}\). We get

$$\begin{aligned} -\sum _{i<j}\lambda _i\lambda _jk_{ij}= & {} -\lambda _1\lambda _2k_{12}-\ \ \ \ \cdots \ \ \ -\lambda _1(-\lambda _1-\cdots -\lambda _{n-1})k_{1n}\\&\quad -\,\lambda _2\lambda _3k_{23}-\cdots -\lambda _2(-\lambda _1-\cdots -\lambda _{n-1})k_{2n}\\&\quad \cdots \\&\quad -\,\lambda _{n-1}(-\lambda _1-\cdots -\lambda _{n-1})k_{n-1,n}\\= & {} \lambda _1\lambda _2(-k_{12}+k_{1n})+\cdots +\lambda _1\lambda _{n-1}(-k_{1,n-1}+k_{1n})+\lambda _1^2k_{1n}\\&\quad +\,\lambda _1\lambda _2k_{2n}{+}\lambda _2\lambda _3(-k_{23}{+}k_{2n})+\cdots +\lambda _2\lambda _{n-1} (-k_{2,n-1}+k_{2n})+\lambda _2^2k_{2n}\\&\quad \cdots \\&\quad +\,\lambda _{n-1}\lambda _1k_{n-1,n}+\cdots +\lambda _{n-1}\lambda _{n-2}k_{n-1,n}+ \lambda _{n-1}^2k_{n-1,n}\\= & {} \sum _{i<j\le n-1}\lambda _i\lambda _j(k_{in}+k_{jn}-k_{ij})+\sum _{i=1}^{n-1}\lambda _i^2k_{in}. \end{aligned}$$

Thus, using the assumption (22) and the condition \(\lambda _n=-\lambda _1-\cdots -\lambda _{n-1}\), we get

$$\begin{aligned} \langle T'_VA_V\rangle\ge & {} \sum _{i<j\le n}(\lambda _i-\lambda _j)^2H_2 +2\sum _{i=1}^{n-1}\lambda _i^2H_2+2\sum _{i<j\le n-1}\lambda _i\lambda _j(k_{in}+k_{jn}-k_{ij})\\= & {} \sum _{i<j\le n-1}\left( \lambda _i^2+\lambda _j^2-2\lambda _i\lambda _j\right) H_2+\sum _{i=1}^{n-1}(\lambda _i-\lambda _n)^2H_2\\&+2\sum _{i=1}^{n-1}\lambda _i^2H_2+2\sum _{i<j\le n-1}\lambda _i\lambda _j(k_{in}+k_{jn}-k_{ij})\\= & {} \sum _{j<j\le n-1}\left( \lambda _i^2+\lambda _j^2\right) H_2 +2\sum _{i<j\le n-1}\lambda _i\lambda _j (k_{in} +k_{jn}-k_{ij}-H_2)\\&+\sum _{i=1}^{n-1}\lambda _i^2H_2 +(n-1)\lambda _n^2H_2-2\sum _{i=1}^{n-1} \lambda _i\lambda _nH_2 +2\sum _{i=1}^{n-1}\lambda _i^2H_2\\= & {} (n-2)\sum _{i=1}^{n-1}\lambda _i^2H_2 +3\sum _{i=1}^{n-1}\lambda _i^2H_2+2\sum _{i<j\le n-1}\lambda _i\lambda _j (k_{in} +k_{jn}-k_{ij}-H_2)\\&+(n-1)\lambda _n^2H_2+2\lambda _n^2H_2\\= & {} (n+1)\sum _{i=1}^{n-1}\lambda _i^2H_2 +2\sum _{i<j\le n-1}\lambda _i\lambda _j (k_{in} +k_{jn}-k_{ij}-H_2)+(n+1)\lambda _n^2H_2\\= & {} (n+1)\psi _VH_2 +2\sum _{i<j\le n-1}\lambda _i\lambda _j (k_{in} +k_{jn}-k_{ij}-H_2). \end{aligned}$$

Therefore, it is sufficient to prove

$$\begin{aligned} (n+1)\psi _V(H_2-H_3) +2\sum _{i<j\le n-1}\lambda _i\lambda _j (k_{in} +k_{jn}-k_{ij}-H_2)\ge 0. \end{aligned}$$
(34)

The left-hand side of this inequality can be written and then estimated as follows

$$\begin{aligned}&(n+1)(\lambda _1^2+\cdots +\lambda _{n-1}^2)(H_2-H_3)+n\lambda _n^2(H_2-H_3)\nonumber \\&\qquad +\,(\lambda _1+\cdots +\lambda _{n-1})^2(H_2-H_3)\nonumber \\&\qquad +\,2\sum _{1\le i<j\le n-1}\lambda _i\lambda _j (k_{in} +k_{jn}-k_{ij}-H_2)\nonumber \\&\quad \ge (n+1)(\lambda _1^2+\cdots +\lambda _{n-1}^2)(H_2-H_3)+ (\lambda _1^2+\cdots +\lambda _{n-1}^2)(H_2-H_3)\nonumber \\&\qquad +\,2\sum _{1\le i<j\le n-1}\lambda _i\lambda _j(H_2-H_3)\nonumber \\&\qquad +\,2\sum _{1\le i<j\le n-1}\lambda _i\lambda _j (k_{in} +k_{jn}-k_{ij}-H_2)\nonumber \\&\quad \ge \frac{n+2}{n-2}(H_2-H_3)(n-2)(\lambda _1^2+\cdots +\lambda _{n-1}^2)\nonumber \\&\qquad +\,2\sum _{1\le i<j\le n-1}\lambda _i\lambda _j(k_{in}+k_{jn}-k_{ij}-H_3). \end{aligned}$$
(35)

In the last computations we have again used the fact that \(\lambda _n^2=(\lambda _1+\cdots +\lambda _{n-1})^2\) as well as the assumption that \(H_2-H_3\ge 0\).

Assume now that \(n\ge 4\). Observe that for \(i<j\le n-1\) we have

$$\begin{aligned} k_{in}+k_{jn}-k_{ij}-H_3\ge 0. \end{aligned}$$

Indeed, we have \( k_{in}+k_{jn}-k_{ij}-H_3\ge 2H_2-H_1-H_3=(\frac{n}{2}-2)\varepsilon \ge 0\) for \(n\ge 4\). Moreover,

$$\begin{aligned} \frac{n+2}{n-2}(H_2-H_3)\ge k_{in}+k_{jn}-k_{ij}-H_3. \end{aligned}$$

Namely, since \(H_1=H_3+\frac{n}{2}\varepsilon \) and \(H_2=H_3+\frac{n-2}{2}\varepsilon \), we have

$$\begin{aligned} k_{in}+k_{jn}-k_{ij}-H_3\le & {} 2H_1-H_2-H_3=\frac{n+2}{2}\varepsilon \\= & {} \left( \frac{n+2}{n-2}\right) \left( \frac{n-2}{2}\varepsilon \right) =\frac{n+2}{n-2}(H_2-H_3). \end{aligned}$$

We now can make farther estimations in (35) as follows

$$\begin{aligned}&\frac{n+2}{n-2}(H_2-H_3)(n-2)(\lambda _1^2+\cdots +\lambda _{n-1}^2)+2\sum _{i<j\le n-1}\lambda _i\lambda _j(k_{in}+k_{jn}-k_{ij}-H_3)\\&\quad \ge (n-2)(\lambda _1^2+\cdots +\lambda _{n-1}^2)(k_{in}+k_{jn}-k_{ij}-H_3)\\&\qquad +2\sum _{i<j\le n-1}\lambda _i\lambda _j(k_{in}+k_{jn}-k_{ij}-H_3)\\&\quad =\sum _{i<j\le n-1}(\lambda _i+\lambda _j)^2(k_{in}+k_{jn}-k_{ij}-H_3)\ge 0. \end{aligned}$$

The lemma is proved for \(n\ge 4\). Consider now the case \(n=3\). By the trace-freeness we can assume that \(\lambda _1\lambda _2\ge 0\). We compute and estimate the left-hand side of (34) as follows

$$\begin{aligned}&2(\lambda _1^2+\lambda _2^2+\lambda _3^2)\varepsilon +2\lambda _1\lambda _2(k_{13}+k_{23}-k_{12}-H_2)\\&\quad \ge 2(\lambda _1^2+\lambda _2^2+\lambda _3^2)\varepsilon +2\lambda _1\lambda _2(2H_2-H_1-H_2)\\&\quad =2(\lambda _1^2+\lambda _2^2+\lambda _3^2)\varepsilon -2\lambda _1\lambda _2\varepsilon \\&\quad =(\lambda _1-\lambda _2)^2\varepsilon +(\lambda _1^2 +\lambda _2^2+\lambda _3^2)\varepsilon \ge 0. \end{aligned}$$

Finally consider the case \(n=2\). In this case we have \(H_2=H_3\), \(\lambda _2=-\lambda _1\) and \(\psi _V=2\lambda _1^2\). Going back to (33) we get

$$\begin{aligned} \langle T'_V,A_V\rangle =6\lambda _1^2k_{12}\ge 3\psi _VH_3. \end{aligned}$$

The proof of Lemma 3.2 is completed. \(\square \)

It is well known that for any tensor field s the following formula holds

$$\begin{aligned} \Delta (g(s,s))=2g(\Delta s,s) +2g({\hat{\nabla }} s,{\hat{\nabla }} s), \end{aligned}$$
(36)

where \(\Delta s\) is defined by

$$\begin{aligned} \Delta s =\sum _{i=1}^n {\hat{\nabla }}^2_{e_ie_i}s, \end{aligned}$$
(37)

for any orthonormal frame \(e_i\). More precisely, if, in particular, s is a tensor field of type (0, k), then

$$\begin{aligned} \Delta s(X_1,\ldots ,X_k)=\sum _{i=1}^n(\hat{\nabla }_{e_i} (\hat{\nabla }s))(e_i, X_1,\ldots ,X_k), \end{aligned}$$

where \(\hat{\nabla }s\) is a \((0,k+1)\)-tensor field given by \(\hat{\nabla }s(X_0,X_1,\ldots ,X_k)=(\hat{\nabla }_{X_0}s)(X_1,\ldots ,X_k)\).

We shall now compute \(\Delta \psi \) for

$$\begin{aligned} \psi =g(A,A). \end{aligned}$$
(38)

Let \(p\in M\), \(X,Y,Z\in T_pM\) and \(e_1,\ldots ,e_n\) be an orthonormal basis of \(T_pM\). Extend all these vectors along \({\hat{\nabla }}\)-geodesics starting at p and denote the obtained vector fields by the same letters XYX, \(e_1,\ldots ,e_n\), respectively. Of course, \({\hat{\nabla }} X={\hat{\nabla }} Y={\hat{\nabla }} Z=0\), \({\hat{\nabla }} e_1=0\),..., \({\hat{\nabla }} e_n=0\) at p. The frame field \(e_1,\ldots , e_n\) is orthonormal. Since \(\hat{\nabla }A\) is symmetric, one gets at p

$$\begin{aligned} \sum _{i=1}^n({\hat{\nabla }}_{e_ie_i}^2A)(X,Y,Z)= & {} \sum _{i=1}^n(\hat{\nabla }_{e_i}(\hat{\nabla }A))(e_i,X, Y,Z)= \sum _{i=1}^n\hat{\nabla }_{e_i}((\hat{\nabla }_{e_i}A)(X,Y,Z))\\= & {} \sum _{i=1}^n\hat{\nabla }_{e_i} ((\hat{\nabla }_{X}A)(e_i,Y,Z))=\sum _{i=1}^n(\hat{\nabla }_{e_i}(\hat{\nabla }_{X}A))(e_i,Y,Z)) \\= & {} \sum _{i=1}^n({\hat{R}} (e_i,X) A)(e_i,Y,Z)+ \sum _{i=1}^n(\hat{\nabla }_{X}(\hat{\nabla }_{e_i}A))(e_i,Y,Z))\\= & {} \sum _{i=1}^n({\hat{R}} (e_i,X) A)(e_i,Y,Z)+ \sum _{i=1}^n\hat{\nabla }_{X}((\hat{\nabla }A)(Y,Z, e_i,e_i)). \end{aligned}$$

Thus,

$$\begin{aligned} (\Delta A)(X,Y,Z)= \mathrm{tr}\, _g ({\hat{R}}(\cdot , X)A)(\cdot , Y,Z). \end{aligned}$$
(39)

Since \({\hat{R}} =R -[K,K]\), we have

$$\begin{aligned} (\Delta A)(X,Y,Z)= \mathrm{tr}\, _g (R(\cdot ,X)A)(\cdot , Y,Z)-\mathrm{tr}\, _g([K_{\cdot },K_X]A)(\cdot . Y,Z). \end{aligned}$$
(40)

For estimating the second term on the right-hand side, we shall use the following inequality proved, in fact, on p. 84 in [5].

Proposition 3.3

For a trace-free statistical structure, we have

$$\begin{aligned} g(F,A)\ge \frac{n+1}{n(n-1)}(g(A,A))^2, \end{aligned}$$
(41)

where

$$\begin{aligned} F(X,Y,Z)= -\mathrm{tr}\, _ g([K_{\cdot },K_X]A)(\cdot , Y,Z). \end{aligned}$$
(42)

We shall now estimate the first term on the right-hand side of (40). Set

$$\begin{aligned} A'(X,Y,Z)=\mathrm{tr}\, _g(R(\cdot , X)A)(\cdot ,Y,Z). \end{aligned}$$
(43)

We have

$$\begin{aligned} g(A',A)= & {} \sum _{i,j,k,l}(R(e_i,e_k)A)(e_i, e_j,e_l)A(e_k,e_j,e_l)\nonumber \\= & {} -\sum _{i,j,k,l}\left[ A(R(e_i,e_k)e_i, e_j,e_l)A(e_k,e_j,e_l)\nonumber \right. \\&\left. +\,A(e_i, R(e_i,e_k)e_j, e_l )A(e_k,e_j,e_l)\right] \nonumber \\&-\sum _{i,j,k,l}A(e_i, e_j, R(e_i,e_k) e_l)A(e_k,e_j,e_l). \end{aligned}$$
(44)

In the last term we interchange the indices j and l. Since A is symmetric, we get

$$\begin{aligned} g(A',A)= & {} -\sum _{i,j,k,l}\left[ A(R(e_i,e_k)e_i, e_j,e_l)A(e_k,e_j,e_l)\nonumber \right. \\&\left. +\,2A(e_i, R(e_i,e_k)e_j, e_l )A(e_k,e_j,e_l)\right] . \end{aligned}$$
(45)

For a fixed index l we have

$$\begin{aligned}&-\sum _{i,j,k}\left[ A(R(e_i,e_k)e_i,e_j,e_l)A(e_k,e_j,e_l)\nonumber \right. \\&\left. \qquad +\,2A(e_i,R(e_i,e_k)e_j, e_l )A(e_k,e_j,e_l)\right] \nonumber \\&\quad =-\sum _{i,j,k}\left[ A_{e_l}(R(e_i,e_k)e_i,e_j)A_{e_l}(e_k,e_j)\nonumber \right. \\&\left. \qquad +\,2A_{e_l}(e_i, R(e_i,e_k)e_j )A_{e_l}(e_k,e_j)\right] . \end{aligned}$$
(46)

Let \(P_l\) stands for the right-hand side of (46). We have \(g(A',A)=\sum _{l=1}^n P_l\) and \(\psi =g(A,A)=\sum _{l=1}^n\psi _{e_l}\), where \(\psi _{e_l}=g(A_{e_l}, A_{e_l})\) as in Lemma 3.2. We now regard \(e_l\) as V in Lemma 3.2 and we get

$$\begin{aligned} P_l\ge (n+1)\psi _{e_l}H_3. \end{aligned}$$
(47)

Hence,

$$\begin{aligned} g(A',A)\ge (n+1)\psi H_3. \end{aligned}$$
(48)

By (36), (48), (40) and Proposition 3.3 we get

$$\begin{aligned} \Delta \psi \ge 2(n+1)\psi H_3+\frac{2(n+1)}{n(n-1)}\psi ^2. \end{aligned}$$
(49)

We shall now cite a theorem on weak solutions of differential inequalities for the Laplacian of non-negative functions. The following version of this theorem, proved in [3], is sufficient for our purposes.

Theorem 3.4

Let (Mg) be a complete Riemannian manifold with Ricci tensor bounded from below. Suppose that \(\psi \) is a non-negative continuous function and a weak solution of the differential inequality

$$\begin{aligned} \Delta \psi \ge b_0\psi ^k-b_1\psi ^{k-1}-\cdots -b_{k-1}\psi -b_k, \end{aligned}$$
(50)

where \(k>1\) is an integer and \(b_0>0\), \(b_1\ge 0\),..., \(b_k\ge 0\). Let N be the largest root of the polynomial equation

$$\begin{aligned} b_0\psi ^k-b_1\psi ^{k-1}-\cdots -b_{k-1}\psi -b_k=0. \end{aligned}$$
(51)

Then

$$\begin{aligned} \psi (p)\le N \end{aligned}$$
(52)

for all \(p\in M\).

We have, see (14),

$$\begin{aligned} {\widehat{\mathrm{{Ric}}}}\ge \mathrm{{Ric}}\ge (n-1)H_2, \end{aligned}$$
(53)

that is, \({\widehat{\mathrm{{Ric}}}}\) is bounded from below. Since \(H_3\le 0\), by Theorem 3.4 and (49) we have

$$\begin{aligned} \psi \le -n(n-1)H_3. \end{aligned}$$
(54)

Let X be a unit vector. Using (12) we now obtain

$$\begin{aligned} \widehat{\mathrm{{Ric}}} (X,X)= & {} \mathrm{{Ric}}(X,X)+g(K_X,K_X) \le \mathrm{{Ric}}(X,X) + g(K,K)=\mathrm{{Ric}}(X,X)+\psi \\\le & {} (n-1)H_1-n(n-1)H_3=-(n-1)^2H_3 +\frac{(n-1)n}{2}\varepsilon . \end{aligned}$$

Combining this with (53), one gets the following estimation of the Ricci tensor \({\widehat{\mathrm{{Ric}}}}\)

$$\begin{aligned} (n-1)H_3+\frac{(n-1)(n-2)}{2}\varepsilon \le {\widehat{\mathrm{{Ric}}}}\le -(n-1)^2H_3 +\frac{(n-1)n}{2}\varepsilon . \end{aligned}$$
(55)

In order to estimate the scalar curvature \({\hat{\rho }}\), we use (15) and (54). We get

$$\begin{aligned} n(n-1)H_3+\frac{(n-1)(n-2)n}{2}\varepsilon \le {\hat{\rho }} \le n(n-1)(H_1-H_3)=\frac{n^2(n-1)}{2}\varepsilon . \end{aligned}$$
(56)

The proof of Theorem 3.1 is completed. \(\square \)

Theorem 3.1 can be obviously formulated as follows

Theorem 3.5

Let \((g,\nabla )\) be a trace-free conjugate symmetric statistical structure on an n-dimensional manifold M. Assume that (Mg) is complete and the sectional \(\nabla \)-curvature k satisfies the inequality (22) on M, where \(H_1= H_3+\frac{n}{2}\varepsilon \), \(H_2=H_1-\varepsilon \), \(H_3\) is a non-positive number and \(\varepsilon \) is a non-negative function on M. Then the Ricci tensor \({\widehat{\mathrm{{Ric}}}}\) of g satisfies the inequalities

$$\begin{aligned} (n-1)H_2\le {\widehat{\mathrm{{Ric}}}}\le (n-1)\left[ (1-n)H_1+\frac{n^2}{2}\varepsilon \right] . \end{aligned}$$
(57)

The scalar curvature \({\hat{\rho }}\) of g satisfies the inequalities

$$\begin{aligned} n(n-1)H_2\le {\hat{\rho }}\le \frac{n^2(n-1)}{2}\varepsilon . \end{aligned}$$
(58)

Remark 3.6

The estimation of the Ricci tensor \({\widehat{\mathrm{{Ric}}}}\) from below in the above theorems is easy, and it follows from (13). The estimation of the Ricci tensor \({\widehat{\mathrm{{Ric}}}}\) from the above is not optimal in Theorems 3.13.5. Namely, in the case of a hyperbolic sphere, that is, in the case where \(H_1=H_2=H_3<0\), Theorem  3.1 gives the estimation \({\widehat{\mathrm{{Ric}}}}\le -(n-1)^2H_3\). (It should be \({\widehat{\mathrm{{Ric}}}}\le 0\).) The estimation of the scalar curvature in Theorems 3.1, 3.5 is optimal and, in the above proof, it is not deduced from the estimation of the Ricci tensor.

4 Conjugate symmetric trace-free statistical structures with non-negative sectional \(\nabla \)-curvature

We shall prove

Theorem 4.1

Let (Mg) be a complete Riemannian manifold with a conjugate symmetric trace-free statistical structure \((g,\nabla )\). If the sectional \(\nabla \)-curvature is non-negative on M, then the statistical structure is trivial, i.e. \(\nabla ={\hat{\nabla }}\).

This theorem can be deduced from the considerations of the previous section, but it can be proved in an easier way, as it is shown below. Namely, consider the non-negative function \(\varphi \) on M given by

$$\begin{aligned} \varphi _x={\mathop {\mathrm{max}}\limits _{U\in {\mathcal {U}}_x}}A(U,U,U), \end{aligned}$$
(59)

where \({\mathcal {U}}_x\) is the unit hypersphere in \(T_xM\), \(x\in M\). The function \(\varphi \) is continuous and non-negative on M. Let \(p\in M\) be a fixed point and \(V\in {\mathcal {U}}_p\) be a vector for which A(UUU) attains its maximum on \({\mathcal {U}}_p\). One observes (see, e.g. [6] the proof of Theorem 5.6) that V is an eigenvector of \(K_V\) and if \(e_1=V, e_2,\ldots , e_n\) is an orthonormal eigenbasis of \(K_V\) with corresponding eigenvalues \(\lambda _1,\ldots ,\lambda _n\) then

$$\begin{aligned} \lambda _1-2\lambda _i\ge 0 \end{aligned}$$
(60)

for \(i=2,\ldots ,n\). Extend \(V=e_1\) and \(e_2,\ldots , e_n\) by \({\hat{\nabla }}\)-parallel transport along \({\hat{\nabla }}\)-geodesics starting at p. We obtain a smooth orthonormal frame field. Denote the vector fields again by \(V=e_1\)\(e_2,\ldots ,e_n\). Then we have at p

$$\begin{aligned} {\hat{\nabla }} {e_i}=0,\ \ \ \ \ {\hat{\nabla }} _{e_i}{\hat{\nabla }} _{e_i}V=0 \end{aligned}$$
(61)

for \(i=1,\ldots ,n\). Denote by \(\varPhi \) the function A(VVV). Of course, \(\varPhi _p=\varphi _p\) and \(\varPhi \le \varphi \) everywhere. We have at p

$$\begin{aligned} \Delta \varPhi =\sum _{i=1}^n({\hat{\nabla }}_{e_i}({\hat{\nabla }}_{e_i}A))(V,V,V). \end{aligned}$$
(62)

Indeed, we have

$$\begin{aligned} ({\hat{\nabla }} \mathrm{d}\varPhi )(X,Y)= & {} X(\mathrm{d}\varPhi (Y))-\mathrm{d}\varPhi ({\hat{\nabla }}_XY)\\= & {} X[({\hat{\nabla }} _YA)(V,V,V)+3A(\hat{\nabla }_YV,V,V)]-\mathrm{d}\varPhi (\hat{\nabla }_XY)\\= & {} (\hat{\nabla }_X(\hat{\nabla }_YA))(V,V,V)+3(\hat{\nabla }_YA)(\hat{\nabla }_XV,V,V)+3(\hat{\nabla }_XA)(\hat{\nabla }_YV,V,V)\\&+\,3A(\hat{\nabla }_X\hat{\nabla }_YV,V,V) +6A(\hat{\nabla }_YV,\hat{\nabla }_XV,V)-\mathrm{d}\Phi (\hat{\nabla }_XY). \end{aligned}$$

Thus, by (61), we get (62) at p. We now have at p

$$\begin{aligned} \Delta \varPhi= & {} \sum _{i=1}^n\hat{\nabla }_{e_i}((\hat{\nabla }_{e_i}A)(V,V,V)) =\sum _{i=1}^n\hat{\nabla }_{e_i}((\hat{\nabla }_VA)(e_i,V,V))\\= & {} \sum _{i=1}^n(\hat{\nabla }_{e_i}(\hat{\nabla }_VA))(e_i,V,V)\\= & {} \sum _{i=1}^n({\hat{R}}(e_i,V)A)(e_i,V,V)+\sum _{i=1}^n(\hat{\nabla }_V(\hat{\nabla }_{e_i}A))(e_i,V,V)\\= & {} \sum _{i=1}^n(\hat{R}(e_i,V)A)(e_i,V,V)+\sum _{i=1}^n\hat{\nabla }_V((\hat{\nabla }_{e_i}A)(e_i,V,V))\\= & {} \sum _{i=1}^n({\hat{R}}(e_i,V)A)(e_i,V,V). \end{aligned}$$

In the last computations we used both assumptions: the conjugate symmetry and the trace-freeness of the statistical structure. By a straightforward computation one also gets at p

$$\begin{aligned} -\sum _{i=1}^n\left( [K_{e_i}, K_V]A\right) (e_i,V,V)=\sum _{i=1}^n\lambda _i^2(3\lambda _1-2\lambda _i) \end{aligned}$$
(63)

and

$$\begin{aligned} \sum _{i=1}^n\left( R(e_i,V)A\right) (e_i,V,V)=\sum _{i=1}^n(\lambda _1-2\lambda _i)k_{i1}. \end{aligned}$$
(64)

Assume now that the sectional \(\nabla \)-curvature is bounded from below by a number N. Using the equality \({\hat{R}}=R-[K,K]\) and the relations \(\lambda _1-2\lambda _i\ge 0\), \(\varPhi =\lambda _1\ge 0\), \(\varPhi =\varphi \) at p, we get at p

$$\begin{aligned} \Delta \varPhi= & {} \sum _{i=1}^n (\lambda _1-2\lambda _i)k_{1i}+\lambda _1^3+ \sum _{i=2}^n\lambda _i^2(3\lambda _1-2\lambda _i)\nonumber \\\ge & {} \sum _{i=2}^n(\lambda _1-2\lambda _i)N+\varPhi ^3= (n+1)N\varPhi +\varPhi ^3. \end{aligned}$$
(65)

It follows that the function \(\varphi \) is a weak solution of the differential inequality

$$\begin{aligned} \Delta \varphi \ge (n+1)N\varphi +\varphi ^3. \end{aligned}$$
(66)

Since \(\widehat{\mathrm{{Ric}}} \) is clearly bounded from below, by Theorem 3.4 we obtain that if \(N\le 0\) then

$$\begin{aligned} \varphi (x)\le \sqrt{-(n+1)N} \end{aligned}$$
(67)

for all \(x\in M\). If \(N=0\) we get \(\varphi \equiv 0\) which means that \(K\equiv 0\). Theorem 4.1 is proved. \(\square \)

We also proved

Proposition 4.2

Let (Mg) be a complete Riemannian manifold and \((g,\nabla )\) a trace-free conjugate symmetric statistical structure on M. If the sectional \(\nabla \)-curvature is bounded from below by a non-positive number N, then for any unit tangent vector \(U\in TM\) we have

$$\begin{aligned} A(U,U,U)\le \sqrt{-(n+1)N}. \end{aligned}$$
(68)

5 Proof of Theorem 1.3

We shall now prove Theorem 1.3. Assume that the statistical sectional curvature is bounded from below and above, that is, the inequalities

$$\begin{aligned} H_2\le k(\pi )\le H_1 \end{aligned}$$
(69)

are satisfied, where \(H_1, H_2\) are real numbers. If \(H_2< 0\) then \(H_3=H_2-\frac{n-2}{2}\varepsilon < 0\) and we can use Theorem 3.1 to get the first assertion of Theorem 1.3. If \(H_2\ge 0\) then we can use Theorem 4.1. The fact that the Ricci tensor of g is bounded trivially follows from the fact that the ordinary sectional curvature of g is equal to the sectional \(\nabla \)-curvature. If \(H_2>0\) then \({\widehat{\mathrm{{Ric}}}}\ge (n-1)H_2>0\). By Myers’ theorem, M is compact and its first fundamental group is finite. This completes the proof of Theorem 1.3.