1 Introduction

In this paper we study the well-posedness of the weakly hyperbolic Cauchy problem

$$\begin{aligned} \left\{ \begin{array}{ll} D^m_t u+\sum _{j=0}^{m-1} A_{m-j}(t,D_x)D_t^j u=0,&\quad (t,x)\in [0,T]\times \mathbb R ^n,\\ D^{h-1}_t u(0,x)=g_{h}(x),&\quad h=1,\ldots ,m, \end{array}\right. \end{aligned}$$
(1)

where each \(A_{m-j}(t,D_x)\) is a differential operator of order \(m-j\) with continuous coefficients depending only on \(t\). Later we will also relax the continuity assumption replacing it by the boundedness. As usual, \(D_t=\frac{1}{\mathrm{i}}\partial _t\) and \(D_x=\frac{1}{\mathrm{i}}\partial _x\). Let \(A_{(m-j)}\) denote the principal part of the operator \(A_{m-j}\) and let \(\lambda _l(t,\xi ), l=1,\ldots ,m\), be the real-valued roots of the characteristic polynomial which we write as

$$\begin{aligned} \tau ^m+\sum _{j=0}^{m-1}A_{(m-j)}(t,\xi )\tau ^j =\tau ^m+\sum _{j=0}^{m-1}\sum _{|\gamma |=m-j}a_{m-j,\gamma }(t)\xi ^\gamma \tau ^j. \end{aligned}$$
(2)

This means that

$$\begin{aligned} \tau ^m+\sum _{j=0}^{m-1}\sum _{|\gamma |=m-j}a_{m-j,\gamma }(t)\xi ^\gamma \tau ^j=\prod _{l=1}^m (\tau -\lambda _l(t,\xi )). \end{aligned}$$
(3)

If (1) is strictly hyperbolic and its coefficients are in the Hölder class in \(t, A_{m-j}(\cdot ,\xi )\in C^{\alpha }([0,T]), 0<\alpha <1\), for all \(j\) and \(\xi \), it was shown in [11, Case 3; Remark 8] by the authors that the Cauchy problem (1) is well-posed in Gevrey classes \(G^s(\mathbb{R }^n)\) provided that \(1\le s< 1+\frac{\alpha }{1-\alpha }\). If \(\alpha =1\), it is sufficient to assume the Lipschitz continuity of coefficients to get the well-posedness in \(G^s\) for all \(s\ge 1\). Earlier, for certain second order equations the same Gevrey index was obtained by Colombini et al. [3], who also showed that it is sharp. We also refer to [11, Remark 16] for the Gevrey-Beurling ultradistributional well-posedness of (1) for \(1\le s\le 1+\frac{\alpha }{1-\alpha }.\) In this paper we will deal with more regular coefficients in the weakly hyperbolic case.

The well-posedness of the weakly hyperbolic equations has been a challenging problem for a long time. For example, even for the second order Cauchy problem in one space dimension,

$$\begin{aligned} \partial _t^2 u - a(t,x)\partial _x^2 u=0, \quad u(0,x)=g_1(x), \quad \partial _t u(0,x)=g_2(x), \end{aligned}$$
(4)

up until now there is no characterisation of smooth functions \(a(t,x)\ge 0\) for which (4) would be \(C^\infty \) well-posed. On one hand, there are sufficient conditions. For example, Oleinik has shown in [16] that (4) is \(C^\infty \) well-posed provided there is a constant \(C>0\) such that \(C a(t,x)+\partial _t a(t,x)\ge 0\). In the case of \(a(t,x)=a(t)\) depending only on \(t\), when the problem becomes

$$\begin{aligned} \partial _t^2 u - a(t)\partial _x^2 u=0, \quad u(0,x)=g_1(x),\quad \partial _t u(0,x)=g_2(x), \end{aligned}$$
(5)

the Oleinik’s condition is satisfied for \(a(t)\ge 0\) with \(a^{\prime }(t)\ge 0\). On the other hand, in the celebrated paper [8], Colombini and Spagnolo constructed a \(C^\infty \) function \(a(t)\ge 0\) such that (5) is not \(C^\infty \) well-posed. The situation becomes even more complicated if one adds mixed terms to (5), even depending only on \(t\) and analytic. For example, the Cauchy problem for the equation

$$\begin{aligned} \partial _t^2 u - 2t \partial _t\partial _x u+ t^2\partial _x^2 u=0 \end{aligned}$$

is Gevrey \(G^s\) well-posed for \(s<2\) while it is ill-posed for any \(s>2\). For other positive and negative results for second order equations with time-dependent coefficients we refer to seminal papers of Colombini et al. [3, 5], and to Nishitani [15] for the necessary and sufficient conditions for the \(C^{\infty }\) well-posedness of (4) with analytic \(a(t,x)\ge 0\) in one dimension.

A reasonable substitute for the \(C^\infty \) well-posedness in the weakly hyperbolic setting is the well-posedness in the space \(G^\infty =\bigcup _{s>1} G^s\). Thus, Colombini et al., proved in [4] that for every \(C^\infty \) function \(a(t)\ge 0\), the Cauchy problem (5) is \(G^\infty \) well-posed. More precisely, they showed that if \(a(t)\) is in \(C^k\), it is well posed in \(G^s\) with \(s\le 1+k/2\), and if \(a(t)\) is analytic, it is \(C^\infty \) well-posed.

From another direction, there are also general results for (1). For example, it was shown by Bronshtein in [2] that, in particular, the Cauchy problem (1) with \(C^\infty \) coefficients is \(G^s\) well-posed provided that \(1\le s<1+\frac{1}{m-1}.\) In some cases, this can be improved. For example, for constant multiplicities, see paper [6] by Colombini and Kinoshita in one-dimension (see also D’Ancona and Kinoshita [9]), and the authors’ paper [11] for further improvements of Gevrey indices and all dimensions, with a survey of literature therein.

In this paper our interest in analysing the Cauchy problem (1) is motivated by

  1. (A)

    allowing any space dimension \(n\ge 1\);

  2. (B)

    considering the effect of lower order terms or, rather, the properties of the lower order terms which do not influence the results on the Gevrey well-posedness (we will look at new effects for both continuous and discontinuous lower order terms); the inclusion of lower order terms in this setting has been untreatable by previous methods;

  3. (C)

    providing well-posedness results in spaces of distributions and ultradistributions.

Our main reference here is the paper [13] of Kinoshita and Spagnolo who have studied the Cauchy problem (1) for operators with homogeneous symbols in one dimension, \(x\in \mathbb R \). Under the condition

$$\begin{aligned} \exists M>0:\quad \lambda _i(t,\xi )^2+\lambda _j(t,\xi )^2\le M(\lambda _i(t,\xi )-\lambda _j(t,\xi ))^2, \nonumber \\ \text{ for}\, 1\le i,j\le m, t\in [0,T], \mathrm for all \xi , \end{aligned}$$
(6)

on the roots \(\lambda _j(t,\xi )\) they have obtained the following well-posedness result:

Theorem 1

([13]) Assume that \(n=1\) and that the differential operator is homogeneous, i.e. \(A_{m-j}(t,\xi )=A_{(m-j)}(t,\xi )=a_{m-j}(t)\xi ^{m-j}\) for all \(j=0,\ldots ,m-1\). If \(a_{m-j}\in C^\infty ([0,T])\) and the characteristic roots are real and satisfy (6), then the Cauchy problem (1) is well-posed in any Gevrey space. More precisely, if \(a_j\in {C}^k([0,T])\) for some \(k\ge 2\) then we have \(G^s\)-well-posedness for

$$\begin{aligned} 1\le s<1+\frac{k}{2(m-1)}. \end{aligned}$$

The proof is based on the construction of a quasi-symmetriser \(Q_\varepsilon ^{(m)}\) which thanks to the condition (6) is nearly diagonal. Previously, equations of second and third order with analytic coefficients, still with \(n=1\) and without low order terms, have been analysed by Colombini and Orrú [7]. They have shown the \(C^\infty \) well-posedness of (1) under assumption (6). Moreover, if all the coefficients \(a_{m-j}(t)\) vanish at \(t=0\), they showed that the condition (6) is also necessary. So, for us it will be natural to adopt (6) for our analysis.

Let us briefly discuss the difficulties of aims (A)–(C) above. For the dimensional extension (A), even under condition (6) on the characteristic roots, for space dependent coefficients such an extension is impossible, see e.g. Bernardi and Bove [1], for examples of second order operators with polynomial coefficients for which the \(C^{\infty }\) well-posedness fails for any \(n\ge 2\). It is interesting to note that for these examples the usual Ivrii–Petkov conditions on lower order terms are also satisfied. As we will show, the \(C^{\infty }\) (and other) well-posedness holds in our case in any dimension \(n\ge 1\) since the coefficients depend only on time. In part (B), the proof of the well-posedness for equations with lower order terms highlights several interesting and somewhat surprising phenomena. For example, if the coefficients of the principal part are analytic and the lower order terms are only bounded (in particular, they may be discontinuous, or may exhibit more irregular oscillating behaviour), but the Cauchy data is Gevrey, we still obtain the solution in Gevrey spaces. Indeed, the Levi conditions in this paper control the zeros of the lower order terms but not their regularity. Finally, aim (C) is motivated by an interesting and challenging problem for weakly hyperbolic equations: analysing the propagation of singularities. For this, in order to be able to use also non-Gevrey techniques, we need to have first well-posedness in some bigger space. This will be achieved for the Cauchy problem (1) in the spaces of Beurling Gevrey ultradistributions. A subtle point of this construction is that we will have to use the Beurling Gevrey ultradistributions and not the usual Roumieu Gevrey ultradistributional class. In the case of the analytic principal part we will obtain the well-posedness in the usual space of distributions.

In particular, in this paper we extend Theorem 1 to weakly hyperbolic equations with non-homogeneous symbols and in any space dimension \(n\ge 1\), and find suitable assumptions on the lower order terms for the Gevrey well-posedness. Already from the beginning we deviate from [13] by using pseudo-differential techniques to reduce the equation to the system. This will allow us to treat all the dimensions \(n\ge 1\). However, the main challenge in the present paper is the analysis of the lower order terms. In fact, in most (if not all) of the literature on the application of the quasi-symmetriser to weakly hyperbolic equations the considered equations are always assumed to have homogeneous symbols. It is our intension to show that the quasi-symmetriser can be effectively used to control parts of the energy corresponding to the lower order terms. It is interesting to see the appearing Levi conditions expressing the dependence of the lower order terms on the principal part of the operator. Such control becomes possible by exploiting the Sylvester form of the system corresponding to Eq. (1), and the structure of the quasi-symmetriser.

An interesting effect that we observe is that the results remain true assuming just the continuity of the lower order terms in time. For example, we will have the \(C^{\infty }\) well-posedness for equations with analytic coefficients in the principal part and only continuous lower order terms. Moreover, we give a variant of our results with only assuming the boundedness of lower order terms in time (instead of continuity).

In this paper we formulate the conditions on the lower order terms in terms of the symbols \(A_{m-j+1}\). Note that in (1), the operator \(A_{m-j+1}(t,D_x)\) is the coefficient in front of the derivative \(D_t^{j-1}\). We assume that there is some constant \(C>0\) such that we have

$$\begin{aligned}&|(A_{m-j+1}-A_{(m-j+1)})(t,\xi )| \nonumber \\&\quad \le C\sum _{i=1}^m \left| \sum _{\begin{array}{c} 1\le l_1<\cdots <l_{m-j}\le m \\ l_h\not =i\; \forall h \end{array}} \lambda _{l_1}(t,\xi )\cdots \lambda _{l_{m-j}}(t,\xi ) \right|, \end{aligned}$$
(7)

for all \(t\in [0,T], j=1,\ldots , m\), and for \(\xi \) away from \(0\) (i.e., for \(|\xi |\ge R\) for some \(R>0\)). Note that in terms of the coefficients of the original equation, using (2) we have

$$\begin{aligned} (A_{m-j+1}-A_{(m-j+1)})(t,\xi )=\sum _{|\gamma |\le m-j} a_{m-j+1,\gamma }(t)\xi ^\gamma . \end{aligned}$$
(8)

For \(j=m\), the condition (7) is the condition on the low order terms coming from the coefficient in front of \(D_t^{m-1}\) , in which case \(A_1-A_{(1)}\) is independent of \(\xi \), and assumption (7) should read as

$$\begin{aligned} |(A_1-A_{(1)})(t,\xi )|\le C, \quad t\in [0,T], \end{aligned}$$

which will be automatically satisfied due to the boundedness of \(A_{1}\) in \(t\). In Sect. 2 we will give examples of the condition (7). We will also show in treating the case \(m=3\) that from the point of view of the desired energy inequality for (1) the assumption (7) is rather natural.

For a better understanding of the right hand side of the condition (7), let us simplify it in the case when all the characteristic roots are nonnegative, i.e. when \(\lambda _l\ge 0\) for all \(l=1,\ldots ,m.\) In this situation, using (3) we see that the right hand side of (7) can be replaced by the coefficient of \(\tau ^j\) in (3) and, therefore, condition (7) becomes

$$\begin{aligned} |(A_{m-j+1}-A_{(m-j+1)})(t,\xi )| \le C |A_{(m-j)}(t,\xi )| \end{aligned}$$

for all \(t\in [0,T], j=1,\ldots , m\), and for \(\xi \) away from \(0\).

In the Appendix in Sect. 6 we will show that also in general, the Levi conditions (7) can be expressed entirely in terms of the coefficients of the Eq. (1).

We are now ready to formulate the well-posedness results. Part (i) of Theorem 2 is the extension of Theorem 1 in [13] to any space-dimension and to equations with low order terms. In the sequel \(\mathcal D ^{\prime }_{(s)}(\mathbb R ^n) (\mathcal E ^{\prime }_{(s)}(\mathbb R ^n))\) denotes the space of Gevrey Beurling (compactly supported) ultradistributions. For the relevant details on these spaces of ultradistributions and their characterisations, with their appearance in the analysis of weakly hyperbolic equations, we refer to our paper [11], where these have been applied to the low (Hölder) regularity constant multiplicities case.

Theorem 2

Let \(n\ge 1\). If the coefficients satisfy \(A_j(\cdot ,\xi )\in {C}([0,T])\) and \(A_{(j)}(\cdot ,\xi )\in C^\infty ([0,T])\) for all \(\xi \) and \(j=1,\ldots ,m\), the characteristic roots are real and satisfy (6), and the low order terms satisfy (7), then the Cauchy problem (1) is well-posed in any Gevrey space. More precisely, for \(A_j(\cdot ,\xi )\in {C}([0,T])\), we have:

  1. (i)

    if \(A_{(j)}(\cdot ,\xi )\in {C}^k([0,T])\) for some \(k\ge 2\) and \(g_j\in G^s(\mathbb R ^n)\) for \(j=1,\ldots ,m,\) then there exists a unique solution \(u\in C^m([0,T];G^s(\mathbb R ^n))\) provided that

    $$\begin{aligned} 1\le s<1+\frac{k}{2(m-1)}; \end{aligned}$$
  2. (ii)

    if \(A_{(j)}(\cdot ,\xi )\in {C}^k([0,T])\) for some \(k\ge 2\) and \(g_j\in \mathcal E ^{\prime }_{(s)}(\mathbb R ^n)\) for \(j=1,\ldots ,m,\) then there exists a unique solution \(u\in C^m([0,T];\mathcal D ^{\prime }_{(s)}(\mathbb R ^n))\) provided that

    $$\begin{aligned} 1\le s\le 1+\frac{k}{2(m-1)}. \end{aligned}$$

In the case of analytic coefficients, we have \(C^\infty \) and distributional well-posedness.

Theorem 3

If \(A_j(\cdot ,\xi )\in {C}([0,T])\) and the coefficients \(A_{(j)}(\cdot ,\xi )\) are analytic on \([0,T]\) for all \(\xi \) and \(j=1,\ldots ,m\), the characteristic roots are real and satisfy (6), and the lower order terms fulfil the conditions (7), then the Cauchy problem (1) is \(C^\infty \) and distributionally well-posed.

By \(W^{\infty ,m}\) we denote the Sobolev space of functions having \(m\) derivatives in \(L^{\infty }\). In the case of discontinuous but bounded lower order terms we have the following:

Theorem 4

  1. (i)

    Assume the conditions of Theorem 2, with \(A_j(\cdot ,\xi )\in {C}([0,T])\) replaced by \(A_j(\cdot ,\xi )\in L^{\infty }([0,T]), j=1,\ldots ,m\). Then the statement remains true provided that we replace the conclusion

    $$\begin{aligned} u\in C^m([0,T];G^s(\mathbb R ^n)) \end{aligned}$$

    by

    $$\begin{aligned} u\in C^{m-1}([0,T];G^s(\mathbb R ^n))\cap W^{\infty ,m}([0,T];G^s(\mathbb R ^n)). \end{aligned}$$
  2. (ii)

    Assume the conditions of Theorem 3 with \(A_j(\cdot ,\xi )\in {C}([0,T])\) replaced by \(A_j(\cdot ,\xi )\in L^{\infty }([0,T]), j=1,\ldots ,m\). Then the \(C^{\infty }\) well-posedness remains true provided that we replace the conclusion

    $$\begin{aligned} u\in C^m([0,T];C^\infty (\mathbb R ^n)) \end{aligned}$$

    by

    $$\begin{aligned} u\in C^{m-1}([0,T];C^\infty (\mathbb R ^n))\cap W^{\infty ,m}([0,T];C^\infty (\mathbb R ^n)). \end{aligned}$$

Similar conclusions remain true in the ultradistributional/distributional settings as well.

In Remark 2 we show that in certain cases the results can be improved, but the Gevrey index may change.

We refer to Remark 1 for a brief discussion of the strictly hyperbolic case. In this case, even in the situation of the lower regularity of coefficients (\(C^{1}\)), one can analyse the global behaviour of solutions with respect to time (see [14]). The cases of constant coefficients and systems with controlled oscillations have been treated in [17, 18], respectively.

Finally, we describe the contents of the sections in more details.

Section 2 collects some motivating examples of applications of our results. In Sect. 3 we recall the required facts about the quasi-symmetriser and in Sect. 4 we use it to derive the energy estimate for the solutions of the hyperbolic system in Sylvester form corresponding to the Cauchy problem (1). The estimate on the part of the energy corresponding to lower order terms is given in Sect. 5. In Sect. 6 we prove Theorems 2, 3 and 4 and we end the paper with a final remark on the Levi conditions (7).

We thank the referee for valuable remarks leading to the improvement of the paper.

2 Examples

Let us first give an example of the Levi conditions (7) for the equations of third order, \(m=3\). In this case (7) become

$$\begin{aligned}&|A_3-A_{(3)}|^2\le C(\lambda _1^2\lambda _2^2+\lambda _2^2\lambda ^2_3+\lambda _3^2\lambda _1^2), \nonumber \\&|A_2-A_{(2)}|^2\le C((\lambda _1+\lambda _2)^2+(\lambda _2+\lambda _3)^2+(\lambda _3+\lambda _1)^2), \\&|A_1-A_{(1)}|^2\le C,\nonumber \end{aligned}$$
(9)

for some \(C>0\). It is convenient in certain applications, whenever possible, to write conditions (6) and (7) in terms of the coefficients of the equation. Such analysis for (1) has been recently carried out by Jannelli and Taglialatela [12]. In Example 3 below we will give an example of the meaning of conditions (9).

For the future technicality, similarly to (9), we may also use an equivalent formulation of (7) as

$$\begin{aligned}&|(A_{m-j+1}-A_{(m-j+1)})(t,\xi )|^2 \nonumber \\&\quad \le C\sum _{i=1}^m \left| \sum _{\begin{array}{c} 1\le l_1<\cdots <l_{m-j}\le m \\ l_h\not =i\; \forall h \end{array}} \lambda _{l_1}(t,\xi )\cdots \lambda _{l_{m-j}}(t,\xi )\right|^2. \end{aligned}$$
(10)

Condition (6) can be often reformulated in terms of the discriminant of (1) defined by \(\Delta (t,\xi )=\prod _{i<j} (\lambda _i(t,\xi )-\lambda _j(t,\xi ))^2.\) Thus, for \(m=2, n=1\), and the equation

$$\begin{aligned} \partial _t^2 u+a_1(t)\partial _t\partial _x u+a_2(t)\partial _x^2 u=0, \end{aligned}$$

condition (6) is equivalent to the existence of \(c>0\) such that \(\Delta (t)\ge c a_1(t)^2\), where \(\Delta (t,\xi )=\Delta (t)\xi \) and \(\Delta (t)=a_1^2(t)-4a_2(t)\ge 0\) is the condition of the hyperbolicity.

For \(m=3, n=1\), and the equation

$$\begin{aligned} \partial _t^3 u+a_1(t)\partial _x\partial _t^2 u+a_2(t)\partial _x^2\partial _t u +a_3(t)\partial _x^3 u=0, \end{aligned}$$

following [13], we have \(\Delta (t,\xi )=\Delta (t)\xi \), with \(\Delta (t)=-4a_2^3-27 a_3^2+a_1^2 a_2^2-4a_1^3 a_3+18a_1 a_2 a_3\ge 0\), and (6) is equivalent to \(\Delta (t)\ge c(a_1(t) a_2(t)-9 a_3(t))^2.\)

Since the hyperbolic equations above have homogeneous symbols, the coefficients are real. We refer to Colombini–Orrú [7] and Kinoshita–Spagnolo [13] for more examples of equations without lower order terms in one dimension \(n=1\).

We now give more examples, which correspond to the new possibility, ensured by Theorems 2 and 3, to consider equations with lower order terms and equations in higher dimensions \(n\ge 1\). However, we note that in the case of second order equations much stronger results can be obtained, see Remark 3, so Examples 1 and 2 serve primarily at demonstrating the meaning of conditions (7) only.

2.1 Example 1

As a first example we consider the second order equation

$$\begin{aligned} D_t^2 u+a_{2,2}(t)D_x^2u+a_{2,1}(t)D_xu+a_{1,0}(t)D_tu+a_{2,0}(t)u=0, \end{aligned}$$

for \(t\in [0,T]\) and \(x\in \mathbb R \). Assume \(a_{2,2}(t)\) is real and \(a_{2,2}(t)\le 0\). The condition (6) is trivially satisfied by the roots

$$\begin{aligned} \lambda _1(t,\xi )&= -\sqrt{-a_{2,2}(t)}|\xi |,\nonumber \\ \lambda _2(t,\xi )&= +\sqrt{-a_{2,2}(t)}|\xi |. \end{aligned}$$

The well-posedness results of Sect. 1 are obtained under the conditions (7) on the lower order terms. In this case (7) means that the coefficient \(a_{1,0}(t)\) is bounded on \([0,T]\) and that there exists a constant \(c>0\) such that \(|a_{2,1}(t)\xi +a_{2,0}(t)|^2\le -c a_{2,2}(t)\xi ^2\) for all \(t\in [0,T]\) and for \(\xi \) away from \(0\). Note that this last condition holds if \(|a_{2,1}(t)|^2+|a_{2,0}(t)|^2\le -c^{\prime }a_{2,2}(t)\) for some \(c^{\prime }>0\) on the \(t\)-inte rval \([0,T]\).

Take now the general second order equation

$$\begin{aligned} D_t^2 u+a_{1,1}(t)D_xD_tu+a_{2,2}(t)D_x^2u+a_{2,1}(t)D_xu+a_{1,0}(t)D_tu+a_{2,0}(t)u=0. \end{aligned}$$

As observed above condition (6) coincides with the bound from below

$$\begin{aligned} \Delta (t)=a_{1,1}^2(t)-4a_{2,2}(t)\ge c_0 a_{1,1}^2(t), \end{aligned}$$

valid for some \(c_0>0\) on \([0,T]\) ([13, (15)]). Here \(a_{1,1}, a_{2,2}\) are assumed real. The conditions (7) on the lower order terms are of the type \(|a_{1,0}(t)|\le c_1\) and \(|a_{2,1}(t)\xi +a_{2,0}(t)|^2\le c_2(a_{1,1}^2(t)-2a_{2,2}(t))\xi ^2\) for all \(t\in [0,T]\) and for \(\xi \) away from \(0\).

2.2 Example 2

The equation

$$\begin{aligned} D_t^2 u&+ \sum _{j=1}^n a_{1,j}(t)D_{x_j}D_tu+a_{2}(t)\sum _{j=1}^nD_{x_j}^2u\\&+ \sum _{j=1}^n b_j(t)D_{x_j}u +b(t)D_tu+d(t)u=0 \end{aligned}$$

is an \(n\)-dimensional version of the previous example, with real \(a_{1,j}\) and \(a_{2}\). The condition (6) is trivially satisfied when \(a_2(t)\le 0\). The conditions (7) on the lower order terms are as follows:

$$\begin{aligned} |b(t)|&\le c, \\ \left|\sum _{j=1}^n b_j(t)\xi _j+d(t)\right|^2&\le c\left[\left(\sum _{j=1}^n a_{1,j}(t)\xi _j\right)^2-2a_2(t)|\xi |^2\right], \end{aligned}$$

for \(t\in [0,T]\) and \(\xi \) away from \(0\).

2.3 Example 3

We finally give an example of a higher order equation. Let

$$\begin{aligned} D_t^3u&-(a+b+c)D_xD_t^2u+(ab+ac+bc)D_x^2D_tu-abcD_x^3u\\&+ \sum _{l<3}a_{3,l}(t)D^l_xu+\sum _{l<2}a_{2,l}(t)D_x^l D_tu+a_{1,0}(t)D_t^2u=0, \end{aligned}$$

where \(a(t), b(t)\) and \(c(t)\) are real-valued functions with \(b\) and \(c\) bounded above and from below by \(a\) (for instance, \(a(t)/4\le b(t)\le a(t)/2\) and \(a(t)/16\le c(t)\le a(t)/8\) for all \(t\in [0,T]\)). It follows that condition (6) on the roots \(\lambda _1(t,\xi )=a(t)\xi , \lambda _2(t,\xi )=b(t)\xi \) and \(\lambda _3(t,\xi )=c(t)\xi \) is fulfilled on \([0,T]\) for all \(\xi \in \mathbb R \). The Levi conditions (7) on the lower order terms are of the following type:

$$\begin{aligned} |a_{3,2}(t)\xi ^2+a_{3,1}(t)\xi +a_{3,0}(t)|^2&\le c\, a^4(t)\xi ^4,\\ |a_{2,1}(t)\xi +a_{2,0}(t)|^2&\le c\, a^2(t)\xi ^2,\\ |a_{1,0}(t)|^2&\le c, \end{aligned}$$

for \(t\in [0,T]\) and \(\xi \) away from \(0\).

3 The quasi-symmetriser

We begin by recalling a few facts concerning the quasi-symmetriser. For more details see [10, 13]. Note that for \(m\times m\) matrices \(A_1\) and \(A_2\) the notation \(A_1\le A_2\) means \((A_1v,v)\le (A_2v,v)\) for all \(v\in \mathbb C ^m\) with \((\cdot ,\cdot )\) the scalar product in \(\mathbb C ^m\). Let \(A(\lambda )\) be the \(m\times m\) Sylvester matrix with real eigenvalues \(\lambda _l\), i.e.,

$$\begin{aligned} A(\lambda )=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 0&1&0 \quad \ldots&0\\ 0&0&1 \quad \ldots&0 \\ \dots&\ldots&\ldots \quad \dots&1 \\ -\sigma _m^{(m)}(\lambda )&-\sigma _{m-1}^{(m)}(\lambda )&\ldots \quad \ldots&-\sigma _1^{(m)}(\lambda ) \\ \end{array} \right), \end{aligned}$$

where

$$\begin{aligned} \sigma _h^{(m)}(\lambda )=(-1)^h\sum _{1\le i_1<\cdots <i_h\le m}\lambda _{i_1}\ldots \lambda _{i_h} \end{aligned}$$

for all \(1\le h\le m\). In the sequel we make use of the following notations: \(\mathcal P _m\) for the class of permutations of \(\{1,\ldots ,m\}, \lambda _\rho =(\lambda _{\rho _1},\ldots ,\lambda _{\rho _m})\) with \(\lambda \in \mathbb R ^m\) and \(\rho \in \mathcal P _m, \pi _i\lambda =(\lambda _1,\ldots ,\lambda _{i-1},\lambda _{i+1},\ldots ,\lambda _m)\) and \(\lambda ^{\prime }=\pi _m\lambda =(\lambda _1,\ldots ,\lambda _{m-1})\). Following Sect. 4 in [13] we have that the quasi-symmetriser is the Hermitian matrix

$$\begin{aligned} Q^{(m)}_\varepsilon (\lambda )=\sum _{\rho \in \mathcal P _m} P_\varepsilon ^{(m)}(\lambda _\rho )^*P_\varepsilon ^{(m)}(\lambda _\rho ), \end{aligned}$$

where \(\varepsilon \in (0,1], P_\varepsilon ^{(m)}(\lambda )=H^{(m)}_\varepsilon P^{(m)}(\lambda ), H_\varepsilon ^{(m)}=\mathrm{diag}\{\varepsilon ^{m-1},\ldots ,\varepsilon ,1\}\) and the matrix \(P^{(m)}(\lambda )\) is defined inductively by \(P^{(1)}(\lambda )=1\) and

$$\begin{aligned} P^{(m)}(\lambda )=\left( \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} \,&\,&\, \,&0\\ \,&\, P^{(m-1)}(\lambda ^{\prime })&\,&\vdots \\ \,&\,&\, \,&0 \\ \sigma _{m-1}^{(m-1)}(\lambda ^{\prime })&\ldots \ldots&\sigma _1^{(m-1)}(\lambda ^{\prime })&1 \\ \end{array} \right). \end{aligned}$$

Note that \(P^{(m)}(\lambda )\) is depending only on \(\lambda ^{\prime }\). Finally, let \(W^{(m)}_i(\lambda )\) denote the row vector

$$\begin{aligned} (\sigma _{m-1}^{(m-1)}(\pi _i\lambda ),\ldots ,\sigma _1^{(m-1)}(\pi _i\lambda ),1),\quad 1\le i\le m, \end{aligned}$$

and let \(\mathcal W ^{(m)}(\lambda )\) be the matrix with row vectors \(W^{(m)}_i\). The following proposition collects the main properties of the quasi-symmetriser \(Q^{(m)}_\varepsilon (\lambda )\). For a detailed proof we refer the reader to Propositions 1 and 2 in [13] and to Proposition 1 in [10].

Proposition 1

  1. (i)

    The quasi-symmetriser \(Q_\varepsilon ^{(m)}(\lambda )\) can be written as

    $$\begin{aligned} Q_0^{(m)}(\lambda )+\varepsilon ^2 Q_1^{(m)}(\lambda )+\cdots +\varepsilon ^{2(m-1)}Q_{m-1}^{(m)}(\lambda ), \end{aligned}$$

    where the matrices \(Q^{(m)}_i(\lambda ), i=1,\ldots ,m-1,\) are nonnegative and Hermitian with entries being symmetric polynomials in \(\lambda _1,\ldots ,\lambda _m\).

  2. (ii)

    There exists a function \(C_m(\lambda )\) bounded for bounded \(|\lambda |\) such that

    $$\begin{aligned} C_m(\lambda )^{-1}\varepsilon ^{2(m-1)}I\le Q^{(m)}_\varepsilon (\lambda )\le C_m(\lambda )I. \end{aligned}$$
  3. (iii)

    We have

    $$\begin{aligned} -C_m(\lambda )\varepsilon Q_\varepsilon ^{(m)}(\lambda )\le Q_\varepsilon ^{(m)}(\lambda )A(\lambda )-A(\lambda )^*Q_\varepsilon ^{(m)}(\lambda )\le C_m(\lambda )\varepsilon Q_\varepsilon ^{(m)}(\lambda ). \end{aligned}$$
  4. (iv)

    For any \((m-1)\times (m-1)\) matrix \(T\) let \(T^\sharp \) denote the \(m\times m\) matrix

    $$\begin{aligned} \left( \begin{array}{ll} T&0\\ 0&0 \end{array} \right)\!. \end{aligned}$$

    Then, \(Q_\varepsilon ^{(m)}(\lambda )=Q_0^{(m)}(\lambda )+\varepsilon ^2\sum _{i=1}^m Q_\varepsilon ^{(m-1)}(\pi _i\lambda )^\sharp \).

  5. (v)

    We have

    $$\begin{aligned} Q_0^{(m)}(\lambda )=(m-1)!\mathcal W ^{(m)}(\lambda )^*\mathcal W ^{(m)}(\lambda ). \end{aligned}$$
  6. (vi)

    We have

    $$\begin{aligned} \det Q_0^{(m)}(\lambda )=(m-1)!\prod _{1\le i<j\le m}(\lambda _i-\lambda _j)^2. \end{aligned}$$
  7. (vii)

    There exists a constant \(C_m\) such that

    $$\begin{aligned} q_{0,11}^{(m)}(\lambda )\cdots q_{0,mm}^{(m)}(\lambda )\le C_m\prod _{1\le i<j\le m}(\lambda ^2_i+\lambda ^2_j). \end{aligned}$$

We finally recall that a family \(\{Q_\alpha \}\) of nonnegative Hermitian matrices is called nearly diagonal if there exists a positive constant \(c_0\) such that

$$\begin{aligned} Q_\alpha \ge c_0\,\mathrm{diag}\,Q_\alpha \end{aligned}$$

for all \(\alpha \), with \(\mathrm{diag}\,Q_\alpha =\mathrm{diag}\{q_{\alpha ,11},\ldots ,q_{\alpha , mm}\}\). The following linear algebra result is proven in [13, Lemma 1].

Lemma 1

Let \(\{Q_\alpha \}\) be a family of nonnegative Hermitian \(m\times m\) matrices such that \(\det Q_\alpha >0\) and

$$\begin{aligned} \det Q_\alpha \ge c\, q_{\alpha ,11}q_{\alpha ,22}\cdots q_{\alpha ,mm} \end{aligned}$$

for a certain constant \(c>0\) independent of \(\alpha \). Then,

$$\begin{aligned} Q_\alpha \ge c\, m^{1-m}\,\mathrm{diag}\,Q_\alpha \end{aligned}$$

for all \(\alpha \), i.e., the family \(\{Q_\alpha \}\) is nearly diagonal.

Lemma 1 is employed to prove that the family \(Q_\varepsilon ^{(m)}(\lambda )\) of quasi-symmetrisers defined above is nearly diagonal when \(\lambda \) belongs to a suitable set. The following statement is proven in [13, Proposition 3].

Proposition 2

For any \(M>0\) define the set

$$\begin{aligned} \mathcal S _M=\{\lambda \in \mathbb R ^m:\, \lambda _i^2+\lambda _j^2\le M (\lambda _i-\lambda _j)^2,\quad 1\le i<j\le m\}. \end{aligned}$$

Then the family of matrices \(\{Q_\varepsilon ^{(m)}(\lambda ):\, 0<\varepsilon \le 1, \lambda \in \mathcal S _M\}\) is nearly diagonal.

We conclude this section with a result on nearly diagonal matrices depending on 3 parameters (i.e. \(\varepsilon , t, \xi \)) which will be crucial in the next section. Note that this is a straightforward extension of Lemma 2 in [13] valid for 2 parameter (i.e. \(\varepsilon , t\)) dependent matrices.

Lemma 2

Let \(\{ Q^{(m)}_\varepsilon (t,\xi ): 0<\varepsilon \le 1, 0\le t\le T, \xi \in \mathbb R ^n\}\) be a nearly diagonal family of coercive Hermitian matrices of class \({C}^k\) in \(t\), \(k\ge 1\). Then, there exists a constant \(C_T>0\) such that for any continuous function \(V:[0,T]\times \mathbb R ^n\rightarrow \mathbb C ^m\) we have

$$\begin{aligned} \int \limits _{0}^T \frac{|(\partial _t Q^{(m)}_\varepsilon (t,\xi ) V(t,\xi ),V(t,\xi ))|}{(Q^{(m)}_\varepsilon (t,\xi )V(t,\xi ),V(t,\xi ))^{1-1/k} |V(t,\xi )|^{2/k}}\, dt\le C_T \Vert Q^{(m)}_\varepsilon (\cdot ,\xi )\Vert ^{1/k}_{{C}^k([0,T])} \end{aligned}$$

for all \(\xi \in \mathbb{R }^n.\)

4 Reduction to a first order system and energy estimate

We now go back to the Cauchy problem (1) and perform a reduction of the \(m\)-order equation to a first order system as in [19]. Let \(\langle D_x \rangle \) be the pseudo-differential operator with symbol \(\langle \xi \rangle =(1+|\xi |^2)^{\frac{1}{2}}\). The transformation

$$\begin{aligned} u_j=D_t^{j-1}\langle D_x \rangle ^{m-j}u, \end{aligned}$$

with \(j=1,\ldots ,m\), makes the Cauchy problem (1) equivalent to the following system

$$\begin{aligned} D_t\left( \begin{array}{c} u_1 \\ \cdot \\ \cdot \\ u_m \\ \end{array} \right) = \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 0&\langle D_x \rangle&0&\ldots&0\\ 0&0&\langle D_x \rangle&\ldots&0 \\ \ldots&\ldots&\ldots&\ldots&\langle D_x \rangle \\ b_1&b_2&\ldots&\ldots&b_m \\ \end{array} \right) \left(\begin{array}{c} u_1 \\ \cdot \\ \cdot \\ u_m \end{array} \right), \end{aligned}$$
(11)

where

$$\begin{aligned} b_j=-A_{m-j+1}(t,D_x)\langle D_x \rangle ^{j-m}, \end{aligned}$$

with initial condition

$$\begin{aligned} u_j|_{t=0}=\langle D_x \rangle ^{m-j}g_j,\quad j=1,\ldots ,m. \end{aligned}$$
(12)

The matrix in (11) can be written as \(\mathbb A _1+B\) with

$$\begin{aligned} \mathbb A _1=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 0&\langle D_x \rangle&0&\ldots&0\\ 0&0&\langle D_x \rangle&\ldots&0 \\ \ldots&\ldots&\ldots&\ldots&\langle D_x \rangle \\ b_{(1)}&b_{(2)}&\ldots&\ldots&b_{(m)} \\ \end{array} \right), \end{aligned}$$

where \(b_{(j)}=-A_{(m-j+1)}(t,D_x)\langle D_x \rangle ^{j-m}\) is the principal part of the operator \(b_{j}=-A_{m-j+1}(t,D_x)\langle D_x \rangle ^{j-m}\) and

$$\begin{aligned} B=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 0&0&0&\ldots&0\\ 0&0&0&\ldots&0 \\ \ldots&\ldots&\ldots&\ldots&0 \\ b_1-b_{(1)}&b_2-b_{(2)}&\ldots&\ldots&b_m-b_{(m)} \\ \end{array} \right). \end{aligned}$$

By Fourier transforming both sides of (11) in \(x\) we obtain the system

$$\begin{aligned} D_t V&= \mathbb A _1(t,\xi )V+B(t,\xi )V, \nonumber \\ V|_{t=0}(\xi )&= V_0(\xi ), \end{aligned}$$
(13)

where \(V\) is the \(m\)-column with entries \(v_j=\widehat{u}_j, V_0\) is the \(m\)-column with entries \(v_{0,j}=\langle \xi \rangle ^{m-j}\widehat{g}_j\) and

$$\begin{aligned} \mathbb A _1(t,\xi )=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 0&\langle \xi \rangle&0&\ldots&0\\ 0&0&\langle \xi \rangle&\ldots&0 \\ \ldots&\ldots&\ldots&\ldots&\langle \xi \rangle \\ b_{(1)}(t,\xi )&b_{(2)}(t,\xi )&\ldots&\ldots&b_{(m)}(t,\xi ) \\ \end{array} \right),\ \end{aligned}$$
$$\begin{aligned} b_{(j)}(t,\xi )=-A_{(m-j+1)}(t,\xi )\langle \xi \rangle ^{j-m}, \end{aligned}$$
(14)
$$\begin{aligned} B(t,\xi )=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 0&0&0&\ldots&0\\ 0&0&0&\ldots&0 \\ \ldots&\ldots&\ldots&\ldots&0 \\ (b_1-b_{(1)})(t,\xi )&\ldots&\ldots&\ldots&(b_m-b_{(m)})(t,\xi ) \\ \end{array} \right),\nonumber \\ (b_j-b_{(j)})(t,\xi )=-(A_{m-j+1}-A_{(m-j+1)})(t,\xi )\langle \xi \rangle ^{j-m}. \end{aligned}$$

From now on we will concentrate on the system (13) and on the matrix

$$\begin{aligned} A(t,\xi ):=\langle \xi \rangle ^{-1}\mathbb A _1(t,\xi ) \end{aligned}$$

for which we will construct a quasi-symmetriser. Note that the eigenvalues of the matrix \(\mathbb A _1\) are exactly the roots \(\lambda _l(t,\xi ), l=1,\ldots ,m\). It is clear that the condition (6) holds for the eigenvalues \(\langle \xi \rangle ^{-1}\lambda _l(t,\xi )\) of the \(0\)-order matrix \(A(t,\xi )\) as well.Let us define the energy

$$\begin{aligned} E_\varepsilon (t,\xi )=(Q^{(m)}_\varepsilon (t,\xi ) V(t,\xi ), V(t,\xi )). \end{aligned}$$

We have

$$\begin{aligned} \partial _t E_\varepsilon (t,\xi )&= (\partial _tQ^{(m)}_\varepsilon V,V)+ i(Q^{(m)}_\varepsilon D_tV,V)-i(Q^{(m)}_\varepsilon V,D_tV)\\&= (\partial _tQ^{(m)}_\varepsilon V,V)+i(Q^{(m)}_\varepsilon (\mathbb A _1V+BV),V)-i(Q^{(m)}_\varepsilon V,\mathbb A _1V+BV)\\&= (\partial _tQ^{(m)}_\varepsilon V,V)+i\langle \xi \rangle ((Q^{(m)}_\varepsilon A-A^*Q^{(m)}_\varepsilon )V,V)\\&\quad +i((Q^{(m)}_\varepsilon B-B^*Q^{(m)}_\varepsilon )V,V).\end{aligned}$$

It follows that

$$\begin{aligned} \partial _t E_\varepsilon (t,\xi )&\le \frac{|(\partial _tQ^{(m)}_\varepsilon V,V)|E_\varepsilon }{(Q^{(m)}_\varepsilon (t,\xi ) V(t,\xi ), V(t,\xi ))}+|\langle \xi \rangle ((Q^{(m)}_\varepsilon A-A^*Q^{(m)}_\varepsilon )V,V)|\nonumber \\&\quad +|((Q^{(m)}_\varepsilon B-B^*Q^{(m)}_\varepsilon )V,V)|.\end{aligned}$$
(15)

We recall that from Proposition 1, \(Q_\varepsilon ^{(m)}(t,\xi )\) is a family of smooth nonnegative Hermitian matrices such that

$$\begin{aligned} Q_\varepsilon ^{(m)}(t,\xi )=Q_0^{(m)}(t,\xi )+\varepsilon ^2 Q_1^{(m)}(t,\xi )+\cdots +\varepsilon ^{2(m-1)}Q_{m-1}^{(m)}(t,\xi ). \end{aligned}$$
(16)

In addition there exists a constant \(C_m>0\) such that for all \(t\in [0,T], \xi \in \mathbb R ^n\) and \(\varepsilon \in (0,1]\) the following estimates hold uniformly in \(V\):

$$\begin{aligned}&C_m^{-1}\varepsilon ^{2(m-1)}|V|^2\le (Q^{(m)}_\varepsilon (t,\xi )V,V)\le C_m|V|^2, \end{aligned}$$
(17)
$$\begin{aligned}&|((Q_\varepsilon ^{(m)}A-A^*Q_\varepsilon ^{(m)})(t,\xi )V,V)|\le C_m\varepsilon (Q_\varepsilon ^{(m)}(t,\xi )V,V). \end{aligned}$$
(18)

Finally, condition (6) and Proposition 2 ensure that the family

$$\begin{aligned} \{ Q_\varepsilon ^{(m)}(t,\xi ):\, \varepsilon \in (0,1],\, t\in [0,T],\, \xi \in \mathbb R ^n\} \end{aligned}$$

is nearly diagonal.

In the sequel we assume that the coefficients \(a_j\) in the Eq. (1) are of class \({C}^k\), or in other words that the matrix \(A(t,\xi )\) has entries of class \({C}^k\) in \(t\in [0,T]\). It follows by construction that the quasi-symmetriser has the same regularity property. We now estimate the three terms of the right hand side of (15).

4.1 First term

We write \(\frac{|(\partial _tQ^{(m)}_\varepsilon V,V)|}{(Q^{(m)}_\varepsilon (t,\xi ) V(t,\xi ), V(t,\xi ))}\) as

$$\begin{aligned} \frac{|(\partial _tQ^{(m)}_\varepsilon V,V)|}{(Q^{(m)}_\varepsilon (t,\xi ) V(t,\xi ), V(t,\xi ))^{1-1/k}(Q^{(m)}_\varepsilon (t,\xi ) V(t,\xi ), V(t,\xi ))^{1/k}}. \end{aligned}$$

From (17) we have

$$\begin{aligned}&\frac{|(\partial _tQ^{(m)}_\varepsilon V,V)|}{(Q^{(m)}_\varepsilon (t,\xi ) V(t,\xi ), V(t,\xi ))}\\&\quad \le \frac{|(\partial _tQ^{(m)}_\varepsilon V,V)|}{(Q^{(m)}_\varepsilon (t,\xi ) V(t,\xi ), V(t,\xi ))^{1-1/k}(C_m^{-1}\varepsilon ^{2(m-1)}|V|^2)^{1/k}}\\&\quad \le C_m^{1/k}\varepsilon ^{-2(m-1)/k}\frac{|(\partial _tQ^{(m)}_\varepsilon V,V)|}{(Q^{(m)}_\varepsilon (t,\xi ) V(t,\xi ), V(t,\xi ))^{1-1/k}|V|^{2/k}}. \end{aligned}$$

An application of Lemma 2 yields the estimate

$$\begin{aligned}&\int _{0}^T\frac{|(\partial _tQ^{(m)}_\varepsilon V,V)|}{(Q^{(m)}_\varepsilon (t,\xi ) V(t,\xi ), V(t,\xi ))}\, dt\\&\quad \le C_m^{1/k}\varepsilon ^{-2(m-1)/k}C_T\sup _{\xi \in \mathbb R ^n}\Vert Q_\varepsilon (\cdot ,\xi )\Vert ^{1/k}_{{C}^k([0,T])}\\&\quad \le C_1\varepsilon ^{-2(m-1)/k}, \end{aligned}$$

for all \(\varepsilon \in (0,1]\). Setting \(\frac{|(\partial _tQ^{(m)}_\varepsilon V,V)|}{(Q^{(m)}_\varepsilon (t,\xi ) V(t,\xi ), V(t,\xi ))}=K_\varepsilon (t,\xi )\) we conclude that

$$\begin{aligned} \frac{|(\partial _tQ^{(m)}_\varepsilon V,V)|E_\varepsilon }{(Q^{(m)}_\varepsilon (t,\xi ) V(t,\xi ), V(t,\xi ))}=K_\varepsilon (t,\xi )E_\varepsilon , \end{aligned}$$
(19)

with

$$\begin{aligned} \int _{0}^T K_\varepsilon (t,\xi )\, dt\le C_1\varepsilon ^{-2(m-1)/k}. \end{aligned}$$
(20)

4.2 Second term

From the property (18) we have that

$$\begin{aligned} |\langle \xi \rangle ((Q^{(m)}_\varepsilon A-A^*Q^{(m)}_\varepsilon )V,V)|\le C_m\varepsilon \langle \xi \rangle (Q_\varepsilon ^{(m)}(t,\xi )V,V)\le C_2\varepsilon \langle \xi \rangle E_\varepsilon . \end{aligned}$$

4.3 Third term

We now concentrate on

$$\begin{aligned} ((Q^{(m)}_\varepsilon B-B^*Q^{(m)}_\varepsilon )V,V), \end{aligned}$$

which is the main task in this paper. By Proposition 1(iv) and the definition of the matrix \(B(t,\xi )\) we have that

$$\begin{aligned}&((Q^{(m)}_\varepsilon B-B^*Q^{(m)}_\varepsilon )V,V)=((Q_0^{(m)}B-B^*Q_0^{(m)})V,V)\\&\quad +\,\varepsilon ^2\sum _{i=1}^m((Q^{(m-1)}_\varepsilon (\pi _i\lambda )^\sharp B-B^*Q^{(m-1)}_\varepsilon (\pi _i\lambda )^\sharp )V,V), \end{aligned}$$

where we notice that \((Q^{(m-1)}_\varepsilon (\pi _i\lambda )^\sharp B-B^*Q^{(m-1)}_\varepsilon (\pi _i\lambda )^\sharp )=0\) due to the structure of zeros in \(B\) and in \(Q^{(m-1)}_\varepsilon (\pi _i\lambda )^\sharp \). Hence

$$\begin{aligned} ((Q^{(m)}_\varepsilon B-B^*Q^{(m)}_\varepsilon )V,V)=((Q_0^{(m)} B-B^*Q_0^{(m)})V,V). \end{aligned}$$

Note that from Proposition 1(i) we have that \((Q^{(m)}_0 V,V)\le E_\varepsilon \). In the next section we will show that the conditions on \(B\) corresponding to (7) imply that

$$\begin{aligned} |((Q_0^{(m)} B-B^*Q_0^{(m)})V,V)|\le C_3 (Q^{(m)}_0 V,V)\le C_3 E_\varepsilon , \end{aligned}$$
(21)

for some constant \(C_3>0\) independent of \(t\in [0,T], \xi \in \mathbb R ^n\) and \(V\in \mathbb C ^m\).

Remark 1

Note that condition (6) is trivially satisfied when the roots are distinct, i.e. in the strictly hyperbolic case. It follows that the family \(\{Q_\varepsilon (\lambda )\}\) of quasi-symmetrisers is nearly diagonal and, therefore, there exists a constant \(c_0>0\) such that \(Q^{(m)}_0\ge c_0\text{ diag}\,Q_0^{(m)}\). This means that

$$\begin{aligned} (Q_0^{(m)}V,V)\ge c_0\sum _{i=1}^m q_{0,ii}|V_i|^2 \end{aligned}$$

holds for all \(V\in \mathbb C ^m\). From the hypothesis of strict hyperbolicity it easily follows that

$$\begin{aligned} \inf _{t\in [0,T], |\xi |\ge R, i=1,\ldots ,m} q_{0,ii}(t,\xi )>0, \end{aligned}$$

for any \(R>0\). This bound from below implies

$$\begin{aligned} (Q_0^{(m)}(t,\xi )V,V)\ge c^{\prime }_0 |V|^2 \end{aligned}$$
(22)

for \(t\in [0,T]\) and \(|\xi |\ge R\) and hence the estimate

$$\begin{aligned} |((Q_0^{(m)} B-B^*Q_0^{(m)})V,V)|\le C_3 (Q_0 V,V) \end{aligned}$$

holds trivially in the strictly hyperbolic case for any lower order term \(B\) (for our purposes it will not be restrictive to assume \(|\xi |\ge R\)). Concluding, when the roots \(\lambda _i\) are distinct the Gevrey and ultradistributional well-posedness results in Theorems 2 and 3 can be stated without additional conditions on the lower order terms. Strictly hyperbolic equations under low regularity (Hölder \({C}^{\alpha }\), \(0<\alpha <1\)) of the coefficients have been analysed by the authors in [11, Case 3], to which we refer for general statements on the Gevrey and ultradistributional well-posedness in this setting.

5 Estimates for the lower order terms

We begin by rewriting \(((Q_0^{(m)} B-B^*Q_0^{(m)})V,V)\) in terms of the matrix \(\mathcal W =\mathcal W ^{(m)}\). From Proposition 1(v) we have

$$\begin{aligned} ((Q_0^{(m)} B-B^*Q_0^{(m)})V,V)&= (m-1)!((\mathcal W BV,\mathcal W V)-(\mathcal W V,\mathcal W BV))\\&= 2i(m-1)!\mathfrak I (\mathcal W BV,\mathcal W V). \end{aligned}$$

It follows that

$$\begin{aligned} |((Q_0^{(m)} B-B^*Q_0^{(m)})V,V)|\le 2(m-1)!|\mathcal W BV||\mathcal W V|. \end{aligned}$$

Since

$$\begin{aligned} (Q_0 V,V)=(m-1)!|\mathcal W V|^2 \end{aligned}$$

we have that if

$$\begin{aligned} |\mathcal W BV|\le C|\mathcal W V| \end{aligned}$$
(23)

for some constant \(C>0\) independent of \(t, \xi \) and \(V\), then the condition (21) will hold. It is our task to show that the condition (7) on the matrix \(B\) of the lower order terms implies the estimate (23).

Before dealing with the general case of \(B m\times m\)-matrix, let us consider the instructive case \(m=3\), which will illustrate the general argument in a simplified setting. In the sequel, for \(f\) and \(g\) real-valued functions (in the variable \(y\)) we write \(f(y)\prec g(y)\) if there exists a constant \(C>0\) such that \(f(y)\le C g(y)\) for all \(y\). More precisely, we will set \(y=(t,\xi )\) or \(y=(t,\xi ,V)\).

5.1 The case \(m=3\)

By definition of the row vectors \(W^{(3)}_i, i=1,2,3\), we have that

$$\begin{aligned} \mathcal W =\left( \begin{array}{lll} \lambda _2\lambda _3&-\lambda _2-\lambda _3&1\\ \lambda _3\lambda _1&-\lambda _3-\lambda _1&1\\ \lambda _1\lambda _2&-\lambda _1-\lambda _2&1 \end{array} \right), \end{aligned}$$

where \(\lambda _i, i=1,2,3\), are the 0-order normalised roots. Hence, by definition of the matrix \(B\) setting \(b_j-b_{(j)}=B_j, j=1,2,3\), we get

$$\begin{aligned} \mathcal W BV=\left( \begin{array}{l} B_1V_1+B_2V_2+B_3V_3\\ B_1V_1+B_2V_2+B_3V_3\\ B_1V_1+B_2V_2+B_3V_3 \end{array} \right) \end{aligned}$$

and

$$\begin{aligned} \mathcal W V=\left( \begin{array}{l} (\lambda _2\lambda _3)V_1-(\lambda _2+\lambda _3)V_2+V_3\\ (\lambda _3\lambda _1)V_1-(\lambda _3+\lambda _1)V_2+V_3\\ (\lambda _1\lambda _2)V_1-(\lambda _1+\lambda _2)V_2+V_3 \end{array} \right). \end{aligned}$$

Thus, instead of working on proving (23) we can work on the equivalent inequality

$$\begin{aligned} |B_1V_1+B_2V_2+B_3V_3|^2&\prec |(\lambda _2\lambda _3)V_1-(\lambda _2+\lambda _3)V_2+V_3|^2\nonumber \\&+ |(\lambda _3\lambda _1)V_1-(\lambda _3+\lambda _1)V_2+V_3|^2 \nonumber \\&+ |(\lambda _1\lambda _2)V_1-(\lambda _1+\lambda _2)V_2+V_3|^2. \end{aligned}$$
(24)

In terms of the coefficients of the matrix \(B\) the Levi conditions (9) on the lower order terms can be written as

$$\begin{aligned}&|B_1|^2\prec \lambda _1^2\lambda _2^2+\lambda _2^2\lambda ^2_3+\lambda _3^2\lambda _1^2,\nonumber \\&|B_2|^2\prec (\lambda _1+\lambda _2)^2+(\lambda _2+\lambda _3)^2+(\lambda _3+\lambda _1)^2,\\&|B_3|^2\prec c.\nonumber \end{aligned}$$
(25)

Under these conditions we now want to prove that (24) holds for all vectors \(V\). We note here that actually for the right hand side of (24) by the triangle inequality we have the upper bound

$$\begin{aligned}&|(\lambda _2\lambda _3)V_1\!-\!(\lambda _2\!+\!\lambda _3)V_2\!+\!V_3|^2\!+\! |(\lambda _3\lambda _1)V_1\!-\!(\lambda _3\!+\!\lambda _1)V_2\!+\!V_3|^2 \\&\qquad \!+|(\lambda _1\lambda _2)V_1\!-\!(\lambda _1\!+\!\lambda _2)V_2\!+\!V_3|^2\\&\quad \prec (\lambda _1^2\lambda _2^2\!+\!\lambda _2^2\lambda ^2_3\!+\!\lambda _3^2\lambda _1^2)|V_1|^2 \!+\!((\lambda _1\!+\!\lambda _2)^2\!+\!(\lambda _2\!+\!\lambda _3)^2\!+\!(\lambda _3\!+\!\lambda _1)^2)|V_2|^2\!+\!|V_3|^2, \end{aligned}$$

in which the right hand side of (25) appears naturally.

Our strategy is to proceed by 3 steps making use of the following partition of \(\mathbb R ^3\):

$$\begin{aligned} \mathbb R ^3=\Sigma _1^{\delta _1} \cup \big (\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \Sigma _2^{\delta _2}\big )\cup \big (\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}}\big ), \end{aligned}$$

where

$$\begin{aligned} \Sigma _1^{\delta _1}&:= \{ V\in \mathbb R ^3:\,|V_3|^2+((\lambda _1+\lambda _2)^2+(\lambda _2+\lambda _3)^2+(\lambda _3+\lambda _1)^2)|V_2|^2\\&\le \delta _1(\lambda _1^2\lambda _2^2+\lambda _2^2\lambda ^2_3+\lambda _3^2\lambda _1^2)|V_1|^2\}, \end{aligned}$$

and

$$\begin{aligned} \Sigma _2^{\delta _2}:=\{V\in \mathbb R ^3:\,|V_3|^2\le \delta _2((\lambda _1+\lambda _2)^2+ (\lambda _2+\lambda _3)^2+(\lambda _3+\lambda _1)^2)|V_2|^2\}. \end{aligned}$$

Estimate on \(\Sigma _1^{\delta _1}\).

Making use of the conditions (25) we have that

$$\begin{aligned}&|B_1V_1\!+\!B_2V_2\!+\!B_3V_3|^2\prec |B_1|^2|V_1|^2\!+\!|B_2|^2|V_2|^2\!+\!|B_3|^2|V_3|^2\nonumber \\&\quad \prec (\lambda _1^2\lambda _2^2\!+\!\lambda _2^2\lambda ^2_3\!+\!\lambda _3^2\lambda _1^2)|V_1|^2\!+\! ((\lambda _1\!+\!\lambda _2)^2\!+\!(\lambda _2\!+\!\lambda _3)^2\!+\!(\lambda _3\!+\!\lambda _1)^2)|V_2|^2\!+\!|V_3|^2\nonumber \\&\quad \prec (\lambda _1^2\lambda _2^2\!+\!\lambda _2^2\lambda ^2_3\!+\!\lambda _3^2\lambda _1^2)|V_1|^2 \end{aligned}$$
(26)

on \(\Sigma _1^{\delta _1}\). NoteFootnote 1 that we have

$$\begin{aligned}&|(\lambda _2\lambda _3)V_1-(\lambda _2+\lambda _3)V_2+V_3|^2+ |(\lambda _3\lambda _1)V_1-(\lambda _3+\lambda _1)V_2+V_3|^2\\&\quad \succ |(\lambda _2\lambda _3-\lambda _3\lambda _1)V_1-(\lambda _2-\lambda _1)V_2|^2\\&\quad \succ (\lambda _2-\lambda _1)^2|\lambda _3 V_1-V_2|^2\succ (\lambda _1^2+\lambda _2^2)|\lambda _3 V_1-V_2|^2, \end{aligned}$$

where also in the last line we make use of the condition (6) on the roots \(\lambda _i\). Hence, by applying this to different combinations of terms, we get

$$\begin{aligned}&|(\lambda _2\lambda _3)V_1-(\lambda _2+\lambda _3)V_2+V_3|^2\nonumber \\&\quad +|(\lambda _3\lambda _1)V_1-(\lambda _3+\lambda _1)V_2+V_3|^2 + |(\lambda _1\lambda _2)V_1-(\lambda _1+\lambda _2)V_2+V_3|^2 \nonumber \\&\succ (\lambda _1^2+\lambda _2^2)|\lambda _3 V_1-V_2|^2+(\lambda _2^2+\lambda _3^2)|\lambda _1 V_1-V_2|^2+(\lambda _3^2+\lambda _1^2)|\lambda _2 V_1-V_2|^2\nonumber \\&\succ \lambda _1^2(|\lambda _3 V_1-V_2|^2+|\lambda _2 V_1-V_2|^2)+\lambda _2^2(|\lambda _3 V_1-V_2|^2+|\lambda _1 V_1-V_2|^2) \nonumber \\&\quad +\lambda _3^2(|\lambda _2 V_1-V_2|^2+|\lambda _1 V_1-V_2|^2)\nonumber \\&\succ (\lambda _1^2(\lambda _3-\lambda _2)^2+\lambda _2^2(\lambda _3-\lambda _1)^2+\lambda _3^2(\lambda _2-\lambda _1)^2)|V_1|^2\nonumber \\&\succ (\lambda _1^2\lambda _2^2+\lambda _2^2\lambda ^2_3+\lambda _3^2\lambda _1^2)|V_1|^2. \end{aligned}$$
(27)

From the bound from below (27) and the estimate (26) one has that the inequality (24) holds true in the region \(\Sigma _1^{\delta _1}\) for all \(\delta _1>0\).

Estimate on \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \Sigma _2^{\delta _2}.\)

We assume from now on that \(V\in \big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\) which means that

$$\begin{aligned} |V_3|^2\!+\!((\lambda _1\!+\!\lambda _2)^2\!+\!(\lambda _2\!+\!\lambda _3)^2\!+\!(\lambda _3\!+\!\lambda _1)^2)|V_2|^2> \delta _1(\lambda _1^2\lambda _2^2\!+\!\lambda _2^2\lambda ^2_3\!+\!\lambda _3^2\lambda _1^2)|V_1|^2. \end{aligned}$$

One immediately has

$$\begin{aligned}&|B_1V_1+B_2V_2+B_3V_3|^2\prec |B_1|^2|V_1|^2+|B_2|^2|V_2|^2+|B_3|^2|V_3|^2\\&\prec (\lambda _1^2\lambda _2^2\!+\!\lambda _2^2\lambda ^2_3\!+\!\lambda _3^2\lambda _1^2)|V_1|^2\!+\! ((\lambda _1\!+\!\lambda _2)^2\!+\!(\lambda _2\!+\!\lambda _3)^2\!+\!(\lambda _3\!+\!\lambda _1)^2)|V_2|^2\!+\!|V_3|^2\\&\prec ((\lambda _1+\lambda _2)^2+(\lambda _2+\lambda _3)^2+(\lambda _3+\lambda _1)^2)|V_2|^2+|V_3|^2. \end{aligned}$$

More precisely,

$$\begin{aligned} |B_1V_1+B_2V_2+B_3V_3|^2\prec ((\lambda _1+\lambda _2)^2+(\lambda _2+\lambda _3)^2+(\lambda _3+\lambda _1)^2)|V_2|^2 \end{aligned}$$
(28)

holds for all \(V\in \big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \Sigma _2^{\delta _2}\). We estimate the right-hand side of (24) as

$$\begin{aligned}&|(\lambda _2\lambda _3)V_1-(\lambda _2+\lambda _3)V_2+V_3|^2\\&\quad +|(\lambda _3\lambda _1)V_1-(\lambda _3+\lambda _1)V_2+V_3|^2 +|(\lambda _1\lambda _2)V_1-(\lambda _1+\lambda _2)V_2+V_3|^2\\&\succ \gamma _1(|(\lambda _2+\lambda _3)V_2-V_3|^2+|(\lambda _3+\lambda _1)V_2-V_3|^2+|(\lambda _1+\lambda _2)V_2-V_3|^2)\\&\quad -\gamma _2(\lambda _1^2\lambda _2^2+\lambda _2^2\lambda ^2_3+\lambda _3^2\lambda _1^2)|V_1|^2, \end{aligned}$$

for someFootnote 2 constants \(\gamma _1,\gamma _2>0.\) By using condition (6) we get the estimate

$$\begin{aligned}&(\lambda _2-\lambda _1)^2+(\lambda _3-\lambda _2)^2+(\lambda _3-\lambda _1)^2\ge \frac{2}{M}(\lambda _1^2+\lambda _2^2+\lambda _3^2)\\&\ge \frac{1}{2M}((\lambda _1+\lambda _2)^2+(\lambda _2+\lambda _3)^2+(\lambda _3+\lambda _1)^2) \end{aligned}$$

and then

$$\begin{aligned}&|(\lambda _2\lambda _3)V_1\!-\!(\lambda _2\!+\!\lambda _3)V_2\!+\!V_3|^2\\&\quad +|(\lambda _3\lambda _1)V_1-(\lambda _3+\lambda _1)V_2+V_3|^2 + |(\lambda _1\lambda _2)V_1-(\lambda _1+\lambda _2)V_2+V_3|^2\\&\succ \gamma _1((\lambda _2-\lambda _1)^2\!+\!(\lambda _3\!-\!\lambda _2)^2\!+\!(\lambda _3\!-\!\lambda _1)^2)|V_2|^2\!-\! \gamma _2(\lambda _1^2\lambda _2^2\!+\!\lambda _2^2\lambda ^2_3\!+\!\lambda _3^2\lambda _1^2)|V_1|^2\\&\succ \gamma ^{\prime }_1((\lambda _1\!+\!\lambda _2)^2\!+\!(\lambda _2\!+\!\lambda _3)^2\!+\!(\lambda _3\!+\!\lambda _1)^2)|V_2|^2\!-\! \gamma _2(\lambda _1^2\lambda _2^2\!+\!\lambda _2^2\lambda ^2_3\!+\!\lambda _3^2\lambda _1^2)|V_1|^2\\&\succ (\gamma ^{\prime }_1-\gamma _2\frac{1}{\delta _1}(\delta _2+1)) ((\lambda _1+\lambda _2)^2+(\lambda _2+\lambda _3)^2+(\lambda _3+\lambda _1)^2)|V_2|^2, \end{aligned}$$

for someFootnote 3 constant \(\gamma _1^{\prime }>0.\) Combining this with (28) we conclude that for any \(\delta _2\) and for \(\delta _1\) big enough the right-hand side of (24) can be estimated from below by \(|V_2|^2\) and, therefore, (24) holds true on \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \Sigma _2^{\delta _2}\).

Estimate on \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}}.\)

Since on \(\big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}}\) we have

$$\begin{aligned} |V_3|^2> \delta _2((\lambda _1+\lambda _2)^2+(\lambda _2+\lambda _3)^2+ (\lambda _3+\lambda _1)^2)|V_2|^2, \end{aligned}$$

it follows that

$$\begin{aligned} |B_1V_1+B_2V_2+B_3V_3|^2\prec |V_3|^2. \end{aligned}$$

Then, for \(V\in \big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}}\), for suitableFootnote 4 constants \(\gamma _1, \gamma _2, \gamma _3\) (independent of \(V\)),

$$\begin{aligned}&|(\lambda _2\lambda _3)V_1-(\lambda _2+\lambda _3)V_2+V_3|^2\\&\quad +|(\lambda _3\lambda _1)V_1-(\lambda _3+\lambda _1)V_2+V_3|^2 + |(\lambda _1\lambda _2)V_1-(\lambda _1+\lambda _2)V_2+V_3|^2\\&\succ \gamma _3|V_3|^2-\gamma _2((\lambda _1+\lambda _2)^2+(\lambda _2+\lambda _3)^2+(\lambda _3+\lambda _1)^2)|V_2|^2\\&\quad -\gamma _1\frac{1}{\delta _1}(|V_3|^2+((\lambda _1+\lambda _2)^2+(\lambda _2+\lambda _3)^2+(\lambda _3+\lambda _1)^2)|V_2|^2)\\&\succ (\gamma _3-\gamma _1\frac{1}{\delta _1})|V_3|^2-(\gamma _2+\gamma _1\frac{1}{\delta _1})\frac{1}{\delta _2}|V_3|^2. \end{aligned}$$

We conclude that for \(\delta _1\) and \(\delta _2\) big enough,

$$\begin{aligned}&|(\lambda _2\lambda _3)V_1-(\lambda _2+\lambda _3)V_2+V_3|^2+ |(\lambda _3\lambda _1)V_1-(\lambda _3+\lambda _1)V_2+V_3|^2\\&\quad +|(\lambda _1\lambda _2)V_1-(\lambda _1+\lambda _2)V_2+V_3|^2\succ |V_3|^2, \end{aligned}$$

and, therefore, (24) holds in the area \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}}\).

The next table describes and summarises the proof above:

Area

Estimates in

\(\delta _i\)

\(\Sigma _1^{\delta _1}\)

\(|V_1|^2\)

Any \(\delta _1\)

\(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \Sigma _2^{\delta _2}\)

\(|V_2|^2\)

\(\delta _1\) big, any \(\delta _2\)

\(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}}\)

\(|V_3|^2\)

\(\delta _1\) and \(\delta _2\) big

5.2 The general case \(m\)

Inspired by the previous subsection we now deal with the inequality

$$\begin{aligned} |\mathcal W BV|\prec |\mathcal W V| \end{aligned}$$

for \(\mathcal W =\mathcal W ^{(m)}\) and arbitrary \(m\ge 2\). This is the topic of the following theorem where the coefficients \(\sigma ^{(m)}_h(\lambda )\) are defined as in Sect. 3 and \(\lambda =(\lambda _1,\lambda _2,\ldots ,\lambda _m)\in \mathbb R ^m\) is the vector of the eigenvalues of the matrix \(A(t,\xi )\) (or the 0-order normalised roots) satisfying the condition (6).

Theorem 5

Let the entries \(B_j\) of the matrix

$$\begin{aligned} B=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 0&0&0&\dots&0\\ 0&0&0&\dots&0 \\ \dots&\dots&\dots&\dots&0 \\ B_1&B_2&\dots&\dots&B_m \\ \end{array} \right) \end{aligned}$$

in (13) fulfil the condition

$$\begin{aligned} |B_j|^2\prec \sum _{i=1}^m |\sigma _{m-j}^{(m-1)}(\pi _i\lambda )|^2 \end{aligned}$$
(29)

for \(j=1,\ldots ,m\). Then we have

$$\begin{aligned} |\mathcal W BV|\prec |\mathcal W V| \end{aligned}$$

uniformly over all \(V\in \mathbb C ^m\). More precisely, define

$$\begin{aligned} \Sigma _k^{\delta _k}:&=\{V\in \mathbb C ^m:\, |V_m|^2+\sum \limits _{j=k+1}^{m-1}\sum _{i=1}^m|\sigma _{m-j}^{(m-1)}(\pi _i\lambda )|^2|V_j|^2\\&\le \delta _k\sum \limits _{i=1}^m|\sigma _{m-k}^{(m-1)}(\pi _i\lambda )|^2|V_k|^2\}, \end{aligned}$$

for \(k=1,\ldots ,m-2\), and for \(k=m-1\), define

$$\begin{aligned} \Sigma _{m-1}^{\delta _{m-1}}:=\{V\in \mathbb C ^m:\, |V_m|^2 \le \delta _{m-1}\sum \limits _{i=1}^m|\sigma _{1}^{(m-1)}(\pi _i\lambda )|^2|V_{m-1}|^2\}. \end{aligned}$$

Then, there exist suitable \(\delta _j>0, j=1,\ldots ,m-1,\) such that

$$\begin{aligned} |\mathcal W BV|^2&\prec&\sum \limits _{i=1}^m |\sigma _{m-1}^{(m-1)}(\pi _i\lambda )|^2 |V_1|^2,\\ |\mathcal W V|^2&\succ&\sum \limits _{i=1}^m |\sigma _{m-1}^{(m-1)}(\pi _i\lambda )|^2 |V_1|^2 \end{aligned}$$

on \(\Sigma _1^{\delta _1}\),

$$\begin{aligned} |\mathcal W BV|^2&\prec&\sum \limits _{i=1}^m |\sigma _{m-k}^{(m-1)}(\pi _i\lambda )|^2 |V_k|^2,\\ |\mathcal W V|^2&\succ&\sum \limits _{i=1}^m |\sigma _{m-k}^{(m-1)}(\pi _i\lambda )|^2 |V_k|^2 \end{aligned}$$

on

$$\begin{aligned} \big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}}\cap \cdots \cap \big (\Sigma _{k-1}^{\delta _{k-1}}\big )^\mathrm{{c}}\cap \Sigma _k^{\delta _k} \end{aligned}$$

for \(2\le k\le m-1\), and

$$\begin{aligned} |\mathcal W BV|^2&\prec&\sum _{i=1}^m |\sigma _{0}^{(m-1)}(\pi _i\lambda )|^2 |V_m|^2,\\ |\mathcal W V|^2&\succ&\sum _{i=1}^m |\sigma _{0}^{(m-1)}(\pi _i\lambda )|^2 |V_m|^2 \end{aligned}$$

on \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}}\cap \cdots \cap \big (\Sigma _{m-1}^{\delta _{m-1}}\big )^\mathrm{{c}}\).

Note that (29) is a reformulation of the condition (10) on the lower order terms. The proof of Theorem 5 makes use of the following two lemmas.

Lemma 3

For all \(i\) and \(j\) with \(1\le i,j\le m\) and \(k=1,\ldots ,m-1,\) one has

$$\begin{aligned}&\sigma _{m-k}^{(m-1)}(\pi _i\lambda )-\sigma _{m-k}^{(m-1)}(\pi _j\lambda )\nonumber \\&\quad =(-1)^{m-k}(\lambda _j-\lambda _i) \sum _{\begin{array}{c} i_h\ne i,\, i_h\ne j\\ 1\le i_1<i_2<\cdots <i_{m-k-1}\le m \end{array}} \lambda _{i_1}\lambda _{i_2}\cdots \lambda _{i_{m-k-1}}. \end{aligned}$$
(30)

Proof

By definition of \(\sigma _{m-k}^{(m-1)}(\pi _i\lambda )\) and \(\sigma _{m-k}^{(m-1)}(\pi _j\lambda )\) we have that

$$\begin{aligned} \sigma _{m-k}^{(m-1)}(\pi _i\lambda )&= (-1)^{m-k}\sum _{\begin{array}{c} 1\le l_1<l_2<\cdots <l_{m-k}\le m\\ l_h\ne i \end{array}}\lambda _{l_1}\lambda _{l_2}\cdots \lambda _{l_{m-k}}\\&= (-1)^{m-k}\sum _{\begin{array}{c} 1\le l_1<l_2<\cdots <l_{m-k}\le m\\ l_h\ne i,j \end{array}}\lambda _{l_1}\lambda _{l_2}\cdots \lambda _{l_{m-k}}\\&\quad +(-1)^{m-k}\lambda _j\sum _{\begin{array}{c} 1\le l_1<l_2<\cdots <l_{m-k-1}\le m\\ l_h\ne i,j \end{array}}\lambda _{l_1}\lambda _{l_2}\cdots \lambda _{l_{m-k-1}} \end{aligned}$$

and

$$\begin{aligned} \sigma _{m-k}^{(m-1)}(\pi _j\lambda )&= (-1)^{m-k}\sum _{\begin{array}{c} 1\le l_1<l_2<\cdots <l_{m-k}\le m\\ l_h\ne j \end{array}}\lambda _{l_1}\lambda _{l_2}\cdots \lambda _{l_{m-k}}\\&= (-1)^{m-k}\sum _{\begin{array}{c} 1\le l_1<l_2<\cdots <l_{m-k}\le m\\ l_h\ne i,j \end{array}}\lambda _{l_1}\lambda _{l_2}\cdots \lambda _{l_{m-k}}\\&+(-1)^{m-k}\lambda _i\sum _{\begin{array}{c} 1\le l_1<l_2<\cdots <l_{m-k-1}\le m\\ l_h\ne i,j \end{array}}\lambda _{l_1}\lambda _{l_2}\cdots \lambda _{l_{m-k-1}}. \end{aligned}$$

This leads immediately to the formula (30). \(\square \)

Lemma 3

For all \(k=1,\ldots ,m\), we have

$$\begin{aligned}&\sum _{i=1}^m\left|\sum _{j=k+1}^m \sigma _{m-j}^{(m-1)}(\pi _i\lambda )V_j+\sigma _{m-k}^{(m-1)}(\pi _i\lambda )V_k\right|^2\nonumber \\&\quad \succ \sum _{i=1}^m|\sigma _{m-k}^{(m-1)}(\pi _i\lambda )|^2|V_k|^2. \end{aligned}$$
(31)

Proof

We give a proof by induction on the order \(m\). Setting \(m=2\) the estimates above makes sense for \(k=1\). Hence we have to prove that

$$\begin{aligned} \sum _{i=1}^2|\sigma _{0}^{(1)}(\pi _i\lambda )V_2+\sigma _{1}^{(1)}(\pi _i\lambda )V_1|^2&= \sum _{i=1}^2|V_2+\sigma _{1}^{(1)}(\pi _i\lambda )V_1|^2\\&\succ \sum _{i=1}^2|\sigma _{1}^{(1)}(\pi _i\lambda )|^2|V_1|^2. \end{aligned}$$

This is clear since by the condition (6) we have that

$$\begin{aligned} \sum _{i=1}^2|V_2+\sigma _{1}^{(1)}(\pi _i\lambda )V_1|^2&\succ&|\sigma ^{(1)}_1(\pi _1\lambda )-\sigma ^{(1)}_1(\pi _2\lambda )|^2|V_1|^2=(\lambda _2-\lambda _1)^2|V_1|^2\\&\succ (\lambda _1^2+\lambda _2^2)|V_1|^2=\sum _{i=1}^2|\sigma _{1}^{(1)}(\pi _i\lambda )|^2|V_1|^2. \end{aligned}$$

Assume now that (31) holds for \(m-1\). Estimating the left-hand side of (31) with the differences between two arbitrary summandsFootnote 5 we can write

$$\begin{aligned}&\sum _{i=1}^m\left|\sum _{j=k+1}^m \sigma _{m-j}^{(m-1)}(\pi _i\lambda )V_j+\sigma _{m-k}^{(m-1)}(\pi _i\lambda )V_k\right|^2\\&\quad \succ \sum _{1\le l_1\ne l_2\le m}\biggl |\sum _{j=k+1}^m (\sigma _{m-j}^{(m-1)}(\pi _{l_1}\lambda )-\sigma _{m-j}^{(m-1)}(\pi _{l_2}\lambda ))V_j\\&\qquad + (\sigma _{m-k}^{(m-1)}(\pi _{l_1}\lambda )-\sigma _{m-k}^{(m-1)}(\pi _{l_2}\lambda ))V_k\biggr |^2\\&\quad =\sum _{1\le l_1\ne l_2\le m}\biggl |\sum _{j=k+1}^{m-1} (\sigma _{m-j}^{(m-1)}(\pi _{l_1}\lambda )-\sigma _{m-j}^{(m-1)}(\pi _{l_2}\lambda ))V_j\\&\qquad + (\sigma _{m-k}^{(m-1)}(\pi _{l_1}\lambda )-\sigma _{m-k}^{(m-1)}(\pi _{l_2}\lambda ))V_k\biggr |^2, \end{aligned}$$

where the terms with \(j=m\) cancel. By applying Lemma 3 and the condition (6) we obtain the following bound from below:

$$\begin{aligned}&\sum _{i=1}^m\biggl |\sum _{j=k+1}^m \sigma _{m-j}^{(m-1)}(\pi _i\lambda )V_j+\sigma _{m-k}^{(m-1)}(\pi _i\lambda )V_k\biggr |^2\succ \sum _{1\le l_1\ne l_2\le m}\biggl |-(\lambda _{l_2}-\lambda _{l_1})V_{m-1}\\&\qquad +\sum _{j=k+1}^{m-2}(-1)^{m-j}(\lambda _{l_2}-\lambda _{l_1})\sum _{\begin{array}{c} i_h\ne l_1,\, i_h\ne l_2\\ 1\le i_1<i_2<\cdots <i_{m-j-1}\le m \end{array}} (\lambda _{i_1}\lambda _{i_2}\cdots \lambda _{i_{m-j-1}})V_j\\&\qquad +(-1)^{m-k}(\lambda _{l_2}-\lambda _{l_1}) \sum _{\begin{array}{c} i_h\ne l_1,\, i_h\ne l_2\\ 1\le i_1<i_2<\cdots <i_{m-k-1}\le m \end{array}} (\lambda _{i_1}\lambda _{i_2}\cdots \lambda _{i_{m-k-1}})V_k\biggr |^2\\&\quad \succ \sum _{1\le l_1\ne l_2\le m}(\lambda _{l_1}^2+\lambda _{l_2}^2)\biggl |-V_{m-1} +\sum _{j=k+1}^{m-2}(-1)^{m-j}\sum _{\begin{array}{c} i_h\ne l_1,\, i_h\ne l_2\\ 1\le i_1<i_2<\cdots <i_{m-j-1}\le m \end{array}} \\&\quad (\lambda _{i_1}\lambda _{i_2}\cdots \lambda _{i_{m-j-1}})V_j +(-1)^{m-k}\sum _{\begin{array}{c} i_h\ne l_1,\, i_h\ne l_2\\ 1\le i_1<i_2<\cdots <i_{m-k-1}\le m \end{array}} (\lambda _{i_1}\lambda _{i_2}\cdots \lambda _{i_{m-k-1}})V_k\biggr |^2. \end{aligned}$$

Noting that

$$\begin{aligned} (-1)^{m-j}\sum _{\begin{array}{c} i_h\ne l_1,\, i_h\ne l_2\\ 1\le i_1<i_2<\cdots <i_{m-j-1}\le m \end{array}} (\lambda _{i_1}\lambda _{i_2}\cdots \lambda _{i_{m-j-1}})= -\sigma ^{(m-2)}_{m-1-j}(\pi _{l_2}(\pi _{l_1}\lambda )) \end{aligned}$$

for \(j=k,\ldots ,m-2\), we write the estimate above as

$$\begin{aligned}&\sum _{i=1}^m\biggl |\sum _{j=k+1}^m \sigma _{m-j}^{(m-1)}(\pi _i\lambda )V_j+\sigma _{m-k}^{(m-1)}(\pi _i\lambda )V_k\biggr |^2 \succ \sum _{1\le l_1\ne l_2\le m}(\lambda _{l_1}^2+\lambda _{l_2}^2)\\&\quad \biggl |-V_{m-1} -\sum _{j=k+1}^{m-2}\sigma ^{(m-2)}_{m-1-j}(\pi _{l_2}(\pi _{l_1}\lambda )) V_j-\sigma ^{(m-2)}_{m-1-k}(\pi _{l_2}(\pi _{l_1}\lambda ))V_k\biggr |^2, \end{aligned}$$

where the right hand-side can be written as

$$\begin{aligned}&\sum _{1\le l_1\ne l_2\le m}(\lambda _{l_1}^2+\lambda _{l_2}^2)\nonumber \\&\quad \biggl |V_{m-1}+\sum _{j=k+1}^{m-2}\sigma ^{(m-2)}_{m-1-j}(\pi _{l_2}(\pi _{l_1}\lambda )) V_j+\sigma ^{(m-2)}_{m-1-k}(\pi _{l_2}(\pi _{l_1}\lambda ))V_k\biggr |^2\nonumber \\&\quad =\sum _{l_1}\lambda _{l_1}^2\sum _{1\le l_1\ne l_2\le m} \biggl |V_{m-1}+\sum _{j=k+1}^{m-2}\sigma ^{(m-2)}_{m-1-j}(\pi _{l_2}(\pi _{l_1}\lambda )) V_j\nonumber \\&\qquad +\,\sigma ^{(m-2)}_{m-1-k}(\pi _{l_2}(\pi _{l_1}\lambda ))V_k\biggr |^2\nonumber \\&\qquad +\sum _{l_2}\lambda _{l_2}^2\sum _{1\le l_1\ne l_2\le m}\biggl |V_{m-1}+\sum _{j=k+1}^{m-2}\sigma ^{(m-2)}_{m-1-j}(\pi _{l_2}(\pi _{l_1}\lambda )) V_j\nonumber \\&\qquad +\sigma ^{(m-2)}_{m-1-k}(\pi _{l_2}(\pi _{l_1}\lambda ))V_k\biggr |^2. \end{aligned}$$
(32)

By now applying the inductive hypothesis to the last two summands in (32) we obtain

$$\begin{aligned}&\sum _{i=1}^m\biggl |\sum _{j=k+1}^m \sigma _{m-j}^{(m-1)}(\pi _i\lambda )V_j+\sigma _{m-k}^{(m-1)}(\pi _i\lambda )V_k\biggr |^2\\&\quad \succ \sum _{l_1}\lambda _{l_1}^2\sum _{1\le l_1\ne l_2\le m}| \sigma ^{(m-2)}_{m-1-k}(\pi _{l_2}(\pi _{l_1}\lambda ))|^2|V_k|^2\\&\qquad +\sum _{l_2}\lambda _{l_2}^2\sum _{1\le l_1\ne l_2\le m}| \sigma ^{(m-2)}_{m-1-k}(\pi _{l_1}(\pi _{l_2}\lambda ))|^2|V_k|^2\\&\quad \succ \sum _{i=1}^m |\sigma _{m-k}^{(m-1)}(\pi _i\lambda )|^2 |V_k|^2, \end{aligned}$$

which completes the proof. \(\square \)

Proof of Theorem 5

By definition of the matrices \(\mathcal W \) and \(B\) we have that \(|\mathcal W BV|^2\prec |\mathcal W V|^2\) is equivalent to

$$\begin{aligned} \biggl |\sum _{j=1}^m B_jV_j\biggr |^2\prec \sum _{i=1}^m\biggl |\sum _{j=1}^m\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_j\biggr |^2. \end{aligned}$$
(33)

Making use of the conditions (29) we have that the following estimate is valid on the area \(\Sigma _1^{\delta _1}\):

$$\begin{aligned}&\biggl |\sum _{j=1}^m B_jV_j\biggr |^2\prec \sum _{j=1}^m |B_j|^2|V_j|^2\prec \sum _{j=1}^m\sum _{i=1}^m |\sigma _{m-j}^{(m-1)}(\pi _i\lambda )|^2|V_j|^2\\&\quad \prec |V_m|^2+\sum _{j=2}^{m-1}\sum _{i=1}^m|\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2|V_j|^2 +\sum _{i=1}^m|\sigma ^{(m-1)}_{m-1}(\pi _i\lambda )|^2|V_1|^2\\&\quad \prec (1+\delta _1)\sum _{i=1}^m|\sigma ^{(m-1)}_{m-1}(\pi _i\lambda )|^2|V_1|^2\prec \sum _{i=1}^m|\sigma ^{(m-1)}_{m-1}(\pi _i\lambda )|^2|V_1|^2. \end{aligned}$$

Setting \(k=1\) in Lemma 4 we obtain the bound from below

$$\begin{aligned}&\sum _{i=1}^m\biggl |\sum _{j=1}^m\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_j\biggr |^2= \sum _{i=1}^m\biggl |\sum _{j=2}^m\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_j+\sigma ^{(m-1)}_{m-1}(\pi _i\lambda )V_1\biggr |^2\\&\quad \succ \sum _{i=1}^m|\sigma ^{(m-1)}_{m-1}(\pi _i\lambda )|^2|V_1|^2. \end{aligned}$$

This proves the inequality (33) on \(\Sigma _{1}^{\delta _1}\) for any \(\delta _1>0\).

Let us now assume that \(V\in \big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}}\cap \cdots \cap \big (\Sigma _{k-1}^{\delta _{k-1}}\big )^\mathrm{{c}}\cap \Sigma _k^{\delta _k}\) for \(2\le k\le m-1\). By definition of the regions \(\Sigma _h^{\delta _h}\) and taking \(\delta _h\ge 1\) for \(1\le h\le k-1\) we have that

$$\begin{aligned}&\sum _{i=1}^m |\sigma ^{(m-1)}_{m-(k-1)}(\pi _i\lambda )|^2|V_{k-1}|^2 \\&\quad <\frac{1}{\delta _{k-1}}\biggl (|V_m|^2+\sum _{j=k+1}^{m-1}\sum _{i=1}^m|\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2|V_j|^2\\&\qquad +\sum _{i=1}^m|\sigma ^{(m-1)}_{m-k}(\pi _i\lambda )|^2|V_k|^2\biggl ) \\&\quad \le \frac{1}{\delta _{k-1}}(1+\delta _k)\sum _{i=1}^m|\sigma ^{(m-1)}_{m-k}(\pi _i\lambda )|^2|V_k|^2, \nonumber \\&\quad \sum _{i=1}^m |\sigma ^{(m-1)}_{m-(k-2)}(\pi _i\lambda )|^2|V_{k-2}|^2\\&\quad <\frac{1}{\delta _{k-2}}\biggl (|V_m|^2+\sum _{j=k+1}^{m-1}\sum _{i=1}^m|\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2|V_j|^2\\&\qquad +\sum _{i=1}^m|\sigma ^{(m-1)}_{m-k}(\pi _i\lambda )|^2|V_k|^2+ \sum _{i=1}^m|\sigma ^{(m-1)}_{m-(k-1)}(\pi _i\lambda )|^2|V_{k-1}|^2 \biggl )\\&\quad \le \frac{1}{\delta _{k-2}}\left(1+\delta _k+\frac{1}{\delta _{k-1}}(1+\delta _k)\right) \sum _{i=1}^m|\sigma ^{(m-1)}_{m-k}(\pi _i\lambda )|^2|V_k|^2,\\&\quad \le (1+\delta _k)\left(\frac{1}{\delta _{k-1}}+\frac{1}{\delta _{k-2}}\right)\sum _{i=1}^m|\sigma ^{(m-1)}_{m-k}(\pi _i\lambda )|^2|V_k|^2. \end{aligned}$$

By iteration one can easily prove the following bound

$$\begin{aligned} \sum _{i=1}^m |\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2|V_{j}|^2\le (1+\delta _k)\sum _{h=1}^{k-1}\frac{1}{\delta _h}\sum _{i=1}^m|\sigma ^{(m-1)}_{m-k}(\pi _i\lambda )|^2|V_k|^2, \end{aligned}$$
(34)

valid on the region \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}}\cap \cdots \cap \big (\Sigma _{k-1}^{\delta _{k-1}}\big )^\mathrm{{c}}\cap \Sigma _k^{\delta _k}\) for all \(j\) with \(1\le j\le k-1\).

It follows that

$$\begin{aligned}&\biggl |\sum _{j=1}^m B_jV_j\biggr |^2\prec \sum _{j=1}^m |B_j|^2|V_j|^2\prec \sum _{j=1}^m\sum _{i=1}^m |\sigma _{m-j}^{(m-1)}(\pi _i\lambda )|^2|V_j|^2\\&\quad \prec \sum _{j=k+1}^m\sum _{i=1}^m |\sigma _{m-j}^{(m-1)}(\pi _i\lambda )|^2|V_j|^2+\sum _{i=1}^m |\sigma _{m-k}^{(m-1)}(\pi _i\lambda )|^2|V_k|^2\\&\qquad +\sum _{j=1}^{k-1}\sum _{i=1}^m |\sigma _{m-j}^{(m-1)}(\pi _i\lambda )|^2|V_j|^2 \prec \sum _{i=1}^m |\sigma _{m-k}^{(m-1)}(\pi _i\lambda )|^2|V_k|^2. \end{aligned}$$

We now pass to estimate the right-hand side of (33) making use of Lemma 4 and of the bound (34). We obtain

$$\begin{aligned}&\sum _{i=1}^m\biggl |\sum _{j=1}^m\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_j\biggr |^2\\&\quad \succ \sum _{i=1}^m\gamma _1\biggl |\sum _{j=k+1}^{m}\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_j +\sigma ^{(m-1)}_{m-k}(\pi _i\lambda )V_k\biggr |^2\\&\qquad -\gamma _2\sum _{i=1}^m\sum _{j=1}^{k-1}|\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2|V_j|^2\\&\quad \succ \gamma _1\sum _{i=1}^m|\sigma _{m-k}^{(m-1)}(\pi _i\lambda )|^2|V_k|^2-\gamma _2 (1+\delta _k)\sum _{h=1}^{k-1}\frac{1}{\delta _h}\sum _{i=1}^m|\sigma ^{(m-1)}_{m-k}(\pi _i\lambda )|^2|V_k|^2\\&\quad =\biggl (\gamma _1-\gamma _2(1+\delta _k)\sum _{h=1}^{k-1}\frac{1}{\delta _h}\biggr ) \sum _{i=1}^m|\sigma ^{(m-1)}_{m-k}(\pi _i\lambda )|^2|V_k|^2. \end{aligned}$$

Therefore, the estimate (33) holds in the region \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}}\cap \cdots \cap \big (\Sigma _{k-1}^{\delta _{k-1}}\big )^\mathrm{{c}}\cap \Sigma _k^{\delta _k}\) for any \(\delta _k>0\) choosing \(\delta _1, \delta _2,\ldots ,\delta _{k-1}\) big enough.

We conclude the proof by assuming \(V\in \big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}}\cap \cdots \cap \big (\Sigma _{m-1}^{\delta _{m-1}}\big )^\mathrm{{c}}\). Since

$$\begin{aligned} |V_m|^2+\sum _{j=h+1}^{m-1}\sum _{i=1}^m|\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2|V_j|^2> \delta _h\sum _{i=1}^m|\sigma ^{(m-1)}_{m-h}(\pi _i\lambda )|^2|V_h|^2, \end{aligned}$$

for \(1\le h\le m-1\), arguing as above and taking \(\delta _h\ge 1\) we obtain the estimate

$$\begin{aligned} \sum _{i=1}^m |\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2|V_{j}|^2\le \sum _{h=1}^{m-1}\frac{1}{\delta _h}|V_m|^2 \end{aligned}$$
(35)

valid on the region \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}}\cap \cdots \cap \big (\Sigma _{m-1}^{\delta _{m-1}}\big )^\mathrm{{c}}\) for all \(j\) with \(1\le j\le m-1\). Hence,

$$\begin{aligned} \biggl |\sum _{j=1}^m B_jV_j\biggr |^2\prec \sum _{j=1}^m |B_j|^2|V_j|^2\prec \sum _{j=1}^m\sum _{i=1}^m |\sigma _{m-j}^{(m-1)}(\pi _i\lambda )|^2|V_j|^2\prec |V_m|^2 \end{aligned}$$

and

$$\begin{aligned}&\sum _{i=1}^m\biggl |\sum _{j=1}^m\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_j\biggr |^2\succ \gamma _1|V_m|^2-\gamma _2\sum _{i=1}^m\biggl |\sum _{j=1}^{m-1} \sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_j\biggr |^2\\&\quad \succ \gamma _1|V_m|^2-\gamma _2\sum _{j=1}^{m-1}\sum _{i=1}^m|\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2|V_j|^2\\&\quad \succ \biggl (\gamma _1-\gamma _2(m-1)\sum _{h=1}^{m-1}\frac{1}{\delta _h}\biggr )|V_m|^2. \end{aligned}$$

This means that the inequality (33) holds on \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}}\cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}}\cap \cdots \cap \big (\Sigma _{m-1}^{\delta _{m-1}}\big )^\mathrm{{c}}\) for sufficiently large values of \(\delta _1,\delta _2,\ldots ,\delta _{m-1}\). \(\square \)

6 Well-posedness results

We are now ready to prove the well-posedness results given in Theorem 2. For the advantage of the reader and the sake of clarity we reformulate Theorem 2 as the following Theorem 6 where we make use of the language and notations introduced in Theorem 5.

Theorem 6

Assume \(A_j\in {C}([0,T])\) for all \(j\). If the coefficients satisfy \(A_{(j)}\in C^\infty ([0,T])\), the characteristic roots are real and satisfy (6) and the entries of the matrix \(B\) of the lower order terms in (13) fulfil the conditions (29) for \(\xi \) away from \(0\), then the Cauchy problem (1) is well-posed in any Gevrey space. More precisely,

  1. (i)

    if \(A_{(j)}\in {C}^k([0,T])\) for some \(k\ge 2\) and \(g_j\in G^s(\mathbb R ^n)\) for \(j=1,\ldots ,m,\) then there exists a unique solution \(u\in C^m([0,T];G^s(\mathbb R ^n))\) for

    $$\begin{aligned} 1\le s<1+\frac{k}{2(m-1)}; \end{aligned}$$
  2. (ii)

    if \(A_{(j)}\in {C}^k([0,T])\) for some \(k\ge 2\) and \(g_j\in \mathcal E ^{\prime }_{(s)}(\mathbb R ^n)\) for \(j=1,\ldots ,m,\) then there exists a unique solution \(u\in C^m([0,T];\mathcal D ^{\prime }_{(s)}(\mathbb R ^n))\) for

    $$\begin{aligned} 1\le s\le 1+\frac{k}{2(m-1)}. \end{aligned}$$

Proof

As usual, the well-posedness in the case of \(s=1\) follows from the result of Bony and Shapira, so we may assume \(s>1\). By the finite propagation speed for hyperbolic equations it is not restrictive to take compactly supported initial data and, therefore, to have that the solution \(u\) is compactly supported in \(x\).

Combining the energy estimate (15) with the estimates of the first, second and third term in Sect. 4 we obtain the estimate

$$\begin{aligned} \partial _t E_\varepsilon (t,\xi )\le (K_\varepsilon (t,\xi )+C_2\varepsilon \langle \xi \rangle +C_3)E_\varepsilon (t,\xi ), \end{aligned}$$
(36)

where \(K_\varepsilon (t,\xi )\) defined in (19) in Sect. 4.1 has the property (20), i.e.

$$\begin{aligned} \int _0^T K_\varepsilon (t,\xi )\, dt\le C_1\varepsilon ^{-2(m-1)/k}, \end{aligned}$$

and \(C_1, C_2, C_3\) are positive constants. Thus, (36) holds for \(t\in [0,T]\) and \(|\xi |\ge R\), with the estimate for the third term provided by Theorem 5. A straightforward application of Gronwall’s lemma leads to

$$\begin{aligned} E_\varepsilon (t,\xi )&\le E_\varepsilon (0,\xi )\mathrm e ^{C_1\varepsilon ^{-2(m-1)/k}+C_2T\varepsilon \langle \xi \rangle +C_3T}\\&\le E_\varepsilon (0,\xi )C_T\mathrm e ^{C_T(\varepsilon ^{-2(m-1)/k}+\varepsilon \langle \xi \rangle )}. \end{aligned}$$

Setting \(\varepsilon ^{-2(m-1)/k}=\varepsilon \langle \xi \rangle \) we get

$$\begin{aligned} E_\varepsilon (t,\xi )\le E_\varepsilon (0,\xi )C_T\mathrm e ^{C_T\langle \xi \rangle ^{\frac{1}{\sigma }}}, \end{aligned}$$

where \(\sigma =1+k/[2(m-1)]\). Finally, making use of the inequality (17) we arrive at

$$\begin{aligned} C_m^{-1}\varepsilon ^{2(m-1)}|V(t,\xi )|^2&\le E_\varepsilon (t,\xi )\le E_\varepsilon (0,\xi )C_T\mathrm e ^{C_T\langle \xi \rangle ^{\frac{1}{\sigma }}}\\&\le C_m|V(0,\xi )|^2C_T\mathrm e ^{C_T\langle \xi \rangle ^{\frac{1}{\sigma }}}, \end{aligned}$$

which implies

$$\begin{aligned} |V(t,\xi )|\le C\langle \xi \rangle ^{\frac{k}{2\sigma }}\mathrm e ^{C\langle \xi \rangle ^{\frac{1}{\sigma }}}|V(0,\xi )|, \end{aligned}$$
(37)

for some new constant \(C>0\), for \(t\in [0,T]\) and \(|\xi |\ge R\).

  1. (i)

    Recall that \(V(t,\xi )=\mathcal F _{x\rightarrow \xi }U(t,x)\), where \(U\) is the \(u_j\)’s column vector. If the initial data \(g_j\) belong to \(G^s_0(\mathbb R ^n)\) from the Fourier transform characterisation of Gevrey functions ([11, Proposition 2.2]) we have that \(|V(0,\xi )|\le c\,\mathrm e ^{-\delta \langle \xi \rangle ^{\frac{1}{s}}}\) for some constants \(c>0\) and \(\delta >0\). Hence,

    $$\begin{aligned} |V(t,\xi )|\le C\langle \xi \rangle ^{\frac{k}{2\sigma }}\mathrm e ^{C\langle \xi \rangle ^{\frac{1}{\sigma }}}c\,\mathrm e ^{-\delta \langle \xi \rangle ^{\frac{1}{s}}} \end{aligned}$$

    for all \(t\in [0,T]\) and \(\xi \in \mathbb R ^n\). Let \(s<\sigma \). Then \(V(t,\xi )\) defines a tempered distribution in §\(^{\prime }(\mathbb R ^n)\) such that

    $$\begin{aligned} |V(t,\xi )|&\le Cc\langle \xi \rangle ^{\frac{k}{2\sigma }}\mathrm e ^{C\langle \xi \rangle ^{\frac{1}{\sigma }}} \mathrm e ^{-\frac{\delta }{2}\langle \xi \rangle ^{\frac{1}{s}}}\mathrm e ^{-\frac{\delta }{2}\langle \xi \rangle ^{\frac{1}{s}}}\\&\le Cc\langle \xi \rangle ^{\frac{k}{2\sigma }}\mathrm e ^{\langle \xi \rangle ^{\frac{1}{\sigma }} (C-\frac{\delta }{2}\langle \xi \rangle ^{\frac{1}{s}-\frac{1}{\sigma }})} \mathrm e ^{-\frac{\delta }{2}\langle \xi \rangle ^{\frac{1}{s}}}. \end{aligned}$$

    It follows that

    $$\begin{aligned} |V(t,\xi )|\le c^{\prime }\mathrm e ^{-\frac{\delta }{2}\langle \xi \rangle ^{\frac{1}{s}}}, \end{aligned}$$
    (38)

    for some \(c^{\prime },\delta >0\) and for \(|\xi |\) large enough. This is sufficient to prove that \(U(t,x)\) belongs to the Gevrey class \(G^s(\mathbb R ^n)\) for all \(t\in [0,T]\) and that the Cauchy problem (1) has a unique solution \(u\in C^m([0,T];G^s(\mathbb R ^n))\) for \(s<\sigma \) under the assumptions of case (i).

  2. (ii)

    If the initial data \(g_j\) are Gevrey Beurling ultradistributions in \(\mathcal E ^{\prime }_{(s)}(\mathbb R ^n)\), from the Fourier transform characterisation of ultradistributions ([11, Proposition 2.13]) we have that there exist \(\delta >0\) and \(c>0\) such that \(|V(0,\xi )|\le c\,\mathrm e ^{\delta \langle \xi \rangle ^{\frac{1}{s}}}\) for all \(\xi \in \mathbb R ^n\). Hence, taking \(s\le \sigma \), we obtain the estimate

    $$\begin{aligned} |V(t,\xi )|\le Cc\langle \xi \rangle ^{\frac{k}{2\sigma }}\mathrm e ^{C\langle \xi \rangle ^{\frac{1}{\sigma }}} \mathrm e ^{+\delta \langle \xi \rangle ^{\frac{1}{s}}}\le c^{\prime }\mathrm e ^{+\delta ^{\prime }\langle \xi \rangle ^{\frac{1}{s}}} \end{aligned}$$

    for some \(c^{\prime },\delta ^{\prime }>0\). This proves that the Cauchy problem (1) has a unique solution \(u\in C^m([0,T];\mathcal D ^{\prime }_{(s)}(\mathbb R ^n))\) for \(s\le \sigma \) under the assumptions of case (ii).

\(\square \)

We pass to consider the case of analytic coefficients. We prove \(C^\infty \) and distributional well-posedness of the Cauchy problem (1) providing an extension of Theorem 1 in [13] to any space dimension. Our proof makes use of the following lemma on analytic functions, a parameter-dependent version of the statement (61)–(62) in [13].

Lemma 5

Let \(f(t,\xi )\) be an analytic function in \(t\in [0,T]\), continuous and homogeneous of order \(0\) in \(\xi \in \mathbb R ^n\). Then,

  1. (i)

    for all \(\xi \) there exists a partition \((\tau _{h(\xi )})\) of the interval \([0,T]\) such that

    $$\begin{aligned} 0=\tau _0<\tau _1<\cdots <\tau _{h(\xi )}<\cdots <\tau _{N(\xi )}=T \end{aligned}$$

    with \(\sup _{\xi \ne 0}N(\xi )<\infty \), such that \(f(t,\xi )\ne 0\) in each open interval \((\tau _{h(\xi )},\tau _{h+1(\xi )})\);

  2. (ii)

    there exists \(C>0\) such that

    $$\begin{aligned} |\partial _t f(t,\xi )|\le C\biggl (\frac{1}{t-\tau _{h(\xi )}}+\frac{1}{\tau _{{h+1}(\xi )}-t}\biggr )|f(t,\xi )| \end{aligned}$$

    for all \(t\in (\tau _{h(\xi )},\tau _{{h+1}(\xi )}), \xi \in \mathbb R ^n\) with \(\xi \ne 0\) and \(0\le h(\xi )\le N(\xi )\).

Proof

Since the function \(f\) is homogeneous of order \(0\) in \(\xi \) we can assume \(|\xi |=1\). Excluding the trivial case \(f\equiv 0\) we have that \(f(t,\xi )\) has a finite number of zeroes in \([0,T]\) and hence we can find a partition \((\tau _{h(\xi )})\) as in (i) such that \(f(t,\xi )\ne 0\) in each interval \((\tau _{h(\xi )},\tau _{h+1(\xi )})\), taking \(\tau _{h(\xi )}\), \(1\le h(\xi )\le N(\xi )-1\), to be the zeros of \(f(\cdot ,\xi )\).

Note that the function \(N(\xi )\) is locally bounded and, therefore, by homogeneity \(\sup _{\xi \ne 0}N(\xi )=\sup _{|\xi |=1}N(\xi )<\infty \). Indeed, if \(\sup _{|\xi |=1}N(\xi )=+\infty \) we can find a sequence of points \((\xi _l)_l\) with \(|\xi _l|=1\) and some \(\xi ^{\prime }\) with \(|\xi ^{\prime }|=1\) such that \(\xi _l\rightarrow \xi ^{\prime }\) and \(N(\xi _l)\rightarrow +\infty \) as \(l\rightarrow \infty \). It follows that \(f(t,\xi ^{\prime })\) must have infinite zeros in \(t\) in contradiction with the hypothesis of analyticity on \([0,T]\).

We now work on the interval \((0,\tau _1)\). By the analiticity in \(t\) we can write

$$\begin{aligned} f(t,\xi )=t^{\nu _0(\xi )}(\tau _1-t)^{\nu _1(\xi )}g(t,\xi ) \end{aligned}$$

where \(g(t,\xi )\) is an analytic function in \(t\) never vanishing on \([0,\tau _1]\) homogeneous of degree \(0\) in \(\xi \). Note that the functions \(\nu _0\) and \(\nu _1\) are positive and have local maxima at all points (perturbations in \(\xi \) in a sufficiently small neighborhood can not increase the multiplicity). Arguing as in [13, p. 566] we write \(t|\partial _t f(t,\xi )|\) as

$$\begin{aligned} \biggl |f(t)\biggl (\nu _0(\xi )-\frac{\nu _1(\xi )t}{(\tau _1-t)}+\frac{t\partial _t g(t,\xi )}{g(t,\xi )}\biggr )\biggr |. \end{aligned}$$

Let us fix \(\xi _0\) with \(|\xi _0|=1\). Taking \(t\) in \([0,\tau _1/2]\) and \(\xi \) in a sufficiently small neighborhood of \(\xi _0\) we have that \(\nu _0(\xi )\le c_1, \nu _1(\xi )t/(\tau _1-t) \le c_2, |g(t,\xi )|\ge c_0>0\) and \({t\partial _t g(t,\xi )}/{g(t,\xi )}\le c_3\). Hence,

$$\begin{aligned} t|\partial _t f(t,\xi )|\le C |f(t,\xi )| \end{aligned}$$

on \([0,\tau _1/2]\) for \(\xi \) in a neighborhood of \(\xi _0\). Similarly, one proves that

$$\begin{aligned} (\tau _1-t)|\partial _t f(t,\xi )|\le C |f(t,\xi )| \end{aligned}$$

on \([\tau _1/2,\tau _1]\) for \(\xi \) in a neighborhood of \(\xi _0\). The homogeneity in \(\xi \) combined with a standard compactness argument allows us to extend the inequality

$$\begin{aligned} |\partial _t f(t,\xi )|\le C\biggl (\frac{1}{t}+\frac{1}{\tau _{1}-t}\biggr )|f(t,\xi )| \end{aligned}$$

to \(\mathbb R ^n\!\setminus \!\{0\}\) for \(t\in (0,\tau _1)\). Analogously one obtains that

$$\begin{aligned} |\partial _t f(t,\xi )|\le C\biggl (\frac{1}{t-\tau _{h(\xi )}}+\frac{1}{\tau _{{h+1}(\xi )}-t}\biggr )|f(t,\xi )| \end{aligned}$$

when \(t\in (\tau _{h(\xi )},\tau _{{h+1}(\xi )})\) and \(\xi \ne 0\). \(\square \)

In the case of analytic coefficients, Theorem 3 follows from the following Theorem 7.

Theorem 7

If \(A_j\in {C}([0,T])\) for all \(j\), and the coefficients \(A_{(j)}\) are analytic on \([0,T]\), the characteristic roots are real and satisfy (6), and the entries of the matrix \(B\) of the lower order terms in (13) fulfil the conditions (29) for \(\xi \) away from \(0\), then the Cauchy problem (1) is \(C^\infty \) and distributionally well-posed.

Proof

By the finite propagation speed for hyperbolic equations it is not restrictive to assume that the initial data \(g_j\) are compactly supported. If the coefficients \(a_j\) are analytic in \(t\) on \([0,T]\) then by construction the entries of the quasi-symmetriser \(Q_\varepsilon ^{(m)}\) are analytic as well. In particular, by Proposition 1

$$\begin{aligned} q_{\varepsilon ,ij}(t,\xi )=q_{0,ij}(t,\xi )+\varepsilon ^2 q_{1,ij}(t,\xi )+\cdots +\varepsilon ^{2(m-1)}q_{m-1,ij}(t,\xi ). \end{aligned}$$

We use the partition of the interval \([0,T]\) in Lemma 5 (applied to any \(q_{\varepsilon ,ij}(t,\xi )\) or more precisely to any \(\widetilde{q}_{\varepsilon ,ij}(t,\xi )=q_{\varepsilon ,ij}(\lambda (t,\xi )/|\xi |)\), homogeneous function of order \(0\) in \(\xi \) having the same zeros in \(t\) of \(q_{\varepsilon ,ij}(t,\xi )\)). Note that this partition can be chosen independent of \(\varepsilon \). Considering the first interval \([0,\tau _1]\) (\(\tau _1=\tau _1(\xi )\)) we define

$$\begin{aligned} E_\varepsilon (t,\xi )={\left\{ \begin{array}{ll} |V(t,\xi )|^2&\text{ for} t\in [0,\varepsilon ]\cup [\tau _1-\varepsilon ,\tau _1],\\ (Q_\varepsilon (t,\xi )V(t,\xi ),V(t,\xi ))&\text{ for} t\in [\varepsilon ,\tau _1-\varepsilon ]. \end{array}\right.} \end{aligned}$$

as in [13, p. 567]. Hence

$$\begin{aligned} \partial _t E_\varepsilon (t,\xi )&\le |\partial _t E_\varepsilon (t,\xi )|\le |((A_1-A_1^*)V,V)|+|((B-B^*)V,V)|\\&\le \left(2\sup _{t\in [0,T]}\Vert A_1(t,\xi )\Vert +2\sup _{t\in [0,T]}\Vert B(t,\xi )\Vert \right)E_\varepsilon (t,\xi ) \end{aligned}$$

on \([0,\varepsilon ]\cup [\tau _1-\varepsilon ,\tau _1]\). It follows by the Gronwall inequality that there exists a constant \(\alpha >0\) such that

$$\begin{aligned} E_\varepsilon (t,\xi )\le {\left\{ \begin{array}{ll} \mathrm e ^{2\alpha \varepsilon \langle \xi \rangle }E_\varepsilon (0,\xi )&\text{ for} t\in [0,\varepsilon ],\\ \mathrm e ^{2\alpha \varepsilon \langle \xi \rangle }E_\varepsilon (\tau _1-\varepsilon ,\xi )&\text{ for} t\in [\tau _1-\varepsilon ,\tau _1]. \end{array}\right.} \end{aligned}$$
(39)

On the interval \([\varepsilon ,\tau _1-\varepsilon ]\) we proceed as in the proof for the Gevrey well-posedness under the conditions (29) on the lower order terms for \(|\xi |\ge R\). We have

$$\begin{aligned} \partial _t E_\varepsilon (t,\xi )\le \biggl (\frac{|(\partial _tQ_\varepsilon V,V)|}{(Q_\varepsilon (t,\xi ) V(t,\xi ), V(t,\xi ))}+C_2\varepsilon \langle \xi \rangle +C_3\biggr )E_\varepsilon (t,\xi ). \end{aligned}$$
(40)

Since the family \(Q_\varepsilon (\lambda )\) is nearly diagonal when the roots \(\lambda _l\) satisfy the condition (6) we have that \(Q_\varepsilon \ge c_0\text{ diag}\,Q_\varepsilon \), i.e.,

$$\begin{aligned} (Q_\varepsilon (t,\xi )V,V)\ge c_0\sum _{h=1}^m q_{\varepsilon ,hh}(t,\xi )|V_h|^2. \end{aligned}$$

This fact combined with the inequality

$$\begin{aligned} |q_{\varepsilon ,ij}||V_i||V_j|&= (Q_\varepsilon e_i,e_j)|V_i||V_j|\le \sqrt{(Q_\varepsilon e_i,e_i)(Q_\varepsilon e_j,e_j)}|V_i||V_j|\\&\le \sqrt{q_{\varepsilon ,ii}q_{\varepsilon ,jj}}|V_i||V_j|\le \sum _{h=1}^m q_{\varepsilon ,hh}|V_h|^2 \end{aligned}$$

and Lemma 5 yields

$$\begin{aligned}&\int _{\varepsilon }^{\tau _1-\varepsilon }\frac{|(\partial _tQ_\varepsilon V,V)|}{(Q_\varepsilon (t,\xi ) V(t,\xi ), V(t,\xi ))}\, dt\le c_0^{-1}\int _{\varepsilon }^{\tau _1-\varepsilon }\sum _{i,j=1}^m \frac{|\partial _t q_{\varepsilon ,ij}(t,\xi )|}{|q_{\varepsilon ,ij}(t,\xi )|}\, dt\\&\quad \le C_1\int _\varepsilon ^{\tau _1-\varepsilon }\biggl (\frac{1}{t}+\frac{1}{\tau _1-t}\biggr )\, dt =2C_1\log \frac{\tau _1-\varepsilon }{\varepsilon }\le 2C_1\log \frac{T}{\varepsilon }, \end{aligned}$$

for some constant \(C_1\) independent of \(t\) and \(\xi \ne 0\). Going back to estimate (40) by Gronwall’s lemma we obtain

$$\begin{aligned} E_\varepsilon (t,\xi )\le C_TE_\varepsilon (\varepsilon ,\xi )\mathrm e ^{C_T\log (1/\varepsilon )+C_T\varepsilon \langle \xi \rangle }, \end{aligned}$$
(41)

for \([\varepsilon ,\tau _1-\varepsilon ]\) and \(|\xi |\ge R\). Finally, putting together (39) with (41) we conclude that there exists a constant \(c>0\) such that

$$\begin{aligned} E_\varepsilon (t,\xi )\le cE_\varepsilon (0,\xi )\mathrm e ^{c(\log (1/\varepsilon )+\varepsilon \langle \xi \rangle )} \end{aligned}$$

for all \(t\in [0,\tau _1]\) and \(|\xi |\ge R\). Hence by applying (17) we have

$$\begin{aligned} |V(t,\xi )|\le c\varepsilon ^{-(m-1)}\mathrm e ^{C_T(\log (1/\varepsilon )+\varepsilon \langle \xi \rangle )}|V(0,\xi )| \end{aligned}$$

on \([0,\tau _1]\). An iteration of the same technique on the other subintervals of \([0,T]\) leads to

$$\begin{aligned} |V(t,\xi )|\le c\varepsilon ^{-N(\xi )(m-1)}\mathrm e ^{N(\xi )C_T(\log (1/\varepsilon )+\varepsilon \langle \xi \rangle )}|V(0,\xi )| \end{aligned}$$

on \([0,T]\) for \(|\xi |\ge R\). Now, setting \(\varepsilon =\langle \xi \rangle ^{-1}\) we get

$$\begin{aligned} |V(t,\xi )|\le c\langle \xi \rangle ^{N(\xi )(m-1)}\mathrm e ^{N(\xi )C_T}\langle \xi \rangle ^{N(\xi )C_T}. \end{aligned}$$

Remembering that from Lemma 5 the function \(N(\xi )\) is bounded in \(\xi \) we conclude that there exist some \(\kappa \in \mathbb N \) and \(C>0\) such that

$$\begin{aligned} |V(t,\xi )|\le C\langle \xi \rangle ^\kappa |V(0,\xi )| \end{aligned}$$
(42)

on \([0,T]\) for all \(|\xi |\ge R\). It is clear that the estimate (42) implies \(C^\infty \) and distributional well-posedness of the Cauchy problem (1). \(\square \)

Finally, given the energy estimates established above, the proof of Theorem 4 is simple:

Proof of‘Theorem 4

We observe that the estimates (38) and (42) imply that \(V(t,\xi )\) is bounded in \(\xi \) if the lower order terms \(A(\cdot ,\xi )\) are bounded on \([0,T]\). Coming back to the solution \(u\) of (1) and the definition of \(V\) we get that the solution \(u(t,x)\) is in the class \({C}^{m-1}([0,T])\) with respect to \(t\). Finally, from the equality \(D^m_t u=-\sum _{j=0}^{m-1} A_{m-j}(t,D_x)D_t^j u\) we see that the right hand side is bounded in \(t\), implying that \(u(t,x)\) is in \(W^{\infty ,m}([0,T])\) with respect to \(t\). \(\square \)

We conclude the paper with the following remarks on how the results change if we assume less than the Levi conditions (7). We thank T. Kinoshita for drawing our attention to this question.

We begin by noting that the matrix \(B\) of the lower order terms in (13) can be written as

$$\begin{aligned} B(t,\xi )=\sum _{l=0}^{m-1}B_{-l}(t,\xi ), \end{aligned}$$

with

$$\begin{aligned} B_{-l}=\left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 0&0&0&\ldots&0\\ 0&0&0&\ldots&0 \\ \ldots&\ldots&\ldots&\ldots&0 \\ B_{-l,1}&B_{-l,2}&\ldots&\ldots&B_{-l,m}\\ \end{array} \right) \end{aligned}$$

and

$$\begin{aligned} B_{-l,j}(t,\xi )={\left\{ \begin{array}{ll} -\sum _{|\gamma |=m-j-l} a_{m-j+1,\gamma }(t)\xi ^\gamma \langle \xi \rangle ^{j-m},&\text{ for} j\le m-l, \\ 0,&\text{ otherwise}, \end{array}\right.} \end{aligned}$$

for \(j=1,\ldots ,m\). We easily see that the matrix \(B_{-l}\) has entries of order \(-l\) and the last \(l\) entries in the bottom row are equal to \(0\). Making use of this decomposition of \(B\) we can write \(((Q_0^{(m)}B-B^*Q_0^{(m)})V,V)\) as

$$\begin{aligned} \sum _{l=0}^{m-1}((Q_0^{(m)}B_{-l}-B_{-l}^*Q_0^{(m)})V,V). \end{aligned}$$
(43)

Remark 2

Let \(0\le h\le m-2\). Let us assume the Levi conditions (7) in the form (29) only on the \(B_{-l}\)-matrices up to level \(h\), i.e., instead of (7) assume only that

$$\begin{aligned} \biggl |\sum _{l=0}^{\min (h,m-j)} B_{-l,j}\biggr |^2\prec \sum _{i=1}^m |\sigma _{m-j}^{(m-1)}(\pi _i\lambda )|^2, \end{aligned}$$
(44)

for \(j=1,\ldots ,m\). In other words, we impose Levi conditions on the coefficients of the equation corresponding to the matrices \(B_{-l}\) up to \(l=h\), leaving free the remaining lower order coefficients. Recall that

$$\begin{aligned} |((Q_0^{(m)} B_l-B_l^*Q_0^{(m)})V,V)|\le 2(m-1)!|\mathcal W B_{-l}V||\mathcal W V|, \end{aligned}$$

where the matrix \(\mathcal W \) is defined in Sect. 3. Under the assumption (44) and the bound (17) from below for the quasi-symmetriser we obtain for (43) the estimate

$$\begin{aligned}&\biggl |\sum _{l=0}^{m-1}((Q_0^{(m)}B_{-l}-B_{-l}^*Q_0^{(m)})V,V)\biggr |\\&\quad \le \biggl |\sum _{l=0}^{h}((Q_0^{(m)}B_{-l}-B_{-l}^*Q_0^{(m)})V,V)\biggr |+\biggl |\sum _{l=h+1}^{m-1}((Q_0^{(m)}B_{-l}-B_{-l}^*Q_0^{(m)})V,V)\biggr |\\&\quad \le C_3E_\varepsilon + 2(m-1)!\sum _{l=h+1}^{m-1}|\mathcal W B_{-l}V||\mathcal W V|\\&\quad \le C_3E_\varepsilon +C_4\langle \xi \rangle ^{-h-1}\varepsilon ^{-(m-1)}E_\varepsilon . \end{aligned}$$

This leads to the energy estimate

$$\begin{aligned} \partial _t E_\varepsilon (t,\xi )\le (C_1\varepsilon ^{-2(m-1)/k}+C_2\varepsilon \langle \xi \rangle +C_3+C_4\langle \xi \rangle ^{-h-1}\varepsilon ^{-(m-1)})E_\varepsilon (t,\xi ). \end{aligned}$$

The Gevrey well-posedness result of Theorem 2 will still hold true under the relaxed Levi condition (44), e.g., if for \(\varepsilon ^{-2(m-1)/k}=\varepsilon \langle \xi \rangle \) one has

$$\begin{aligned} \langle \xi \rangle ^{-h-1}\varepsilon ^{-(m-1)}\le \langle \xi \rangle ^{\frac{1}{\sigma }}, \end{aligned}$$

with \(\sigma =1+k/[2(m-1)]\), that is if

$$\begin{aligned} h+1\ge \frac{(m-1)(k-2)}{k+2(m-1)}. \end{aligned}$$
(45)

In other words, for any fixed \(k\in \mathbb N \) by involving sufficiently enough matrices \(B_{-h}\) in the Levi condition (44) (how many depend on the equation order \(m\) and the regularity \(k\) of the coefficients) one can still obtain \(G^s\) well-posedness for

$$\begin{aligned} 1\le s<1+\frac{k}{2(m-1)}. \end{aligned}$$

More precisely, by rewriting (45) as

$$\begin{aligned} h\ge m-2-\frac{2m(m-1)}{k+2(m-1)}, \end{aligned}$$

we can take

$$\begin{aligned} h\ge h_0:=m-2-\bigg [\frac{2m(m-1)}{k+2(m-1)}\biggl ], \end{aligned}$$
(46)

for all \(m\ge 2\) and \(k\ge 2\).

We now focus on the case of second order equations.

Remark 3

From the definition of \(h_0\) in (46), we see that

$$\begin{aligned} h_0=-\bigg [\frac{4}{k+2}\biggl ]=0 \end{aligned}$$

if \(m=2\) and \(k\ge 2\). This shows that in the case of second order equations under the hypothesis (6) it is enough to put Levi conditions on the matrix \(B_0\) to prove the Gevrey well-posedness of the Cauchy problem (1) for \(1\le s < 1+\frac{k}{2(m-1)}.\)

Concluding, let’s us consider the case of non-analytic but very regular coefficients.

Remark 4

Assume now that the equation coefficients are smooth and that \(m>2\). This implies that for any \(a>0\) we can take \(k\) large enough such that \(\varepsilon ^{-2(m-1)/k}\le \langle \xi \rangle ^a\). Hence, \(\varepsilon ^{-2(m-1)/k}\le \langle \xi \rangle ^{-h-1}\varepsilon ^{-(m-1)}\) with \(h=0\). Setting then \(\varepsilon \langle \xi \rangle =\langle \xi \rangle ^{-1}\varepsilon ^{-(m-1)}\) we get that under the Levi condition (44) with \(h=0\) the Cauchy problem (1) is well-posed in \(G^s\) with

$$\begin{aligned} 1\le s< 1+\frac{2}{m-2} \end{aligned}$$

In terms of the Gevrey order this result is worse than the one stated in Theorem 2 but it is obtained with Levi conditions only on the coefficients appearing in the matrix \(B_0\). We note that it is still better than the Bronstein’s result due to the extra assumption (6) and the Levi condition (44) with \(h=0.\)

Let us give some explanation to the relaxed Levi conditions (44). In particular, using the definition of the terms \(B_{-l,j}\) and omitting the squares as in (10), condition (44) is equivalent to

$$\begin{aligned}&\sum _{l=0}^{\min (h_0,m-j)}\left|\sum _{|\gamma |= m-j-l} a_{m-j+1,\gamma }(t)\xi ^\gamma \right| \nonumber \\&\quad \le C\sum _{i=1}^m \left| \sum _{\begin{array}{c} 1\le l_1<\cdots <l_{m-j}\le m \\ l_h\not =i\; \forall h \end{array}} \lambda _{l_1}(t,\xi )\cdots \lambda _{l_{m-j}}(t,\xi ) \right|, \end{aligned}$$
(47)

where the powers of \(\langle \xi \rangle \) cancel. By comparison with the Levi conditions (7) and the identity (8), it is clear that we impose conditions on less terms of the equation.