Large Time Convergence of the Non-homogeneous Goldstein-Taylor Equation


The Goldstein-Taylor equations can be thought of as a simplified version of a BGK system, where the velocity variable is constricted to a discrete set of values. It is intimately related to turbulent fluid motion and the telegrapher’s equation. A detailed understanding of the large time behaviour of the solutions to these equations has been mostly achieved in the case where the relaxation function, measuring the intensity of the relaxation towards equally distributed velocity densities, is constant. The goal of the presented work is to provide a general method to tackle the question of convergence to equilibrium when the relaxation function is not constant, and to do so as quantitatively as possible. In contrast to the usual modal decomposition of the equations, which is natural when the relaxation function is constant, we define a new Lyapunov functional of pseudodifferential nature, one that is motivated by the modal analysis in the constant case, that is able to deal with full spatial dependency of the relaxation function. The approach we develop is robust enough that one can apply it to multi-velocity Goldstein-Taylor models, and achieve explicit rates of convergence. The convergence rate we find, however, is not optimal, as we show by comparing our result to those found in [8].


The object of this work is the large time analysis of the Goldstein-Taylor equations on the one-dimensional torus \({\mathbb {T}}\), i.e. on \([0,2\pi ]\) with periodic boundary conditions, and for \(t\in (0,\infty )\):

$$\begin{aligned} \begin{aligned} \partial _t f_+(x,t) + \partial _x f_+(x,t)&= \frac{\sigma (x)}{2}(f_-(x,t)-f_+(x,t)),\\ \partial _t f_-(x,t) - \partial _x f_-(x,t)&= - \frac{\sigma (x)}{2}(f_-(x,t)-f_+(x,t)),\\ f_{\pm }(x,0)&= f_{\pm ,0}(x), \end{aligned} \end{aligned}$$

where \(f_{\pm }(x,t)\) are the density functions of finding an element with a velocity \(\pm 1\) in a position \(x\in {\mathbb {T}}\) at time \(t>0\). The function \(\sigma \in L^{\infty }_+\left( {\mathbb {T}} \right) :=\big \{f\in L^\infty ({\mathbb {T}})\;\big |\;\text{ essmin } f>0\big \}\) is the relaxation coefficient, and \(f_{\pm ,0}\) are the initial conditions. Since (1) is mass conserving, its steady state is of the form

$$\begin{aligned} f_{\pm ,\infty }(x) := f_\infty \ ,\quad x\in {\mathbb {T}}\ ;\qquad f_\infty :=\frac{1}{2} (f_{+,0}+f_{-,0})_{\mathrm {avg}}, \end{aligned}$$

with the notation

$$\begin{aligned} h_{\mathrm {avg}}:=\frac{1}{2\pi }\int _{0}^{2\pi } h(x)dx. \end{aligned}$$

The Goldstein-Taylor model was originally considered as a diffusion process, resulting as a limit of a discontinuous random migration in 1D, where particles may change direction with rate \(\sigma \). It appeared in the context of turbulent fluid motion and the telegrapher’s equation, see [15, 22], respectively. (1) can also be seen as a special 1D case of a BGK-model (named after the three physicists Bhatnagar, Gross, and Krook [9]) with a discrete set of velocities. Such equations commonly appear in applications like gas and fluid dynamics as velocity discretisations of various kinetic models (e.g. the Boltzmann equation). The mathematical analysis of such discrete velocity models has a long standing tradition, see [10, 18] and references therein.

Although the Goldstein-Taylor equation is very simple, it still exhibits an interesting and mathematically rich structure. Hence, it has been attracting continuous interest over the last 20 years. Most of its mathematical analyses was devoted to the following three topics: scaling limits, asymptotic preserving (AP) numerical schemes, and large time behaviour. In a diffusive scaling, the Goldstein-Taylor model can be viewed as a hyperbolic approximation to the heat equation [21]. Various AP-schemes for this model in the stiff relaxation regime (i.e. for \(\sigma \rightarrow \infty \)) were constructed and analyzed in [4, 16, 17]. Since the large time convergence of solutions to (1) towards its unique steady state is also the topic of this work, we shall review the related literature in more detail:

Analytically, the main difficulty of (1) is with its hypocoercivity, as defined in [24]: More specifically, the relaxation operator on the r.h.s. is not coercive on \({\mathbb {T}}\times {\mathbb {R}}^2\). Hence, for each fixed x, the r.h.s. by itself would drive the system to its local equilibrium, generated by the kernel of the relaxation operator, \({\text {span}}\{\left( {\begin{array}{c}1\\ 1\end{array}}\right) \}\), but the local mass (density) might be different at different positions. Convergence to the global equilibrium \((f_\infty ,f_\infty )^T\) only arises due to the interplay between local relaxation and the transport operator on the l.h.s. of (1).

The Goldstein-Taylor model was also considered in the analysis of [5], if one chooses the velocity matrix to be \(V=\text{ diag }(1,-1)\) and the relaxation matrix \({\varvec{A}}(x)\) to be

$$\begin{aligned} {\varvec{A}}(x) = \frac{\sigma (x)}{2} \begin{pmatrix} 1&{} -1\\ -1&{} 1 \end{pmatrix}\ge 0. \end{aligned}$$

Exponential convergence to the steady state is then proved in the aforementioned work for the system (1) with inflow boundary conditions. Such boundary conditions make the problem significantly easier than in the periodic set-up envisioned here, in particular it allows for \(\sigma (x)\) to be zero on a subset of \({\mathbb {T}}\), an issue that proves to be far more difficult in our setting.

In [12] the authors proved polynomial decay towards the equilibrium, allowing \(\sigma (x)\) to vanish at finitely many points.

In [23] the author proved exponential decay for solutions to (1) for a more general \(\sigma (x)\ge 0\). That work is based on a (non-local in time) weak coercive estimate on the damping.

All of the papers mentioned so far did not focus on the optimality of the (exponential) decay rate. Using the equivalence between (1) and the telegrapher’s equation, the authors of [8] have shown that this optimal decay rate, \(\mu (\sigma )\), is the minimum of \(\sigma _{\mathrm {avg}}\) and the spectral gap of the telegrapher’s equation (excluding the case when some of those eigenvalues with real part equal to \(\mu (\sigma )\) are defective). The precise value of this spectral gap, however, is hardly accessible - even for simple non-constant relaxation functions \(\sigma (x)\) (see e.g. Appendix A). Moreover, it is based on the restrictive requirement \(f_{\pm ,0}\in H^1({\mathbb {T}})\), and cannot be extended to other discrete velocity models in 1D. The reason for the latter is that [8] heavily relies on the equivalence of (1) to the telegrapher’s equation.

The issues above motivated our subsequent analysis: We introduce a method for \(L^2\)–initial data that can be extended to other discrete velocity BGK-models (as illustrated below for a \(3-\)velocities system), and that yields sharp rates for constant \(\sigma \). Moreover, and most importantly, it is applicable in the general non-homogeneous \(\sigma \in L^{\infty }_+\left( {\mathbb {T}} \right) \) case and yields in these cases an explicit, quantitative lower bound for the decay rate. In this case, however, it will not achieve an optimal rate of convergenceFootnote 1 to the appropriate equilibrium of the system. The method to be derived here will use a Lyapunov function technique in the spirit of the earlier works [1, 2, 13, 24].

This paper is structured as follows: In §2 we give the analytical setting of the problem and present our main convergence result (Theorem  1). In §3 we recall some analytical results which will be needed in the analysis that will follow, and explore some properties of the entropy functional \(E_\theta \) and the anti-derivative of functions on \({\mathbb {T}}\), defined in (4) and (5), respectively. § 4 is devoted to the case where \(\sigma (x)=\sigma \) is constant, which will motivate our more general approach: Based on a modal decomposition of the Goldstein-Taylor system and its spectral analysis we derive the entropy functional \(E_\theta \), first on a modal level and then as a pseudo-differential operator in physical space. We conclude by proving part (a) of our main theorem. Continuing to §5, we will prove, using a perturbative approach to the problem, part (b) of our main theorem. The robustness of our method will be shown in §6 where we use it to obtain an explicit rate of convergence for a \(3-\)velocities Goldstein-Taylor model. Finally, in Appendix A we discuss a potential way to improve the technique from §5, and explicitly show the lack of optimality of it for a particular case of \(\sigma (x)\).

The Setting of the Problem and Main Results

To better understand the Goldstein-Taylor system, (1), one starts by recasting it in the macroscopic variables

$$\begin{aligned} u: = f_+ + f_-,\quad \quad v:=f_+-f_-, \end{aligned}$$

representing the spatial (mass) density and the flux density, respectively. The macroscopic variables yield the following system of equations on \({\mathbb {T}}\times (0,\infty )\):

$$\begin{aligned} \begin{aligned}&\partial _t u(x,t) +\partial _x v(x,t) =0,\\&\partial _t v(x,t) +\partial _x u(x,t) =- \sigma (x)v(x,t),\\&u(.,0)=u_0:=f_{+,0}+f_{-,0},\quad v(.,0)=v_0:=f_{+,0}-f_{-,0}\ , \end{aligned} \end{aligned}$$

whose theory of existence and uniqueness is straightforward (since the r.h.s. is a bounded perturbation of the transport operator; see §2 in [12] or, more generally, [20]). Moreover, when one tries to understand the qualitative behaviour of (3), one notices that the equation for u speaks of “total mass conservation” (upon integration over the spatial interval \((0,2\pi )\)), while the equation for v predicts a strong decay to zero for the function. This means, at least intuitively, that the difference between \(f_+\) and \(f_-\) should go to zero, and that their sum retains its mass. As the main driving force of the equation is a transport operation on the torus, we will not be surprised to learn that the large time behaviour of u (and since v should go to zero, of \(f_+\) and \(f_-\) as well) is convergence to a constant. All of this has been verified in several cases, most generally in [8].

We now set the framework that will assist us in the investigation of the large time behaviour of (3), in a relatively general case. The natural Hilbert space to consider this problem is \(L^2({\mathbb {T}})^{\otimes 2}\), with the standard inner product for each component:

$$\begin{aligned} \left\langle f_1,f_2\right\rangle :=\frac{1}{2\pi }\int _{0}^{2\pi } f_1(x)\overline{f_2(x)}dx, \end{aligned}$$

where the bar denotes complex conjugation. Since (1) and (3) are (only) hypocoercive, the symmetric part of their generators (i.e. the operators on their r.h.s.) are not coercive on \(L^2({\mathbb {T}})^{\otimes 2}\). Hence, the standard \(L^2\)–norm cannot serve as a usable Lyapunov functional. As is typical for hypocoercive equations (see [1, 13, 24]), a possible remedy to this problem is to consider a “twisted” norm (often also referred to as entropy functional), constructed in a way that this functional strictly decays along each trajectory (u(t), v(t)).

The following functional, which will be our entropy functional, is not an ansatz, and its origin will be derived in §4. Moreover, we will show that it will yield the sharp exponential decay for constant \(\sigma \), when one chooses \(\theta =\theta (\sigma )\) appropriately.

Definition 1

Let \(f,g\in L^2\left( {\mathbb {T}} \right) \) and let \(\theta >0\) be given. Then we define the entropy \(E_\theta (f,g)\) as

$$\begin{aligned} E_\theta (f,g):=\left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 -\frac{\theta }{2\pi }\int _{0}^{2\pi } {\text {Re}}\left( \partial _x^{-1}f (x)\overline{g(x)} \right) dx. \end{aligned}$$

Here, the anti-derivative of f is defined as

$$\begin{aligned} \partial _x^{-1}f (x):=\int _{0}^{x}f(y)dy - \left( \int _{0}^{x}f(y)dy \right) _{\mathrm {avg}}\ , \end{aligned}$$

with the average defined in (2). The normalization constant in (5) is chosen such that \((\partial _x^{-1}f)_{\mathrm {avg}}=0\).

Several recent studies (like [1, 13]) considered the Goldstein-Taylor system with constant \(\sigma \). This case can be investigated fairly easy as one is able to utilise Fourier analysis in this setting, and construct a Lyapunov functional as a sum of quadratic functionals of the Fourier modes. However, the moment we change \(\sigma (x)\) to a non-constant function - even to one that is natural in the Fourier setting, such as sine or cosine - the Fourier analysis becomes nigh impossible to solve.

The main idea that guided us in our approach was to re-examine the case where \(\sigma \) is constant and to recast the modal Fourier norm by using a pseudo-differential operator, without needing its modal decomposition. This functional, which is exactly \(E_\theta \) for particular choices of \(\theta =\theta (\sigma )\), can then be extended to the case where \(\sigma (x)\) is not constant, yielding quantitative estimates for the convergence. As the nature of this approach is perturbative, our decay rates are not optimal. The methodology itself, however, is fairly robust, and is viable in other cases, such as the multi-velocity Goldstein-Taylor model (as we shall see).

The main theorem we will show in this paper, with the use of the vector notation

$$\begin{aligned} f(t):=\left( {\begin{array}{c}f_+(t)\\ f_-(t)\end{array}}\right) \ ,\quad f_0:=\left( {\begin{array}{c}f_{+,0}\\ f_{-,0}\end{array}}\right) \ , \end{aligned}$$

is the following:

Theorem 1

Let \(u,v\in C([0,\infty );L^2\left( {\mathbb {T}} \right) )\) be mildFootnote 2 real valued solutions to (3) with initial datum \(u_0,\, v_0\in L^2\left( {\mathbb {T}} \right) \). Denoting by \(u_{\mathrm {avg}}=\left( u_0 \right) _{\mathrm {avg}}\) we have:

  1. a)

    \(\underline{\text {If }\sigma (x)=\sigma \text { is constant}}\) we have that: If \(\sigma \not =2\) then

    $$\begin{aligned} E_{\theta (\sigma )}\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le E_{\theta (\sigma )}\left( u_0-u_{\mathrm {avg}},v_0 \right) e^{-2\mu \left( \sigma \right) t} \end{aligned}$$


    $$\begin{aligned} \theta \left( \sigma \right) :={\left\{ \begin{array}{ll} \sigma , &{} 0<\sigma<2 \\ \frac{4}{\sigma }, &{} \sigma>2 \end{array}\right. }, \quad \mu \left( \sigma \right) :={\left\{ \begin{array}{ll} \frac{\sigma }{2}, &{} 0<\sigma <2 \\ \frac{\sigma }{2}-\sqrt{\frac{\sigma ^2}{4}-1}, &{} \sigma >2 \end{array}\right. }, \end{aligned}$$

    and if \(\sigma =2\) then for any \(0<\epsilon <1\)

    $$\begin{aligned} E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}}\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}}\left( u_0-u_{\mathrm {avg}},v_0 \right) e^{-2\left( 1-\epsilon \right) t}.\end{aligned}$$

    Consequently if \(\sigma \not =2\)

    $$\begin{aligned} \begin{aligned} \Big \Vert f(t)-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert \le C_{\sigma } \Big \Vert f_0-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert e^{-\mu \left( \sigma \right) t}, \end{aligned} \end{aligned}$$


    $$\begin{aligned} C_{\sigma }:={\left\{ \begin{array}{ll} \sqrt{\frac{2+\sigma }{2-\sigma }}, &{} 0<\sigma <2 \\ \sqrt{ \frac{\sigma +2}{\sigma -2}}, &{} \sigma >2 \end{array}\right. }, \quad f_{\infty }=\frac{u_{\mathrm {avg}}}{2}\ , \end{aligned}$$

    and the decay rate \(\mu (\sigma )\) is sharp.

    For \(\sigma =2\) we have that

    $$\begin{aligned} \begin{aligned} \Big \Vert f(t)-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert \le \frac{\sqrt{2}}{\epsilon } \Big \Vert f_0-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert e^{-\left( 1-\epsilon \right) t}\ . \end{aligned} \end{aligned}$$
  2. b)

    \({\text {If }\sigma (x)\text { is non-constant}}\) such that

    $$\begin{aligned} 0<\sigma _{\mathrm {min}}:=\inf _{x\in {\mathbb {T}}}\sigma (x)< \sup _{x\in {\mathbb {T}}}\sigma (x)=:\sigma _{\mathrm {max}}<\infty , \end{aligned}$$

    then by defining

    $$\begin{aligned} \theta ^*:=\min \left( \sigma _{\mathrm {min}},\frac{4}{\sigma _{\mathrm {max}}} \right) \end{aligned}$$


    $$\begin{aligned} \alpha ^*:=\alpha ^*\left( \sigma _{\mathrm {min}},\sigma _{\mathrm {max}} \right) :={\left\{ \begin{array}{ll} \frac{\sigma _{\mathrm {min}}\left( 4+2\sqrt{4-\sigma _{\mathrm {min}}^2}-\sigma _{\mathrm {min}}\sigma _{\mathrm {max}} \right) }{4+2\sqrt{4-\sigma _{\mathrm {min}}^2}-\sigma _{\mathrm {min}}^2}, &{} \sigma _{\mathrm {min}}< \frac{4}{\sigma _{\mathrm {max}}}\\ \sigma _{\mathrm {max}}- \sqrt{\sigma _{\mathrm {max}}^2-4}, &{} \sigma _{\mathrm {min}}\ge \frac{4}{\sigma _{\mathrm {max}}} \end{array}\right. } \end{aligned}$$

    we have that

    $$\begin{aligned} E_{\theta ^*}\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le E_{\theta ^*}\left( u_0-u_{\mathrm {avg}},v_0 \right) e^{-\alpha ^*t}, \end{aligned}$$

    and as result

    $$\begin{aligned} \begin{aligned} \Big \Vert f(t)-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert \le \sqrt{\frac{2+\theta _*}{2-\theta _*}} \Big \Vert f_0-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert e^{-\frac{\alpha ^*}{2}t}\ , \end{aligned} \end{aligned}$$

    with \(f_\infty \) defined in (8).

Part (a) of this theorem will be proved in §4.4, and Part (b) in §5. In many of the proofs which will eventually lead to the proof of this theorem we will assume that (uv) is a classical solution, pertaining to \(u_0,\; v_0\) in the periodic Sobolev space \(H^1({\mathbb {T}})\). The general result will follow by a simple density argument.

Remark 1

It is simple to see that if \(\sigma (x)\) satisfies the conditions of (b), then, as \(\sigma _{\mathrm {min}}\) and \(\sigma _{\mathrm {max}}\) approach a positive constant \(\sigma \not =2\), we find that

$$\begin{aligned} \theta ^*\rightarrow \min \left( \sigma ,\frac{4}{\sigma } \right) ,\quad \text {and}\quad \alpha ^*\rightarrow {\left\{ \begin{array}{ll} \sigma -\sqrt{\sigma ^2-4}, &{} \sigma >2 \\ \sigma , &{} \sigma <2\end{array}\right. }\ , \end{aligned}$$

recovering the results of part (a) of the above theorem.

In addition, one should note that when \(\sigma _{\mathrm {min}}> \frac{4}{\sigma _{\mathrm {max}}} \) we have that

$$\begin{aligned} \alpha ^*\left( \sigma _{\mathrm {min}},\sigma _{\mathrm {max}} \right) =2\mu \left( \sigma _{\mathrm {max}} \right) , \end{aligned}$$

where \(\mu \left( \sigma \right) \) was defined in part (a) of the Theorem. This validates the intuition that, if \(\sigma _{\mathrm {max}}\) is “large enough”, the convergence rate of the solution can be estimated using the “worst convergence rate”, corresponding to \(\mu \left( \sigma _{\mathrm {max}} \right) \) of the \(\sigma (x)=\sigma \) case.

Lastly, one notices that when \(\sigma _{\mathrm {min}}=\frac{4}{\sigma _{\mathrm {max}}}\)

$$\begin{aligned} \frac{\sigma _{\mathrm {min}}\left( 4+2\sqrt{4-\sigma _{\mathrm {min}}^2}-\sigma _{\mathrm {min}}\sigma _{\mathrm {max}} \right) }{4+2\sqrt{4-\sigma _{\mathrm {min}}^2}-\sigma _{\mathrm {min}}^2} =\sigma _{\mathrm {max}}- \sqrt{\sigma _{\mathrm {max}}^2-4}, \end{aligned}$$

which shows the continuity of \(\alpha ^{*}\) on the curve that stitches the two formulas in (11).


In this short section we will remind the reader of a few simple properties of functions on the torus, as well as explore properties of the anti-derivative function, \(\partial _x^{-1} f\), and our functional \(E_{\theta }(f,g)\). Most of the simple proofs of this section will be deferred to Appendix B.

We begin with the well known Poincaré inequality:

Lemma 1

(Poincaré Inequality) Let \(f\in H^1_{per}\left( {\mathbb {T}} \right) \) with \(f_{\mathrm {avg}}=0\). Then

$$\begin{aligned} \left\Vert f\right\Vert \le \left\Vert f^\prime \right\Vert . \end{aligned}$$

Next we focus our attention on some simple, yet crucial, properties of the anti-derivative function which was defined in (5).

Lemma 2

Let \(f\in L^1\left( {\mathbb {T}} \right) \). Then:

  1. i)

    \(\left( \partial ^{-1}_x f \right) _{\mathrm {avg}}=0\).

  2. ii)

    \(\partial _x^{-1}f\) is differentiable a.e. on \([0,2\pi ]\) and \(\partial _x\left( \partial ^{-1}_x f \right) (x)=f(x)\) a.e.

  3. iii)

    If in addition f is differentiable we have that \(\partial ^{-1}_x\left( \partial _x f \right) (x)=f(x)-f_{\mathrm {avg}}\).

  4. iv)

    If, in addition, we have that \(f_{\mathrm {avg}}=0\), then \(\partial _x^{-1}f\) is a continuous function on the torus, and

    $$\begin{aligned} \widehat{\partial ^{-1}_x f}\left( k \right) = \left\{ \begin{array}{ll} \frac{{\widehat{f}}(k)}{ik}, &{} \, k\ne 0 \\ 0, &{} \, k=0 \\ \end{array} \right. \ . \end{aligned}$$

Remark 2

(ii), (iv), and the fact that f is a function on the torus, imply that if \(f_{\mathrm {avg}}=0\) we are allowed to use integration by parts with \(\partial _x^{-1}f(x)\) on this boundaryless manifold without qualms.

The last simple lemma in this revolves around our newly defined functional, \(E_{\theta }\).

Lemma 3

Let \(f,g\in L^2\left( {\mathbb {T}} \right) \) be such that \(f_{\mathrm {avg}}=0\) and let \(\theta \in {\mathbb {R}}\) be given. Then the entropy \(E_\theta (f,g)\), defined in (4), satisfies

$$\begin{aligned} E_\theta \left( f,g \right) \le \left( 1+\frac{\left|\theta \right|}{2} \right) \left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) \ . \end{aligned}$$

If in addition \(\left|\theta \right|<2\) we have that

$$\begin{aligned} E_\theta \left( f,g \right) \ge \left( 1-\frac{\left|\theta \right|}{2} \right) \left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) \ . \end{aligned}$$

In particular, if \(0\le \theta <2\) we have that

$$\begin{aligned} \left( 1-\frac{\theta }{2} \right) \left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) \le E_\theta \left( f,g \right) \le \left( 1+\frac{\theta }{2} \right) \left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) \ . \end{aligned}$$

Lastly, we shall prove the following theorem, which (finally) brings the system (3) into play, and on which we will rely on frequently in our future estimation.

Proposition 1

Let \(u,v\in C([0,\infty );L^2\left( {\mathbb {T}} \right) )\) be (real valued) mild solutions to (3) with initial datum \(u_0,\; v_0\in L^2\left( {\mathbb {T}} \right) \). Then for any \(\theta \in {\mathbb {R}}\)

$$\begin{aligned} \frac{d}{dt}E_{\theta }\left( u(t)-u_{\mathrm {avg}},v(t) \right)= & {} -\theta \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 + \frac{1}{2\pi } \int _0^{2\pi }(\theta -2\sigma (x))v(x,t)^2 dx\nonumber \\&+\frac{\theta }{2\pi }\int _0^{2\pi }\sigma (x) \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t)dx-\theta \left( v(t)_{\mathrm {avg}} \right) ^2\ ,\nonumber \\ \end{aligned}$$


$$\begin{aligned} u_{\mathrm {avg}}=\frac{1}{2\pi }\int _{0}^{2\pi }u_0(x)dx=\frac{1}{2\pi } \int _{0}^{2\pi }u(x,t)dx,\quad \forall t>0. \end{aligned}$$


We begin by noticing that the validity of (19) follows immediately from the fact that u is a mild solution and the conservation of mass property of the system (3). Moreover, one can see that replacing \(\left( u(t),v(t) \right) \) by \(\left( u(t)-u_{\mathrm {avg}},v(t) \right) \) yields an equivalent solution (up to a constant shift in the initial data) to the system of equations, with the additional condition that the average of the first component is zero for all \(t\ge 0\). With this observation in mind, we can assume without loss of generality that \(u_{\mathrm {avg}}=0\).

Using the Goldstein-Taylor equations we see that

$$\begin{aligned} \frac{d}{dt}\left\Vert u(t)\right\Vert ^2= & {} 2\left\langle u,\partial _t u\right\rangle =-2\left\langle u,\partial _x v\right\rangle .\\ \frac{d}{dt}\left\Vert v(t)\right\Vert ^2= & {} 2\left\langle v,\partial _t v\right\rangle =-2\left\langle v,\partial _x u+\sigma v\right\rangle . \end{aligned}$$


$$\begin{aligned} \left\langle u,\partial _x v\right\rangle +\left\langle v,\partial _x u\right\rangle =\frac{1}{2\pi }\int _{0}^{2\pi }\partial _x\left( uv \right) (x,t)dx=0\ ,\end{aligned}$$

we see that

$$\begin{aligned} \frac{d}{dt}\left( \left\Vert u(t)\right\Vert ^2+\left\Vert v(t)\right\Vert ^2 \right) =-\frac{1}{\pi }\int _{0}^{2\pi } \sigma (x)v(x,t)^2dx. \end{aligned}$$

We now turn our attention to the mixed term of \(E_\theta (u,v)\):

$$\begin{aligned}&\frac{d}{dt}\frac{\theta }{2\pi }\int _{0}^{2\pi } \partial _x^{-1}u (x,t)v(x,t)dx\\&\quad =\frac{\theta }{2\pi }\int _{0}^{2\pi } \partial _x^{-1}\left( \partial _t u \right) (x,t)v(x,t)dx +\frac{\theta }{2\pi }\int _{0}^{2\pi } \partial _x^{-1}u (x,t)\partial _t v(x,t)dx\\&\quad =-\frac{\theta }{2\pi }\int _{0}^{2\pi } \partial _x^{-1}\left( \partial _x v \right) (x,t)v(x,t)dx\\&\quad -\frac{\theta }{2\pi }\int _{0}^{2\pi } \partial _x^{-1}u (x,t)[\partial _x u(x,t)+\sigma (x)v(x,t)]dx. \end{aligned}$$

Using points (ii) and (iii) of Lemma 2, together with Remark 2, we find that the above equals

$$\begin{aligned}&-\frac{\theta }{2\pi }\int _{0}^{2\pi } \left( v(x,t)-v(t)_{\mathrm {avg}} \right) v(x,t) dx + \frac{\theta }{2\pi }\int _{0}^{2\pi } u(x,t)^2 dx \\&\quad -\frac{\theta }{2\pi }\int _{0}^{2\pi } \sigma (x) \partial _x^{-1}u(x,t)v(x,t)dx. \end{aligned}$$

Subtracting this from (20) (as there is a minus in definition (4)) yields (18). \(\square \)

Constant Relaxation Function

In recent years, the investigation of the Goldstein-Taylor model on \({\mathbb {T}}\) with constant relaxation function \(\sigma \) was frequently tackled with a modal decomposition in the Fourier space w.r.t. x. This approach allows for an extension to other discrete velocity models and even some continuous velocities models [1], but is not suitable for the non-homogeneous case.

Before beginning with our investigation we review a few recent results:

In [13, §1.4] exponential convergence to equilibrium was shown, but without the sharp rate. In [1, §4.1] a hypocoercive decay estimate of the form

$$\begin{aligned} \Big \Vert f(t)-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert _{L^2} \le c\,e^{-\mu t} \Big \Vert f_0-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert _{L^2}\ , \end{aligned}$$

with the vector notation from (6) and the sharp rate

$$\begin{aligned} \mu (\sigma )= \left\{ \begin{array}{ll} \frac{\sigma }{2}, &{} \,0< \sigma <2 \\ \frac{\sigma }{2}-\sqrt{\frac{\sigma ^2}{4}-1}, &{} \,\sigma >2 \\ \end{array} \right. \end{aligned}$$

was obtained (see also Fig. 1 below). A further study on the minimal constant c in the above was provided in [3, Th. 1.1].

Fig. 1

The exponential decay rate, \(\mu \left( \sigma \right) \), of the solution pair \((u(t)-u_{\mathrm {avg}},\,v(t))\) grows linearly until \(\sigma =2\) where the defectiveness appears (hence the circle). From that point onwards the decay rate decreases, and is of order \(O\left( \frac{1}{\sigma } \right) \)

With these results in mind, we turn our attention to the following (recast) Goldstein-Taylor equation with a constant relaxation rate:

$$\begin{aligned} \begin{aligned} \partial _t u(x,t)&= -\partial _x v(x,t),\\ \partial _t v(x,t)&= -\partial _x u(x,t) - \sigma v(x,t) \ . \end{aligned} \end{aligned}$$

In order to be able to discover our entropy functional, we shall consider the straightforward modal analysis in detail. This will allow us to obtain not only explicit decay rates for each Fourier mode, but also an “optimal Lyapunov functional” for such given mode, with which we will then be able to construct a non-modal entropy functional in terms of a pseudo-differential operator as defined in (4).

As was mentioned in §2, this will give us intuition to the large time behaviour of the equation in several cases even when \(\sigma (x)\) is not constant.

Fourier Analysis and the Spectral Gap

One natural way to understand the large time behaviour of (21) relies on a simple Fourier analysis together with a hypocoercivity technique that was developed by Arnold and Erb in [6]. We begin with the former, and focus on the latter from the next subsection onwards.

Using the Fourier transform on the torus (i.e. in the spatial variable), we see that (21) is equivalent to infinity many decoupled ODE systems:

$$\begin{aligned} \frac{d}{dt} \begin{pmatrix} {\widehat{u}}(k)\\ {\widehat{v}}(k) \end{pmatrix} = - \begin{pmatrix} 0&{} ik\\ ik&{} \sigma \end{pmatrix}\begin{pmatrix} {\widehat{u}}(k)\\ {\widehat{v}}(k) \end{pmatrix}:=-\mathbf{C}_k \begin{pmatrix} {\widehat{u}}(k)\\ {\widehat{v}}(k) \end{pmatrix} ,\qquad k\in {\mathbb {Z}}. \end{aligned}$$

The eigenvalues of the matrices \(\mathbf{C}_k\in {\mathbb {C}}^{2\times 2}\) are given by

$$\begin{aligned} \lambda _{\pm ,k}:= \frac{\sigma }{2} \pm \sqrt{\frac{\sigma ^2}{4}-k^2},\quad k\in {\mathbb {Z}}, \end{aligned}$$

and as such:

  • Invariant space: For \(k=0\) we find that \(\lambda _{-,0}=0\) and \(\lambda _{+,0}=\sigma \). In fact, as

    $$\begin{aligned} \mathbf{C}_0=\begin{pmatrix} 0&{} 0\\ 0&{} \sigma \end{pmatrix} \end{aligned}$$

    we can conclude immediately that \({\widehat{u}}(0,t)=\widehat{u_0}(0)\) and \({\widehat{v}}(0,t)=\widehat{v_0}(0)e^{-\sigma t}\), corresponding to the mass conservation of the original equation and the rapid decay of the difference between the masses of \(f_-\) and \(f_+\).

  • Case I: For \(0<|k|< \frac{\sigma }{2}\) one finds two real eigenvalues, whose minimum is

    $$\begin{aligned} \lambda _{-,k}=\frac{\sigma }{2} - \sqrt{\frac{\sigma ^2}{4}-k^2}=\frac{2k^2}{\sigma + \sqrt{\sigma ^2-4k^2}}\ , \end{aligned}$$

    i.e. the large time behaviour of \({\widehat{u}}(k)\) and \({\widehat{v}}(k)\) is controlled by \(e^{-\left( \frac{\sigma }{2} - \sqrt{\frac{\sigma ^2}{4}-k^2} \right) t}\).

  • Case II: For \(0<|k|=\frac{\sigma }{2}\in {\mathbb {N}}\) the two eigenvalues coincide and are equal to \(\frac{\sigma }{2}\). Moreover, that eigenvalue is defective (i.e. corresponds to a Jordan block of size 2) and the large time behaviour of \({\widehat{u}}(k)\) and \({\widehat{v}}(k)\) is controlled by \(\left( 1+t \right) e^{-\frac{\sigma }{2}t}\).

  • Case III: For \(|k|>\frac{\sigma }{2} \), one finds two complex conjugate eigenvalues, whose real part equals \(\frac{\sigma }{2}\). Thus the large time behaviour of \({\widehat{u}}(k)\) and \({\widehat{v}}(k)\) is controlled by \(e^{-\frac{\sigma }{2}t}\).

Fig. 2

The eigenvalues \(\lambda _{\pm ,k}\) of \(\mathbf{C}_k\), \(|k|\in {\mathbb {N}}\) for \(\sigma =5\). The spectral gap is \(\mu =(5-\sqrt{21})/2\)

From the observations above, we notice that as long as we subtract \({\widehat{u}}(0)\), i.e. as long as we remove the initial total mass from the original solution, all the modes converge exponentially to zero. Their rates have a sharp, and uniform-in-k lower bound that depends on \(\sigma \). This spectral gap of (21) will be denoted by \(\mu \left( \sigma \right) \).

Case I, i.e. \(0<|k|< \frac{\sigma }{2} \), is the most “difficult case” as the real part of the eigenvalues depends on k. However, one notices that the lower eigenvalue, \(\lambda _{-,k}\), increases with k, which implies that, if there are \(k-\)s such that \(0<|k|< \frac{\sigma }{2} \), the slowest possible convergence will be given by \(\lambda _{-,\pm 1}\). As we need to compare the decay rates of all modes simultaneously, we find that it is enough to consider the following possibilities:

  • \(0<\sigma <2\): We only have possibilities of Case III, implying that all modes are controlled by \(e^{-\frac{\sigma }{2}t}\).

  • \(\sigma =2\): We have possibilities of Case III, as well as defectiveness in \(k=\pm 1\) (Case II). This means that the modes are controlled by \(\left( 1+t \right) e^{-t}\). If one searches for a pure exponential control, the best rate one would find is \(e^{-\left( 1-\epsilon \right) t}\) for any given fixed \(\epsilon >0\).

  • \(\sigma >2\): We have possibilities from Cases I and III, and potentially Case II. All the modes that correspond to Case I are controlled by \(e^{-\left( \frac{\sigma }{2} - \sqrt{\frac{\sigma ^2}{4}-1} \right) t}\), while those that correspond to Case III are controlled by \(e^{-\frac{\sigma }{2}t}\). If Case II is realised, i.e. \(\frac{\sigma }{2}\in {\mathbb {N}}\setminus \left\{ 1\right\} \), we find that the modes \(k=\pm \frac{\sigma }{2}\) are controlled by \(\left( 1+t \right) e^{-\frac{\sigma }{2}t}\). In total, thus, all the modes are controlled by \(e^{-\left( \frac{\sigma }{2} - \sqrt{\frac{\sigma ^2}{4}-1} \right) t}\), a decay rate that is realised on the \(k=\pm 1\) modes, and the coefficient in the exponent is the spectral gap of the Goldstein-Taylor system (21).

An illustration of the eigenvalues of the matrices \(\mathbf{C}_k\) for \(\left|k\right|\in {\mathbb {N}}\) and \(\sigma =5\) can be viewed in Fig. 2.

Before we turn our attention to properly consider these cases and “uncover” our spatial entropy, we remind the reader of the hypocoercivity technique which will allow us to transform the spectral information of \(\mathbf{C}_k\) into a an appropriate, twisted norm with which we will show the desired decay of the k-th mode.

Hypocoercivity and Modal Lyapunov Functionals

In the previous subsection we have concluded that, barring the zero mode, all the Fourier modes of (22) decay exponentially (excluding potentially those with \(|k|=\frac{\sigma }{2}\) where a polynomial correction is required). The lack of positive definiteness of the governing matrix, \(\mathbf{C}_k\), stops us from seeing this behaviour in the Euclidean norm on \({\mathbb {C}}^2\). However, by modifying the norm with the help of another, closely related, positive definite matrix \({\mathbf {P}}_k\), one can construct a new Lyaponov functional, which is equivalent to the Euclidean norm, that decays with the expected exponential rate (at least for a non-defective \(\mathbf{C}_k\)).

This is exactly the idea that motivated Arnold and Erb, and which is expressed in the following theorem (see [6, 1, Lemma 2]):

Theorem 2

Let the matrix \(\mathbf{C}\in {\mathbb {C}}^{n\times n}\) be positive stable (i.e. have only eigenvalues with positive real parts). Let

$$\begin{aligned} \mu =\min \left\{ {\text {Re}}\lambda \;|\;\lambda \text { is an eigenvalue of }\mathbf{C}\right\} . \end{aligned}$$


  1. i)

    If all eigenvalues with real part equal to \(\mu \) are non-defective, there exists a Hermitian, positive definite matrix \({\mathbf {P}}\) such that

    $$\begin{aligned} \mathbf{C}^*{\mathbf {P}}+{\mathbf {P}}\mathbf{C}\ge 2\mu {\mathbf {P}}. \end{aligned}$$
  2. ii)

    If at least one eigenvalue with real part equal to \(\mu \) is defective, then for any \(\epsilon >0\), one can find a Hermitian, positive definite matrix \({\mathbf {P}}_{\epsilon }\) such that

    $$\begin{aligned} \mathbf{C}^*{\mathbf {P}}_{\epsilon }+{\mathbf {P}}_{\epsilon }\mathbf{C}\ge 2\left( \mu -\epsilon \right) {\mathbf {P}}_{\epsilon }\ , \end{aligned}$$

    where \(\mathbf{C}^*\) denotes the Hermitian transpose of \(\mathbf{C}\).

We remark that the matrices \({\mathbf {P}}\) and \({\mathbf {P}}_\epsilon \) are never unique.

One can utilise the theorem in the following way: Assuming the eigenvalues associated to \(\mathbf{C}\)’s spectral gap, \(\mu \), are non-defective, then by defining the norm

$$\begin{aligned} \left\Vert y\right\Vert ^2_{{\mathbf {P}}}:=\left\langle y,{\mathbf {P}}y\right\rangle =y^*{\mathbf {P}}y, \end{aligned}$$

one sees that, if y(t) solves the ODE \({\dot{y}}=-\mathbf{C}y\), then

$$\begin{aligned} \frac{d}{dt}\left\Vert y\right\Vert ^2_{{\mathbf {P}}}=-\left\langle y, \left( \mathbf{C}^*{\mathbf {P}}+{\mathbf {P}}\mathbf{C} \right) y\right\rangle \le -2 \mu \left\Vert y\right\Vert ^2_{{\mathbf {P}}}, \end{aligned}$$

resulting in the correct decay rate. The same approach works in the second case of Theorem 2.

Besides the general idea of this methodology, Arnold and Erb have given a recipe (one that was later extended in [7] to defective cases, using a time dependent matrix \({\mathbf {P}}\)) to finding the matrices \({\mathbf {P}}\, \mathrm{and}\, {\mathbf {P}}_{\epsilon }\):

Assuming that \(\mathbf{C}\) is diagonalisable, and letting \(\left\{ \omega _i\right\} _{i=1,\dots ,n}\) be the eigenvectors of \(\mathbf{C}^*\), the matrix \({\mathbf {P}}>0\) can be chosen to be

$$\begin{aligned} {\mathbf {P}}=\sum _{i=1}^n b_i \omega _i \otimes \omega _i^*, \end{aligned}$$

for any positive sequence \(\left\{ b_i\right\} _{i=1,\dots , n}\). The above formula remains true, for a particular choice of \(\left\{ b_i\right\} _{i=1,\dots , n}\), in the case where \(\mathbf{C}\) is not diagonalisable. In that case we also need to augment the eigenvectors with the generalised eigenvectors. We refer the interested reader to Lemma 4.3 in [6]. Moreover, for \(n=2\), the case we shall need below, and \(\mathbf{C}\) non-defective, all matrices \({\mathbf {P}}\) satisfying (24) are indeed of the form (27), see [3, Lemma 3.1].

We now turn our attention back to the Fourier transformed Goldstein-Taylor system (22) and determine the modal Lyapunov functionals using the above recipe. A short computation, where the weights \(b_1,\,b_2\) are chosen such that both diagonal elements of \({\mathbf {P}}\) are 1, finds the following matrices (For Case III we also require \(b_1=b_2\), as this minimises the number of the resulting admissible matrices \({\mathbf {P}}_k\) satisfying (24).):

  • Case I: \(0<|k|< \frac{\sigma }{2} \). In this case we have:

    $$\begin{aligned} {\mathbf {P}}_k^{(I)}:= \begin{pmatrix} 1&{} -\frac{2ki}{\sigma }\\ \frac{2ki}{\sigma }&{} 1 \end{pmatrix}, \end{aligned}$$
  • Case II: \(|k|= \frac{\sigma }{2}\in {\mathbb {N}}\). As this case fosters defective eigenvalues, we will only consider the case \(\sigma =2\) (as was mentioned beforehand), and state the matrix corresponding to \(k=\pm 1\) and a given fixed \(\epsilon >0\):

    $$\begin{aligned} {\mathbf {P}}^{(II)}_{\epsilon ,\pm 1}:= \begin{pmatrix} 1 &{} \mp \frac{i(2-\epsilon ^2)}{2+\epsilon ^2}\\ \pm \frac{i(2-\epsilon ^2)}{2+\epsilon ^2} &{} 1 \end{pmatrix} \end{aligned}$$
  • Case III: \(|k|> \frac{\sigma }{2} \). In this case we have:

    $$\begin{aligned} {\mathbf {P}}_k^{(III)}:= \begin{pmatrix} 1&{} -\frac{i\sigma }{2k}\\ \frac{i\sigma }{2k}&{} 1 \end{pmatrix} \end{aligned}$$

For each mode \(k\ne 0\), its modal Lyapunov functional will be given by \(\big \Vert \left( {\begin{array}{c}{\hat{u}} (k,t)\\ {\hat{v}} (k,t)\end{array}}\right) \big \Vert _{{\mathbf {P}}_k}^2\), where the matrix \({\mathbf {P}}_k\) is chosen according to the above three cases. In Case II, the parameter \(\epsilon >0\) can be chosen arbitrarily small.

Derivation of the Spatial Entropy \({E_\theta (u,v)}\)

The goal of this subsection is twofold: Finding a modal entropy to our system, and translating it to a spatial entropy that is modal-independent.

To begin with we shall define a modal entropy to quantify the exponential decay of solutions to (22) towards its steady state:

$$\begin{aligned} \widehat{u_\infty }(k) = \left\{ \begin{array}{ll} \widehat{u_0}(k=0)=(u_0)_{\mathrm {avg}}, &{} \, k=0 \\ 0, &{} \, k\ne 0 \\ \end{array} \right. \ ;\qquad \quad \widehat{v_\infty }(k)=0\ ,\, k\in {\mathbb {Z}}\ . \end{aligned}$$

Since the matrix \(\mathbf{C}_0\) from (23) has no spectral gap, the mode \(k=0\) plays a special role, and hence will be treated separately.

Once found, we will want to relate that modal-based entropy to the spatial entropy \(E_\theta \) from Definition 1, which is not based on a modal decomposition. To this end we already remark that the off-diagonal factors ik in (28) and 1/ik in (30) correspond in physical space, roughly speaking, to a first derivative and an anti-derivative, respectively.

As in §4.1 we shall distinguish three cases of \(\sigma \):

\({\varvec{0<\sigma <2}}:\) All modes \(k\ne 0\) satisfy \(|k|> \frac{\sigma }{2} \), and hence are of Case III. We recall from §4.1 that all modes decay here with the sharp rate \(\frac{\sigma }{2}\). For a modal entropy to reflect this decay, we hence have to use for each mode a Lyapunov functional \(\big \Vert \left( {\begin{array}{c}{\hat{u}} (k,t)\\ {\hat{v}} (k,t)\end{array}}\right) \big \Vert _{{\mathbf {P}}_k}^2\), where \({\mathbf {P}}_k\) satisfies the inequality (24) with \(\mu =\frac{\sigma }{2}\). \({\mathbf {P}}_k={\mathbf {P}}_k^{(III)}\) is the most convenient choice.

We define the modal entropy for any \(\left\{ {\widehat{u}}(k),{\widehat{v}}(k)\right\} _{k\in {\mathbb {Z}}}\) such that \({\widehat{u}}(0)=0\) as

$$\begin{aligned} {\mathscr {E}}\left( {\widehat{u}},{\widehat{v}} \right):= & {} \sum _{k\in {\mathbb {Z}}\setminus \left\{ 0\right\} }\left\Vert \begin{pmatrix} {\widehat{u}}(k)\\ {\widehat{v}}(k) \end{pmatrix}\right\Vert ^2_{{\mathbf {P}}_k^{(III)}}+\left\Vert \begin{pmatrix} {\widehat{u}}(0)\\ {\widehat{v}}(0) \end{pmatrix}\right\Vert ^2 \end{aligned}$$
$$\begin{aligned}= & {} \sum _{k\in {\mathbb {Z}}}\left( \left|{\widehat{u}}(k)\right|^2 -\sigma {\text {Re}}\left( \frac{{\widehat{u}}(k)}{ik}\overline{{\widehat{v}}(k)} \right) +\left|{\widehat{v}}(k)\right|^2 \right) \ , \end{aligned}$$

where we used the convention \(\frac{{\widehat{u}}(0)}{0}=0\). The mode \(k=0\) was included since \({\widehat{u}}(0,t)={\widehat{u}}(0)=0\) and \({\widehat{v}}(0,t)={\widehat{v}}(0)e^{-\sigma t}\). Using Plancherel’s equality, and (iv) from Lemma 2, we find that

$$\begin{aligned} {\mathscr {E}}\left( {\widehat{u}},{\widehat{v}} \right) =E_\sigma \left( u,v \right) , \end{aligned}$$

which shows why we consider the spatial entropy functional from Definition 1 in this case.

We note that, since \(u_{\mathrm {avg}}(t)\) is conserved, part (iv) of Lemma 2, explains why we have chosen to use the anti-derivative of u, and not of v.

\({\varvec{\sigma >2:}}\) This situation is more complicated than the previous one, as we have a mixture of at least two of the aforementioned three cases: finitely many \(k-\)s in \({\mathbb {Z}}\) for which \(0<|k|< \frac{\sigma }{2} \) (i.e. Case I), Case II for two \(k-\)s if \(\frac{\sigma }{2}\in {\mathbb {N}}\), while the rest satisfy \(|k|> \frac{\sigma }{2} \) (i.e. Case III). Following the above methodology to construct the modal entropy, we would need to use a combination of \({\mathbf {P}}_k^{(I)}\) and \({\mathbf {P}}_k^{(III)}\), given by (28) and (30), and potentially a matrix for the defective modes. This is feasible on the modal level, but does not easily translate back to the spatial variables. It would yield a complicated pseudo-differential operator “inside” the spatial entropy.

Recalling the discussion from §4.1 we see that the overall decay rate, \(\mu =\frac{\sigma }{2}-\sqrt{\frac{\sigma ^2}{4}-1}\) is only determined by the modes \(k=\pm 1\). Since all the other modes decay faster, we are not obliged to use “optimal” modal Lyapunov functionals for these higher modes. This gives some leeway for choosing the matrices \({\mathbf {P}}_k\), \(|k|>1\). Moreover, using these “optimal” functionals will result in worsening of (i.e. enlargement of) the multiplicative constant in the \(L^2\) hypocoercive estimation (7). Due to these reasons we will use the matrix

$$\begin{aligned} {\mathbf {P}}_k^{\mathrm {suff}}:={\mathbf {P}}_{k}^{(III)}\left( \sigma \rightarrow \frac{4}{\sigma } \right) =\begin{pmatrix} 1&{} -\frac{2i}{k\sigma }\\ \frac{2i}{k\sigma }&{} 1 \end{pmatrix} >0\ \end{aligned}$$

when \(k\ne 0\), which satisfies \({\mathbf {P}}_{\pm 1}^{\mathrm {suff}}={\mathbf {P}}_{\pm 1}^{(I)}\) for the crucial lowest modes. It also satisfies the following result, which implies exponential decay of all modal Lyapunov functionals \(\big \Vert \left( {\begin{array}{c}{\hat{u}} (k,t)\\ {\hat{v}} (k,t)\end{array}}\right) \big \Vert _{{\mathbf {P}}_k^{\mathrm {suff}}}^2\), \(k\ne 0\) with rate \(2\mu =\sigma -\sqrt{\sigma ^2-4}\).

Lemma 4

Let \(\sigma >2\). Then

$$\begin{aligned} \mathbf{C}^*_k {\mathbf {P}}^{\mathrm {suff}}_k+{\mathbf {P}}^{\mathrm {suff}}_k\mathbf{C}_k-2\mu {\mathbf {P}}^{\mathrm {suff}}_k\ge 0 \qquad \forall \,k\ne 0\ . \end{aligned}$$

The proof of this lemma is straightforwardFootnote 3. Proceeding like in (32) we define the modal entropy for any \(\left\{ {\widehat{u}}(k),{\widehat{v}}(k)\right\} _{k\in {\mathbb {Z}}}\) such that \({\widehat{u}}(0)=0\):

$$\begin{aligned} {\mathscr {E}}\left( {\widehat{u}},{\widehat{v}} \right) :=\sum _{k\in {\mathbb {Z}}\setminus \left\{ 0\right\} }\left\Vert \begin{pmatrix} {\widehat{u}}(k)\\ {\widehat{v}}(k) \end{pmatrix}\right\Vert ^2_{{\mathbf {P}}_k^{\mathrm {suff}}}+\left\Vert \begin{pmatrix} {\widehat{u}}(0)\\ {\widehat{v}}(0) \end{pmatrix}\right\Vert ^2\ . \end{aligned}$$

Due to (34) and (35) it is related to the spatial entropy functional from Definition 1 as

$$\begin{aligned} {\mathscr {E}}\left( {\widehat{u}},{\widehat{v}} \right) =E_{\frac{4}{\sigma }}\left( u,v \right) . \end{aligned}$$

\({\varvec{\sigma =2:}}\) Just like in the previous case, the lowest frequency modes \(k=\pm 1\) control the large time behaviour. However, the matrices \(\mathbf{C}_{\pm 1}\) are now defective, which leads to a (purely) exponential decay rate reduced by \(\epsilon \).

We proceed similarly to the case \(\sigma >2\) and define for some \(\epsilon >0\):

$$\begin{aligned} {\mathbf {P}}_{\epsilon ,k}^{\mathrm {suff}}= {\mathbf {P}}^{(III)}_{k}\left( \sigma \rightarrow \frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2} \right) =\begin{pmatrix} 1&{} -\frac{i\left( 2-\epsilon ^2 \right) }{k\left( 2+\epsilon ^2 \right) }\\ \frac{i\left( 2-\epsilon ^2 \right) }{k\left( 2+\epsilon ^2 \right) }&{} 1 \end{pmatrix}>0\ , \end{aligned}$$

which satisfies \({\mathbf {P}}_{\epsilon ,\pm 1}^{\mathrm {suff}}={\mathbf {P}}^{(II)}_{\epsilon ,\pm 1}\) for the crucial lowest model. It also satisfies the following result, which implies exponential decay of all modal Lyapunov functionals \(\big \Vert \left( {\begin{array}{c}{\hat{u}} (k,t)\\ {\hat{v}} (k,t)\end{array}}\right) \big \Vert _{{\mathbf {P}}_{\epsilon ,k}^{\mathrm {suff}}}^2\), \(k\ne 0\) with rate of at least \(2\mu =2(1-\epsilon )\).

Lemma 5

Let \(\sigma =2\). Then

$$\begin{aligned} \mathbf{C}^*_k {\mathbf {P}}^{\mathrm {suff}}_{\epsilon ,k}+{\mathbf {P}}^{\mathrm {suff}}_{\epsilon ,k}\mathbf{C}_k-2\mu {\mathbf {P}}^{\mathrm {suff}}_{\epsilon ,k}>0 \qquad \forall \,k\ne 0\ . \end{aligned}$$

Proceeding like in (32) we define the modal entropy for any \(\left\{ {\widehat{u}}(k),{\widehat{v}}(k)\right\} _{k\in {\mathbb {Z}}}\) such that \({\widehat{u}}(0)=0\):

$$\begin{aligned} {\mathscr {E}}\left( {\widehat{u}},{\widehat{v}} \right) :=\sum _{k\in {\mathbb {Z}}\setminus \left\{ 0\right\} }\left\Vert \begin{pmatrix} {\widehat{u}}(k)\\ {\widehat{v}}(k) \end{pmatrix}\right\Vert ^2_{{\mathbf {P}}_{\epsilon ,k}^{\mathrm {suff}}}+\left\Vert \begin{pmatrix} {\widehat{u}}(0)\\ {\widehat{v}}(0) \end{pmatrix}\right\Vert ^2\ . \end{aligned}$$

Due to (34) and (36) it is related to the spatial entropy functional from Definition 1 as

$$\begin{aligned} {\mathscr {E}}\left( {\widehat{u}},{\widehat{v}} \right) =E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}}\left( u,v \right) . \end{aligned}$$

The Evolution of the Spatial Entropy \(E_\theta \)

In the previous subsection we have shown how, depending on the value of \(\sigma \), the entropies \(E_{\sigma }\), \(E_{\frac{4}{\sigma }}\) and \(E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}}\) are the correct candidates to show the exponential convergence to equilibrium. A closer look at (26) shows that each modal Lyapunov functional \(\big \Vert \left( {\begin{array}{c}{\hat{u}} (k,t)\\ {\hat{v}} (k,t)\end{array}}\right) \big \Vert _{{\mathbf {P}}_k}^2\) decays exponentially, and hence also the spatial entropy \(E_\theta \). Recalling the decay rates presented in §4.3 for the three regimes of \(\sigma \), confirms that we have actually already proved most of part (a) of Theorem 1. However, as our main goal is to consider these functionals in the spatial variable alone (i.e. without a modal decomposition), we shall show how one achieves the correct convergence result following a direct calculation. This will also serve as a preparation for §5.

Theorem 3

Under the same conditions of Theorem 1 with \(\sigma (x)=\sigma \), one has that

  1. i)

    If \(0<\sigma <2\) then

    $$\begin{aligned} E_{\sigma }\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le E_{\sigma }\left( u_{0}-u_{\mathrm {avg}},v_0 \right) e^{- \sigma t}. \end{aligned}$$
  2. ii)

    If \(\sigma >2\) then

    $$\begin{aligned} E_{\frac{4}{\sigma }}\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le E_{\frac{4}{\sigma }}\left( u_{0}-u_{\mathrm {avg}},v_0 \right) e^{- \left( \sigma -\sqrt{\sigma ^2-4} \right) t}. \end{aligned}$$
  3. iii)

    If \(\sigma =2\) then for any \(0<\epsilon <1\)

    $$\begin{aligned} E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}}\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}}\left( u_{0}-u_{\mathrm {avg}},v_0 \right) e^{- 2(1-\epsilon ) t}. \end{aligned}$$


In order to prove this theorem we shall obtain differential inequalities for \(E_\theta \), from which we will conclude the desired result by a simple application of Gronwall’s inequality. Using Proposition 1 we find that:

\({\text {If }0<\sigma <2:}\)

$$\begin{aligned} \frac{d}{dt}E_{\sigma }\left( u(t)-u_{\mathrm {avg}},v(t) \right)= & {} -\sigma \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 - \sigma \left\Vert v(t)\right\Vert ^2\\&+\frac{\sigma ^2}{2\pi }\int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx-\sigma \left( v(t)_{\mathrm {avg}} \right) ^2 \\= & {} -\sigma E_{\sigma }\left( u(t)-u_{\mathrm {avg}},v(t) \right) -\sigma \left( v(t)_{\mathrm {avg}} \right) ^2 \le \\&-\sigma E_{\sigma }\left( u(t)-u_{\mathrm {avg}},v(t) \right) . \end{aligned}$$

Note that, since \(v_{\mathrm {avg}}(t)=(v_{0})_{\mathrm {avg}}\,e^{-\sigma t}\), we can compute \(E_{\theta }\left( u(t)-u_{\mathrm {avg}},v(t) \right) \) explicitly.

\({\text {If }\sigma >2:}\)

$$\begin{aligned} \frac{d}{dt}E_{\frac{4}{\sigma }}\left( u(t)-u_{\mathrm {avg}},v(t) \right)= & {} -\frac{4}{\sigma } \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 -\left( 2\sigma -\frac{4}{\sigma } \right) \left\Vert v(t)\right\Vert ^2\\&+\frac{4}{2\pi }\int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t)dx-\frac{4}{ \sigma } \left( v(t)_{\mathrm {avg}} \right) ^2\\\le & {} -\left( \sigma -\sqrt{\sigma ^2-4} \right) E_{\frac{4}{\sigma }}\left( u(t)-u_{\mathrm {avg}},v(t) \right) \\&+\left( \sigma -\sqrt{\sigma ^2-4}-\frac{4}{\sigma } \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2\\&+\left( \frac{4}{\sigma }-\sigma -\sqrt{\sigma ^2-4} \right) \left\Vert v(t)\right\Vert ^2\\&+\frac{4}{2\pi }\left( 1-\frac{\sigma -\sqrt{\sigma ^2-4}}{\sigma } \right) \int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx. \end{aligned}$$

The desired inequality, \(\frac{d}{dt}E_{\frac{4}{\sigma }}\le -\big (\sigma -\sqrt{\sigma ^2-4}\,\big ) E_{\frac{4}{\sigma }}\), is valid if and only if

$$\begin{aligned} \begin{aligned}&\frac{4}{2\pi }\int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t)dx \\&\quad \le \left( \sigma -\sqrt{\sigma ^2-4} \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2+\left( \sigma +\sqrt{\sigma ^2-4} \right) \left\Vert v(t)\right\Vert ^2. \end{aligned} \end{aligned}$$

Cauchy-Schwarz inequality, together with Poincaré inequality (Lemma 1) and Lemma 2, imply that

$$\begin{aligned}&\frac{4}{2\pi }\int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx\\&\quad \le 4\left\Vert u(t)-u_{\mathrm {avg}}\right\Vert \left\Vert v(t)\right\Vert \\&\quad =2\left( \sqrt{\sigma -\sqrt{\sigma ^2-4}}\left\Vert u(t)-u_{\mathrm {avg}}\right\Vert \right) \left( \sqrt{\sigma +\sqrt{\sigma ^2-4}}\left\Vert v(t)\right\Vert \right) \ . \end{aligned}$$

Together with the fact that \(2\left|ab\right|\le a^2+b^2\) this shows (37), concluding the proof in this case.

\(\underline{\text {If }\sigma =2:}\)

$$\begin{aligned} \begin{aligned}&\frac{d}{dt}E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}} \left( u(t)-u_{\mathrm {avg}},v(t) \right) \\&\quad =- \frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}\left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 -\frac{2\left( 2+3\epsilon ^2 \right) }{2+\epsilon ^2}\left\Vert v(t)\right\Vert ^2\\&\qquad +\frac{1}{2\pi }\cdot \frac{4\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}\int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx-\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2} \left( v(t)_{\mathrm {avg}} \right) ^2\\&\quad \le -2\left( 1-\epsilon \right) E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}}\left( u(t)-u_{\mathrm {avg}},v(t) \right) -2\epsilon \left( 1-\frac{2\epsilon }{2+\epsilon ^2} \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2\\&\qquad -2\epsilon \left( 1+\frac{2\epsilon }{2+\epsilon ^2} \right) \left\Vert v(t)\right\Vert ^2+\frac{1}{2\pi }\cdot \frac{4\epsilon \left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}\int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx. \end{aligned} \end{aligned}$$

Like before, the desired inequality will follow if

$$\begin{aligned}&\frac{1}{2\pi }\cdot \frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2} \int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t)dx \\&\quad \le \left( 1-\frac{2\epsilon }{2+\epsilon ^2} \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 +\left( 1+\frac{2\epsilon }{2+\epsilon ^2} \right) \left\Vert v(t)\right\Vert ^2. \end{aligned}$$

This is valid since

$$\begin{aligned}&\frac{1}{2\pi }\cdot \frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2} \int _0^{2\pi }\partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t)dx \\&\quad \le \frac{2\sqrt{4+\epsilon ^4}}{2+\epsilon ^2}\left\Vert u(t)-u_{\mathrm {avg}}\right\Vert \left\Vert v(t)\right\Vert \\&\quad = 2 \left( \sqrt{1-\frac{2\epsilon }{2+\epsilon ^2}}\left\Vert u(t)-u_{\mathrm {avg}}\right\Vert \right) \left( \sqrt{1+\frac{2\epsilon }{2+\epsilon ^2}}\left\Vert v(t)\right\Vert \right) \\&\quad \le \left( 1-\frac{2\epsilon }{2+\epsilon ^2} \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 +\left( 1+\frac{2\epsilon }{2+\epsilon ^2} \right) \left\Vert v(t)\right\Vert ^2, \end{aligned}$$

where we used Cauchy-Schwarz inequality, Poincaré inequality, and Lemma 2 again.

The theorem is now complete. \(\square \)

As the last part of this section, we finally prove part (a) of Theorem 1:

Proof of part (a) of Theorem 1

The decay estimates of \(E_{\theta (\sigma )}\) are already shown in Theorem 3. To show (7) and (9) we recall that

$$\begin{aligned} f_+ = \frac{u+v}{2},\quad f_-=\frac{u-v}{2}\ , \end{aligned}$$


$$\begin{aligned} \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2\le \frac{2}{2-\theta }E_{\theta }\left( f,g \right) ,\quad E_{\theta }\left( f,g \right) \le \frac{2+\theta }{2}\left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) \end{aligned}$$

for \(0<\theta <2\) and \(f_{\mathrm {avg}}=0\), according to Lemma 3. Thus, using the definition of \(f_\infty \) from (8) we see that

$$\begin{aligned}&\left\Vert f_+(t)-f_\infty \right\Vert ^2 + \left\Vert f_-(t)-f_\infty \right\Vert ^2 \\&\quad =\frac{1}{2}\left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2+\frac{1}{2}\left\Vert v(t)\right\Vert ^2 \le \frac{1}{2-\theta }E_\theta \left( u(t)-u_{\mathrm {avg}},v(t) \right) \\&\quad \le \frac{1}{2-\theta }E_\theta \left( u_0-u_{\mathrm {avg}},v_0 \right) e^{-2\mu (\sigma )t} \le \frac{1}{2}\cdot \frac{2+\theta }{2-\theta }\left( \left\Vert u_0-u_{\mathrm {avg}}\right\Vert ^2 +\left\Vert v_0\right\Vert ^2 \right) e^{-2\mu (\sigma )t} \\&\quad =\frac{2+\theta }{2-\theta } \left( \left\Vert f_{+,0}-f_\infty \right\Vert ^2+\left\Vert f_{-,0}-f_\infty \right\Vert ^2 \right) e^{-2\mu (\sigma )t}, \end{aligned}$$

which shows the result for the appropriate choices of \(\theta (\sigma )\) and \(\mu (\sigma )\). For \(\sigma =2\) we choose

$$\begin{aligned} \theta (2)=\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2},\quad \mu (2)=1-\epsilon \ . \end{aligned}$$

The sharpness of the decay rate for \(\sigma \ne 2\) can be verified easily on the first mode, e.g. for \(u_0=0\), \(v_0=e^{ix}\). \(\square \)

With the constant case fully behind us, we can now focus on the case where \(\sigma (x)\) is a non-constant function.

\(x-\)Dependent Relaxation Function

The large time behaviour of solutions to the Goldstein-Taylor equation (1), or equivalently its recast form (3), becomes increasingly harder to understand, if the relaxation function, \(\sigma (x)\), is not a constant. However, as shown in §4, we have managed to find a potential spatial entropy that captures the exact behaviour of the decay to equilibrium. The idea that we will employ in this section is to use the same type of entropy to try and estimate the convergence rate even when \(\sigma (x)\) is not constant. This is, as mentioned in the introduction, a perturbative approach - yet the methodology, and ideas, are robust enough to deal with more complicated systems, as will be shown in the next section.

A fundamental theorem to establish our main result, Theorem 1 (b), is the following:

Theorem 4

Let \(u,v\in C([0,\infty );L^2\left( {\mathbb {T}} \right) )\) be mild solutions to (3) with initial datum \(u_0,\; v_0\in L^2\left( {\mathbb {T}} \right) \). Denoting by \(u_{\mathrm {avg}}=\left( u_0 \right) _{\mathrm {avg}}\) we have that for any given \(0<\alpha ,\theta <2 \) the conditions

$$\begin{aligned} \begin{aligned} \alpha< \theta ,\quad \theta +\alpha < 2\sigma _{\mathrm {min}}\end{aligned} \end{aligned}$$


$$\begin{aligned} \begin{aligned} \sup _{x\in {\mathbb {T}}}\left( \theta ^2\left( \sigma (x)-\alpha \right) ^2-4\left( \theta -\alpha \right) \left( 2\sigma (x)-\theta -\alpha \right) \right) \le 0, \end{aligned} \end{aligned}$$

imply that

$$\begin{aligned} E_{\theta }\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le E_{\theta }\left( u_0-u_{\mathrm {avg}},v_0 \right) e^{-\alpha t},\quad t\ge 0. \end{aligned}$$


Using (18) from Proposition 1, and the fact that \(\theta \left( v(t)_{\mathrm {avg}} \right) ^2 \ge 0\), we find that

$$\begin{aligned} \begin{aligned} \frac{d}{dt}E_{\theta }\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le&-\alpha E_{\theta }\left( u(t)-u_{\mathrm {avg}},v(t) \right) -\left( \theta -\alpha \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 \\&- \frac{1}{2\pi } \int _0^{2\pi }( 2\sigma (x)-\theta -\alpha )v(x,t)^2 dx\\&+\frac{\theta }{2\pi }\int _0^{2\pi }\left( \sigma (x)-\alpha \right) \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx. \end{aligned} \end{aligned}$$

The proof of the theorem will follow from the above inequality if we can show that

$$\begin{aligned} \begin{aligned}&\frac{\theta }{2\pi }\int _0^{2\pi }\left( \sigma (x)-\alpha \right) \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx \\&\quad \le \left( \theta -\alpha \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 +\frac{1}{2\pi } \int _0^{2\pi }( 2\sigma (x)-\theta -\alpha )v(x,t)^2 dx. \end{aligned} \end{aligned}$$

Due to condition (38) we have that

$$\begin{aligned} \inf _{x\in {\mathbb {T}}}\left( 2\sigma (x)-\theta -\alpha \right) =2\sigma _{\mathrm {min}}-\theta -\alpha >0. \end{aligned}$$

Hence, we obtain with Cauchy-Schwarz, Young’s inequality \(\left|ab\right| \le \frac{a^2}{\theta }+\frac{\theta b^2}{4}\), and the Poincaré inequality, (13), that

$$\begin{aligned} \begin{aligned}&\left|\frac{\theta }{2\pi }\int _0^{2\pi }\left( \sigma (x)-\alpha \right) \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx \right|\\&\quad \le \frac{\theta }{2\pi }\int _0^{2\pi }\sqrt{2\sigma (x)-\theta -\alpha } \left|v(x,t)\right| \frac{\left|\sigma (x)-\alpha \right|}{\sqrt{2\sigma (x)-\theta -\alpha }}\left|\partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) \right|dx \\&\quad \le \frac{\theta }{2\pi }\left( \int _0^{2\pi }\left( 2\sigma (x)-\theta -\alpha \right) v(x,t)^2dx \right) ^{\frac{1}{2}}\\&\qquad \left( \int _{0}^{2\pi }\frac{\left( \sigma (x) -\alpha \right) ^2}{2\sigma (x)-\theta -\alpha } \left( \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) \right) ^2dx \right) ^{\frac{1}{2}} \\&\quad \le \frac{1}{2\pi } \int _0^{2\pi }( 2\sigma (x)-\theta -\alpha )v(x,t)^2 dx \\&\qquad + \frac{1}{2\pi } \int _0^{2\pi }\frac{\theta ^2\left( \sigma (x) -\alpha \right) ^2}{4\left( 2\sigma (x)-\theta -\alpha \right) }\left( \partial _x^{-1}\left( u(x,t) -u_{\mathrm {avg}} \right) \right) ^2 dx \\&\quad \le \frac{1}{2\pi } \int _0^{2\pi }( 2\sigma (x)-\theta -\alpha )v(x,t)^2 dx + \sup _{x\in {\mathbb {T}}}\left( \frac{\theta ^2\left( \sigma (x)-\alpha \right) ^2}{4\left( 2\sigma (x) -\theta -\alpha \right) } \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 . \end{aligned} \end{aligned}$$

The above implies that (42) will be valid when

$$\begin{aligned} \sup _{x\in {\mathbb {T}}}\frac{\theta ^2\left( \sigma (x)-\alpha \right) ^2}{4\left( 2\sigma (x)-\theta -\alpha \right) } \le \theta -\alpha , \end{aligned}$$

which is equivalent, due to the positivity of the denominator, to (39). The proof is thus complete. \(\square \)

Remark 3

It is worth to note that the conditions expressed in (38) are crucial in our estimation. Indeed, they tell us that

$$\begin{aligned} \left( \theta -\alpha \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2\quad \text {and}\quad \int _0^{2\pi }( 2\sigma (x)-\theta -\alpha )v(x,t)^2 dx \end{aligned}$$

are non-negative. If one part of the condition would not be true, we would be able to “cook” initial data such that the mixed uv–term in (42) is zero, and the above terms add up to something strictly negative - breaking the functional inequality we are aiming to attain.

The next step towards proving part (b) in Theorem 1 is to look for \(\theta \) and \(\alpha \) such that conditions (38) and (39) are satisfied.

We recall the definition of \(\theta ^*\) from Theorem 1:

$$\begin{aligned} \theta ^*:=\min \left( \sigma _{\mathrm {min}},\frac{4}{\sigma _{\mathrm {max}}} \right) , \end{aligned}$$

which in a sense captures the “worst possible” behaviour when comparing \(\sigma (x)\) to the constant case (with \(\sigma \not =2\)). We show the following:

Lemma 6

Assume that \(0<\sigma _{\mathrm {min}}<\sigma _{\mathrm {max}}<\infty \), where \(\sigma _{\mathrm {min}}\) and \(\sigma _{\mathrm {max}}\) were defined in Theorem 1. Then

$$\begin{aligned} \alpha ^*:=\alpha ^*\left( \sigma _{\mathrm {min}},\sigma _{\mathrm {max}} \right) :={\left\{ \begin{array}{ll} \frac{\sigma _{\mathrm {min}}\left( 4+2\sqrt{4-\sigma _{\mathrm {min}}^2}-\sigma _{\mathrm {min}}\sigma _{\mathrm {max}} \right) }{4+2\sqrt{4-\sigma _{\mathrm {min}}^2} -\sigma _{\mathrm {min}}^2}, &{} \sigma _{\mathrm {min}}< \frac{4}{\sigma _{\mathrm {max}}}\\ \sigma _{\mathrm {max}}- \sqrt{\sigma _{\mathrm {max}}^2-4}, &{} \sigma _{\mathrm {min}}\ge \frac{4}{\sigma _{\mathrm {max}}} \end{array}\right. } \end{aligned}$$

is such that \(\theta ^*\) and \(\alpha ^*\) satisfy conditions (38) and (39).


Clearly, since

$$\begin{aligned} \theta ^*\le {\left\{ \begin{array}{ll} \sigma _{\mathrm {min}}, &{} \sigma _{\mathrm {min}}<\sigma _{\mathrm {max}}\le 2 \\ \frac{4}{\sigma _{\mathrm {max}}}, &{} \sigma _{\mathrm {max}}>2 \end{array}\right. } \end{aligned}$$

we always have that \(0<\theta ^*<2\).

We continue by considering condition (39), and finding appropriate parameters which will give condition (38) automatically. Denoting by

$$\begin{aligned} f\left( \alpha ,\theta ,y \right) :=\theta ^2\left( y-\alpha \right) ^2-4\left( \theta -\alpha \right) \left( 2y-\theta -\alpha \right) \end{aligned}$$

for \(\left( \alpha ,\theta \right) \) that satisfy condition (38) and \(y\in \left[ \sigma _{\mathrm {min}},\sigma _{\mathrm {max}} \right] \), we find that for fixed \(\alpha \) and \(\theta \), f is an upward parabola in y whose non-positive part lies between its roots

$$\begin{aligned} y_{\pm }\left( \alpha ,\theta \right) :=\alpha + \frac{2\left( \theta -\alpha \right) }{\theta ^2}\left( 2\pm \sqrt{4-\theta ^2} \right) . \end{aligned}$$

Thus, condition (39) is satisfied if and only if

$$\begin{aligned} y_{-}\left( \alpha ,\theta \right) \le \sigma _{\mathrm {min}},\quad \text {and}\quad \sigma _{\mathrm {max}}\le y_{+}\left( \alpha ,\theta \right) . \end{aligned}$$

A simple calculation shows that for \(0<\theta <2\)

$$\begin{aligned} y_{-}\left( \alpha ,\theta \right)\le & {} \sigma _{\mathrm {min}}\;\;\Leftrightarrow \;\; \alpha \le \frac{\theta \left( 2\sqrt{4-\theta ^2}-\left( 4-\sigma _{\mathrm {min}}\theta \right) \right) }{2\sqrt{4-\theta ^2}-\left( 4-\theta ^2 \right) }=:\gamma _{\mathrm {min}}\left( \theta \right) , \\ \sigma _{\mathrm {max}}\le & {} y_{+}\left( \alpha ,\theta \right) \;\;\Leftrightarrow \;\; \alpha \le \frac{\theta \left( 2\sqrt{4-\theta ^2}+\left( 4-\sigma _{\mathrm {max}}\theta \right) \right) }{2\sqrt{4-\theta ^2}+\left( 4-\theta ^2 \right) }=:\gamma _{\mathrm {max}}\left( \theta \right) . \end{aligned}$$

This means that, if we choose \(\alpha \left( \theta \right) \) for a fixed \(\theta \) so that condition (39) is valid, we must have that

$$\begin{aligned} \alpha \left( \theta \right) \le \min \left( \gamma _{\mathrm {min}}\left( \theta \right) ,\gamma _{\mathrm {max}}\left( \theta \right) \right) . \end{aligned}$$

One can continue and show that (see Appendix B):

  1. (i)

    For \(\theta \le \sigma _{\mathrm {min}}\) and \(0<\theta <2\) we have that \(\gamma _{\mathrm {max}}\left( \theta \right) \le \gamma _{\mathrm {min}}\left( \theta \right) \).

  2. (ii)

    For \(\theta \le \frac{4}{\sigma _{\mathrm {max}}}\) and \(0<\theta <\sigma _{\mathrm {max}}\) we have that \(0<\gamma _{\mathrm {max}}\left( \theta \right) < \theta \).

With these observations we deduce that for any

$$\begin{aligned} \theta \in (0,\theta ^*]= \left( 0,\min \left( \sigma _{\mathrm {min}},\frac{4}{\sigma _{\mathrm {max}}} \right) \right] \cap (0,2) \end{aligned}$$

we have \(\theta <\sigma _{\mathrm {max}}\) and hence

$$\begin{aligned} \gamma _{\mathrm {max}}(\theta )=\min \left( \gamma _{\mathrm {min}}(\theta ), \gamma _{\mathrm {max}}(\theta ) \right) \quad \text {and} \quad \gamma _{\max }\left( \theta \right) <\theta . \end{aligned}$$

Hence, the pair \(\left( \theta ,\alpha =\gamma _{\mathrm {max}}(\theta ) \right) \) satisfies not only condition (39) but also

$$\begin{aligned} \gamma _{\mathrm {max}}(\theta )+\theta< 2\theta \le 2\theta ^*\le 2\sigma _{\mathrm {min}}\quad \text {and}\quad \gamma _{\mathrm {max}}\left( \theta \right) < \theta , \end{aligned}$$

i.e. condition (38). We conclude that \(\theta \) and \(\alpha =\gamma _{\mathrm {max}}\left( \theta \right) \) satisfy both desired conditions, for any \(\theta \in (0,\theta ^*]\).

Noticing that

$$\begin{aligned} \gamma _{\mathrm {max}}\left( \theta ^* \right) = \left\{ \begin{aligned}&\frac{\sigma _{\mathrm {min}}\left( 2\sqrt{4-\sigma _{\mathrm {min}}^2}+\left( 4-\sigma _{\mathrm {max}}\sigma _{\mathrm {min}} \right) \right) }{2\sqrt{4-\sigma _{\mathrm {min}}^2} +\left( 4-\sigma _{\mathrm {min}}^2 \right) },&\sigma _{\mathrm {min}}< \frac{4}{\sigma _{\mathrm {max}}} \\&\frac{\frac{8}{\sigma _{\mathrm {max}}}\sqrt{4-\frac{16}{\sigma _{\mathrm {max}}^2}}}{2\sqrt{4-\frac{16}{\sigma _{\mathrm {max}}^2}}+ 4-\frac{16}{\sigma _{\mathrm {max}}^2}},&\sigma _{\mathrm {min}}>\frac{4}{\sigma _{\mathrm {max}}} \end{aligned}\right\} =\alpha ^*\left( \sigma _{\mathrm {min}},\sigma _{\mathrm {max}} \right) , \end{aligned}$$

we conclude the proof. \(\square \)

Remark 4

The choice of \(\alpha ^*\left( \sigma _{\mathrm {min}},\sigma _{\mathrm {max}} \right) =\gamma _{\mathrm {max}}\left( \theta ^* \right) \) is not accidental. Indeed, one can easily show that

$$\begin{aligned} \frac{d}{d\theta }\gamma _{\mathrm {max}}\left( \theta \right) =\frac{8-2\sigma _{\mathrm {max}}\theta }{\left( 4-\theta ^2 \right) ^{\frac{3}{2}}}, \end{aligned}$$

and as such

$$\begin{aligned} \max _{\theta \in (0,\theta ^*]}\gamma _{\mathrm {max}}\left( \theta \right) = \gamma _{\mathrm {max}}\left( \theta ^* \right) . \end{aligned}$$

As the parameter \(\alpha ^*=\gamma _{\mathrm {max}}\left( \theta ^* \right) \) corresponds to the decay rate of our entropy according to Theorem 4, our choice of \(\alpha ^*\left( \sigma _{\mathrm {min}},\sigma _{\mathrm {max}} \right) \) was motivated by maximising the decay rate that is possible with our methodology.

We now posses all the tools which are required to prove part (b) of Theorem 1.

Proof of part (b) of Theorem 1

The convergence estimation for \(E_{\theta ^*}\left( u(t)-u_{\mathrm {avg}},v(t) \right) \) follows immediately from Theorem 4 and Lemma 6. To obtain (12) we use Lemma 3 in a similar fashion to the way we proved part (a). \(\square \)

Convergence to Equilibrium in a \(3-\)Velocity Goldstein-Taylor Model

The Goldstein-Taylor model can be thought of as a simplification of the BGK equation [1, 9]

$$\begin{aligned}&\partial _{t}f(x,v,t)+ v\cdot \nabla _x f(x,v,t)- \nabla _x V(x)\cdot \nabla _{v}f(x,v,t)\\&\qquad =M(v)\int f(x,v,t)dv - f(x,v,t), \end{aligned}$$

where the variable v is now in the discrete velocity space \(\left\{ v_1,\dots ,v_n\right\} \), the variable x is in the torus \({\mathbb {T}}\), and the potential V(x) is zero. The r.h.s. of the above BGK equation corresponds to a projection onto the Maxwellian M(v); in the discrete velocity case this Maxwellian is replaced by a constant matrix that determines the large time behaviour of the new model. Under the natural physical assumption of symmetry in the velocities (i.e. \(\sum _{i=1}^n v_i=0\)) and the expectation that the solutions will converge towards a state that is equally distributed in v and constant in xFootnote 4, we find one potential multi-velocity extension of the Goldstein-Taylor model on \({\mathbb {T}}\times (0,\infty )\):

$$\begin{aligned} \partial _{t}\begin{pmatrix} f_1(x,t) \\ \vdots \\ f_n(x,t) \end{pmatrix} +{\mathscr {V}} \begin{pmatrix} f_1(x,t) \\ \vdots \\ f_n(x,t) \end{pmatrix} = \sigma (x)\left( \begin{pmatrix} \frac{1}{n} \\ \vdots \\ \frac{1}{n} \end{pmatrix}\otimes \left( 1,\dots ,1 \right) -\mathbf{I} \right) \begin{pmatrix} f_1(x,t) \\ \vdots \\ f_n(x,t) \end{pmatrix}, \end{aligned}$$

with the the diagonal matrix \({\mathscr {V}}:=\text{ diag }[v_1,\dots ,v_n]\), and the discrete velocities

$$\begin{aligned} \left\{ v_1,\dots ,v_n\right\} ={\left\{ \begin{array}{ll} \left\{ -k+\frac{1}{2},\dots ,-\frac{1}{2},\frac{1}{2},\dots , k-\frac{1}{2}\right\} , &{} n=2k \\ \left\{ -k,\dots , -1,0,1,\dots , k \right\} ,&{} n=2k-1\end{array}\right. },\quad n\in {\mathbb {N}},\; n\ge 2. \end{aligned}$$

The matrix on the r.h.s. of (44) takes the form

$$\begin{aligned} {\varvec{Q}}=\frac{1}{n}\begin{pmatrix} 1-n &{} 1 &{} \dots &{} 1 \\ 1 &{} 1-n &{} \dots &{} 1 \\ \vdots &{} \vdots &{} \vdots &{} \vdots \\ 1 &{} 1 &{} \dots &{} 1-n \end{pmatrix} \end{aligned}$$

which has \(\left( 1,1,\dots , 1 \right) ^T\) in its kernel, and \({\mathscr {A}}=\left\{ \left( \xi _1,\dots , \xi _n \right) ^T\in {\mathbb {R}}^n\;|\; \sum _{i=1}^n \xi _i=0\right\} \) as its \(n-1\) dimensional eigenspace corresponding to the eigenvalue \(\lambda =-1\). This corresponds to the conservation of total mass, and the fact that differences such as \(\left\{ f_i-f_j\right\} _{i,j=1,\dots , n}\) converge to zero. For more information we refer the interested reader to [1].

In this section we will consider a simple \(3-\)velocity Goldstein-Taylor model, which is governed by the following system of equations on \({\mathbb {T}}\times (0,\infty )\)

$$\begin{aligned} \begin{aligned}&\partial _t f_1(x,t) + \partial _x f_1(x,t)=\frac{\sigma (x)}{3}\left( f_2(x,t)+f_3(x,t)-2f_1(x,t) \right) ,\\&\partial _t f_2(x,t) =\frac{\sigma (x)}{3}\left( f_1(x,t)+f_3(x,t)-2f_2(x,t) \right) ,\\&\partial _t f_3(x,t) - \partial _x f_3(x,t)=\frac{\sigma (x)}{3}\left( f_1(x,t)+f_2(x,t)-2f_3(x,t) \right) . \end{aligned} \end{aligned}$$

Much like our Goldstein-Taylor equation, (1), we can recast the above with the variables

$$\begin{aligned} \begin{pmatrix} u_1 \\ u_2 \\ u_3 \end{pmatrix}=\begin{pmatrix} \frac{1}{\sqrt{3}} &{} \frac{1}{\sqrt{3}} &{} \frac{1}{\sqrt{3}} \\ \frac{1}{\sqrt{2}} &{} 0 &{} -\frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{6}} &{} -\frac{2}{\sqrt{6}} &{} \frac{1}{\sqrt{6}} \end{pmatrix} \begin{pmatrix} f_1 \\ f_2 \\ f_3 \end{pmatrix}\ , \end{aligned}$$

which yields the following set of equations:

$$\begin{aligned} \begin{aligned}&\partial _t u_1(x,t) + \sqrt{\frac{2}{3}}\partial _x u_2(x,t)=0,\\&\partial _t u_2(x,t) +\sqrt{\frac{2}{3}} \partial _x u_1(x,t)+\frac{1}{\sqrt{3}}\partial _x u_3(x,t)=-\sigma (x) u_2(x,t),\\&\partial _t u_3(x,t) + \frac{1}{\sqrt{3}}\partial _x u_2(x,t)= - \sigma (x) u_3(x). \end{aligned} \end{aligned}$$

The orthogonal transformation (46) has a strong geometrical reasoning behind it, as it diagonalises the appropriate “interaction matrix”, \({\varvec{Q}}\). It is also worth to mention that much like (3), this transformations brings us to the macroscopic variables. Indeed, up to some scaling \(u_1\) is the mass, \(u_2\) is the flux, and \(u_3\) is a linear combination of the kinetic energy and the mass.

Following our intuition we expect that by denoting

$$\begin{aligned} u_\infty := \frac{1}{2\sqrt{3}\pi }\int _{{\mathbb {T}}}\left( f_{1,0}(x)+f_{2,0}(x)+f_{3,0}(x) \right) dx, \end{aligned}$$

we will find that

$$\begin{aligned} u_1(t,x){\mathop {\longrightarrow }\limits ^{t\rightarrow \infty }} u_\infty ,\quad u_2(t,x){\mathop {\longrightarrow }\limits ^{t\rightarrow \infty }} 0,\quad u_3(t,x){\mathop {\longrightarrow }\limits ^{t\rightarrow \infty }} 0. \end{aligned}$$

To prove this result we shall introduce an appropriate Lyapunov functional. To find this functional, we have two options, even for the simple case of constant \(\sigma \) (which is our base case): Proceeding as in § 4.2, we could use a modal decomposition of (47) and the (optimal) positive definite matrices \({\mathbf {P}}_k\) to construct an entropy functional with sharp decay, and then rewrite it in physical space, using pseudo-differential operators. This construction, which is analogous to the construction of \(E_\theta (f,g)\) from (4), can become extremely cumbersome in dimension 3 and higher.

As a simpler alternative we shall hence rather follow the strategy from [1, §4.3] and [2, §2.3]: In Fourier space, the system matrix of (47) reads as

$$\begin{aligned} \mathbf{C}_k=\begin{pmatrix} 0 &{} \sqrt{\frac{2}{3}}ik &{} 0 \\ \sqrt{\frac{2}{3}}ik &{} \sigma &{} {\frac{1}{\sqrt{3}}}ik \\ 0 &{} {\frac{1}{\sqrt{3}}}ik &{} 0 \end{pmatrix}\ . \end{aligned}$$

We note that, for \(k\ne 0\), the hypocoercivity indexFootnote 5 of \(\mathbf{C}_k\), as well as of (47) is one, since this index is always bounded from above by the kernel dimension of the symmetric part of the generator, cf. [2]. For such index-1 problems, Theorem 2.6 from [2] shows that the choice

$$\begin{aligned} {\mathbf {P}}_k=\begin{pmatrix} 1 &{} \frac{\lambda }{ik} &{} 0 \\ -\frac{\lambda }{ik} &{} 1 &{} 0 \\ 0 &{} 0 &{} 1 \end{pmatrix}\ \quad k\ne 0, \end{aligned}$$

with an appropriate \(\lambda \in {\mathbb {R}}\), always yields a (simple) Lyapunov functional for (47), typically with a sub-optimal decay rate. Much like in § 5, this guides us to the definition of our functional, expressed in the following theorem:

Theorem 5

Let \(u_1,u_2,u_3\in C([0,\infty );L^2\left( {\mathbb {T}} \right) )\) be mild real valued solutions to (47) with initial datum \(u_{1,0},u_{2,0},u_{3,0}\in L^2\left( {\mathbb {T}} \right) \). Denoting by

$$\begin{aligned} {\mathfrak {E}}_{\theta }\left( f,g,h \right) :=\left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2+\left\Vert h\right\Vert ^2 - \frac{\theta }{2\pi }\int _{0}^{2\pi }\left( \partial _x^{-1}f(x) \,{g(x)} \right) dx, \end{aligned}$$

we have that

$$\begin{aligned} {\mathfrak {E}}_{\theta }\left( u_1(t)-u_{\infty },u_2(t),u_3(t) \right) \le {\mathfrak {E}}_{\theta }\left( u_{1,0}-u_{\infty },u_{2,0},u_{3,0} \right) e^{-\alpha t}\ ,\quad t\ge 0, \end{aligned}$$

for any \(\theta >0\) and \(\alpha >0\) such that

$$\begin{aligned} \sqrt{\frac{2}{3}}\theta +\alpha < 2\sigma _{\mathrm {min}}, \quad \alpha \le \sqrt{\frac{2}{3}}\theta , \end{aligned}$$


$$\begin{aligned} \left( \sup _{x\in {\mathbb {T}}}\frac{\theta ^2\left( \sigma (x)-\alpha \right) ^2}{8\sigma (x) -4\sqrt{\frac{2}{3}}\theta -4\alpha } \right) +\left( \sup _{x\in {\mathbb {T}}}\frac{\theta ^2}{12\left( 2\sigma (x)-\alpha \right) } \right) \le \sqrt{\frac{2}{3}}\theta -\alpha . \end{aligned}$$

Remark 5

For \(0<\theta <2\), \({\mathfrak {E}}_{\theta }\left( f,g,h \right) \) is equivalent to \(\left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2+\left\Vert h\right\Vert ^2\). Indeed, following Lemma 3 we see that

$$\begin{aligned} \left( 1-\frac{\left|\theta \right|}{2} \right) \left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) +\left\Vert h\right\Vert ^2\le {\mathfrak {E}}_{\theta }\left( f,g,h \right) \le \left( 1+\frac{\left|\theta \right|}{2} \right) \left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) +\left\Vert h\right\Vert ^2. \end{aligned}$$

Proof of Theorem 5

We start by noticing that the transformation

$$\begin{aligned} u_1\rightarrow u_1-u_\infty ,\quad u_2\rightarrow u_2,\quad u_3\rightarrow u_3 \end{aligned}$$

keeps (47) invariant, so we may assume, without loss of generality, that \(u_\infty =0\). This, together with the equation for \(u_1(x,t)\) implies that

$$\begin{aligned} \left( u_1(t) \right) _{\mathrm {avg}}=\left( u_{1,0} \right) _{\mathrm {avg}}=u_{\infty }=0. \end{aligned}$$

Next, we compute the time derivatives of the \(L^2\) norms and obtain:

$$\begin{aligned} \begin{aligned} \frac{d}{dt}\left( \left\Vert u_1(t)\right\Vert ^2 + \left\Vert u_2(t)\right\Vert ^2+\left\Vert u_3(t)^2\right\Vert \right) =&-\frac{1}{\pi }\int _0^{2\pi }\sigma (x)u_2(x,t)^2dx\\&-\frac{1}{\pi }\int _0^{2\pi }\sigma (x)u_3(x,t)^2dx. \end{aligned} \end{aligned}$$

Continuing, we see that

$$\begin{aligned} \begin{aligned} \frac{d}{dt}\int _{0}^{2\pi } \partial _x^{-1}u_1(x,t) u_2(x,t)dx&=2\pi \sqrt{\frac{2}{3}}\left( \left( u_2(t)_{\mathrm {avg}} \right) ^2-\left\Vert u_2(t)\right\Vert ^2 \right) + \frac{2\sqrt{2}\pi }{\sqrt{3}}\left\Vert u_1(t)\right\Vert ^2 \\&+\frac{1}{\sqrt{3}}\int _{0}^{2\pi } u_1(x,t)u_3(x,t)dx\\&-\int _{0}^{2\pi } \sigma (x)\partial _x^{-1}u_1(x,t) u_2(x,t)dx, \end{aligned} \end{aligned}$$

where we used Lemma 2. As such, together with (51), we conclude that

$$\begin{aligned} \begin{aligned}&\frac{d}{dt}{\mathfrak {E}}_{\theta }\left( u_1(t),u_2(t),u_3(t) \right) \\&\quad = -\frac{1}{2\pi }\int _0^{2\pi }\left( 2\sigma (x)-\sqrt{\frac{2}{3}}\theta \right) u_2(x,t)^2dx\\&\qquad -\frac{1}{\pi }\int _0^{2\pi }\sigma (x)u_3(x,t)^2dx-\sqrt{\frac{2}{3}} \theta \left\Vert u_1(t)\right\Vert ^2-\sqrt{\frac{2}{3}}\theta \left( u_2(t)_{\mathrm {avg}} \right) ^2\\&\qquad -\frac{\theta }{2\sqrt{3}\pi }\int _{0}^{2\pi } u_1(x,t)u_3(x,t)dx+\frac{\theta }{2\pi }\int _{0}^{2\pi } \sigma (x)\partial _x^{-1}u_1(x,t) u_2(x,t)dx. \end{aligned} \end{aligned}$$


$$\begin{aligned} \frac{d}{dt}{\mathfrak {E}}_{\theta }\left( u_1(t),u_2(t),u_3(t) \right) =-\alpha {\mathfrak {E}}_{\theta }\left( u_1(t),u_2(t),u_3(t) \right) +R_{\theta ,\alpha ,\sigma }(t) \end{aligned}$$


$$\begin{aligned} \begin{aligned} R_{\theta ,\alpha ,\sigma }(t):=&-\frac{1}{2\pi }\int _0^{2\pi }\left( 2\sigma (x) -\sqrt{\frac{2}{3}}\theta -\alpha \right) u_2(x,t)^2dx\\&-\frac{1}{2\pi }\int _0^{2\pi }\left( 2\sigma (x)-\alpha \right) u_3(x,t)^2dx -\left( \sqrt{\frac{2}{3}}\theta -\alpha \right) \left\Vert u_1(t)\right\Vert ^2\\&-\sqrt{\frac{2}{3}}\theta \left( u_2(t)_{\mathrm {avg}} \right) ^2-\frac{\theta }{2\sqrt{3}\pi }\int _{0}^{2\pi } u_1(x,t)u_3(x,t)dx\\&+\frac{\theta }{2\pi }\int _{0}^{2\pi } \left( \sigma (x)-\alpha \right) \partial _x^{-1}u_1(x,t) u_2(x,t)dx. \end{aligned} \end{aligned}$$

To conclude the proof it is enough to show that under conditions (49) and (50) we have that \(R_{\theta ,\alpha ,\sigma }(t) \le 0\). We will, in fact, show the stronger statement:

$$\begin{aligned} \begin{aligned}&\left|-\frac{\theta }{2\sqrt{3}\pi }\int _{0}^{2\pi } u_1(x,t)u_3(x,t)dx+\frac{\theta }{2\pi }\int _{0}^{2\pi } \left( \sigma (x)-\alpha \right) \partial _x^{-1}u_1(x,t) u_2(x,t)dx\right|\\&\quad \le \frac{1}{2\pi }\int _0^{2\pi }\left( 2\sigma (x)-\sqrt{\frac{2}{3}} \theta -\alpha \right) u_2(x,t)^2dx\\&\qquad +\frac{1}{2\pi }\int _0^{2\pi }\left( 2\sigma (x)-\alpha \right) u_3(x,t)^2dx +\left( \sqrt{\frac{2}{3}}\theta -\alpha \right) \left\Vert u_1(t)\right\Vert ^2. \end{aligned} \end{aligned}$$

Similarly to the techniques we have used in the proof of part (b) of Theorem 1, and using the positivity of the coefficients in the last two terms (which follows from (49)), we see that

$$\begin{aligned}&\left|\frac{\theta }{2\pi }\int _{0}^{2\pi } \left( \sigma (x)-\alpha \right) \partial _x^{-1}u_1(x,t) u_2(x,t)dx\right| \\&\quad \le \frac{\theta }{2\pi }\int _{0}^{2\pi }\frac{\left|\sigma (x) -\alpha \right|}{\sqrt{2\sigma (x)-\sqrt{\frac{2}{3}}\theta -\alpha }}\left|\partial _x^{-1}u_1(x,t)\right| \cdot \sqrt{2\sigma (x)-\sqrt{\frac{2}{3}}\theta -\alpha }\left|u_2(x,t)\right|dx \\&\quad \le \left( \sup _{x\in {\mathbb {T}}}\frac{\theta ^2\left( \sigma (x)-\alpha \right) ^2}{8\sigma (x) -4\sqrt{\frac{2}{3}}\theta -4\alpha } \right) \left\Vert u_1(t)\right\Vert ^2+ \frac{1}{2\pi }\int _0^{2\pi }\left( 2\sigma (x)-\sqrt{\frac{2}{3}}\theta -\alpha \right) u_2(x,t)^2dx, \end{aligned}$$

and that

$$\begin{aligned}&\left|\frac{\theta }{2\sqrt{3}\pi }\int _{0}^{2\pi } u_1(x,t)u_3(x,t)dx\right| \le \frac{\theta }{2\pi }\int _{0}^{2\pi } \frac{\left|u_1(x,t)\right|}{\sqrt{6\sigma (x)-3\alpha }}\sqrt{2\sigma (x)-\alpha } \left|u_3(x,t)\right|dx \\&\quad \le \left( \sup _{x\in {\mathbb {T}}}\frac{\theta ^2}{12\left( 2\sigma (x)-\alpha \right) } \right) \left\Vert u_1(t)\right\Vert ^2+\frac{1}{2\pi }\int _0^{2\pi }\left( 2\sigma (x)-\alpha \right) u_3(x,t)^2dx. \end{aligned}$$

Thus, one sees that (54) holds when

$$\begin{aligned} \left( \sup _{x\in {\mathbb {T}}}\frac{\theta ^2\left( \sigma (x)-\alpha \right) ^2}{8\sigma (x) -4\sqrt{\frac{2}{3}}\theta -4\alpha } \right) +\left( \sup _{x\in {\mathbb {T}}}\frac{\theta ^2}{12\left( 2\sigma (x)-\alpha \right) } \right) \le \sqrt{\frac{2}{3}}\theta -\alpha , \end{aligned}$$

which is (50). The proof is complete. \(\square \)

While we have elected not to optimise the choice of \(\alpha \) (as in §5), we can still infer the following, simpler yet far from optimal, corollary:

Corollary 1

Let \(\theta >0\) and \(\alpha >0\) be such that

$$\begin{aligned} \sqrt{\frac{2}{3}}\theta +\alpha < 2\sigma _{\mathrm {min}}, \quad \alpha \le \sqrt{\frac{2}{3}}\theta \end{aligned}$$


$$\begin{aligned} \frac{\theta ^2\sigma _{\mathrm {max}}^2}{8\sigma _{\mathrm {min}}-4\sqrt{\frac{2}{3}}\theta -4\alpha } +\frac{\theta ^2}{12\left( 2\sigma _{\mathrm {min}}-\alpha \right) } \le \sqrt{\frac{2}{3}}\theta -\alpha . \end{aligned}$$


$$\begin{aligned} {\mathfrak {E}}_{\theta }\left( u_1(t)-u_{\infty },u_2(t),u_3(t) \right) \le {\mathfrak {E}}_{\theta }\left( u_{1,0}-u_{\infty },u_{2,0},u_{3,0} \right) e^{-\alpha t}. \end{aligned}$$

In particular, for

$$\begin{aligned} \alpha := \min \left( \frac{\sigma _{\mathrm {min}}}{2},\frac{3\sigma _{\mathrm {min}}}{9\sigma _{\mathrm {max}}^2+1} \right) \end{aligned}$$

we have that \({\mathfrak {E}}_{\sqrt{6}\alpha }\) decays exponentially to zero with rate \(\alpha \).


Since \(\alpha <2\sigma _{\mathrm {min}}\le \sigma _{\mathrm {max}}+\sigma _{\mathrm {min}}\) we see that

$$\begin{aligned} \alpha - \sigma _{\mathrm {max}}< \sigma _{\mathrm {min}}\le \sigma (x) <\sigma _{\mathrm {max}}+ \alpha , \end{aligned}$$

implying that \(\left( \sigma (x)-\alpha \right) ^2 \le \sigma _{\mathrm {max}}^2\) for any \(x\in {\mathbb {T}}\). Using this with additional elementary estimation on the denominator of the expressions that appear in (50), we see that (49) and (50) are valid. As such the first statement of the corollary follows from Theorem 5.

To show the second part of the corollary we notice that with the choice \(\theta _{\alpha }:=\sqrt{6}\alpha \) and \(\alpha \le \frac{\sigma _{\mathrm {min}}}{2}\)

$$\begin{aligned} \sqrt{\frac{2}{3}}\theta _\alpha +\alpha =3\alpha < 2\sigma _{\mathrm {min}},\quad \alpha \le 2\alpha =\sqrt{\frac{2}{3}}\theta _{\alpha }, \end{aligned}$$

giving us (49). Using the inequalities

$$\begin{aligned} 8\sigma _{\mathrm {min}}-4\sqrt{\frac{2}{3}}\theta _\alpha -4\alpha \ge 2\sigma _{\mathrm {min}},\quad \text {and}\quad 2\sigma _{\mathrm {min}}-\alpha \ge \frac{3}{2}\sigma _{\mathrm {min}}\end{aligned}$$

for the l.h.s. of (55), we see that

$$\begin{aligned} \frac{\theta _\alpha ^2\sigma _{\mathrm {max}}^2}{8\sigma _{\mathrm {min}}-4\sqrt{\frac{2}{3}}\theta _\alpha -4\alpha }+\frac{\theta _\alpha ^2}{12\left( 2\sigma _{\mathrm {min}}-\alpha \right) } \le \left( 9\sigma _{\mathrm {max}}^2+1 \right) \frac{\alpha ^2}{3\sigma _{\mathrm {min}}}. \end{aligned}$$

Thus, since \(\sqrt{\frac{2}{3}}\theta _\alpha -\alpha =\alpha \), the desired condition (55) is valid when

$$\begin{aligned} \alpha \le \frac{3\sigma _{\mathrm {min}}}{9\sigma _{\mathrm {max}}^2+1}, \end{aligned}$$

which concludes the proof. \(\square \)


  1. 1.

    At least compared to the \(H^1\)-result in [8].

  2. 2.

    We use mild solution in the termonology of semigroup theory [20].

  3. 3.

    In a sense, the same computation that shows this inequality is embedded in the proof of the exponential decay of \(E_\theta \) in the next subsection.

  4. 4.

    If one wants to approximate the BGK equation with a Maxwellian relaxation function, then the column vector \((\frac{1}{n},\dots ,\frac{1}{n})^T\) inside the relaxation matrix would have to be replaced by a discrete Maxwellian, as was done [2, §4.2].

  5. 5.

    This index characterises the degree of degeneracy of ODE or PDE-evolution equations, cf. [2].

  6. 6.

    Note that by definition, and by Lemma 1, \(C_\omega \le \sqrt{\left\Vert \omega \right\Vert _{\infty }}\), and as such we will automatically get an improvement to condition (39).

  7. 7.

    This process was dealt with numerically.

  8. 8.

    at least for \(H^1\)-initial data


  1. 1.

    Achleitner, F., Arnold, A., Carlen, E.A.: On linear hypocoercive BGK models. In: From particle systems to partial differential equations. III, Springer Proc. Math. Stat., vol. 162, pp. 1–37. Springer, [Cham] (2016).

  2. 2.

    Achleitner, F., Arnold, A., Carlen, E.A.: On multi-dimensional hypocoercive BGK models. Kinet. Relat. Models 11(4), 953–1009 (2018).

    MathSciNet  Article  MATH  Google Scholar 

  3. 3.

    Achleitner, F., Arnold, A., Signorello, B.: On optimal decay estimates for ODEs and PDEs with modal decomposition. In: Stochastic dynamics out of equilibrium, Springer Proc. Math. Stat., vol. 282, pp. 241–264. Springer, Cham (2019).

  4. 4.

    Albi, G., Herty, M., Jörres, C., Pareschi, L.: Asymptotic preserving time-discretization of optimal control problems for the Goldstein-Taylor model. Numer. Methods Partial Differ. Equ. 30(6), 1770–1784 (2014).

    MathSciNet  Article  MATH  Google Scholar 

  5. 5.

    Arnold, A., Carrillo, J.A., Tidriri, M.D.: Large-time behavior of discrete kinetic equations with non-symmetric interactions. Math. Models Methods Appl. Sci. 12(11), 1555–1564 (2002).

    MathSciNet  Article  MATH  Google Scholar 

  6. 6.

    Arnold, A., Erb, J.: Sharp entropy decay for hypocoercive and non-symmetric fokker–planck equations with linear drift (2014). arXiv:1409.5425

  7. 7.

    Arnold, A., Jin, S., Wöhrer, T.: Sharp decay estimates in local sensitivity analysis for evolution equations with uncertainties: from ODEs to linear kinetic equations. J. Differ. Equ. 268(3), 1156–1204 (2020).

    ADS  MathSciNet  Article  MATH  Google Scholar 

  8. 8.

    Bernard, E., Salvarani, F.: Optimal estimate of the spectral gap for the degenerate Goldstein-Taylor model. J. Stat. Phys. 153(2), 363–375 (2013). Erratum to appear (2020)

  9. 9.

    Bhatnagar, P.L., Gross, E.P., Krook, M.: A model for collision processes in gases. i. small amplitude processes in charged and neutral one-component systems. Phys. Rev. 94, 511–525 (1954).

    ADS  Article  MATH  Google Scholar 

  10. 10.

    Cercignani, C., Illner, R., Shinbrot, M.: A boundary value problem for discrete-velocity models. Duke Math. J. 55(4), 889–900 (1987).

    MathSciNet  Article  MATH  Google Scholar 

  11. 11.

    Cox, S., Zuazua, E.: The rate at which energy decays in a damped string. Comm. Partial Differ. Equ. 19(1–2), 213–243 (1994).

    MathSciNet  Article  MATH  Google Scholar 

  12. 12.

    Desvillettes, L., Salvarani, F.: Asymptotic behavior of degenerate linear transport equations. Bull. Sci. Math. 133(8), 848–858 (2009).

    MathSciNet  Article  MATH  Google Scholar 

  13. 13.

    Dolbeault, J., Mouhot, C., Schmeiser, C.: Hypocoercivity for linear kinetic equations conserving mass. Trans. Am. Math. Soc. 367(6), 3807–3828 (2015).

    MathSciNet  Article  MATH  Google Scholar 

  14. 14.

    Evans, L.C.: Partial differential equations, Graduate Studies in Mathematics, vol. 19, second edn. American Mathematical Society, Providence, RI (2010).

  15. 15.

    Goldstein, S.: On diffusion by discontinuous movements, and on the telegraph equation. Quart. J. Mech. Appl. Math. 4, 129–156 (1951).

    MathSciNet  Article  MATH  Google Scholar 

  16. 16.

    Gosse, L., Toscani, G.: An asymptotic-preserving well-balanced scheme for the hyperbolic heat equations. C. R. Math. Acad. Sci. Paris 334(4), 337–342 (2002).

    MathSciNet  Article  MATH  Google Scholar 

  17. 17.

    Jin, S.: Efficient asymptotic-preserving (AP) schemes for some multiscale kinetic equations. SIAM J. Sci. Comput. 21(2), 441–454 (1999).

    MathSciNet  Article  MATH  Google Scholar 

  18. 18.

    Kawashima, S.: Existence and stability of stationary solutions to the discrete Boltzmann equation. Japan J. Indust. Appl. Math. 8(3), 389–429 (1991).

    MathSciNet  Article  MATH  Google Scholar 

  19. 19.

    Lebeau, G.: Équations des ondes amorties. In: Séminaire sur les Équations aux Dérivées Partielles, 1993–1994, pp. Exp. No. XV, 16. École Polytech., Palaiseau (1994)

  20. 20.

    Pazy, A.: Semigroups of linear operators and applications to partial differential equations, Applied Mathematical Sciences, vol. 44. Springer-Verlag, New York (1983).

  21. 21.

    Salvarani, F.: Diffusion limits for the initial-boundary value problem of the Goldstein-Taylor model. Rend. Sem. Mat. Univ. Politec. Torino 57(3), 209–220 (2002) (1999)

  22. 22.

    Taylor, G.I.: Diffusion by Continuous Movements. Proc. London Math. Soc. (2) 20(3), 196–212 (1921).

    MathSciNet  Article  MATH  Google Scholar 

  23. 23.

    Tran, M.B.: Convergence to equilibrium of some kinetic models. J. Differential Equations 255(3), 405–440 (2013).

    ADS  MathSciNet  Article  MATH  Google Scholar 

  24. 24.

    Villani, C.: Hypocoercivity. Mem. Amer. Math. Soc. 202(950), iv+141 (2009).

    MathSciNet  Article  MATH  Google Scholar 

Download references


Open Access funding provided by University of Graz.

Author information



Corresponding author

Correspondence to Amit Einav.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors gratefully acknowledge the support of the Hausdorff Research Institute for Mathematics (Bonn), through the Junior Trimester Program on Kinetic Theory. The first and third authors were partially supported by the FWF (Austrian Science Fund) funded SFB #F65, the third and forth authors by the FWF-doctoral school W 1245.

Communicated by Eric A. Carlen.


Appendix A: Lack of Optimality

In this appendix we will briefly discuss the lack of optimality of our decay rate for non-homogeneous \(\sigma (x)\) in comparison to that given in [8]. We will even go one step further and show how one can improve our general methodology in simple cases, though even this improvement will fall short of the optimal convergence rate.

As one simple example we will explore the following relaxation function:

$$\begin{aligned} \sigma (x):={\left\{ \begin{array}{ll} 1, &{} 0< x \le \pi \\ 4, &{} \pi <x\le 2\pi \end{array}\right. }, \end{aligned}$$

which is motivated by the fact that for this function \(\sigma _{\mathrm {min}}=\frac{\sigma _{\mathrm {max}}}{4}\), and so the choice of \(\theta ^*=1\) in our main Theorem 1 (b) comes “from both directions”.

Before we start with a more structured discussion, we would like to explain how one can improve the technique we developed in §5. A crucial point in the investigation of the behaviour of \(E_{\theta ^*}\) was to find, and close, a linear differential inequality for this entropy, as can be seen in the proof of Theorem 4. One of the final steps in this proof, appearing in (43), was to combine Poincaré inequality with an \(L^\infty \) estimation on the mixed term of \(\partial _x^{-1}\left( u-u_{\mathrm {avg}} \right) \) and v, to show the non-positivity of an appropriate “remainder”. The use of these two inequalities is somewhat crude (yet due to that, quite general), and one can imagine that replacing these two estimations with an inequality that is more \(L^2\) based would improve the range of validity of the theorem. One idea that comes to mind is a weighted Poincaré inequality, i.e. an inequality of the form

$$\begin{aligned} \int _{0}^{2\pi } \left( f(x)-f_{\mathrm {avg}} \right) ^2 \omega (x)dx \le C^2\int _{0}^{2\pi } \left( f^\prime (x) \right) ^2dx. \end{aligned}$$

for a given weight \(\omega (x)\ge 0\) and constant C. Denoting byFootnote 6

$$\begin{aligned} C_{\omega }:=\inf \left\{ C>0\;\Big |\;\int _{0}^{2\pi } \left( f(x)-f_{\mathrm {avg}} \right) ^2 \omega (x)dx \le C^2\int _{0}^{2\pi } \left( f^\prime (x) \right) ^2dx,\quad \forall f\in H^1({\mathbb {T}})\right\} , \end{aligned}$$

we can replace condition (39) of Theorem 4 with the improved condition

$$\begin{aligned} \frac{\theta ^2}{4}C^2_{\omega } \le \theta -\alpha ,\quad \text {where}\quad \omega (x)=\frac{\left( \sigma (x)-\alpha \right) ^2}{2\sigma (x)-\theta -\alpha }. \end{aligned}$$

(58) will be explicitly derived in §1.

From this point onwards the appendix will proceed as follows: First we will show how one can find the optimal weighted Poincaré constant, and compute it in some simple cases, which we will then use in the case were \(\sigma (x)\) is given by (56) to obtain an improvement of our current rate of convergence to equilibrium. Next we will compute the optimal rate given by [8], and conclude a lack of optimality by comparing the rate we achieved in our main theorem, the improved rate we have found, and the optimal rate of [8].

Weighted Poincaré Inequality

The problem of finding a weighted Poincaré inequality and its associated sharp constant can be recast as a constrained variational problem. We define the functional

$$\begin{aligned} {\mathscr {F}}:\;\;\;\;{\mathscr {D}}:=H^1\left( {\mathbb {T}} \right) \rightarrow {\mathbb {R}}, \end{aligned}$$

where \(H^1\left( {\mathbb {T}} \right) \) is the Sobolev space of real valued periodic functions, by

$$\begin{aligned} {\mathscr {F}}(u):=\int _0^{2\pi }\left( u^\prime (x) \right) ^2dx, \end{aligned}$$

and denote by

$$\begin{aligned} c_{\mathrm {min}}:=\inf \left\{ {\mathscr {F}}(u)\;\Big |\;u\in {\mathscr {D}},\;\int _{0}^{2\pi }u(x)^2 \omega (x)dx=1,\;\int _0^{2\pi }u(x)dx=0\right\} . \end{aligned}$$

Even though the minimization set is not convex, standard techniques from Calculus of Variation (see for instance [14, Sect. 8]) show that if \(\omega \) is bounded then the infimum is attained (the conditions on \(\omega \) can be weakened).

One can easily check that in that case

$$\begin{aligned} C^2_{\omega }=\frac{1}{c_{\mathrm {min}}}. \end{aligned}$$

Finding a minimiser to the problem (59) amounts to solving the following constrained Euler-Lagrange equation on \({\mathbb {T}}\)

$$\begin{aligned} u^{\prime \prime }(x)+\lambda u(x)\omega (x)-\tau =0, \end{aligned}$$

considered in weak form, with two Lagrange multipliers \(\lambda >0\) and \(\tau \in {\mathbb {R}}\). Integrating (60) against u shows that

$$\begin{aligned} \lambda ={\mathscr {F}}(u), \end{aligned}$$

which we will use shortly.

Since \(\omega \in L^\infty ({\mathbb {T}})\), we find that \(u\in H^2({\mathbb {T}})\hookrightarrow C^1({\mathbb {T}})\). When \(\omega \) is piecewise constant, the ODE (60), now in strong form, can be solved explicitly. This shows that in these cases the minimiser of \({\mathscr {F}}\) is actually unique.

As we shall see in §1 below, the relevant weight functions we require for our improved study are closely related to \(\sigma (x)\). With (56) in mind, we shall consider weights of the form:

$$\begin{aligned} \omega (x):={\left\{ \begin{array}{ll} \omega _1, &{} 0< x\le \pi \\ \omega _2, &{} \pi <x\le 2\pi \end{array}\right. }. \end{aligned}$$

Hence, the solution to the Euler-Lagrange equation (60) is given by

$$\begin{aligned} \begin{aligned}&u(x)={\left\{ \begin{array}{ll} c_1\sin \left( \sqrt{\lambda \omega _1}x \right) +c_2\cos \left( \sqrt{\lambda \omega _1}x \right) + \frac{\tau }{\lambda \omega _1}, &{} 0<x<\pi \\ c_3\sin \left( \sqrt{\lambda \omega _2}x \right) +c_4\cos \left( \sqrt{\lambda \omega _2}x \right) + \frac{\tau }{\lambda \omega _2},&{} \pi<x<2\pi \end{array}\right. }\\&\quad =:{\left\{ \begin{array}{ll} u_1(x), &{} 0<x<\pi \\ u_2(x), &{} \pi<x<2\pi \end{array}\right. }, \end{aligned} \end{aligned}$$

and it satisfies the following \(C^1\)-matching conditions and constraints:

$$\begin{aligned} \begin{aligned}&u_1(0)=u_2\left( 2\pi \right) ,\\&u_1(\pi )=u_2(\pi ),\\&u_1^\prime (0)=u_2^\prime (2\pi ),\\&u_1^\prime (\pi )=u_2^\prime (\pi ),\\&\int _{0}^{\pi } u_1(x)dx+\int _{\pi }^{2\pi } u_2(x)dx=0,\\&\int _{0}^{\pi } \omega _1u_1(x)^2dx+\int _{\pi }^{2\pi } \omega _2 u_2^2(x)dx=1. \end{aligned} \end{aligned}$$

The first five equations correspond to the linear set of equations:

$$\begin{aligned} {\varvec{M}}(\lambda )\begin{pmatrix}c_1 \\ c_2 \\ c_3 \\ c_4 \\ \tau \end{pmatrix}=\begin{pmatrix}0\\ 0\\ 0\\ 0 \\ 0\end{pmatrix}\ ,\end{aligned}$$

where the matrix \({\varvec{M}}\left( \lambda \right) \) is

$$\begin{aligned} \tiny \begin{pmatrix} 0 &{} 1 &{} -\sin \left( 2\pi \sqrt{\lambda \omega _2} \right) &{} -\cos \left( 2\pi \sqrt{\lambda \omega _2} \right) &{} \frac{\omega _2-\omega _1}{\lambda \omega _1\omega _2} \\ \sin \left( \pi \sqrt{\lambda \omega _1} \right) &{} \cos \left( \pi \sqrt{\lambda \omega _1} \right) &{} -\sin \left( \pi \sqrt{\lambda \omega _2} \right) &{} -\cos \left( \pi \sqrt{\lambda \omega _2} \right) &{} \frac{\omega _2-\omega _1}{\lambda \omega _1\omega _2} \\ \sqrt{\omega _1} &{} 0 &{} -\sqrt{\omega _2}\cos \left( 2\pi \sqrt{\lambda \omega _2} \right) &{} \sqrt{\omega _2}\sin \left( 2\pi \sqrt{\lambda \omega _2} \right) &{} 0\\ \sqrt{\omega _1}\cos \left( \pi \sqrt{\lambda \omega _1} \right) &{} -\sqrt{\omega _1}\sin \left( \pi \sqrt{\lambda \omega _1} \right) &{} -\sqrt{\omega _2}\cos \left( \pi \sqrt{\lambda \omega _2} \right) &{} \sqrt{\omega _2}\sin \left( \pi \sqrt{\lambda \omega _2} \right) &{} 0\\ \frac{1-\cos \left( \pi \sqrt{\lambda \omega _1} \right) }{\sqrt{\omega _1}} &{} \frac{\sin \left( \pi \sqrt{\lambda \omega _1} \right) }{\sqrt{\omega _1}} &{} \frac{\cos \left( \pi \sqrt{\lambda \omega _2} \right) -\cos \left( 2\pi \sqrt{\lambda \omega _2} \right) }{\sqrt{\omega _2}} &{} \frac{\sin \left( 2\pi \sqrt{\lambda \omega _2} \right) -\sin \left( \pi \sqrt{\lambda \omega _2} \right) }{\sqrt{\omega _2}} &{} \frac{\pi \left( \omega _2+\omega _1 \right) }{\sqrt{\lambda }\omega _1\omega _2} \end{pmatrix}. \end{aligned}$$

As we are looking for a non-zero solution to the above equation, we must have that \(\text {det}\left( {\varvec{M}}(\lambda ) \right) =0\). In (63), the last condition on \(u_1\) and \(u_2\) merely acts as normalisation, and doesn’t help in finding \(\lambda \). Hence, due to (61), we find that

$$\begin{aligned} c_{\mathrm {min}}\left( \omega _1,\omega _2 \right) =\min \left\{ \lambda >0\;|\;\text {det} \left( {\varvec{M}}(\lambda ) \right) =0\right\} . \end{aligned}$$

This is how we can find \(c_{\mathrm {min}}\), and consequently \(C_{\omega }\), explicitly (numerically in many cases).

Improved Methodology

We return now to the proof of the differential inequality (41) that governs the evolution of \(E_{\theta }\), which is essentially based on the estimate (43). Choosing \(\theta ^*=1\) and \(\sigma (x)\) as in (56) and using the weight

$$\begin{aligned} \omega _{\alpha }(x):=\frac{\left( \sigma (x)-\alpha \right) ^2}{2\sigma (x)-1-\alpha }={\left\{ \begin{array}{ll} 1-\alpha &{} 0< x\le \pi \\ \frac{\left( 4-\alpha \right) ^2}{7-\alpha } &{} \pi <x\le 2\pi \end{array}\right. }, \end{aligned}$$

which appears in the penultimate line of (43), we see that by using the previously discussed weighted Poincaré inequality instead of the last step of (43), we obtain from (41) and (43):

$$\begin{aligned} \begin{aligned} \frac{d}{dt}E_1\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le&- \alpha E_1\left( u(t)-u_{\mathrm {avg}},v(t) \right) \\&-\left( 1-\alpha -\frac{C^2_{\omega _{\alpha }}}{4} \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2. \end{aligned} \end{aligned}$$

We will maximise the decay rate \(\alpha \), satisfying

$$\begin{aligned} 0<\alpha \le 1-\frac{C^2_{\omega _\alpha }}{4}<1 \end{aligned}$$

(so that the second term in (64) is non-positive) by a processes of iteration: Guessing the starting value \(\alpha _0:=\alpha ^{*}\left( 1,4 \right) =2\left( 2-\sqrt{3} \right) \) (the rate one obtains from our main Theorem 1, cf. (11)) we follow the process described in the previous subsection and find the weighted Poincaré constant \(C^2_{\omega _{\alpha _0}}=1.12013...\), which indeed satisfies (65).

We proceed and create a sequence \(\left\{ \alpha _n\right\} _{n\in {\mathbb {N}}_0}\), defined recursively, so that each \(\alpha _n\) improves upon the previous step by taking its “optimal” value, i.e.

$$\begin{aligned} \alpha _{n}:=1-\frac{C^2_{\omega _{\alpha _{n-1}}}}{4},\quad \quad n\in {\mathbb {N}}, \end{aligned}$$

as long as (65) is still satisfied for this choice. A change of \(\alpha \) implies a change of our weight function \(\omega _\alpha (x)\), yet these new weights are still of the form given in our previous subsection. As such we are able to compute the appropriate \(C_{\omega _{\alpha _n}}\;'\)s, and to show that this sequence converges to the improved decay rateFootnote 7

$$\begin{aligned} \alpha _{\mathrm {max}}\approx 0.7234. \end{aligned}$$

Comparison of Convergence Rates

The optimal rateFootnote 8 of exponential convergence to the Goldstein-Taylor equation, (1), was found by Bernard and Salvarani in [8]. Taking into account the different scaling of the torus \({\mathbb {T}}\) in our paper, this convergence rate is given by

$$\begin{aligned} \alpha _{\mathrm {BS}}:=\frac{1}{\pi }\min \left( \left\Vert {\widetilde{\sigma }}\right\Vert _{L^1 \left( \frac{{\mathbb {T}}}{2\pi } \right) }, {\tilde{D}}(0) \right) \end{aligned}$$


$$\begin{aligned} {\widetilde{\sigma }}(\xi ):=\pi \sigma \left( 2\pi \xi \right) ,\quad \quad \xi \in \frac{{\mathbb {T}}}{2\pi }, \end{aligned}$$

and \({\tilde{D}}(0)\) is the spectral gap of the telegrapher’s equation, see [8, Proposition 3.5], [19, Theorem 2]. More precisely,

$$\begin{aligned} {\tilde{D}}(0):=\inf \left\{ {\text {Re}}\lambda _j\,\big |\,\lambda _j\in \Big ( \text{ spectrum } \text{ of } {\varvec{A}}_{{\tilde{\sigma }}}=\begin{pmatrix} 0 &{} -1\\ -\partial _{xx} &{} 2{\tilde{\sigma }} \end{pmatrix} \text{ in } H^2\oplus H^1\Big ) \setminus \{0\} \right\} . \end{aligned}$$

We note that, for \({\tilde{\sigma }}\) constant and in Fourier space, the matrix \(\begin{pmatrix} 0 &{} -1\\ k^2 &{} 2{\tilde{\sigma }} \end{pmatrix}\) is related to \(\mathbf{C}_k\) from (22) by a simple similarity transformation.

Following on our choice for \(\sigma (x)\) from (56), we see that

$$\begin{aligned} {\widetilde{\sigma }}(\xi )={\left\{ \begin{array}{ll} \sigma _1:=\pi , &{} 0< \xi \le \frac{1}{2} \\ \sigma _2:=4\pi , &{} \frac{1}{2}<\xi \le 1 \end{array}\right. }, \end{aligned}$$

and as such \(\left\Vert {\widetilde{\sigma }}\right\Vert _{L^1\left( \frac{{\mathbb {T}}}{2\pi } \right) }=\frac{5\pi }{2}\).

The calculation of \({\tilde{D}}(0)\) is more involved. According to [19], the spectrum of \({\varvec{A}}_{{\widetilde{\sigma }}}\), besides potentially \(\left\{ 0\right\} \), is discrete and the real part of its eigenvalues must lie in \((0,2\left\Vert {\widetilde{\sigma }}\right\Vert _{\infty }]\). A more detailed investigation of the spectrum can be found in [11].

The eigenvalue problem

$$\begin{aligned} {\varvec{A}}_{{\widetilde{\sigma }}}\begin{pmatrix}u \\ v \end{pmatrix}= \gamma \begin{pmatrix}u \\ v \end{pmatrix}, \end{aligned}$$

with \(\gamma \not =0\), is equivalent to the set of equations

$$\begin{aligned} v^{\prime \prime }(\xi )=\gamma \left( \gamma - 2{\widetilde{\sigma }}(\xi ) \right) v(\xi ), \quad v(\xi )=-\gamma u(\xi ). \end{aligned}$$

To find \({{\tilde{D}}}(0)\) it is sufficient to consider only eigenvalues with \({\text {Re}}\gamma \in (0,2\sigma _1)=(0,2\pi )\), since this complex strip already includes one (real) eigenvalue as we shall see below. With the notation \(\tau _{1,2}(\gamma ):=\sqrt{\gamma (2\sigma _{1,2}-\gamma )}\), which may have to be considered as a complex root, the solution of the ODE is of the form

$$\begin{aligned} v(\xi )={\left\{ \begin{array}{ll} A_1 \cos (\tau _1(\gamma )\,\xi ) +B_1\sin (\tau _1(\gamma )\,\xi ), &{} 0< \xi \le \frac{1}{2} \\ A_2 \cos (\tau _2(\gamma )\,\xi ) +B_2\sin (\tau _2(\gamma )\,\xi ), &{} \frac{1}{2}<\xi \le 1 \end{array}\right. }. \end{aligned}$$

With \(C^1\)–matching conditions at \(\xi =0\) and \(\xi =\frac{1}{2}\), the coefficients satisfy the following system of linear equations:

$$\begin{aligned} {\varvec{M}}(\gamma )\begin{pmatrix}A_1 \\ B_1 \\ A_2 \\ B_2\end{pmatrix}=\begin{pmatrix}0\\ 0\\ 0\\ 0\end{pmatrix} \end{aligned}$$

where the matrix \({\varvec{M}}(\gamma )\) is given by

$$\begin{aligned} \left( \begin{array}{cccc} 1 &{} 0 &{} -\cos \left( \tau _2(\gamma ) \right) &{} -\sin \left( \tau _2(\gamma ) \right) \\ 0 &{} 1 &{} \frac{\tau _2(\gamma )}{\tau _1(\gamma )}\sin \left( \tau _2(\gamma ) \right) &{} -\frac{\tau _2(\gamma )}{\tau _1(\gamma )}\cos \left( \tau _2(\gamma ) \right) \\ \cos \left( \frac{\tau _1(\gamma )}{2} \right) &{} \sin \left( \frac{\tau _1(\gamma )}{2} \right) &{} -\cos \left( \frac{\tau _2(\gamma )}{2} \right) &{} -\sin \left( \frac{\tau _2(\gamma )}{2} \right) \\ \sin \left( \frac{\tau _1(\gamma )}{2} \right) &{} -\cos \left( \frac{\tau _1(\gamma )}{2} \right) &{} -\frac{\tau _2(\gamma )}{\tau _1(\gamma )}\sin \left( \frac{\tau _2(\gamma )}{2} \right) &{} \frac{\tau _2(\gamma )}{\tau _1(\gamma )}\cos \left( \frac{\tau _2(\gamma )}{2} \right) \end{array} \right) . \end{aligned}$$

The requirement that

$$\begin{aligned} \text {det}\left( {\varvec{M}}(\gamma ) \right)= & {} -\sin \left( \frac{\tau _1(\gamma )}{2} \right) \sin \left( \frac{\tau _2(\gamma )}{2} \right) \left( 1+\left( \frac{\tau _2(\gamma )}{\tau _1(\gamma )} \right) ^2 \right) \nonumber \\&+ 2\frac{\tau _2(\gamma )}{\tau _1(\gamma )} \left( \cos \left( \frac{\tau _1(\gamma )}{2} \right) \cos \left( \frac{\tau _2(\gamma )}{2} \right) -1 \right) =0 \end{aligned}$$

yields the wanted eigenvalues \(\gamma \in {\mathbb {C}}\) with \({\text {Re}}\gamma \in (0,2\pi )\).

In our case, i.e. when \({{\tilde{\sigma }}}(x)\) is given by (66), we find (numerically) that the minimal real part of the non-zero eigenvalues found from (67) is approximately 2.72831, which implies that \({\tilde{D}}(0)\approx 2.72831\). Thus, the optimal decay rate given by [8] is

$$\begin{aligned} \alpha _{\mathrm {BS}}\approx \frac{1}{\pi }\min \left( \frac{5\pi }{2},2.72831 \right) \approx 0.86845 .\end{aligned}$$

Summarising, we now have three convergence rates for the case

$$\begin{aligned} \sigma (x)={\left\{ \begin{array}{ll} 1, &{} 0\le x \le \pi \\ 4, &{} \pi <x\le 2\pi \end{array}\right. }\quad : \end{aligned}$$
  • Rate from our main Theorem 1: \(\alpha ^{*}=4-\sqrt{12}\approx 0.5359\).

  • Rate from our improved technique in §1: \(\alpha _{\mathrm {max}}\approx 0.7234\).

  • Rate from the work of Bertrand and Salvarani: \(\alpha _{\mathrm {BS}}\approx 0.86845\).

This shows, as expected, the lack of optimality in our technique.

Appendix B: Deferred Proofs

Proof of Lemma 1

While the proof is standard, we show it here for completion and to fix the sharp constant. Denoting the \(k-\)th Fourier coefficient of f by

$$\begin{aligned} {\widehat{f}}(k):=\frac{1}{2\pi }\int _{0}^{2\pi }f(x)e^{-i kx}dx \end{aligned}$$

we find that

$$\begin{aligned} \widehat{f^\prime }(k)=ik{\widehat{f}}(k) \end{aligned}$$

for all \(k\in {\mathbb {Z}}\) (including \(k=0\)). The condition \(f_{\mathrm {avg}}=0\) is equivalent to \({\widehat{f}}(0)=0\) and as such, using Plancherel’s equality, we find that

$$\begin{aligned} \left\Vert f\right\Vert ^2= & {} \sum _{k\in {\mathbb {Z}}}\left|{\widehat{f}}(k)\right|^2=\sum _{k\in {\mathbb {Z}}\setminus \left\{ 0\right\} } \left|{\widehat{f}}(k)\right|^2 \\= & {} \sum _{k\in {\mathbb {Z}}\setminus \left\{ 0\right\} }\frac{\left|\widehat{f^\prime }(k)\right|^2}{k^2} \le \sum _{k\in {\mathbb {Z}}\setminus \left\{ 0\right\} }\left|\widehat{f^\prime }(k)\right|^2=\sum _{k\in {\mathbb {Z}}} \left|\widehat{f^\prime }(k)\right|^2=\left\Vert f^\prime \right\Vert ^2, \end{aligned}$$

completing the proof. \(\square \)

Proof of Lemma 2

Since for any function \(h\in L^1\left( {\mathbb {T}} \right) \)

$$\begin{aligned} \left( h-h_{\mathrm {avg}} \right) _{\mathrm {avg}}=0\ , \end{aligned}$$

we conclude (i) from the definition of \(\partial _x^{-1}f(x)\). To show (ii) we invoke the fundamental theorem of calculus (the version from Lebesgue theory), and to show (iii) we notice that if f is differentiable

$$\begin{aligned} \partial ^{-1}_x\left( \partial _x f \right) (x)= & {} \int _{0}^{x} \partial _y f(y)dy -\frac{1}{2\pi }\int _{0}^{2\pi } \left( \int _{0}^{x} \partial _y f(y)dy \right) dx \\= & {} f(x)-f(0)-\frac{1}{2\pi }\int _{0}^{2\pi } \left( f(x)-f(0) \right) dx=f(x)-f_{\mathrm {avg}}. \end{aligned}$$

Lastly, we notice that the continuity of \(\partial _x^{-1}f(x)\) as a function on the interval \([0,2\pi ]\) is a standard result from Analysis. To conclude the continuity on the torus, though, we must also show that \(\partial _x^{-1}f (0)=\partial _x^{-1} f\left( 2\pi \right) \). This is equivalent to

$$\begin{aligned} 0=\int _{0}^{0}f(x)dx = \int _{0}^{2\pi }f(x)dx=2\pi f_{\mathrm {avg}}, \end{aligned}$$

which is exactly the additional assumption. In addition, (14) for \(k\not = 0\) follows immediately from (68) and (ii). For \(k=0\) we use

$$\begin{aligned} \widehat{\partial ^{-1}_x f}\left( 0 \right) =\left( \partial ^{-1}_x f \right) _{\mathrm {avg}}=0, \end{aligned}$$

according to (i). The proof is thus complete. \(\square \)

Proof of Lemma 3

We will establish that

$$\begin{aligned} \left|\frac{\theta }{2\pi }\int _{0}^{2\pi } \partial _x^{-1}f (x)\overline{g(x)}dx\right| \le \frac{\left|\theta \right|}{2}\left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) \end{aligned}$$

from which (15), (16) and (17) all follow. Indeed, using the Cauchy-Schwarz inequality, the Poincaré inequality – which is valid since \((\partial ^{-1}_x f)_{\mathrm {avg}}=0\) - and part (ii) of Lemma 2 we conclude that

$$\begin{aligned}&\left|\frac{\theta }{2\pi }\int _{0}^{2\pi } \partial _x^{-1}f (x)\overline{g(x)}dx\right| \le \left|\theta \right|\left\Vert \partial _x^{-1}f\right\Vert \left\Vert g\right\Vert \le \frac{\left|\theta \right|}{2}\left( \left\Vert \partial _x^{-1}f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) \\&\quad \le \frac{\left|\theta \right|}{2}\left( \left\Vert \partial _x\left( \partial _x^{-1}f \right) \right\Vert ^2 +\left\Vert g\right\Vert ^2 \right) =\frac{\left|\theta \right|}{2}\left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) \ .\end{aligned}$$

The proof is thus completed. \(\square \)

Proof of (i) and (ii) within the proof of Lemma 6

To show (i) we notice that \(\gamma _{\mathrm {max}}\left( \theta \right) \le \gamma _{\mathrm {min}}\left( \theta \right) \) if and only if

$$\begin{aligned} \frac{2\sqrt{4-\theta ^2}+\left( 4-\sigma _{\mathrm {max}}\theta \right) }{2+\sqrt{4-\theta ^2}} \le \frac{2\sqrt{4-\theta ^2}-\left( 4-\sigma _{\mathrm {min}}\theta \right) }{2-\sqrt{4-\theta ^2}} \end{aligned}$$

which is equivalent to

$$\begin{aligned} 2\left( 8-\theta \left( \sigma _{\mathrm {min}}+\sigma _{\mathrm {max}} \right) \right) + \sqrt{4-\theta ^2}\;\theta \left( \sigma _{\mathrm {max}}-\sigma _{\mathrm {min}} \right) \le 4\left( 4-\theta ^2 \right) , \end{aligned}$$


$$\begin{aligned} \frac{2\left( \sigma _{\mathrm {max}}+\sigma _{\mathrm {min}}- 2\theta \right) }{\sigma _{\mathrm {max}}-\sigma _{\mathrm {min}}} \ge \sqrt{4-\theta ^2}. \end{aligned}$$

The above inequality is easily satisfied when \(\theta \le \min \left( \sigma _{\mathrm {min}},2 \right) \) as in this case

$$\begin{aligned} \frac{2\left( \sigma _{\mathrm {max}}+\sigma _{\mathrm {min}}- 2\theta \right) }{\sigma _{\mathrm {max}}-\sigma _{\mathrm {min}}} \ge \frac{2\left( \sigma _{\mathrm {max}}-\sigma _{\mathrm {min}} \right) }{\sigma _{\mathrm {max}}-\sigma _{\mathrm {min}}}=2\ge \sqrt{4-\theta ^2}. \end{aligned}$$

(ii) is trivial, since \(\theta <\sigma _{\mathrm {max}}\) holds if and only if \(\gamma _{\mathrm {max}}(\theta )<\theta \). \(\square \)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Arnold, A., Einav, A., Signorello, B. et al. Large Time Convergence of the Non-homogeneous Goldstein-Taylor Equation. J Stat Phys 182, 41 (2021).

Download citation


  • BGK equation
  • Hypocoercivity
  • Large time behaviour
  • Exponential decay
  • Lyapunov functional

Mathematics Subject Classification

  • 82C40
  • 35B40
  • 35Q82
  • 35S05