# Estimation pitfalls when the noise is not i.i.d.

• Liudas Giraitis
• Masanobu Taniguchi
Perspectives on data science for advanced statistics

## Abstract

This paper extends Whittle estimation to linear processes with a general stationary ergodic martingale difference noise. We show that such estimation is valid for standard parametric time series models with smooth bounded spectral densities, e.g., ARMA models. Furthermore, we clarify the impact of the hidden dependence in the noise on such estimation. We show that although the asymptotic normality of the Whittle estimates may still hold, the presence of dependence in the noise impacts the limit variance. Hence, the standard errors and confidence intervals valid under i.i.d. noise may not be applicable and thus require correction. The goal of this paper is to raise awareness to the impact of a non-i.i.d. noise in applied work.

## Keywords

Whittle estimation Asymptotic normality Quadratic form Martingale difference noise

62E20 60F05

## 1 Introduction

A variety of stationary models known in statistical and econometric literature can be expressed as a linear/moving average time series. Under minor restrictions a linear stationary time series can be represented, by the Wold decomposition, as a moving average of infinite order,
\begin{aligned}&X_t= \sum _{j=0}^\infty a_j \eta _{t-j},\quad t\in {\mathbb Z}, \end{aligned}
(1.1)
where $$\{\eta _t\}$$ is an uncorrelated noise with zero mean and weights $$\{a_j\}$$ satisfy $$\sum _{j=0}^\infty a_j^2<\infty$$. We have $$a_j=a_j(\theta )$$ and $$E\eta _t^2=\sigma ^2$$. The goal is to estimate $$\theta$$ and $$\sigma ^2$$.

In applications, the noise $$\{\eta _t\}$$ thus is often assumed to be i.i.d. or a stationary martingale differences (m.d.) sequence. The estimation of such models is mainly done under the assumption of i.i.d. noise, but this may be too restrictive in applications and hard to verify in practice and it excludes ARCH type conditionally heteroskedastic noises $$\{\eta _t\}$$ which are stationary ergodic m.d. processes. A typical example of a linear process (1.1) with m.d. noise is $$X_t=r_t^2-Er_t^2,$$ where $$r_t$$ is a GARCH(pq) process. It is commonly used for modeling squared returns of assets in financial econometrics, see e.g., the review in Giraitis et al. (2007).

In empirical work, parameters of such time series are often estimated using techniques suitable for i.i.d. noises $$\{\eta _t\}$$ but without proper theoretical validation, see e.g., Wu and Shieh (2007).

In this paper, we examine the validity of a standard Whittle estimation procedure for a linear process (1.1) with a martingale difference noise $$\{\eta _t\}$$, and we analyze how this m.d. noise impacts the asymptotic behavior of the estimates.

In his seminal work, Hannan (1973) showed that parametric Whittle estimates $$(\hat{\sigma }_n^2, \hat{\theta }_n)$$ given in (2.6) are consistent estimators of the true value of the parameter $$(\sigma _0^2, \theta _0)$$ for a large class of ergodic time series $$\{X_t\}$$. He established the asymptotic normality of $$\hat{\theta }_n$$ for linear processes with smooth bounded spectral densities $$f_{\sigma ^2,\theta }$$ for a special class of m.d. noises $$\{\eta _t\}$$. Hannan assumed that $$E[\eta _t|\mathcal{F}_{t-1}]=\sigma ^2$$ is a constant a.s.

Fox and Taqqu (1986), Giraitis and Surgailis (1990), Giraitis et al. (2001) and others extended the parametric Whittle estimation technique to linear long memory time series $$\{X_t\}$$ with unbounded spectral densities and i.i.d. noise $$\{\eta _t\}$$; see Chapter 8 in Giraitis et al. (2012). Hosoya and Taniguchi (1982) showed that Whittle estimation remains valid for linear processes with uncorrelated noise $$\{\eta _t\}$$ whose fourth-order cumulants are summable and whose conditional moments satisfy some regularity conditions.

Our aim in this paper is to extend Whittle estimation to linear processes with a stationary ergodic m.d. noise $$\{\eta _t\}$$. We shall show that such estimation is valid for standard parametric time series models with smooth bounded spectral densities, e.g., ARMA models. Furthermore, we shall clarify the impact of the dependence structure of the noise in such estimation. The proof of the asymptotic normality relies on the normal approximation results for quadratic forms in stationary ergodic m.d. noise $$\{\eta _t\}$$ obtained in Giraitis et al. (2016).

We show that differences in inference between modeling with i.i.d. noise and modeling with m.d. noise cannot be ignored. The goal of this paper is thus to raise awareness, when doing applied work, to the fact that although the asymptotic normality of the estimates may still hold, the limit variance might be affected. Hence, the standard errors and confidence intervals valid under i.i.d. noise may not be applicable and thus need to be corrected.

The main results are Theorem 2.1 (consistency), Theorem 2.2 (asymptotic normality) and Theorem 3.1 (asymptotic normality of quadratic forms). More generally, Sect.  2 examines the impact of m.d. noise on parametric Whittle estimation. We establish asymptotic normality for parametric Whittle estimator $$\hat{\theta }_n$$ under weak conditions on the noise. This requires deriving the asymptotic normality of quadratic forms for linear processes with m.d. noise and is done in Sect. 3. Section 4 contains auxiliary results. Section 5 deals with applications.

Throughout the paper, by $$\rightarrow _p$$ and $$\rightarrow _D$$ we denote convergence in probability and distribution, respectively, while C denotes generic constants.

## 2 Parametric Whittle estimation

Denote by $$\{\eta _t\}$$ a stationary ergodic martingale difference (m.d.) sequence with respect to the natural filtration $$\mathcal{F}_t$$ defined below, namely $$E[\eta _t|\mathcal{F}_{t-1}]=0$$, with moments
\begin{aligned} E\eta _t=0, \quad E\eta _t^2=\sigma ^2\,\,\mathrm{and}\,\,E\eta _t^4<\infty . \end{aligned}
(2.1)
By $$\mathcal{F}_t$$ we denote the $$\sigma$$-field generated by $$(\eta _{t}, \eta _{t-1},\ldots )$$ or, more generally, by some underlying noise $$(\varepsilon _{t}, \varepsilon _{t-1},\ldots )$$ such that $$\eta _t=f(\varepsilon _{t}, \varepsilon _{t-1},\ldots )$$ is a measurable function of $$\varepsilon _t$$'s. Clearly, $$E\eta _t\eta _s=0$$ for $$t\ne s$$. Indeed, if $$s<t$$, then $$E\eta _t\eta _s=E[E[\eta _t|\mathcal{F}_{t-1}]\eta _s]=0$$.
In this section, we study the parametric Whittle estimation for a linear process
\begin{aligned}&X_t:= \sum _{k=0}^\infty a_k(\theta )\eta _{t-k}, \quad a_0=a_0(\theta )=1, \quad t\in {\mathbb Z}, \end{aligned}
(2.2)
with
\begin{aligned}&\sum _{k=0}^\infty a_k^2(\theta )<\infty , \,\, \, \theta \in \Theta . \end{aligned}
The real-valued coefficients $$a_k(\theta ), k=0,1,\ldots$$ are parameterized by the parameter $$\theta \in \Theta$$ taking values in a compact set $$\Theta \subset {\mathbb R}^q$$. Throughout this paper, $$\sigma ^2_0,\, \theta _0$$ denote the true parameter values of $$\sigma ^2,\, \theta$$, respectively. In this paper, we prove asymptotic normality of the Whittle estimator of $$\theta _0$$ and consistency of the estimator of $$\sigma ^2$$.
The spectral density of the process $$\{X_t\}$$ has a parametric form
\begin{aligned}&f(u)\equiv f_{\sigma ^2,\theta }(u) = \frac{\sigma ^2}{2\pi }s_\theta (u), \end{aligned}
(2.3)
where
\begin{aligned} s_\theta (u):=\Big |\sum _{k=0}^\infty a_k(\theta ) {\hbox {e}}^{\i k u}\Big |^2,\quad u\in \Pi , \qquad \theta \in \Theta , \end{aligned}
where $$\Pi =[-\pi ,\,\pi )$$. By Wold decomposition, the class of stationary processes having linear representation (2.2) with an uncorrelated noise $$\eta _t$$ is very large. In this paper the class of possible noises is reduced by supposing that they form stationary ergodic m.d. sequences.
Assume that observations $$X_1, X_2,\ldots , X_n$$ are from the linear process (2.2). Denote the periodogram based on $$X_1,X_2,\ldots , X_n$$ by
\begin{aligned} I_n(u):= & {} \frac{1}{2\pi n} \Big |\sum _{j=1}^nX_j {\hbox {e}}^{\i j u}\Big |^2. \end{aligned}
(2.4)
Parametric Whittle inference procedures involve the integrated weighted periodogram
\begin{aligned} Q_n(\theta ):= \int _\Pi \frac{I_n(u)}{s_\theta (u)} {\hbox {d}}u \end{aligned}
(2.5)
with weight function $$s_\theta (u)$$. In view of (2.4), $$Q_n(\theta )$$ equals the quadratic form
\begin{aligned} Q_n(\theta ) = \frac{1}{n}\sum _{j, k=1}^n b_{j-k}(\theta )X_j X_k, \end{aligned}
where
\begin{aligned} b_j(\theta ):=\frac{1}{2\pi } \int _\Pi \frac{{\hbox {e}}^{\i j u}}{s_\theta (u)} {\hbox {d}}u, \quad j\in {\mathbb Z}. \end{aligned}
Whittle estimates of $$\sigma _0,\,\theta _0$$ based on $${X}_1, X_2,\ldots , X_n$$ are defined as
\begin{aligned} \hat{\sigma }^2_n=Q_n(\hat{\theta }_n), \quad \hat{\theta }_n=\text{ argmin }_{\theta \in \Theta } \, Q_n(\theta ). \end{aligned}
(2.6)
These estimators were introduced by Whittle (1953) and are obtained by minimizing the approximate Gaussian log-likelihood. The approximate Gaussian log-likelihood is known as “the Whittle Gaussian log-likelihood”.

We shall first address the consistency. Consider the following assumption.

### Assumption (a0)

The parameter space $$\Theta$$ is compact, parameter $$(\sigma ^2,\theta )$$ determines the spectral density
\begin{aligned} f_{\sigma ^2, \theta }(u)=\sigma ^2s_{\theta }(u)/2\pi \end{aligned}
uniquely. The function $$s_{\theta }(u)$$ is continuous in $$(u,\theta )\in \Pi \times \Theta$$ and for some $$c_1>0, c_2>0$$,
\begin{aligned}0<c_1\le s_{\theta }(u)\le c_2<\infty , \quad (u,\theta )\in \Pi \times \Theta . \end{aligned}

Thus the spectral density is bounded away from the origin and infinity.

### Theorem 2.1

(Consistency of Whittle estimators). Suppose an observable linear process $$\{X_t\}$$ of (2.2) has the spectral density
\begin{aligned} f(u)=\frac{\sigma ^2_0}{2\pi }s_{\theta _0}(u), \end{aligned}
and suppose the functions $$s_\theta$$, $$\theta \in \Theta ,$$ satisfy Assumption (a0). Then, as $$n\rightarrow \infty ,$$
\begin{aligned}&\hat{\theta }_n \rightarrow \theta _0,\qquad \hat{\sigma }^2_n \rightarrow \sigma ^2_0,\quad a.\,s. \end{aligned}
(2.7)

### Proof

By assumption, $$\{\eta _t\}$$ is a stationary ergodic sequence. Thus, Theorem 3.5.8 in Stout (1974) implies that the sequence $$\{X_t\}$$ in (2.2) is also stationary ergodic. Hence (2.7) follows from Theorem 1 in Hannan (1973). For more details of the proof, see Theorem 8.2.1 in Giraitis et al. (2012). $$\square$$

In general, the asymptotic normality of the Whittle estimates requires stronger modeling assumptions on $$\{X_{t}\}$$ in (2.2). We introduce the following conditions on the functions $$s_{\theta }(u)$$ and the weights $$a_k(\theta _0)$$ of $$\{X_t\}$$.

Denote by $$\nabla _{\theta }$$ the partial derivative operator with respect to a vector $$\theta$$ and by $$^{\prime }$$ the transposition operator. Set
\begin{aligned}&W_{\theta _0}= \int _{\Pi }\nabla _{\theta } \log s_{\theta _{0}}(u)\nabla _{\theta }^{\prime }\log s_{\theta _{0}}(u){\hbox {d}}u,\nonumber \\&V_{\theta _0,\eta }=4E\left[ \left( \sum _{k=1}^\infty \beta _{k,\theta _0}\eta _{-k}\right) \left( \sum _{k=1}^\infty \beta _{k,\theta _0}\eta _{-k}\right) ^\prime \eta _0^2\right] ,\nonumber \\&\beta _{k,\theta _0}:=(2\pi )^{-1}\int _\Pi {\hbox {e}}^{\i k u}\nabla _{\theta }\log s_{\theta _{0}}(u){\hbox {d}}u, \quad k\in {\mathbb Z}. \end{aligned}
(2.8)
Noting that $$\beta _{k,\theta _0}=\beta _{-k,\theta _0}$$ and $$\beta _{0,\theta _0}=0$$, Parseval’s identity implies
\begin{aligned} W_{\theta _0}= 2\pi \sum _{k\in {\mathbb Z}}\beta _{k,\theta _0}\beta _{k,\theta _0}^\prime =4\pi \sum _{k=1}^\infty \beta _{k,\theta _0}\beta _{k,\theta _0}^\prime . \end{aligned}
(2.9)

### Assumption (a1)

1. (i)

The true value of parameter $$(\sigma _0^2,\theta _0)$$ lies in the interior $$(0,\infty )\times \Theta$$. The weights $$a_k(\theta _0)$$ in (2.2) have property $$\sum _{k=0}^\infty ka_k^2(\theta _0)<\infty .$$

2. (ii)

The partial derivatives $$\nabla _{\theta }s_\theta (u),\nabla _{u}s_{\theta }(u ),$$ $$\nabla _{u}\nabla _{\theta }s_{\theta }(u )$$, $$\nabla _{\theta }\nabla _{\theta }^{\prime }s_{\theta }(u )$$, exist and are bounded and continuous functions of $$u\in \Pi$$ and $$\theta \in \Theta .$$

3. (iii)

The matrix $$W_{\theta _0}$$ is positive definite.

The following theorem establishes the asymptotic normality of the Whittle estimate $$\hat{\theta }_n$$. Remarkably, besides (2.1), no additional conditions on the stationary ergodic martingale difference noise $$\{\eta _t\}$$ are needed.

Hidden dependence of the noise variables $$\eta _t$$, however, will have an impact on the asymptotic variance matrix of the Whittle estimate $$\hat{\theta }_n$$ in (2.10). By hidden dependence, we mean for example, a situation where the $$\eta _t$$ may be uncorrelated but their square $$\eta _t^2$$ are correlated. The matrix has the form
\begin{aligned} \Omega _{\theta _0,\eta }=4\pi ^2 \sigma _0^{-4}W^{-1}_{\theta _0}V_{\theta _0,\eta }W^{-1}_{\theta _0}. \end{aligned}
We show that for i.i.d. noise $$\{\eta _j\}$$, this matrix reverts to the standard asymptotic variance matrix of the Whittle estimate $$\Omega _{\theta _0}=4\pi W^{-1}_{\theta _0}$$ given in Theorem 2 in Hannan (1973), which does not depend on $$\{\eta _t\}$$.

### Theorem 2.2

(Asymptotic normality of Whittle estimators). Let $$\{X_t\}$$ be a linear process (2.2) having parametric spectral density $$f=\sigma ^2_0\,s_{\theta _0}/2\pi$$, with $$s_\theta$$ satisfying Assumptions (a0) and (a1). Then
\begin{aligned}&n^{1/2}(\hat{\theta }_n-\theta _0)\rightarrow _D \mathcal{N}(0,\Omega _{\theta _0,\eta }), \quad \Omega _{\theta _0,\eta }= 4\pi ^2 \sigma _0^{-4} W^{-1}_{\theta _0}V_{\theta _0,\eta }W^{-1}_{\theta _0}. \end{aligned}
(2.10)
Moreover,
1. (i)
If $$\{\eta _t\}$$ are i.i.d. random variables, then
\begin{aligned} \Omega _{\theta _0,\eta }=4\pi W_{\theta _0}^{-1}, \quad V_{\theta _0,\eta }=\pi ^{-1}\sigma _0^4\, W_{\theta _0}, \end{aligned}
(2.11)
and
\begin{aligned} n^{1/2}(\hat{\theta }_n-\theta _0)\rightarrow _D \mathcal{N}(0,\Omega _{\theta _0}). \end{aligned}

2. (ii)
If m.d. noise $$\{\eta _t\}$$ is such that $$E[\eta _0^2\eta _k\eta _s]=0$$ for any $$s<k<0$$, then
\begin{aligned} \Omega _{\theta _0,\eta }= & {} 4\pi W_{\theta _0}^{-1}+\Omega _{\theta _0,\eta }^*, \quad \Omega _{\theta _0,\eta }^*:=4\pi ^2 \sigma _0^{-4} W^{-1}_{\theta _0}\Delta _{\eta } W^{-1}_{\theta _0}, \nonumber \\ V_{\theta _0,\eta }= & {} \pi ^{-1}\sigma _0^4\, W_{\theta _0}+\Delta _{\eta },\quad \Delta _{\eta }:=4\sum _{k=1}^\infty \beta _{k,\theta _0} \beta _{k,\theta _0}^\prime \mathrm{cov}(\eta ^2_0, \eta _{-k}^2). \end{aligned}
(2.12)

### Remark 2.1

Obviously (2.12) reduces to (2.11) if the $$\eta _t$$ are i.i.d. Whittle estimation is robust with respect to an i.i.d. noise $$\eta _t$$ in the sense that its variance (2.11) does not depend on $$\eta _t$$. In particular, Whittle estimation does not have pitfalls in the sense that (2.10) holds with (2.11) when the m.d. noise $$\eta _t$$ is Gaussian since such $$\eta _t$$s are i.i.d. Theorem 2.2 shows that this property is not valid anymore for m.d. noises $$\eta _t$$ which are not i.i.d. Although asymptotic normality may still hold, the standard errors of the Whittle estimates are affected by the presence of the hidden dependence in the noise.

### Example 2.1

Consider, for example, the m.d. noise
\begin{aligned} \eta _t=\varepsilon _{t}\varepsilon _{t-1}, \end{aligned}
(2.13)
where $$\{\varepsilon _t\}$$ are i.i.d. Gaussian random variables with zero mean and variance $$E\varepsilon _0^2=\sigma _\varepsilon ^2$$. Then $$\{\eta _t\}$$ is a stationary ergodic m.d. sequence with respect to $$\sigma$$-field $$\mathcal{F}_t$$ generated by variables $$\varepsilon _t, \varepsilon _{t-1}, \ldots$$ with
\begin{aligned} \eta _t=0,\quad \sigma _0^2=E\eta _0^2=(E\varepsilon _0^2)^2 \quad \text{ and }\ \ E[\eta _0^2\eta _k\eta _s]=0 \end{aligned}
for $$s<k<0$$. Moreover,
\begin{aligned} \mathrm{cov}(\eta ^2_0, \eta _{-1}^2)=E[\eta ^2_0 \eta _{-1}^2]-E[\eta ^2_0]E[ \eta _{-1}^2]=2 \sigma _0^4, \end{aligned}
and
\begin{aligned} \mathrm{cov}(\eta ^2_0, \eta _{-k}^2)=0, \quad \text{ for }\ k \ge 2. \end{aligned}
Hence, (2.12) holds with $$\Delta _{\eta }:=8 \sigma _0^4 \,\beta _{1,\theta _0} \beta _{1,\theta _0}^\prime ,$$
\begin{aligned} \Omega _{\theta _0,\eta }=4\pi W_{\theta _0}^{-1}+\Omega _{\theta _0,\eta }^*, \quad \Omega _{\theta _0,\eta }^*:=32\pi ^2 W^{-1}_{\theta _0} \,\beta _{1,\theta _0} \beta _{1,\theta _0}^\prime \, W^{-1}_{\theta _0}. \end{aligned}
Here, the dependence in m.d. noise contributed an additional term $$\Omega _{\theta _0,\eta }^*$$ to the variance $$\Omega _{\theta _0,\eta }$$ compared with an i.i.d. noise.

### Example 2.2

The class of stationary ergodic m.d. processes $$\{\eta _t\}$$ is very rich. It covers conditional heteroscedastic ARCH models, stochastic volatility models and others. Such processes can usually be written in the form
\begin{aligned} \eta _t= \varepsilon _t \sigma _t, \quad \sigma _t= f(\varepsilon _{t-1}, \varepsilon _{t-2}, \ldots ), \end{aligned}
where $$\{\varepsilon _t\}$$ is a sequence of i.i.d. random variables with $$E\varepsilon _t=0$$, $$E\varepsilon _t^2<\infty$$ and f is a measurable function of $$(\varepsilon _t, \varepsilon _{t-1}, \ldots )$$. Clearly, such $$\{\eta _t\}$$ process is a stationary m.d. sequence. Since $$\{\varepsilon _t\}$$ is an ergodic process, then by Theorem 3.5.8 in Stout (1974), $$\{\eta _t\}$$ is a stationary ergodic m.d. sequence. It satisfies (2.1) as long as $$E\varepsilon _t^4<\infty$$, $$E\sigma _t^4<\infty$$.

### Remark 2.2

Verification of the asymptotic normality for $$\hat{\theta }_n$$ is reduced in this paper to the asymptotic normality of a quadratic form $$\sum _{j,k=1,\, j\ne k}^n \beta _{j-k}\eta _j\eta _k$$ of an m.d. noise $$\eta _t$$ with a zero diagonal. Normal approximation for $$\hat{\sigma }^2_n,$$ however, would require establishing the asymptotic normality of a quadratic form with a non-zero diagonal. Since $$\{\eta _t^2\}$$ is not a m.d. noise, proof of the asymptotic normality for $$\hat{\sigma }^2_n$$ and $$(\hat{\theta }_n, \, \hat{\sigma }^2_n)$$ would require additional assumptions on $$\{\eta _t\}$$.

Under additional assumptions, Hosoya and Taniguchi (1982) established asymptotic normality of Gaussian maximum likelihood estimate for linear processes with uncorrelated noise $$\{\eta _t\}$$ and Taniguchi (1982) suggested consistent estimates for its asymptotic variance.

### Proof of Theorem 2.2

By Theorem 2.1, $$\hat{\theta }_n\rightarrow \theta _0$$ a.s. Hence, $$\nabla _{\theta } Q_n(\hat{\theta }_n)=0$$ with probability tending to 1. Thus, by the continuity and differentiability of function $$Q_n(\hat{\theta })$$ which is guaranteed by Assumption (a0), by the mean-value theorem there exists $$\theta _n^*\in \Theta$$ such that $$||\theta _n^*-\theta _0|| \le ||\hat{\theta }_n -\theta _0||$$ and
\begin{aligned} 0= \nabla _{\theta } Q_n(\hat{\theta }_n) = \nabla _{\theta } Q_n(\theta _0) + \nabla _{\theta }\nabla _{\theta }^{\prime } Q_n(\theta _n^*)(\hat{\theta }_n-\theta _0). \end{aligned}
(2.14)
Since the components of $$\nabla _{\theta }s_\theta ^{-1}(u)$$ are continuous functions in $$\theta$$ and u, by Lemma 8.2.2. in Giraitis et al. (2012),
\begin{aligned} \nabla _{\theta }\nabla _{\theta }^{\prime } Q_n(\theta )\rightarrow \frac{\sigma _0^2}{2\pi }\int _{\Pi }s_{\theta _0}(u) \nabla _{\theta }\nabla _{\theta }^{\prime }s_\theta ^{-1}(u)\,{\hbox {d}}u \,\,\, a.s. \end{aligned}
uniformly in $$\theta \in \Theta$$. Together with the consistency of $$\hat{\theta }_n$$ this implies
\begin{aligned}&\nabla _{\theta }\nabla _{\theta }^{\prime } Q_n(\theta ^*_n) \rightarrow _p \frac{\sigma _0^2}{2\pi }\int _\Pi s_{\theta _0}(u) \nabla _{\theta }\nabla _{\theta }^{\prime } s_{\theta _0}^{-1}(u)\,{\hbox {d}} u = \frac{\sigma _0^2}{2\pi }W(\theta _0). \end{aligned}
(2.15)
By Kolmogorov formula, the parameterization assumption $$a_0=1$$ is equivalent to
\begin{aligned} \int _\Pi \log s_\theta (u) {\hbox {d}}u=0, \quad \theta \in \Theta . \end{aligned}
The use of the latter yields the last equality in (2.15), see Hannan (1973) or, e.g., page 216 in Giraitis et al. (2012).
Because of (2.14) and (2.15), (2.10) will follow from the convergence
\begin{aligned} -n^{1/2} \nabla _{\theta } Q_n(\theta _0) \rightarrow _D \mathcal{N}(0, V_{\theta _0,\eta }). \end{aligned}
(2.16)
Let $$c\in {\mathbb R}^q$$. Denote
\begin{aligned} S_{n,c} :=- n c'\nabla _{\theta } Q_n(\theta _0). \end{aligned}
By Cramér–Wold device, to prove (2.16) it suffices to verify that for any c,
\begin{aligned} n^{-1/2}S_{n,c}\rightarrow \mathcal{N}(0, v^2_c), \quad v^2_c=c^\prime V_{\theta _0,\eta }c. \end{aligned}
(2.17)
Express $$S_{n,c}$$ as a quadratic form
\begin{aligned} S_{n,c}=\sum _{t,s=1}^n g_{t-s} \,X_t X_s,\,\,\, g_t := (2\pi )^{-1}\int _\Pi {\hbox {e}}^{\i tu} \hat{g}(u){\hbox {d}}u, \,\, t\in {\mathbb Z},\,\,\, \end{aligned}
(2.18)
where
\begin{aligned} \hat{g}(u) := - c' \nabla _{\theta } s_{\theta _0}^{-1}(u). \end{aligned}
We prove (2.17) by showing that $$S_{n,c}$$ satisfies assumptions of Theorem 3.1 below. By Assumptions (a0) and (a1)(ii), the partial derivative $$\nabla _{u }\nabla _{\theta } s_{\theta _0}^{-1}$$ is a bounded continuous function which implies that the series $$\sum _{j\in {\mathbb Z}}|g_j| <\infty$$ converges. Clearly, $$g_j=g_{-j}$$. By Assumption (a1)(i), $$\sum _{k=1}^\infty k a_k^2(\theta _0)<\infty$$. By (3.3) and (3.11) we obtain
\begin{aligned} \beta _{k}= & {} (2\pi )^{-1}\int _\Pi {\hbox {e}}^{ \i ku} \hat{g}(u) \, s_{\theta _0}(u){\hbox {d}}u =-(2\pi )^{-1}\int _\Pi {\hbox {e}}^{ \i ku} c' \nabla _{\theta } s_{\theta _0}^{-1}(u) \, s_{\theta _0}(u){\hbox {d}}u \nonumber \\= & {} (2\pi )^{-1}\int _\Pi {\hbox {e}}^{ \i ku} c' \nabla _{\theta }\log s_{\theta _0}(u) {\hbox {d}}u, \quad k\in {\mathbb Z}. \end{aligned}
(2.19)
Moreover, property $$\int _\Pi \log s_\theta (u) {\hbox {d}}u=0$$ implies that
\begin{aligned} \int _\Pi \nabla _{\theta }\log s_\theta (u) {\hbox {d}}u=0, \quad \beta _{0}=(2\pi )^{-1}\int _\Pi c' \nabla _{\theta }\log s_{\theta _0}(u) {\hbox {d}}u=0. \end{aligned}
(2.20)
Hence, Theorem 3.1 implies (2.17) with $$V_{\theta _0,\eta }$$ as in (2.8).
Proof of (2.11). Recall that $$\beta _{k,\theta _0}=\beta _{-k,\theta _0}$$ and by (2.20),
\begin{aligned} \beta _{0,\theta _0}=(2\pi )^{-1}\int _\Pi \nabla _{\theta }\log s_{\theta _{0}}(u){\hbox {d}}u= 0. \end{aligned}
Therefore, for i.i.d. random variables $$\{\eta _t\}$$, $$V_{\theta _0,\eta }$$ in (2.10) takes the form
\begin{aligned} V_{\theta _0,\eta }=4\sigma _0^4 \,\sum _{k=1}^\infty \beta _{k,\theta _0}\beta _{k,\theta _0}^\prime =2\sigma _0^4 \,\sum _{k=-\infty }^\infty \beta _{k,\theta _0}\beta _{k,\theta _0}^\prime . \end{aligned}
(2.21)
Hence, using definition (2.8) of $$\beta _{k,\theta _0}$$, by Parseval’s identity, we obtain (2.11):
\begin{aligned} V_{\theta _0,\eta }=\pi ^{-1}\sigma _0^4\int _\Pi \nabla _{\theta }\log s_{\theta _{0}}(u)\nabla _{\theta }^\prime \log s_{\theta _{0}}(u) {\hbox {d}}u =\pi ^{-1}\sigma _0^4\, W_{\theta _0}. \end{aligned}
(2.22)
Finally, under assumption $$E[\eta _0^2\eta _k\eta _s]=0$$ for any $$s<k<0$$, (2.12) follows straightforwardly from the definition of $$V_{\theta _0,\eta }$$ in (2.8) noting that $$E[\eta _{-k}^2\eta _0^2]=(E[\eta _{-k}^2])^2+\mathrm{cov}(\eta ^2_0, \eta _{-k}^2)$$ and using (2.21) and (2.22). This completes the proof of the theorem. $$\square$$

## 3 Quadratic forms of m.d. noise

We study here the quadratic form
\begin{aligned} S_n=\sum _{t,s=1}^n g_{t-s} X_tX_s \end{aligned}
(3.1)
with symmetric real weights $$g_k=g_{-k}$$, $$k \in {\mathbb Z}$$. We assume that
\begin{aligned} X_t=\sum _{j=0}^\infty a_j \eta _{t-j}, \qquad \sum _{j=0}^\infty |a_j|<\infty , \end{aligned}
(3.2)
is a linear process and $$\{\eta _t\}$$ is a stationary ergodic m.d. sequence satisfying (2.1). Such a quadratic form appeared in (2.18).

Quadratic forms appear in numerous statistical applications. Asymptotic normality for quadratic forms of linear processes was widely investigated in the statistical and probabilistic literature, see e.g., Hannan (1973), Fox and Taqqu (1987), Giraitis and Surgailis (1990), Robinson (1995), Bhansali et al. (2007) and others. Sufficient general conditions for asymptotic normality of quadratic forms in i.i.d. random variables were established in Rotar (1973), De Jong (1987) and Guttorp and Lockhart (1988). For quadratic forms in m.d. random variables such conditions were derived in Giraitis et al. (2016).

For simplicity set $$a_j=0$$ for $$j\le 0$$. Denote
\begin{aligned} \gamma _k=\sum _{j\in {\mathbb Z}}a_ja_{j+k}, \quad \beta _k= \sum _{j\in {\mathbb Z}} g_j \gamma _{j+k}, \quad k \in {\mathbb Z}. \end{aligned}
(3.3)
The following theorem establishes the asymptotic normality of a quadratic form $$S_n$$. Its proof is based on the results of the paper by Giraitis et al. (2016), whose application requires additional technical effort.

### Theorem 3.1

Suppose that the $$g_k$$s and $$a_k$$s are such that
\begin{aligned} (a)\,\,\sum _{k\in {\mathbb Z}} |g_k|<\infty , \quad (b) \,\,\sum _{k=1}^\infty ka_k^2 <\infty , \quad (c)\,\,\beta _0 =0. \end{aligned}
(3.4)
Then the quadratic form $$S_n$$ in (3.1) satisfies
\begin{aligned}&n^{-1/2}(S_n-ES_n)\rightarrow _D\mathcal{N}(0, v^2), \quad v^2:= 4E\left[ \left( \sum _{k=1}^\infty \beta _k \eta _{-k}\right) ^2 \eta _0^2\right] ,\end{aligned}
(3.5)
\begin{aligned}&n^{-1/2}ES_n\rightarrow 0. \end{aligned}
(3.6)
Moreover, $$v^2$$ has the following properties.
1. (a)
If in addition, $$\{\eta _t\}$$ is a sequence of i.i.d. random variables, then
\begin{aligned} v^2= 4(E\eta _0^2)^2\sum _{k=1}^\infty \beta _k^2. \end{aligned}
(3.7)

2. (b)
If in addition, $$E[\eta _0^2\eta _{s}\eta _{k}]=0$$ for any $$s<k<0$$ , then
\begin{aligned} v^2=4\sum _{k=1}^\infty \beta _k^2E[\eta _0^2 \eta _{-k}^2]=4(E\eta _0^2)^2\sum _{k=1}^\infty \beta _k^2 +4\sum _{k=1}^\infty \beta _k^2 \, \mathrm{cov}(\eta _0^2, \eta _{-k}^2). \end{aligned}
(3.8)

3. (c)
If in addition, $$\{\eta _t\}$$ satisfies assumption that there exists a positive constant $$c>0$$ such that
\begin{aligned} E[\eta _k^2|\mathcal{F}_{k-1}]\ge c>0, \,\,\,k\in {\mathbb Z},\,\,\, a.s., \end{aligned}
(3.9)
then there exists $$c_0>0$$ such that
\begin{aligned} v^2\ge c_0\sum _{k=1}^\infty \beta _k^2>0. \end{aligned}
(3.10)

### Remark 3.1

Suppose that $$g_k=(2\pi )^{-1}\int _\Pi {\hbox {e}}^{ \i kx} \hat{g}(x){\hbox {d}}x,$$ $$k\in {\mathbb Z},$$ where $$\hat{g}(x),\, x\in \Pi$$ is a even continuous bounded function. Notice that $$\gamma _k$$ in (3.3) can be expressed as $$\gamma _k=(2\pi )^{-1}\int _\Pi {\hbox {e}}^{ \i kx} s(x){\hbox {d}}x$$ with $$s(x)=|\sum _{j=0}^\infty {\hbox {e}}^{ \i jx}a_j|^2$$, $$x\in \Pi$$. Then, by Parseval’s identity,
\begin{aligned}&\beta _k= \sum _{j\in {\mathbb Z}} g_j \gamma _{j+k}=(2\pi )^{-1}\int _\Pi {\hbox {e}}^{ \i kx} \hat{g}(x) \, s(x){\hbox {d}}x,\quad k\in {\mathbb Z}\end{aligned}
(3.11)
and the condition $$\beta _0=0$$ is equivalent to
\begin{aligned} \int _\Pi \hat{g}(x) \, s(x){\hbox {d}}x=0. \end{aligned}
(3.12)
Furthermore, for an i.i.d. noise $$\eta _j$$, by Parseval’s identity, $$v^2$$ in (3.7) can be written as
\begin{aligned} v^2=2(E\eta _0^2)^2\sum _{k\in {\mathbb Z}}^\infty \beta _k^2=(E\eta _0^2)^2\pi ^{-1}\int _\Pi \hat{g}^2(x) \, s^2(x){\hbox {d}}x. \end{aligned}
(3.13)

### Example 3.1

Consider the m.d. noise $$\eta _t=\varepsilon _{t}\varepsilon _{t-1},$$ where $$\{\varepsilon _t\}$$ are independent Gaussian random variables with zero mean and variance $$E\varepsilon _0^2=\sigma _\varepsilon ^2$$, see Example 2.1. Then $$\mathrm{cov}(\eta _0^2, \eta _{-1}^2)=2\sigma _0^4$$ and $$\mathrm{cov}(\eta _0^2, \eta _{-k}^2)=0$$ for $$k\ge 2.$$ Thus, (3.8) becomes
\begin{aligned} v^2=4\sigma _0^4\sum _{k=1}^\infty \beta _k^2+8\sigma _0^4 \beta _1^2>4\sigma _0^4\sum _{k=1}^\infty \beta _k^2. \end{aligned}
Hence, compared to an i.i.d. noise, dependence in the noise $$\eta _j$$ increased the variance $$v^2$$.

### Proof of Theorem 3.1

We shall start with the proof of (3.5). Set $$S_{\eta , n}=\sum _{j,k=1}^n \beta _{j-k}\eta _j\eta _k$$. We shall show that
\begin{aligned}&S_n-ES_n=S_{\eta , n}+o_p(n^{1/2}),\end{aligned}
(3.14)
\begin{aligned}&n^{-1/2} S_{\eta , n}\rightarrow _D\mathcal{N}(0,v^2), \end{aligned}
(3.15)
which implies (3.5).
First we verify (3.14). Write $$S_n$$ as
\begin{aligned} S_n=\sum _{j,k=-\infty }^nc_{n,jk}\eta _j\eta _k \quad \text{ with }\quad c_{n,jk}=\sum _{t,s=1}^ng_{t-s}a_{t-j}a_{s-k}. \end{aligned}
(3.16)
Then we split $$S_n$$ into two sums
\begin{aligned} S_n=S_n^{\Delta }+S_n^{o},\quad S_n^{\Delta }:= \sum _{j=-\infty }^nc_{n,jj}\eta _j^2, \quad S_n^{o}:=\sum _{j,k=-\infty : \, j\ne k}^n c_{n,jk}\eta _j\eta _k, \end{aligned}
where $$S_n^{\Delta }$$ is a sum of diagonal term and $$S_n^{o}$$ a sum of quadratic term with zero diagonal.
To verify (3.14), it suffices to show that
\begin{aligned}&E|S_n^{\Delta }|=o\left( n^{1/2}\right) ,\end{aligned}
(3.17)
\begin{aligned}&E\left( S_n^{o}-S_{\eta ,n}\right) ^2=o(n). \end{aligned}
(3.18)
We have $$E|S_n^{\Delta }|\le E\eta _1^2\sum _{j=-\infty }^n|c_{n,jj}|=o(n^{1/2})$$ by (4.1) which proves (3.17) and (3.6). To show (3.18), write
\begin{aligned} S_n^{o}-S_{\eta ,n}=\sum _{j,k=-\infty : \, j\ne k}^n a_{n,jk}\eta _j\eta _k, \quad a_{n,jk}:=c_{n,jk}- \beta _{j-k}I(1\le j,k\le n). \end{aligned}
Then by Lemma 3.1(ii) below,
\begin{aligned}&E(S_n^{o}-S_{\eta ,n})^2\le C\sum _{j,k=-\infty : \, j\ne k}^n a_{n,jk}^2=C\sum _{j,k=-\infty : \, j\ne k}^n \big (c_{n,jk}- \beta _{j-k}I(1\le j,k\le n)\big )^2 \\&\le C\left\{ \sum _{j,k=1 \, j\ne k}^n \left( c_{n,jk}- \beta _{j-k}\right) ^2 +\sum _{j=-\infty }^0\sum _{k=-\infty }^n c_{n,jk}^2\right\} =o(n) \end{aligned}
by (4.1) and (4.2) of Lemma 4.1 below.
It remains to show (3.15). Observe that $$ES_{\eta , n}=0$$ and
\begin{aligned} \sum _{k\in {\mathbb Z}} |\beta _k|\le \sum _{j\in {\mathbb Z}} |g_j|\sum _{k\in {\mathbb Z}} |\gamma _k|\le \sum _{j\in {\mathbb Z}} |g_j|\left( \sum _{k=0}^\infty |a_k|\right) ^2<\infty . \end{aligned}
Therefore, under the additional Assumption (3.9) on the m.d. noise $$\eta _t$$, Corollary 1.1 (i) in Giraitis et al. (2016) implies that
\begin{aligned} \mathrm{var}(S_{\eta , n})^{-1/2} S_{\eta , n}\rightarrow _D\mathcal{N}(0,1). \end{aligned}
(3.19)
Assumption (3.9) is required only to show that $$\mathrm{var}(S_{\eta , n})$$ has property
\begin{aligned} \mathrm{var}(S_{\eta , n})\ge cB_n^2, \quad n \rightarrow \infty , \quad B_n^2:=\sum _{j,k=1}^n\beta _{j-k}^2 \end{aligned}
(3.20)
for some $$c>0,$$ where $$B_n$$ is the Euclidean norm of the (symmetric) matrix $$(\beta _{j-k},\,j,k=1, \ldots ,n)$$. Since $$B_n^2 \sim n \sum _{j\in {\mathbb Z}}\beta _j^2$$, condition (3.20) is equivalent to
\begin{aligned} \mathrm{var}(S_{\eta , n})\ge cn, \quad n \rightarrow \infty \quad (\exists c>0). \end{aligned}
(3.21)
Therefore, Assumption (3.9) can be replaced by (3.21).
To verify (3.21), we will show that under the assumptions of our theorem,
\begin{aligned} n^{-1}\mathrm{var}(S_{\eta , n})\rightarrow v^2, \quad v^2=4E\left[ \left( \sum _{k=1}^\infty \beta _k \eta _{-k}\right) ^2 \eta _0^2\right] . \end{aligned}
(3.22)
If $$v^2>0$$, then (3.21) is valid and (3.19) holds which in turn implies (3.15).

Finally, if $$v^2=0$$, then the normal approximation $$n^{-1/2}S_{\eta , n}\rightarrow 0=\mathcal{N}(0,0)$$ in (3.15) holds with a degenerate limit.

Proof of (3.22). Since $$\eta _t$$ is a stationary m.d. sequence, $$E\eta _t^4<\infty$$, and $$\beta _0=0$$, then
\begin{aligned}&ES_{\eta , n}^2 =E\left( 2\sum _{j=1}^n \eta _{j}\sum _{k=1}^{j-1} \beta _{j-k} \eta _{k}\right) ^2=4\sum _{j=1}^n E\left[ \eta ^2 _{j}\left( \sum _{k=1}^{j-1} \beta _{j-k}\eta _{k}\right) ^2\right] \\&=4\sum _{j=1}^n E\left[ \eta ^2 _{j}\left( \sum _{l=1}^{j-1} \beta _{l}\eta _{j-l}\right) ^2\right] =4\sum _{j=1}^n E\left[ \eta ^2 _{0}\left( \sum _{l=1}^{j-1} \beta _{l}\eta _{-l}\right) ^2\right] =4\sum _{j=1}^n \left( \frac{v^2}{4} +r_j \right) , \end{aligned}
where
\begin{aligned} r_j:=E\left[ \eta ^2 _{0}\left[ \left( \sum _{l=1}^{j-1} \beta _{l}\eta _{-l}\right) ^2-\left( \sum _{l=1}^\infty \beta _{l}\eta _{-l}\right) ^2\right] \right] . \end{aligned}
Hence, $$n^{-1}ES_{\eta , n}^2=v^2+4R_n$$, where $$R_n=n^{-1}\sum _{j=1}^n r_j$$. To verify (3.22), it suffices to show that
\begin{aligned} R_n=o(1). \end{aligned}
Indeed, using equality $$a^2-b^2=(a-b)^2+2(a-b)b$$ with $$a=\sum _{l=1}^{j-1} \beta _{l}\eta _{-l}$$ and $$b=\sum _{l=1}^\infty \beta _{l}\eta _{-l}$$, we obtain
\begin{aligned} |r_j|\le & {} E\left[ \eta ^2 _{0} \,\left\{ \left( \sum _{l=j}^\infty \beta _{l}\eta _{-l}\right) ^2 +2 \left|\sum _{l=j}^\infty \beta _{l}\eta _{-l} \right| \left|\sum _{l=1}^\infty \beta _{l}\eta _{-l} \right| \right\} \right] \\\le & {} \left( E\eta ^4 _{0}\right) ^{1/2} \left\{ \left( E\left( \sum _{l=j}^\infty \beta _{l}\eta _{-l}\right) ^4\right) ^{1/2} +2\left( E\left( \sum _{l=j}^\infty \beta _{l}\eta _{-l}\right) ^4\right) ^{1/4}\left( E\left( \sum _{l=1}^\infty \beta _{l}\eta _{-l}\right) ^4\right) ^{1/4}\right\} . \end{aligned}
Denote $$P_j=\sum _{l=j}^\infty \beta ^2_{l}$$, $$j\ge 1$$. By (3.23),
\begin{aligned} E\left( \sum _{l=j}^\infty \beta _{l}\eta _{-l}\right) ^4\le CP_j^2, \quad E\left( \sum _{l=1}^\infty \beta _{l}\eta _{-l}\right) ^4\le CP_1^2. \end{aligned}
Hence, $$|r_j|\le C\{P_j+P_{j}^{1/2}\}$$ for $$j\ge 1$$. Since $$P_j\le C\sum _{l=j}^\infty |\beta _{l}|\rightarrow 0$$ as $$j\rightarrow \infty$$, then
\begin{aligned} |R_n|\le n^{-1}\sum _{j=1}^n|r_j|\le Cn^{-1}\sum _{j=1}^n\left\{ P_j+P_{j}^{1/2}\right\} =o(1). \end{aligned}
This proves (3.22) and completes the proof of (3.5).

The claim (3.7) is obvious. Equality (3.8) follows straightforwardly noting that $$E[\eta _0^2 \eta _{-k}^2]=(E\eta _0^2)^2+ \mathrm{cov}(\eta _0^2, \eta _{-k}^2).$$

To verify (3.10), notice that under (3.9),
\begin{aligned} E\left[ \eta ^2 _{0} \,\left( \sum _{l=1}^\infty \beta _{l}\eta _{-l}\right) ^2\right] =E\left[ E\left[ \eta ^2 _{0}|\mathcal{F}_{-1}\right] \, \left( \sum _{l=1}^\infty \beta _{l}\eta _{-l}\right) ^2\right] \ge c E\left[ \left( \sum _{l=1}^\infty \beta _{l}\eta _{-l}\right) ^2\right] =cE\eta _1^2 \, \sum _{l=1}^\infty \beta ^2_{l} \end{aligned}
which implies (3.10). This completes the proof of the theorem. $$\square$$

The following lemma was used in the proof of Theorem 3.1 to show (3.18).

### Lemma 3.1

1. (i)
If the m.d. sequence $$\eta _t$$ satisfies $$\max _t E|\eta _t|^p<\infty$$, for some $$p\ge 2$$, then
\begin{aligned} E\left| \sum _{j\in {\mathbb Z}}d_j\eta _j\right| ^p\le C\left( \sum _{j\in {\mathbb Z}}d_j^2\right) ^{p/2}, \end{aligned}
(3.23)
for any $$d_j$$s such that $$\sum _{j\in {\mathbb Z}}d_j^2<\infty$$, where $$C<\infty$$.

2. (ii)
If in addition, $$p\ge 4$$, then
\begin{aligned}&E\left( \sum _{j,k=-\infty :\, j\ne k}^\infty a_{jk}\eta _{j}\eta _{k}\right) ^2\le C\sum _{j, k=-\infty }^\infty a_{jk}^2 \end{aligned}
(3.24)
for any $$a_{jk}$$s such that $$\sum _{j,k\in {\mathbb Z}: \, j\ne k} a_{jk}^2<\infty$$, where $$C<\infty$$.

### Proof of Lemma 3.1

(i) The bound (3.23) is known, see e.g., Lemma 2.5.2 in Giraitis et al. (2012). (ii) Since $$\eta _t$$ is a m.d. sequence, then by (3.23),
\begin{aligned}&E\left( \sum _{j,k=-\infty :\, j\ne k}^\infty a_{jk}\eta _{j}\eta _{k}\right) ^2\le E \left( \sum _{j=-\infty }^\infty \eta _{j}\sum _{k=-\infty }^{j-1}(a_{jk}+a_{kj})\eta _{k}\right) ^2\\&\qquad \le \sum _{j=-\infty }^\infty E\left[ \eta _{j}^2\left( \sum _{k=-\infty }^{j-1}(a_{jk}+a_{kj})\eta _{k}\right) ^2\right] \le \sum _{j=-\infty }^\infty \left( E \eta ^4_{j}\right) ^{1/2}\left( E\left( \sum _{k=-\infty }^{j-1}(a_{jk}+a_{kj})\eta _{k}\right) ^4\right) ^{1/2}\\&\qquad \le C\sum _{j=-\infty }^\infty \sum _{k=-\infty }^{j-1}(a_{jk}+a_{kj})^2\le C\sum _{j, k=-\infty }^\infty a_{jk}^2. \end{aligned}
This completes the proof of the lemma. $$\square$$

## 4 Properties of the weights

In this section, we derive auxiliary results of properties of the weights $$c_{n,jk}$$ and $$\beta _{k}$$ given in (3.16) and (3.3). We used them in the proof of Theorem 3.1.

### Lemma 4.1

Let the $$g_k$$s and $$a_k$$s satisfy (3.4) a–c. Then,
\begin{aligned}&\sum _{j=-\infty }^n|c_{n,jj}|=o\left( n^{1/2}\right) ,\end{aligned}
(4.1)
\begin{aligned}&\sum _{j=-\infty }^0\sum _{k=-\infty }^n c^2_{n,jk}=o(n),\end{aligned}
(4.2)
\begin{aligned}&\sum _{j,k=1}^n(c_{n,jk}-\beta _{j-k})^2=o(n). \end{aligned}
(4.3)

### Proof

To prove (4.1), write
\begin{aligned} \sum _{j=-\infty }^n|c_{n,jj}|=\sum _{j=1}^n|c_{n,jj}|+\sum _{j=-\infty }^0|c_{n,jj}|=:s_{n,1}+s_{n,2}. \end{aligned}
It remains to show that as $$n\rightarrow \infty$$,
\begin{aligned} s_{n,k}=o\left( n^{1/2}\right) , \quad k=1,2. \end{aligned}
(4.4)
First we consider $$s_{n,1}$$. To evaluate $$c_{n,jj}$$ for $$1\le j\le n$$, recall that $$a_j=0$$, $$j\le 0$$. After the change of summation variables $$u=t-j$$, $$v=s-j$$, we obtain
\begin{aligned} c_{n,jj} = \sum _{t,s=1}^ng_{t-s}a_{t-j}a_{s-j} = \sum _{u,v=0}^{n-j}g_{u-v}a_{u}a_{v}. \end{aligned}
Observe that
\begin{aligned} \sum _{u,v=0}^\infty g_{u-v}a_{u}a_{v}=\sum _{s \in {\mathbb Z}}^\infty g_s \gamma _s=\beta _0=0, \end{aligned}
where the last equality holds by Assumption (3.4)c. Hence,
\begin{aligned} |c_{n,jj}|= & {} |c_{n,jj}-\beta _0| \\= & {} |\left( \sum _{u,v=0}^{n-j}-\sum _{u,v=0}^\infty \right) g_{u-v}a_{u}a_{v}|\\\le & {} \left( \sum _{u=n-j+1}^\infty \sum _{v=0}^\infty + \sum _{u=0}^\infty \sum _{v=n-j+1}^\infty \right) |g_{u-v}a_{u}a_{v}|. \end{aligned}
This, together with the change of the summation order
\begin{aligned} \sum _{j=1}^n \sum _{u=n-j+1}^\infty = \sum _{u=1}^\infty \sum _{j=\max (n-u+1,1)}^n \end{aligned}
yields
\begin{aligned} s_{n,1}=\sum _{j=1}^n|c_{n,jj}|\le 2\sum _{j=1}^n\sum _{u=n-j+1}^\infty \sum _{v=0}^\infty |g_{u-v}a_{u}a_{v}| \le 2\sum _{u=1}^\infty \sum _{v=0}^\infty |g_{u-v}(u\wedge n) a_{u}a_{v}|, \end{aligned}
where $$u\wedge n=\min (u,n)$$. Recall the inequality
\begin{aligned} \sum _{u,v\in {\mathbb Z}} |f_{u-v}h_u\nu _v|\le \left( \sum _{u\in {\mathbb Z}} |f_{u}|\right) \left( \sum _{s\in {\mathbb Z}} h_s^2\sum _{v\in {\mathbb Z}} \nu ^2_v \right) ^{1/2}, \end{aligned}
(4.5)
which holds for any sequences $$(f_t)$$, $$(h_t)$$ and $$(\nu _t)$$ of real numbers such that the right hand side (r.h.s.) of (4.5) is finite. Applying (4.5) with $$f_{u-v}=g_{u-v}$$, $$h_u= (u\wedge n) |a_{u}|$$ and $$\nu _v= |a_{v}|$$, we obtain
\begin{aligned} s_{n,1}\le \left( \sum _{u\in {\mathbb Z}} |g_{u}|\right) \left( \sum _{s=0}^\infty (s\wedge n)^2 a^2_{s}\right) ^{1/2} \left( \sum _{v=0}^\infty a_{v}^2 \right) ^{1/2}\le C\left( \sum _{s=1}^\infty (s\wedge n)^2 a^2_{s}\right) ^{1/2} \end{aligned}
since by (3.4) a, b, $$\sum _{u\in {\mathbb Z}} |g_{u}|<\infty$$ and $$\sum _{v=0}^\infty a_{v}^2<\infty$$. Set $$L=\log n$$. We shall bound $$(s \wedge n)^2 \le ns$$ for $$s\ge L$$; $$(s \wedge n)^2 \le Ls$$ for $$1\le s< L$$. Then,
\begin{aligned} s_{n,1}\le C \left( L\sum _{s=1}^{L-1} s a^2_{s} \right) ^{1/2}+C \left( n\sum _{s=L}^\infty s a^2_{s} \right) ^{1/2}=o\left( n^{1/2}\right) \end{aligned}
since $$\sum _{s=L}^\infty s a^2_{s}=o(1)$$ as $$L\rightarrow \infty$$ by (3.4) b.
Next, we consider $$s_{n,2}$$. To evaluate $$c_{n,jj}$$ for $$j \le 0$$, we apply inequality (4.5) with $$f_{u}=g_{u}$$, $$h_t=a_{t-j}I(1\le t\le n)$$ and $$\nu _s=a_{s-j}I(1\le s\le n)$$ which yields
\begin{aligned} |c_{n,jj}|= \left|\sum _{t,s=1}^ng_{t-s}a_{t-j}a_{s-j} \right|\le \left( \sum _{u\in {\mathbb Z}}|g_u|\right) \left( \sum _{t=1}^na^2_{t-j}\right) ^{1/2} \left( \sum _{s=1}^na^2_{s-j}\right) ^{1/2} \le C \left( \sum _{t=1}^na^2_{t-j}\right) . \end{aligned}
Observe that $$a^2_{t-j}\le a^2_{t-j}(t-j)/(1-j)$$ for $$j\le 0$$. Hence,
\begin{aligned}&s_{n,2}=\sum _{j=-\infty }^0|c_{n,jj}|\le C \sum _{j=-\infty }^0\sum _{t=1}^na^2_{t-j}\le C \sum _{j=-\infty }^0\frac{1}{1-j}\sum _{t=1}^na^2_{t-j}(t-j)\\&\quad \le \sum _{j=-\infty }^{-n}n^{-1}\sum _{t=1}^na^2_{t-j}(t-j)+ \left( \sum _{j=-n}^0\frac{1}{1-j}\right) \left( \sum _{t\in {\mathbb Z}}a^2_{t}|t|\right) \\&\quad \le Cn^{-1}\sum _{t=1}^n \left( \sum _{j\in {\mathbb Z}}a^2_{j}|j|\right) +C\log n\le C(1+\log n)=o\left( n^{1/2}\right) . \end{aligned}
This completes the proof of (4.4) and (4.1).
Proof of (4.2). Recall notation $$\gamma _k= \sum _{j\in {\mathbb Z}} a_{j} a_{j+k}$$, and set $$\theta _{t, u}= \sum _{j=-\infty }^0 a_{t-j} a_{u-j}$$. Notice that $$\sum _{j\in {\mathbb Z}} a_{t-j} a_{u-j}=\gamma _{t-u}$$. Then,
\begin{aligned} i_n:=\sum _{j=-\infty }^0\sum _{k=0}^\infty c_{n,jk}^2=\sum _{j=-\infty }^0\sum _{k=0}^\infty \left( \sum _{t,s=1}^ng_{t-s}a_{t-j}a_{s-k}\right) ^2 \le \sum _{t,s, u,v=1}^ng_{t-s}g_{u-v} \theta _{t,u}\gamma _{s-v}. \end{aligned}
Applying inequality (4.5), we get, for any $$1\le t,u\le n$$,
\begin{aligned} \sum _{s,v=1}^n |\gamma _{s-v}g_{t-s}g_{u-v}|\le \left( \sum _{u\in {\mathbb Z}} |\gamma _{u}| \right) \left( \sum _{s\in {\mathbb Z}} g_s^2 \right) ^{1/2} \left( \sum _{v\in {\mathbb Z}} g^2_v \right) ^{1/2}, \end{aligned}
where the r.h.s. does not depend on $$t, \, u$$ and n. Observe that $$\sum _{u\in {\mathbb Z}} g^2_{u}<\infty$$ by (3.4) a, while $$A:=\sum _{j\in {\mathbb Z}} |a_j|<\infty$$ holds by (3.4) b and implies $$\sum _{v\in {\mathbb Z}} |\gamma _{v}|<\infty$$. Hence,
\begin{aligned}&i_n \le C\sum _{t, u=1}^n|\theta _{t,u}|= C\sum _{t, u=1}^n| \sum _{j=-\infty }^0 a_{t-j} a_{u-j}|. \end{aligned}
(4.6)
Set $$L=\log n$$. Then,
\begin{aligned} i_n\le & {} \sum _{t, u=1}^n \left\{ \sum _{j=-L+1}^{0} |a_{t-j} a_{u-j}|+\sum _{j=-\infty }^{-L} |a_{t-j} a_{u-j}| \right\} \\\le & {} \sum _{j=-L+1}^{0} \left( \sum _{t\in {\mathbb Z}}|a_{t-j}| \right) \left( \sum _{u\in {\mathbb Z}} |a_{u-j}|\right) +\sum _{t=1}^n\left[ \sum _{j=-\infty }^{-L} |a_{t-j}| \left( \sum _{u\in {\mathbb Z}}|a_{u-j}| \right) \right] \\\le & {} LA^2+\sum _{t=1}^n \left( \sum _{s=L}^\infty |a_{s}|\right) A =o(n) \end{aligned}
since $$\sum _{s=L}^\infty |a_{s}|\rightarrow 0$$ as $$n\rightarrow \infty$$. This proves (4.2).
Proof of (4.3). Observe that
\begin{aligned} \sum _{t,s\in {\mathbb Z}}g_{t-s}a_{t-j}a_{s-k}=\sum _{u\in {\mathbb Z}}g_u\gamma _{u+k-j}=\sum _{u\in {\mathbb Z}}g_u\gamma _{u+j-k}=\beta _{j-k}. \end{aligned}
Since $$a_l=0$$ for $$l\le 0$$, $$\beta _{j-k}=\sum _{t,s=1}^\infty g_{t-s}a_{t-j}a_{s-k}$$ for $$1\le j,k\le n$$. Hence,
\begin{aligned}&i_n^\prime :=\sum _{j,k=1}^n(c_{n,jk}-\beta _{j-k})^2=\sum _{j,k=1}^n\left( \left( \sum _{t,s=1}^n-\sum _{t,s=1}^\infty \right) g_{t-s}a_{t-j}a_{s-k}\right) ^2\\&\quad \le \sum _{j=1}^n\sum _{k\in {\mathbb Z}}\left( 2\sum _{t=n+1}^\infty \sum _{s=1}^\infty |g_{t-s}a_{t-j}a_{s-k}|\right) ^2. \end{aligned}
Denote $$\gamma _k^*= \sum _{j\in {\mathbb Z}} |a_{j} a_{j+k}|$$ and $$\theta _{n,t, u}^*= \sum _{j=1}^n |a_{t-j} a_{u-j}|$$. Then,
\begin{aligned} i_n^\prime \le 4\sum _{t,u=n+1}^\infty \sum _{s,v=1}^\infty |g_{t-s}g_{u-v}| \theta _{n,t,u}^*\gamma _{s-v}^*. \end{aligned}
Clearly, as above in estimation of $$i_n$$, applying inequality (4.5) we obtain
\begin{aligned} \sum _{s,v=1}^\infty |\gamma ^*_{s-v}g_{t-s}g_{u-v}|\le \left( \sum _{u\in {\mathbb Z}} \gamma ^*_{u} \right) \left( \sum _{s\in {\mathbb Z}} g_s^2 \right) \le C<\infty \end{aligned}
uniformly in $$t, \, u$$ and n. Hence,
\begin{aligned}&i_n^\prime \le C\sum _{t, u=n+1}^\infty \theta ^*_{n,t,u}= C\sum _{t, u=n+1}^\infty \sum _{j=1}^n| a_{t-j} a_{u-j}|. \end{aligned}
(4.7)
Similarly as in estimation $$i_n$$ above, setting $$L=\log n,$$ we obtain
\begin{aligned} i_n^\prime\le & {} \sum _{t=n+1}^{n+L} \left( \sum _{j\in {\mathbb Z}} |a_{t-j}| \right) \left( \sum _{u\in {\mathbb Z}} |a_{u-j}| \right) +\sum _{j=1}^n\left[ \sum _{t=n+L+1}^\infty |a_{t-j}| \left( \sum _{u\in {\mathbb Z}}| a_{u-j}| \right) \right] \\\le & {} LA^2+\sum _{j=1}^n \left( \sum _{s=L}^\infty |a_{s}| \right) A =o(n) \end{aligned}
which proves (4.3). This completes the proof of the lemma. $$\square$$

## 5 Applications

We shall demonstrate the impact of the m.d. noise $$\eta _t$$ on Whittle estimation using examples of AR(1) and MA(1) processes.

First we consider an AR(1) process
\begin{aligned} X_t=\phi X_{t-1}+\eta _t, \end{aligned}
(5.1)
where $$|\phi |<1$$ and $$\eta _t$$ is a stationary ergodic m.d. noise satisfying (2.1). $$\{X_t\}$$ can be written as a stationary linear process
\begin{aligned} X_t=\sum _{j=0}^{\infty }\phi ^j\eta _{t-j}. \end{aligned}
(5.2)
It has the spectral density
\begin{aligned} f_{ \sigma _0^2, \phi }(u)=\frac{\sigma _0^2}{2\pi }s_{\phi }(u), \quad s_{\phi }(u)= \left| \sum _{j=0}^\infty \phi ^j {\hbox {e}}^{\i j u} \right|^2 =(1-2\phi \cos u +\phi ^2)^{-1} \end{aligned}
(5.3)
parametrized by the parameter $$\phi$$. Since
\begin{aligned} ({\hbox {d}}/{\hbox {d}}\phi )s_{\phi }^{-1}(u)=-2\cos u +2\phi , \quad ({\hbox {d}}^2/{\hbox {d}}^2\phi )s_{\phi }^{-1}(u)=2,\quad \end{aligned}
solving the equation $$({\hbox {d}}/{\hbox {d}}\phi )Q_n(\phi )=0$$ implies that Whittle estimate
\begin{aligned} \hat{\phi }=\frac{\int _\Pi \cos (u)I_n(u){\hbox {d}}u}{\int _\Pi I_n(u){\hbox {d}}u}=\frac{\sum _{k=2}^nX_kX_{k-1}}{\sum _{k=1}^nX_k^2} \end{aligned}
(5.4)
is the sample correlation of $$\{X_t\}$$ at the lag 1.

Let $$\Theta =[-a,a],$$ where $$0<a<1$$. Clearly, the family of functions $$s_\phi (u)$$, $$\phi \in \Theta$$, $$u\in \Pi$$ satisfies Assumptions (a0) and (a1).

Theorem 2.2 implies the following result.

### Corollary 5.1

The Whittle estimator $$\hat{\phi }$$ given by (5.4) has the following properties:
\begin{aligned}&n^{1/2} \left( \hat{\phi }-\phi _0 \right) \rightarrow \mathcal{N}\left( 0, v^2_{\phi _0 , \eta }\right) , \nonumber \\&\qquad v^2_{\phi _0, \eta }:=\frac{E \left[ X_{-1}^2\, \eta _0^2 \right] }{\mathrm{var}^2(X_0)}= \left( 1-\phi _0^2 \right) + \frac{\mathrm{cov} \left( X_{-1}^2,\, \eta _0^2 \right) }{\mathrm{var}^2(X_0)}. \end{aligned}
(5.5)
1. (i)
If the m.d. noise $$\{\eta _t\}$$ is an i.i.d. sequence, then
\begin{aligned} v^2_{\phi _0, \eta }=1-\phi _0^2. \end{aligned}
(5.6)

2. (ii)
If the m.d. noise $$\{\eta _t=\varepsilon _t\varepsilon _{t-1}\}$$ is as in (2.13), then
\begin{aligned} v^2_{\phi _0, \eta }= \left( 1-\phi _0^2 \right) +2 \left( 1-\phi _0^2 \right) ^2. \end{aligned}
(5.7)

### Remark 5.1

Relations (5.5) and (5.7) show that m.d. noise $$\{\eta _t\}$$ may have strong impact on the variance $$v^2_{\phi _0, \eta }$$ of the estimate $$\hat{\phi }$$ and thus on the confidence intervals for $$\phi _0$$.

The unknown variance $$v^2_{\phi _0, \eta }$$ in (5.5) can be estimated as follows. Recall that $$EX_t=0$$. Since $$\{\eta _t^2X_{t-1}^2\}$$ and $$\{X_{t-1}^2\}$$ are stationary ergodic sequences, then
\begin{aligned} \frac{n^{-1}\sum _ {k=2}^n \eta _k^2X_{k-1}^2}{\left( n^{-1}\sum _ {k=1}^nX_{k}^2\right) ^2}\rightarrow _p \frac{E\left[ \eta _0^2X_{-1}^2 \right] }{\left( E\left[ X_{0}^2\right] \right) ^2} = v^2_{\phi _0, \eta }. \end{aligned}
Hence, $$v^2_{\phi _0, \eta }$$ can be estimated by
\begin{aligned} \hat{v}^2_{\phi _0, \eta }:=\frac{n^{-1}\sum _ {k=2}^n\hat{\eta }_k^2X_{k-1}^2}{\left( n^{-1}\sum _ {k=1}^nX_{k}^2\right) ^2}\rightarrow _p v^2_{\phi _0, \eta }, \quad \hat{\eta }_k:=X_k-\hat{\phi }X_{k-1} \end{aligned}
(5.8)
which implies
\begin{aligned} \frac{n^{1/2}}{\sqrt{\hat{v}^2_{\phi _0, \eta }}} \left( \hat{\phi }-\phi _0 \right) \rightarrow \mathcal{N}(0, 1). \end{aligned}

### Proof of Corollary 5.1

Recall that
\begin{aligned} \gamma _{k, \phi }:= & {} (2\pi )^{-1}\int _\Pi {\hbox {e}}^{\i k u} s_{\phi }(u){\hbox {d}}u=\phi ^k \gamma _{0,\phi }, \quad k\ge 1; \,\,\, \gamma _{0,\phi }= \left( (1-\phi ^2 \right) ^{-1}. \end{aligned}
Thus, $$\beta _{k,\phi }$$'s in (2.8) can be written as
\begin{aligned} \beta _{k,\phi }= & {} -(2\pi )^{-1}\int _\Pi {\hbox {e}}^{\i k u}\nabla _{\phi } s^{-1}_{\phi }(u)s_{\phi }(u){\hbox {d}}u\nonumber \\= & {} (2\pi )^{-1}\int _\Pi {\hbox {e}}^{\i k u}(2\cos u-2\phi )s_{\phi }(u){\hbox {d}}u \nonumber \\= & {} (2\pi )^{-1}\int _\Pi {\hbox {e}}^{\i k u}({\hbox {e}}^{\i u}+{\hbox {e}}^{-\i u}-2\phi ) s_{\phi }(u){\hbox {d}}u\nonumber \\= & {} \gamma _0(\phi ^{k+1}+\phi ^{k-1}-2\phi ^{k+1})\nonumber \\= & {} \phi ^{k-1}, \quad k\ge 1. \end{aligned}
(5.9)
So, $$\sum _{j=1}^\infty \beta _{j,\phi _0}^2=(1-\phi _0^2)^{-1}=:B_{\phi _0}$$, and by (2.8) and (2.9), we obtain
\begin{aligned}&W_{\phi _0}=4\pi B_{\phi _0},\\&V_{\phi _0, \eta }=4E\left[ \left( \sum _{k=1}^\infty \beta _{k,\phi _0}\eta _{-k}\right) ^2 \eta _0^2\right] =4E\left[ \left( \sum _{k=0}^\infty \phi ^k_0\eta _{-k-1}\right) ^2 \eta _0^2\right] =4E\left[ X_{-1}^2 \eta _0^2\right] ,\\&\Omega _{\phi _0,\eta }=\sigma _0^{-4}B^{-2}_{\phi _0}E\left[ X_{-1}^2 \eta _0^2\right] = B^{-1}_{\phi _0}+ \sigma _0^{-4}B^{-2}_{\phi _0}\mathrm{cov}(X_{-1}^2,\, \eta _0^2). \end{aligned}
Notice that $$\sigma _0^{2}B_{\phi _0}=\mathrm{var}(X_0).$$ This together with (2.10) of Theorem 2.2 proves (5.5). Clearly, (5.5) implies (i) and (ii). $$\square$$
Next, we consider the example of MA(1) process
\begin{aligned} X_t=\eta _t-\theta \eta _{t-1}, \end{aligned}
(5.10)
where $$|\theta |<1$$ and $$\eta _t$$ is as in (5.1). This process has the spectral density
\begin{aligned} f_{ \sigma _0^2, \theta }(u)=\frac{\sigma _0^2}{2\pi }s_{\theta }(u), \quad s_{\theta }(u)=|1-\theta {\hbox {e}}^{ \i u}|^2 =1-2\theta \cos u +\theta ^2 \end{aligned}
(5.11)
parametrized by the parameter $$\theta$$.
Let $$\Theta =[-a,a],$$ where $$0<a<1$$. Since $$(d/d\theta )s_{\theta }(u)=-2\cos u +2\theta ,$$ $$(d^2/d^2\theta )s_{\theta }=2,$$ functions $$s_\theta (u)$$, $$\theta \in \Theta$$, $$u\in \Pi$$ satisfy Assumptions (a0) and (a1). Moreover, (5.9) implies that the weights $$\beta _{k,\theta _0}$$ in (2.8) satisfy
\begin{aligned} \beta _{k,\theta _0}:=(2\pi )^{-1}\int _\Pi {\hbox {e}}^{\i k u}\nabla _{\theta }\log s_{\theta _{0}}(u){\hbox {d}}u =-(2\pi )^{-1}\int _\Pi {\hbox {e}}^{\i k u}\nabla _{\theta }\log s^{-1}_{\theta _{0}}(u){\hbox {d}}u=-\theta ^{k-1}_0, \quad k \ge 1. \end{aligned}
Hence, by (2.8) and (2.9), setting $$Z_{t}=\sum _{k=0}^\infty \theta ^k_0\eta _{t-k}$$, we obtain
\begin{aligned}&W_{\theta _0}=4\pi \sum _{j=1}^\infty \beta _{j,\theta _0}^2=4\pi (1-\theta _0^2)^{-1}=:4\pi B_{\theta _0},\\&V_{\theta _0, \eta }=4E\left[ \left( \sum _{k=1}^\infty \beta _{k, \theta _0}\eta _{-k}\right) ^2 \eta _0^2\right] =4E\left[ \left( \sum _{k=0}^\infty \theta ^k_0\eta _{-k-1}\right) ^2 \eta _0^2\right] =4E\left[ Z_{-1}^2 \eta _0^2\right] ,\\&\Omega _{\theta _0,\eta }=\sigma _0^{-4}B^{-2}_{\theta _0}E\left[ Z_{-1}^2 \eta _0^2\right] = B^{-1}_{\theta _0}+ \sigma _0^{-4}B^{-2}_{\theta _0}\mathrm{cov}\left( Z_{-1}^2,\, \eta _0^2\right) . \end{aligned}
Since $$\sigma _0^{2}B_{\theta _0}=\mathrm{var}(Z_{0}),$$ (2.10) of Theorem 2.2 implies the following result.

### Corollary 5.2

The Whittle estimator $$\hat{\theta }$$ for MA(1) process (5.10) has the following properties:
\begin{aligned}&n^{1/2}\left( \hat{\theta }-\theta _0\right) \rightarrow \mathcal{N} \left( 0, v^2_{\theta _0, \eta }\right) , \nonumber \\&\qquad v^2_{\theta _0, \eta }:=\frac{E \left[ Z_{-1}^2\, \eta _0^2 \right] }{\mathrm{var}^2(Z_0)}=(1-\theta _0^2)+ \frac{\mathrm{cov} \left( Z_{-1}^2,\, \eta _0^2 \right) }{\mathrm{var}^2(Z_0)}. \end{aligned}
(5.12)
1. (i)

If the m.d. noise $$\{\eta _t\}$$ is an i.i.d. sequence, then $$v^2_{\theta _0, \eta }=1-\theta _0^2.$$

2. (ii)

If the m.d. noise $$\{\eta _t=\varepsilon _t\varepsilon _{t-1}\}$$ is as in (2.13), then $$v^2_{\theta _0, \eta }=(1-\theta _0^2)+2(1-\theta _0^2)^2.$$

### Remark 5.2

Expressions of asymptotic variances $$v^2_{\phi _0, \eta }$$ in (5.5) and $$v^2_{\theta _0, \eta }$$ in (5.12) for parametric Whittle estimators of AR(1) and MA(1) models are remarkably similar. For AR(1) model, $$v^2_{\phi _0, \eta }$$ can be estimated by (5.8).

For MA(1) model, a consistent estimate of $$v^2_{\theta _0, \eta }$$ can be constructed as follows. By inverting MA(1) process (5.10), we obtain $$\eta _t=(1-\theta L)^{-1} X_t=\sum _{j=0}^\infty \theta ^j X_{t-j}$$, where L is the backshift operator. Similarly,
\begin{aligned} Z_t=(1-\theta L)^{-1}\eta _t=(1-\theta L)^{-2}X_t= \left( \sum _{s=0}^\infty (s+1)\theta ^s L^s \right) X_t =\sum _{s=0}^\infty (s+1)\theta ^s X_{t-s}. \end{aligned}
Since $$\{Z_{t-1}^2\eta _t^2\}$$ and $$\{Z_{t}^2\}$$ are stationary ergodic processes, then
\begin{aligned} \frac{n^{-1}\sum _{j=2}^n Z_{j-1}^2\eta _j^2}{\left( n^{-1}\sum _{j=1}^n Z_{j}^2\right) ^2} \rightarrow _p \frac{E \left[ Z_{-1}^2\eta _0^2 \right] }{ \left( E\left[ Z_{0}^2\right] \right) ^2}=v^2_{\theta _0, \eta }. \end{aligned}
Setting $$\hat{Z}_t=\sum _{s=0}^t (s+1)\hat{\theta }^s X_{t-s}$$, $$\hat{\eta }_t=\sum _{s=0}^t \hat{\theta }^s X_{t-s},$$ we obtain the required estimate:
\begin{aligned} \hat{v}^2_{\theta _0,\eta }=\frac{n^{-1}\sum _{j=2}^n \hat{Z}_{j-1}^2\hat{\eta }_j^2}{\big (n^{-1}\sum _{j=1}^n \hat{Z}_{j}^2\big )^2} \rightarrow _p v^2_{\theta _0, \eta }. \end{aligned}

## Notes

### Acknowledgements

The authors would like to thank the two anonymous referees for valuable comments and suggestions. Liudas Giraitis and Murad S. Taqqu would like to thank Masanobu Taniguchi for his hospitality in Japan. M. Taniguchi was supported by JSPS Kiban grant A-15H02061 at Waseda University. The corresponding author states that there is no conflict of interest.

## References

1. Bhansali, R. J., Giraitis, L., & Kokoszka, P. (2007). Approximations and limit theory for quadratic forms of linear processes. Stoch. Process. Appl., 117, 71–95.
2. De Jong, P. (1987). A central limit theorem for generalized quadratic forms. Probab. Theory Relat. Fields, 75, 261–277.
3. Fox, R., & Taqqu, M. S. (1986). Large sample properties of parameter estimates for strongly dependent stationary Gaussian time series. Ann. Stat., 14, 517–532.
4. Fox, R., & Taqqu, M. S. (1987). Central limit theorems for quadratic forms in random variables having long-range dependence. Probab. Theory Relat. Fields, 74, 213–240.
5. Giraitis, L., & Surgailis, D. (1990). A central limit theorem for quadratic forms in strongly dependent linear variables and its application to asymptotic normality of Whittle’s estimate. Probab. Theory Relat. Fields, 86, 87–104.
6. Giraitis, L., Hidalgo, J., & Robinson, P. M. (2001). Gaussian estimation of parametric spectral density with unknown pole. Ann. Stat., 29, 987–1023.
7. Giraitis, L., Leipus, R., & Surgailis, D. (2007). Recent Advances in ARCH Modelling. In G. Teyssiere & A. P. Kirman (Eds.), Long Memory in Economics (pp. 3–38). Berlin: Springer.
8. Giraitis, L., Koul, H. L., & Surgailis, D. (2012). Large Sample Inference for Long Memory Processes. London: Imperial College Press.
9. Giraitis, L., Taniguchi, M., & Taqqu, M. S. (2016). Asymptotic normality of quadratic forms of martingale differences. Stat. Inference Stoch. Process., 20, 315–327.
10. Guttorp, P., & Lockhart, R. A. (1988). On the asymptotic distribution of quadratic forms in uniform order statistics. Ann. Stat., 16, 433–449.
11. Hannan, E. J. (1973). The asymptotic theory of linear time series models. J. Appl. Probab., 10, 130–145.
12. Hosoya, Y., & Taniguchi, M. (1982). A central limit theorem for stationary processes and the parameter estimation of linear processes. Ann. Stat., 10, 132–153.
13. Robinson, P. M. (1995). Gaussian semiparametric estimation of long range dependence. Ann. Stat., 23, 1630–1661.
14. Rotar, V. I. (1973). Certain limit theorems for polynomials of degree two. Teoria Verojatnosti i Primenenia, 18, 527–534. (in Russian).
15. Stout, W. (1974). Almost Sure Convergence. New York: Academic Press.
16. Taniguchi, M. (1982). On estimation of the integrals of the fourth order cumulant spectral density. Biometrika, 69, 117–122.
17. Wu, P. T., & Shieh, S. J. (2007). Value-at-Risk analysis for long-term interest rate futures: fat-tail and long memory in return innovations. J. Empir. Finance, 14, 248–259.
18. Whittle, P. (1953). Estimation and information in stationary time series. Ark. Mat., 2, 423–443.

© Japanese Federation of Statistical Science Associations 2018

## Authors and Affiliations

• Liudas Giraitis
• 1
• Masanobu Taniguchi
• 2