Abstract
Subspace methods have been shown to be remarkably robust procedures providing consistent estimates of linear dynamical state-space systems for (multivariate) time series in different situations including stationary and integrated processes without the need for specifying the degree of persistence. Fractionally integrated processes bridge the gap between short-memory processes corresponding to stable rational transfer functions and integrated processes such as unit root processes. Therefore, it is of interest to investigate the robustness of subspace procedures for this class of processes. In this paper, it is shown that a particular subspace method called canonical variate analysis (CVA) that is closely related to long vector autoregressions (VAR) provides consistent estimators of the transfer function corresponding to the data generating process also for fractionally integrated processes of the VARFIMA or FIVARMA type, if integer parameters such as the system order tend to infinity as a suitable function of the sample size. The results are based on analogous statements for the consistency of long VAR modelling. In a simulation study, it is demonstrated that the model reduction implicit in CVA leads to accuracy gains for the subspace methods in comparison to long VAR modelling.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This algorithm has also been called canonical correlation analysis (CCA) in the literature.
- 2.
In the unlikely case of identically estimated singular values \(\hat{\sigma }_{n}= \hat{\sigma }_{n+1}\), the basis in the corresponding spaces is chosen randomly.
References
Bauer, D.: Estimating linear dynamical systems using subspace methods. Econ. Theory 21, 181–211 (2005)
Bauer, D., Wagner, M.: A canonical form for unit root processes in the state space framework. Econ. Theory 6, 1313–1349 (2012)
Chan, N.H., Palma, W.: State space modeling of long-memory processes. Ann. Stat. 26, 719–740 (1998)
Dahlen, A., Scherrer, W.: The relation of the CCA subspace method to a balanced reduction of an autoregressive model. J. Econ. 118, 293–312 (2004)
Galbraith, J.W., Zinde-Walsh, V.: Autoregression-based estimators for ARFIMA models. In: CIRANO Working Papers 2001–11 (2001)
Hannan, E.J., Deistler, M.: The Statistical Theory of Linear Systems. Wiley, New York (1988)
Hosking, J.R.M.: Fractional differencing. Biometrika 68(1), 165–176 (1981)
Hosking, J.R.M.: Asymptotic distributions of the sample mean, autocovariances, and autocorrelations of long-memory time series. J. Econ. 73(1), 261–284 (1996)
Larimore, W.E.: System identification, reduced order filters and modeling via canonical variate analysis. In: Proceedings of the 1983 American Control Conference, Piscataway, NJ, pp. 445–451 (1983)
Lewis, R., Reinsel, G.: Prediction of multivariate time series by autoregressive model fitting. J. Multivar. Anal. 16, 393–411 (1985)
Palma, W., Bondon, P.: On the eigenstructure of generalized fractional processes. Stat. Probab. Lett. 65, 93–101 (2003)
Poskitt, D.S.: Autoregressive approximation in nonstandard situations: the fractionally integrated and non-invertible case. Ann. Inst. Stat. Math. 59, 697–725 (2006)
Sela, R.J., Hurvich, C.M.: Computationally efficient methods for two multivariate fractionally integrated models. J. Time Ser. Anal. 30(6), 631–651 (2009)
Acknowledgements
The author is grateful to Philipp Sibbertsen and Christian Leschinski for discussions and comments on the contents of the paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
Auxiliary Lemmas
Lemma A.1
Let \(y_t = \sum _{j=0}^\infty k_j \varepsilon _{t-j}\) where \((\varepsilon _t)_{t \in {\mathbb Z}}\) is an iid sequence of random variables having zero mean and finite fourth moments. Let \( \hat{\gamma }_j = \frac{1}{T}\sum _{t=1+j}^T (y_t - \bar{y})(y_{t-j}-\bar{y})' \) where \(\bar{y} = \frac{1}{T}\sum _{t=1}^T y_t, {\tilde{\gamma }}_j = \frac{1}{T}\sum _{t=1}^T y_{t+j}y_{t}'- \bar{y} \bar{y}'\) and \(\gamma _j = {\mathbb E}y_ty_{t-j}'\). Assume that \(k_j = O(j^{d-1})\) where \(-0.5<d<0.5\). Then,
(i) \(\Vert \gamma _j \Vert \le \kappa j^{2d-1} for j>0\).
(ii) \({\mathbb E}\bar{y}\bar{y}' = O(T^{2d-1})\) and \({\mathbb E}\Vert \bar{y}\Vert ^4 = O(T^{4d-2})\).
(iii) \({\mathbb E}\Vert \hat{\gamma }_j- {\tilde{\gamma }}_j\Vert \le \tau j/T\) where \(0<\tau <\infty \) does not depend on j.
(iv) \(\hat{\gamma }_j - \gamma _j = O_P(j/T) + ({\tilde{\gamma }}_j-\gamma _j)\) where \( {\mathbb E}\mathrm{vec}({\tilde{\gamma }}_j-\gamma _j)\mathrm{vec}({\tilde{\gamma }}_k-\gamma _k)' = O(Q_T(d)^2)\) with
(v) Let \(R_T(d) := (T/\log T)^{d-1/2}\). For \(H_T R_T(\tilde{d})\rightarrow 0 \) for \(\tilde{d} = \max (0,d)\), we have
Proof
The proof uses the results of Theorems 1, 3 and 5 of [8] and Theorems 1 and 2 of [12].
(i) Using \(\Omega = {\mathbb E}\varepsilon _t \varepsilon _t'\), we have for some constant \(0< \kappa <\infty \) not depending on j
since \(\Vert \Omega \Vert < \mu , \Vert k_i \Vert \le \bar{\mu } i^{d-1}\) for some \(\bar{\mu }<\infty \), \(i<i+j\) and the techniques in the proof of Lemma 3.2 of [3].
(ii) The first part follows directly from (i) in combination with [8, Theorem 1] dealing with each coordinate separately.
Below, we deal without restriction of generality with scalar processes. The vector case is merely notationally more complex. With respect to the second part, note that \({\mathbb E}y_ty_sy_ry_0 = \gamma _{t-s}\gamma _r + \gamma _{t-r}\gamma _{s} + \gamma _{t}\gamma _{s-r} + \kappa _4(t,s,r)\) for \(\kappa _4(t,s,r):= \sum _{a=-\infty }^{\infty } k_{a+t}k_{a+s}k_{a+r}k_a ({\mathbb E}\varepsilon _t^4 - ({\mathbb E}\varepsilon _t^2)^2)\) where for notational simplicity \(k_a=0, a<0\) is used. Next,
is the sum of four terms where the first three are identical:
The last term equals the fourth term of (A.2) in [8]. Using Lemma 3.2. (i) of [3] in the fourth row of the equation on p. 277 of [8], we obtain
for \(0.25 \le d<0.5\). By dominated convergence, we have that this term is \(O(T^{-1})\) for \(-0.5<d<0.25\). Hence, we obtain the bound \({\mathbb E}\bar{y}^4 = O(T^{4d-2})\) for \(d>0.25\) and \(O(T^{-1})\) else.
(iii) For \(j>0\), we obtain
(iv) From (iii), it follows that \(\hat{\gamma }_j = {\tilde{\gamma }}_j + O_P(j/T)\). The rest follows from [8].
(v) Noting that Theorems 1 and 2 of [12] only use the variance bounds derived above, uniformity of a.s. convergence follows.
Lemma A.2
Let \((y_t)_{t \in {\mathbb Z}}\) be as in Lemma A.1. Define \(k(z) := \sum _{j=0}^\infty k_jz^j\) and the spectrum \(f(\omega )=\frac{1}{2\pi }k(e^{i\omega })\Omega k(e^{i\omega })^*\) where \(\Omega := {\mathbb E}\varepsilon _t \varepsilon _t'\).
(i) Assume that there exist constants \(0<a,b<\infty \) such that \(aI_s \le f(\omega ) \le bI_s \overline{f}(\omega )\) where \(\overline{f}(\omega )=|1-e^{i\omega }|^{-2d}\) for \(d \ge 0\). Then there exists constants \(0< C_1< C_2 < \infty \) such that \(C_1 I_{ps} \le {\mathbb E}Y_{t,p}^- (Y_{t,p}^-)' \le C_2 I_{ps}p^{2d}\).
(ii) If there exist constants \(0<a,b<\infty \) such that \(aI_s \underline{f}(\omega )\le f(\omega ) \le bI_s\) where \(\underline{f}(\omega )=|1-e^{i\omega }|^{-2d}\) for \(d < 0\), then there exist constants \(0< C_1< C_2 < \infty \) such that \(C_1 I_{ps}p^{2d} \le {\mathbb E}Y_{t,p}^- (Y_{t,p}^-)' \le C_2\).
Proof
The proof is a straightforward generalization of the univariate result in Theorem 2 of [11]; compare also Lemma 2 in [12]. Only (i) is proved, the remaining statements follow analogously. Let \(\varGamma _p^-= {\mathbb E}Y_{t,p}^- (Y_{t,p}^-)'\). Note that the definition of \(\overline{f}\) and \(\underline{f}\) coincide with the definition given in (9), (10), p. 96 of [11].
The smallest eigenvalue of \(\varGamma _p^-\) equals the minimum of \(x' \varGamma _p^-x\) for \(x'x=1\) and the largest corresponds to the maximum. Since \((y_t)_{t \in {\mathbb Z}}\) is assumed to be stationary with spectral density \(f(z) := k(z)\Omega k(z)^*/(2\pi )\), it follows that
where \(x' = [x_1',\dots ,x_p'], x_j \in {\mathbb R}^s\). If \(d>0\) then
Also,
It follows that the function \(h(\omega ) := \Vert \sum _{j=1}^p x_je^{ij\omega } \Vert ^2\) is in the set \(P_p\) (p. 97 of [11]). Hence the result holds. \(\square \)
Lemma A.3
Let \((y_t)_{t \in {\mathbb Z}}\) be as in Theorem 3.2. Then,
-
For \(0.25< d_+ < 0.5\), we have \(\Vert {\mathbb E}Y_{t,f}^+ (Y_{t,p}^-)' \Vert _{Fr} = O((f+p)^{2d_+}), \Vert {\mathbb E}Y_{t,p}^- (Y_{t,p}^-)' \Vert _{Fr} =O(p^{2d_+})\).
-
For \(d_+=0.25\), we have \(\Vert {\mathbb E}Y_{t,f}^+ (Y_{t,p}^-)' \Vert _{Fr} = O((f+p)^{1/2}), \Vert {\mathbb E}Y_{t,p}^- (Y_{t,p}^-)' \Vert _{Fr} =O(\sqrt{p \log {p}})\).
-
For \(0 \le d_+<0.25\), we have \(\Vert {\mathbb E}Y_{t,f}^+ (Y_{t,p}^-)' \Vert _{Fr} = O((f+p)^{2d_+}), \Vert {\mathbb E}Y_{t,p}^- (Y_{t,p}^-)' \Vert _{Fr} =O(\sqrt{p})\).
-
For \(d_+<0\), we have \(\Vert {\mathbb E}Y_{t,f}^+ (Y_{t,p}^-)' \Vert _{Fr} = O(1), \Vert {\mathbb E}Y_{t,p}^- (Y_{t,p}^-)' \Vert _{Fr} =O(\sqrt{p})\).
The lemma is an easy consequence of \(\Vert \gamma _l \Vert \le \mu l^{2d_+-1}\) as shown in Lemma A.1 in combination with \(\sum _{j=1}^m j^{\beta -1} = O(m^\beta )\) for \(\beta >0\).
Lemma A.4
Let the assumptions of Theorem 3.2 hold. Let \(\tilde{d}_- = \min (d_-,0), \tilde{d}_+ = \max (d_+,0)\) and assume that \(p^{2-4\tilde{d}_-}R_T(\tilde{d}_+)^2 \rightarrow 0\) and \(p R_T(\tilde{d}_+) \rightarrow 0\). Then
Proof
All three parts follow almost immediately from Lemma A.1 (v). With respect to the third statement using [10], p. 397, l. 11, we obtain
where \(F := \Vert (\varGamma _p^-)^{-1}\Vert _2, Z_{p,T} := \Vert \langle Y_{t,p}^- , Y_{t,p}^- \rangle ^{-1} - (\varGamma _p^-)^{-1} \Vert _2/F(\Vert \langle Y_{t,p}^- , Y_{t,p}^- \rangle ^{-1} - (\varGamma _p^-)^{-1} \Vert _2+F) \le \Vert \langle Y_{t,p}^- , Y_{t,p}^- \rangle - \varGamma _p^-\Vert _2\). Then the upper bound on p implies that \(Z_{p,T} =o(1)\). For \(d_->0\) and since \(\tilde{d}_- = 0\), we have \(F< \infty \) and therefore the result follows. For \(d_-<0\), we have \(F = O(p^{-2d_-})\) and therefore the upper bound on p has to ascertain that \(F Z_{p,T}\rightarrow 0\) where \(Z_{p,T} = O(p R_T(\tilde{d}_+))\). \(\square \)
Lemma A.5
Let the assumptions of Theorem 3.2 hold and let \(\phi _j(p)\) denote the coefficients of the long VAR approximation in (4). Then \(\sum _{j=1}^p \Vert \phi _j - \phi _j(p) \Vert ^2 = O(p^{4d_+-2d_--1})\) if \(d_+\ge d_-\ge 0\) which tends to zero if \(d_- > 2d_+-0.5\). If \(d_+=d_-=d\), this always holds and the order equals \(O(p^{2d-1})\).
If \(d_- \le d_+ \le 0\) then \(\sum _{j=1}^p \Vert \phi _j - \phi _j(p) \Vert ^2 = O(p^{4d_+-6d_--1})\) which tends to zero if \(2d_+ < 0.5+3d_-\). For \(0 \ge d_+=d_->-0.5\), this always holds and the rate equals \(O(p^{-2d-1})\) in this case.
Proof
This result has already been obtained in the scalar case (where automatically \(d_+=d_-\)) by [5]. Note that using \(f=1\), we obtain
where \(\varPhi _{p}^-\) denotes the matrix \(\varPhi \) where the first p block columns are omitted. Further, \({\tilde{\varGamma }}_{2,p}^- = {\mathbb E}Y_{t-p,\infty }^- (Y_{t,p}^-)'\). Therefore it is sufficient to compute the Frobenius norm of \(\varPhi _{p}^- {\tilde{\varGamma }}_{2,p}^-\) which contains as a typical element \(\sum _{i=1}^{\infty } \phi _{i+p} {\mathbb E}y_{t-p-i} y_{t-j}', j=1,\dots ,p\). Using \(\Vert \phi _l \Vert \le M_l l^{-1-d_-}, \Vert \gamma _{l} \Vert \le M_g l^{2d_+-1}\), we can bound the norm of this entry by \(M_lM_g \sum _{i=1}^\infty (i+p)^{-1-d_-} (p+i-j)^{2d_+-1}\). For \(d_->0\), this is of order \(O(p^{-d_-}(p-j)^{2d_+-1})\). Summing the squares over \(j=1,\dots ,p\) shows that the squared Frobenius norm of \(\varPhi _{p}^- {\tilde{\varGamma }}_{2,p}^-\) in this case is of order \(O(p^{4d_+-2d_--1})\). Since the smallest eigenvalue of \(\varGamma _p^-\) is bounded away from zero for \(d_->0\), the result follows. The result for \(d_+=d_-\) is obvious.
For \(d_+<0\), the norm of the typical entry is of order \(O(p^{-1-d_-}(p-j)^{2d_+})\) and therefore the squared Frobenius norm of \(\varPhi _{p}^- {\tilde{\varGamma }}_{2,p}^-\) also in this case is of order \(O(p^{4d_+-2d_--1})\). Here the smallest eigenvalue of \(\varGamma _p^-\) tends to zero as \(p^{2d_-}\) and hence the inverse adds a factor \(p^{-2d_-}\) to the Frobenius norm adding up to order \(O(p^{-1+4d_+-6d_-})=o(1)\) if \(-0.5+2d_+-3d_-<0\). \(\square \)
Proof of Theorem 3.2
The main insight into the algorithm lies in the fact that for \(n=ps\) the CVA estimate \((\hat{A},\hat{C}, \hat{K})\) equals the long VAR approximation of \((y_t)_{t \in {\mathbb Z}}\) using lag order p [4], the properties of which follow along the lines of [12]:
Lemma A.6
Let \((y_t)_{t \in {\mathbb Z}}\) be as in Theorem 3.2 denoting
Further, let the assumptions on \(d_-, d_+\) be as in Theorem 3.2. Then the OLS estimates of the coefficients in this long VAR approximation fulfil uniformly in \(1 \le p \le H_T\) where \(H_T\) is such that \( H_T^{2-4\tilde{d}_-}R_T(\tilde{d}_+)^2 \rightarrow 0, R_T(\tilde{d}_+)P_T(d_-,H_T) \rightarrow 0 \) where \(P_T(d,p) = p\) for \(d \ge 0\) and \(P_T(d,p) = p^{1-4d}\) for \(d<0\)
Proof
The proof follows the arguments of [12]. The long AR approximation can be written as \(y_t = \beta _{1,p} Y_{t,p}^- + \varepsilon _{t}(p)\). It follows that \( \hat{\beta }_{1,p} = \langle y_t, Y_{t,p}^- \rangle \langle Y_{t,p}^- , Y_{t,p}^- \rangle ^{-1} = \left[ \begin{array}{cccc} \hat{\phi }_1(p)&\hat{\phi }_2(p)&\dots&\hat{\phi }_p(p) \end{array} \right] . \) Let
Then Lemma A.5 implies that \(\Vert \beta _{1,p} - [\varPhi ]_p \Vert _{Fr} = o(1)\) where \(\sup _p \Vert [\varPhi ]_p \Vert _{Fr} < \infty \) due to the square integrability of the AR coefficients. This implies \(\sup _p \Vert \beta _{1,p} \Vert _{Fr} < \infty \). Since \(\Vert \gamma _l \Vert _{Fr} \le \kappa l^{2d_+-1}\), it follows that \(\sup _p \Vert {\mathcal H}_{1,p} \Vert _{Fr} < \infty \) for \(d_+<0.25\).
Further, note that \( \Vert ({\hat{\varGamma }}_p^-)^{-1} \Vert _2 \le \Vert ({\hat{\varGamma }}_p^-)^{-1} - (\varGamma _p^-)^{-1} \Vert _2 + \Vert (\varGamma _p^-)^{-1} \Vert _2 = o(1) + \Vert (\varGamma _p^-)^{-1} \Vert _2. \) Then \(\hat{\beta }_{1,p} - \beta _{1,p} =\)
From these equations (for \(d_+>0\), we can use the third equation, else the second can be used) in combination with Lemma A.4 (where the norm bounds hold uniformly in the lag length), the result follows. \(\square \)
Consequently, \(p \rightarrow \infty \) at the rate given in the theorem implies that for \(\hat{\phi }(z) = \sum _{j=0}^p \hat{\phi }_j z^j\) and \(\phi _p(z) = \sum _{j=0}^p \phi _j(p)z^j\) it follows that \(\Vert \hat{\phi }(z) - \phi _p(z) \Vert _2 \rightarrow 0\).
Furthermore, the norm bound implies that
and hence the Fourier series \(\sum _{j=0}^p \phi _j z^j\) converges in \(L_2\) to \(\phi (z)\), that is, \(\Vert \phi _p(z) - \phi (z) \Vert _2 \rightarrow 0\) and thus \(\Vert \hat{\phi }(z) - \phi (z) \Vert _2 \rightarrow 0\).
These two lemmas show that subspace methods with the maximal choice of the order \(n = ps\) deliver consistent estimates of the inverse transfer function \(\phi (z)\) and thus also of the transfer function.
Finally, some facts on the approximation error for \(n< ps\) are provided.
Lemma A.7
Let \((\hat{A},\hat{C},\hat{K})\) denote a system with corresponding state \(\hat{x}_t\) such that \(\langle \hat{x}_t, \hat{x}_t \rangle = I_N\) and innovation noise variance \(\hat{\Omega }\). Then consider the partitioning of the system as (where \(\hat{A}_{11} \in {\mathbb R}^{n \times n}, \hat{K}_1 \in {\mathbb R}^{n \times s},\hat{C}_1 \in {\mathbb R}^{s \times n}\))
(i) The system \((\hat{A}_n,\hat{C}_n,\hat{K}_n)\) obtained from using \(\hat{x}_{t,n} = [I_n,0]\hat{x}_t\) in the regressions in the last step of the CVA algorithm fulfils
(ii)\( \Vert \hat{C}_{:,j} \Vert _{2} = O\left( p^{\tilde{d}_+} \hat{\sigma }_j \right) \) where \( \hat{C}_{:,j}\) denotes the jth column of \(\hat{C}\).
(iii) Furthermore, let \(\Vert f(z) \Vert _{\infty } = \sup _{\omega \in [0,2\pi ]} \Vert f(e^{i\omega }) \Vert _2\) and use the notation \(\hat{k}(z) = I_s+ z\hat{C}(I - z\hat{A})^{-1} \hat{K}, \hat{k}_{n1}(z) = I_s+ z\hat{C}_n(I - z\hat{A}_{n})^{-1} \hat{K}_1\). Then \( \Vert \hat{k}(z) - \hat{k}_{n1}(z) \Vert _{\infty }\)
(iv) Consequently, using \(\hat{k}_n(z) = I_s+ z\hat{C}_n(I - z\hat{A}_{n})^{-1} \hat{K}_n\), we obtain
Proof
(i) Reference [4] shows that \(C_n=C_1, \Omega _n = \Omega + C_2C_2', A_n=A_{11}, K_n = (M_1-A_{11}C_1')\Omega _n^{-1}\) where \(M_1 = K_1\Omega + [I_n,0]AC'\). Thus \(\Vert \Omega _n - \Omega \Vert = \Vert C_2 \Vert ^2\). Therefore
Since \(\Vert \Omega ^{-1} \Vert \le M, \Vert K_1 \Vert \le \sqrt{n}, \Vert A_{1,2} \Vert \le \sqrt{n}\) and \(\Omega _n \ge \Omega \), it thus follows that
It is straightforward to show that this also holds for the estimates as only orthogonality relations are used here.
(ii) Consider the estimation of the state as \(\hat{x}_t = \hat{V}' ({\hat{\varGamma }}_p^-)^{-1/2} Y_{t,p}^-\) which implies that \(\langle \hat{x}_t , \hat{x}_t \rangle = I_{ps}\). According to Lemma A.4 \(\Vert {\hat{\varGamma }}_f^+ - \varGamma _f^+ \Vert _{Fr} \rightarrow 0\) if \(p R_T(\tilde{d}_+) \rightarrow 0\). Then \(\Vert {\hat{\varGamma }}_f^+ \Vert _2 \le \Vert {\hat{\varGamma }}_f^+ - \varGamma _f^+ \Vert _2+ \Vert \varGamma _f^+\Vert _2 = O(p^{2\tilde{d}_+})\). Furthermore,
showing that the two norm of the jth column of \(\hat{C}_2\) is of order \(O(p^{\tilde{d}_+} \hat{\sigma }_{n+j})\).
(iii) and (iv) follow from straightforward computations. \(\square \)
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Bauer, D. (2019). Using Subspace Methods to Model Long-Memory Processes. In: Valenzuela, O., Rojas, F., Pomares, H., Rojas, I. (eds) Theory and Applications of Time Series Analysis. ITISE 2018. Contributions to Statistics. Springer, Cham. https://doi.org/10.1007/978-3-030-26036-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-26036-1_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26035-4
Online ISBN: 978-3-030-26036-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)