# Dimension Reduction in High Dimensional Multivariate Time Series Analysis

## Abstract

The vector autoregressive (VAR) and vector autoregressive moving average (VARMA) models have been widely used to model multivariate time series, because of their capability to represent the dynamic relationships among variables in a system and their usefulness in forecasting unknown future values. However, when the dimension is very high, the number of parameters often exceed the number of available observations, and it is impossible to estimate the parameters. A suitable solution is clearly needed. After introducing some existing methods, we will suggest the use of contemporal aggregation as a dimension reduction method, which is very natural and simple to use. We will compare our proposed method with other existing methods in terms of forecast accuracy through both simulations and empirical examples. The presentation is based on the invited talk at the 2017 ICSA Applied Statistics Symposium in Chicago.

## Keywords

VARMA model Regularization Starma model Clustering High dimension Aggregation## Notes

### Acknowledgments

The author wants to thank his PhD student, Zeda Li, who helped him develop software code for the analyses of many data sets in the presentation.

## References

- Bai, J., Ng, S.: Determining the number of factors in approximate factor models. Econometrica.
**70**, 191–221 (2002)MathSciNetCrossRefGoogle Scholar - Banfield, J., Raftery, A.: Model-based cluster Gaussian and non-Gaussian clustering. Biometrics.
**49**, 803–821 (1993)MathSciNetCrossRefGoogle Scholar - Box, G.E.P., Tiao, G.C.: A canonical analysis of multiple times series. Biometrika.
**64**, 355–370 (1977)MathSciNetCrossRefGoogle Scholar - Box, G.E.P., Jenkins, G.M., Reinsel, G.C., Ljung, G.M.: Time Series Analysis, Forecasting and Control, 5th edn. Wiley, Hoboken (2015)zbMATHGoogle Scholar
- Cattell, R.B.: The description of personality: basic traits resolved into clusters. J. Abnorm. Soc. Psychol.
**38**, 476–506 (1943)CrossRefGoogle Scholar - Cliff, A.D., Ord, J.: Model building and the analysis of spatial pattern in human geography. J. R. Stat. Soc. Ser. B.
**37**, 297–348 (1975)MathSciNetzbMATHGoogle Scholar - Fraley, C., Raftery, A.: Model-based cluster clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc.
**97**, 458–470 (2002)CrossRefGoogle Scholar - Gehman, A.: The effects of spatial aggregation on spatial time series modeling and forecasting, PhD dissertation, Temple University (2015)Google Scholar
- Hamilton, J.D.: Time Series Analysis. Princeton University Press, Princeton (1994)zbMATHGoogle Scholar
- Hannan, E.J.: Multiple Time Series. Wiley, New York (1970)CrossRefGoogle Scholar
- Hsu, N., Hung, H., Chang, Y.: Subset selection for vector autoregressive processes using lasso. Comput. Stat. Data Anal.
**52**, 3645–3657 (2008)MathSciNetCrossRefGoogle Scholar - Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Basic Eng.
**82**, 35–45 (1960)CrossRefGoogle Scholar - Kohn, R.: When is an aggregate of a time series efficiently forecast by its past? J. Econ.
**18**, 337–349 (1982)MathSciNetCrossRefGoogle Scholar - Koop, G.M.: Forecasting with medium and large Bayesian VARS. J. Appl. Econ.
**28**, 177–203 (2013)MathSciNetCrossRefGoogle Scholar - Lee, N., Choi, H., Kim, S.-H.: Bayes shrinkage estimation for high-dimensional VAR models with scale mixture of normal distributions for noise. Comput. Stat. Data Anal.
**101**, 250–276 (2016)MathSciNetCrossRefGoogle Scholar - Lütkepohl, H.: Forecasting contemporaneously aggregated vector ARMA processes. J. Bus. Econ. Stat.
**2**, 201–214 (1984)Google Scholar - Lütkepohl, H.: New Introduction to Multiple Time Series Analysis. Springer, Berlin (2007)zbMATHGoogle Scholar
- MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Cam, L.M., Neyman J. (eds.) Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967)Google Scholar
- Matteson, D.S., Tsay, R.S.: Dynamic orthogonal components for multivariate time series. J. Am. Stat. Assoc.
**106**, 1450–1463 (2011)MathSciNetCrossRefGoogle Scholar - McLachlan, G., Basford, K.E.: Mixture Models: Inference and Applications to Clustering. Marcel Dekker, New York (1988)zbMATHGoogle Scholar
- Nicholson, W.B., Bien, J., Matteson, D.S.: High dimensional forecasting via interpretable vector autoregression, arXiv:1412.5250v3 [stat.ME] (2018)Google Scholar
- Pfeifer, P.E., Deutsch, S.J.: A three-stage iterative procedure for space-time modeling. Technometrics.
**22**, 35–47 (1980a)CrossRefGoogle Scholar - Pfeifer, P.E., Deutsch, S.J.: Identification and interpretation of the first order space-time ARMA models. Technometrics.
**22**, 397–408 (1980b)CrossRefGoogle Scholar - Pfeifer, P.E., Deutsch, S.J.: Stationary and inevitability regions for low order STARMA models. Commun. Stat. Simul. Comput.
**9**, 551–562 (1980c)CrossRefGoogle Scholar - Reinsel, G.C.: Elements of Multivariate Time Series Analysis, 2nd edn. Springer, New York (1997)CrossRefGoogle Scholar
- Rose, D.: Forecasting aggregates of independent ARIMA process. J. Econ.
**5**, 323–345 (1977)CrossRefGoogle Scholar - Scrucca, L.: Dimension reduction for model-based cluster clustering. Stat. Comput.
**20**, 471–484 (2010)MathSciNetCrossRefGoogle Scholar - Song, S., Bickel, P.: Large vector autoregressions, arXiv: 1106.3915v1 [stat.ML] (2011)Google Scholar
- Stock, J.H., Watson, M.W.: Forecasting using principal components from a large number of predictors. J. Am. Stat. Assoc.
**97**, 1167–1179 (2002a)MathSciNetCrossRefGoogle Scholar - Stock, J.H., Watson, M.W.: Macroeconomic forecasting using diffusion index. J. Bus. Econ. Stat.
**20**, 1147–1162 (2002b)Google Scholar - Stock, J.H., Watson, M.W.: Forecasting in dynamic factor models subject to structural instability. In: Shephard, N., Castle, J. (eds.) The Methodology and Practice of Econometrics: Festschrift in Honor of D.F. Hendry, chap. 7. Oxford University Press, Oxford (2009)CrossRefGoogle Scholar
- Tiao, G.C., Guttman, I.: Forecasting contemporal aggregate of multiple time Series. J. Econ.
**12**, 219–230 (1980)MathSciNetCrossRefGoogle Scholar - Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol.
**58**, 267–288 (1996)MathSciNetzbMATHGoogle Scholar - Tryon, R.C.: Cluster analysis: correlation profile and orthometric (factor) analysis for the isolation of unities in mind and personality. Edwards Brothers, Ann Arbor (1939)Google Scholar
- Tsay, R.S.: Multivariate Time Series Analysis with R and Financial Applications. Wiley, Hoboken (2013)Google Scholar
- Wang, S., Zhou, J.: Variable selection for model-based high-dimensional clustering and its application to microarray data. Biometrics.
**64**, 440–448 (2008)MathSciNetCrossRefGoogle Scholar - Wang, Y., Tsay, R.S., Ledolter, J., Shrestha, K.M.: Forecasting simultaneously high-dimensional time series: a robust model-based clustering approach. J. Forecast.
**32**, 673–684 (2013)MathSciNetCrossRefGoogle Scholar - Ward, J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc.
**58**, 234–244 (1963)MathSciNetCrossRefGoogle Scholar - Wei, W.W.S.: Time Series Analysis – Univariate and Multivariate Methods, 2nd edn. Pearson Addison-Wesley, Boston, MA (2006)zbMATHGoogle Scholar
- Wei, W.W.S., Abraham, B.: Forecasting contemporal time series aggregates. Commun. Stat. Theory Methods.
**10**, 1335–1334 (1981)MathSciNetCrossRefGoogle Scholar - Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B.
**68**, 49–67 (2006)MathSciNetCrossRefGoogle Scholar