Analysis of High-Dimensional Regression Models Using Orthogonal Greedy Algorithms

  • Hsiang-Ling Hsu
  • Ching-Kang IngEmail author
  • Tze Leung Lai
Part of the Springer Handbooks of Computational Statistics book series (SHCS)


We begin by reviewing recent results of Ing and Lai (Stat Sin 21:1473–1513, 2011) on the statistical properties of the orthogonal greedy algorithm (OGA) in high-dimensional sparse regression models with independent observations. In particular, when the regression coefficients are absolutely summable, the conditional mean squared prediction error and the empirical norm of OGA derived by Ing and Lai (Stat Sin 21:1473–1513, 2011) are introduced. We then explore the performance of OGA under more general sparsity conditions. Finally, we obtain the convergence rate of OGA in high-dimensional time series models, and illustrate the advantage of our results compared to those established for Lasso by Basu and Michailidis (Ann Stat 43:1535–1567, 2015) and Wu and Wu (Electron J Stat 10:352–379, 2016).


Conditional mean squared prediction errors Empirical norms High-dimensional models Lasso Orthogonal greedy algorithms Sparsity Time series 


  1. Basu S, Michailidis G (2015) Regularized estimation in sparse high-dimensional time series models. Ann Stat 43:1535–1567MathSciNetCrossRefGoogle Scholar
  2. Bickel PJ, Levina E (2008) Regularized estimation of large covariance matrices. Ann Stat 36: 199–227MathSciNetCrossRefGoogle Scholar
  3. Bickel PJ, Ritov Y, Tsybakov AB (2009) Simultaneous analysis of Lasso and Dantzig selector. Ann Stat 37:1705–1732MathSciNetCrossRefGoogle Scholar
  4. Bühlmann P (2006) Boosting for high-dimensional linear models. Ann Stat 34:559–583MathSciNetCrossRefGoogle Scholar
  5. Bunea F, Tsybakov AB, Wegkamp MH (2007) Sparsity oracle inequalities for the Lasso. Electr J Stat 1:169–194MathSciNetCrossRefGoogle Scholar
  6. Candés EJ, Plan Y (2009) Near-ideal model selection by 1 minimization. Ann Stat 37:2145–2177MathSciNetCrossRefGoogle Scholar
  7. Candés EJ, Tao T (2007) The Dantzig selector: statistical estimation when p is much larger than n. Ann Stat 35:2313–2351MathSciNetCrossRefGoogle Scholar
  8. Cai T, Zhang C-H, Zhou HH (2010) Optimal rates of convergence for covariance matrix estimation. Ann Stat 38:2118–2144MathSciNetCrossRefGoogle Scholar
  9. Donoho DL, Elad M, Temlyakov VN (2006) Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Trans Inform Theory 52:6–18MathSciNetCrossRefGoogle Scholar
  10. Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360MathSciNetCrossRefGoogle Scholar
  11. Findley DF, Wei C-Z (1993) Moment bounds for deriving time series CLT’s and model selection procedures. Stat Sin 3:453–470Google Scholar
  12. Foster DP, George EI (1994) The risk inflation criterion for multiple regression. Ann Stat 22:1947–1975MathSciNetCrossRefGoogle Scholar
  13. Friedman J, Hastie T, Tibshirani R (2010) glmnet: Lasso and elastic-net regularized generalized linear models. R package version 1.1-5. Accessed 10 Dec 2012
  14. Gao F, Ing C-K, Yang Y (2013) Metric entropy and sparse linear approximation of l q-Hulls for 0 < q ≤ 1. J Approx Theory 166:42–55Google Scholar
  15. Ing C-K, Lai TL (2011) A stepwise regression method and consistent model selection for high-dimensional sparse linear models. Stat Sin 21:1473–1513Google Scholar
  16. Ing C-K, Lai TL (2015) An efficient pathwise variable selection criterion in weakly sparse regression models. Technical Report, Academia SinicaGoogle Scholar
  17. Ing C-K, Lai TL (2016) Model selection for high-dimensional time series. Technical Report, Academia SinicaGoogle Scholar
  18. Ing C-K, Wei C-Z (2003) On same-realization prediction in an infinite-order autoregressive process. J Multivar Anal 85:130–155MathSciNetCrossRefGoogle Scholar
  19. Negahban SN, Ravikumar P, Wainwright MJ, Yu B (2012) A unified framework for high-dimensional analysis of m-estimators with decomposable regularizers. Stat Sci 27:538–557MathSciNetCrossRefGoogle Scholar
  20. Raskutti G, Wainwright MJ, Yu B (2011) Minimax rates of estimation for high-dimensional linear regression over l q-balls. IEEE Trans Inform Theory 57:6976–6994MathSciNetCrossRefGoogle Scholar
  21. Temlyakov VN (2000) Weak greedy algorithms. Adv Comput Math 12:213–227MathSciNetCrossRefGoogle Scholar
  22. Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B 58:267–288MathSciNetzbMATHGoogle Scholar
  23. Tropp JA (2004) Greed is good: algorithmic results for sparse approximation. IEEE Trans Inform Theory 50:2231–2242MathSciNetCrossRefGoogle Scholar
  24. Wang Z, Paterlini S, Gao F, Yang Y (2014) Adaptive minimax regression estimation over sparse hulls. J Mach Learn Res 15:1675–1711MathSciNetzbMATHGoogle Scholar
  25. Wei C-Z (1987) Adaptive prediction by least squares predictors in stochastic regression models with applications to time series. Ann Stat 15:1667–1682MathSciNetCrossRefGoogle Scholar
  26. Wu WB, Wu YN (2016) Performance bounds for parameter estimates of high-dimensional linear models with correlated errors. Electron J Stat 10:352–379MathSciNetCrossRefGoogle Scholar
  27. Zhang C-H, Huang J (2008) The sparsity and bias of the Lasso selection in highdimensional linear regression. Ann Stat 36:1567–1594CrossRefGoogle Scholar
  28. Zhao P, Yu B (2006) On model selection consistency of Lasso. J Mach Learn Res 7:2541–2563MathSciNetzbMATHGoogle Scholar
  29. Zou H (2006) The adaptive Lasso and its oracle properties. J Am Stat Assoc 101:1418–1429MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Hsiang-Ling Hsu
    • 1
  • Ching-Kang Ing
    • 2
    Email author
  • Tze Leung Lai
    • 3
  1. 1.National University of KaohsiungKaohsiungTaiwan
  2. 2.National Tsing Hua UniversityHsinchuTaiwan
  3. 3.Stanford UniversityStanfordUSA

Personalised recommendations