Computational Statistics

, Volume 34, Issue 1, pp 415–432 | Cite as

Improving the prediction performance of the LASSO by subtracting the additive structural noises

  • Morteza AminiEmail author
  • Mahdi Roozbeh
Original Paper


It is shown that the prediction performance of the LASSO method is improved for high dimensional data sets by subtracting structural noises through a sparse additive partially linear model. A mild combination of the partial residual estimation method and the back-fitting algorithm by further implying the LASSO method to the predictors of the linear part is proposed to estimate the parameters. The method is applied to the riboflavin production data set and a simulation study is conducted to examine the performance of the proposed method.


Additive partially linear model High dimensional LASSO Kernel smoothing Sparsity 



The authors would like to thank two anonymous reviewers for their valuable comments on an earlier version of this paper. The research of the first author was partially supported by University of Tehran under Grant Number 28773/1/02. The second author’s research is supported by a Grant No. 266/97/17577 from the Research Councils of Semnan University, Iran.

Supplementary material

180_2018_849_MOESM1_ESM.pdf (399 kb)
Supplementary material 1 (pdf 398 KB)


  1. Amini M, Roozbeh M (2015) Optimal partial ridge estimation in restricted semiparametric regression models. J Multivar Anal 136:26–40MathSciNetzbMATHGoogle Scholar
  2. Bach FR (2008) Consistency of the group lasso and multiple kernel learning. J Mach Learn Res 9:1179–1225MathSciNetzbMATHGoogle Scholar
  3. Binder H, Tutz G (2008) A comparison of methods for the fitting of generalized additive models. Stat Comput 18(1):87–99MathSciNetGoogle Scholar
  4. Breiman L (1999) Prediction games and arcing algorithms. Neural Comput 11(7):1493–1517Google Scholar
  5. Bühlmann P, Kalisch M, Meier L (2014) High-dimensional statistics with a view toward applications in biology. Ann Rev 1:255–278Google Scholar
  6. Denby L (1984) Smooth regression functions. PhD Thesis. Department of Statistics, University of Michigan, Ann ArborGoogle Scholar
  7. Efron B, Hastie T, Johnstone I, Tibshirani R et al (2004) Least angle regression. Ann Stat 32(2):407–499MathSciNetzbMATHGoogle Scholar
  8. Fahrmeir L, Tutz G (2013) Multivariate statistical modelling based on generalized linear models. Springer, BerlinzbMATHGoogle Scholar
  9. Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360MathSciNetzbMATHGoogle Scholar
  10. Fan J, Peng H et al (2004) Nonconcave penalized likelihood with a diverging number of parameters. Ann Stat 32(3):928–961MathSciNetzbMATHGoogle Scholar
  11. Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232MathSciNetzbMATHGoogle Scholar
  12. Friedman JH, Stuetzle W (1981) Projection pursuit regression. J Am Stat Assoc 76(376):817–823MathSciNetGoogle Scholar
  13. Gasser T, Müller H-G (1979) Kernel estimation of regression functions. In: Gasser T, Rosenblatt M (eds) Smoothing techniques for curve estimation. Lecture notes in mathematics, vol 757. Springer, Berlin, Heidelberg, pp 23–68Google Scholar
  14. Geyer CJ (1996) On the asymptotics of convex stochastic optimization (unpublished manuscript)Google Scholar
  15. Hastie T, Tibshirani R (1987) Generalized additive models: some applications. J Am Stat Assoc 82(398):371–386zbMATHGoogle Scholar
  16. Hastie TJ, Tibshirani RJ (1990) Generalized additive models. Monographs on statistics and applied probability, vol 43. Chapman and Hall, LondonGoogle Scholar
  17. Huang J, Horowitz JL, Wei F (2010) Variable selection in nonparametric additive models. Ann Stat 38(4):2282MathSciNetzbMATHGoogle Scholar
  18. Huang L, Jiang H, Tian H (2018) The consistency of model selection for dynamic semi-varying coefficient models with autocorrelated errors. Commun Stat Theory Methods 1–10.
  19. Lei H, Xia Y, Qin X et al (2016) Estimation of semivarying coefficient time series models with ARMA errors. Ann Stat 44(4):1618–1660MathSciNetzbMATHGoogle Scholar
  20. Lin Y, Zhang HH et al (2006) Component selection and smoothing in multivariate nonparametric regression. Ann Stat 34(5):2272–2297MathSciNetzbMATHGoogle Scholar
  21. Liu X, Wang L, Liang H (2011) Estimation and variable selection for semiparametric additive partial linear models (ss-09-140). Stat Sin 21(3):1225zbMATHGoogle Scholar
  22. Marx BD, Eilers PHC (1998) Direct generalized additive modeling with penalized likelihood. Comput Stat Data Anal 28(2):193–209zbMATHGoogle Scholar
  23. McCullagh P, Nelder JA (1989) Generalized linear models, vol 37. CRC Press, LondonzbMATHGoogle Scholar
  24. Meier L, Van de Geer S, Bühlmann P et al (2009) High-dimensional additive modeling. Ann Stat 37(6B):3779–3821MathSciNetzbMATHGoogle Scholar
  25. Opsomer JD, Ruppert D (1999) A root-n consistent backfitting estimator for semiparametric additive modeling. J Comput Graphical Stat 8(4):715–732Google Scholar
  26. Ravikumar P, Lafferty J, Liu H, Wasserman L (2009) Sparse additive models. J R Stat Soc Ser B (Stat Methodol) 71(5):1009–1030MathSciNetGoogle Scholar
  27. Robinson GK et al (1991) That BLUP is a good thing: the estimation of random effects. Stat Sci 6(1):15–32MathSciNetzbMATHGoogle Scholar
  28. Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric regression. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, CambridgeGoogle Scholar
  29. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol) 58:267–288MathSciNetzbMATHGoogle Scholar
  30. Tutz G, Binder H (2006) Generalized additive modeling with implicit variable selection by likelihood-based boosting. Biometrics 62(4):961–971MathSciNetzbMATHGoogle Scholar
  31. Wang H, Xia Y (2009) Shrinkage estimation of the varying coefficient model. J Am Stat Assoc 104(486):747–757MathSciNetzbMATHGoogle Scholar
  32. Wang L, Liu X, Liang H, Carroll RJ (2011) Estimation and variable selection for generalized additive partial linear models. Ann Stat 39(4):1827MathSciNetzbMATHGoogle Scholar
  33. Wang L, Chen G, Li H (2007) Group SCAD regression analysis for microarray time course gene expression data. Bioinformatics 23(12):1486–1494Google Scholar
  34. Wang T, Xia Y (2015) Whittle likelihood estimation of nonlinear autoregressive models with moving average residuals. J Am Stat Assoc 110(511):1083–1099MathSciNetzbMATHGoogle Scholar
  35. Wang Y (1998) Mixed effects smoothing spline analysis of variance. J R Stat Soc Ser B (Stat Methodol) 60(1):159–174MathSciNetzbMATHGoogle Scholar
  36. Wood SN (2000) Modelling and smoothing parameter estimation with multiple quadratic penalties. J R Stat Soc Ser B (Stat Methodol) 62(2):413–428MathSciNetGoogle Scholar
  37. Wood SN (2004) Stable and efficient multiple smoothing parameter estimation for generalized additive models. J Am Stat Assoc 99(467):673–686MathSciNetzbMATHGoogle Scholar
  38. Wood SN (2006) Generalized additive models: an introduction with R. Chapman and Hall/CRC, LondonzbMATHGoogle Scholar
  39. Zhang C-H et al (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38(2):894–942MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Statistics, School of Mathematics, Statistics and Computer Science, College of ScienceUniversity of TehranTehranIran
  2. 2.Department of Statistics, Faculty of Mathematics, Statistics and Computer SciencesSemnan UniversitySemnanIran

Personalised recommendations