A robust self-weighted SELO regression model

  • Meihong Su
  • Yaqing Guo
  • Changqian Men
  • Wenjian WangEmail author
Original Article


Linear regression model is a useful tool in machine learning and has been applied in diverse fields including compressed sensing, computer vision and matrix analysis, among many others. For linear regression model, variable selection and parameter estimation are the most fundamental and important tasks. The seamless-\(L_{0}\) penalty estimator (SELO), which can finish variable selection and parameter estimation simultaneously, is attractive for its good theoretical properties and easy computation. However, the SELO is sensitive to outliers. Besides, seamless-\(L_{0}\) is non-convex, so most of the existing computing methods may easily converge to (bad) local minima. To solve these problems, a novel robust self-weighted model of SELO (RSWSELO) is proposed for linear regression. The RSWSELO is proved to be consistent in term of parameter estimation and variable selection, like the SELO model, which means the RSWSELO can converge to oracle estimator (asymptotically equivalent to the least squares estimator constrained to the true nonzero coefficients). An adaptive regularizer is introduced to the proposed model, which can assign weights to the selected samples based on the loss of samples during the iteration process. Thus, the weights can be decided by the model automatically and the proposed model is much more robust than the SELO model. Furthermore, the experimental results on simulation studies and UCI datasets demonstrate that the proposed model is effective and outperforms SELO in generalization performance.


Self-weighted Robust Non-convex Linear regression 



This work was supported by the National Natural Science Foundation of China (Nos. 61673249, U1805263), the Research Project Supported by Shanxi Scholarship Council of China (No. 2016-004).


  1. 1.
    Akaike H (1992) Information theory and an extension of the maximum likelihood principle. Wiley, New YorkCrossRefzbMATHGoogle Scholar
  2. 2.
    Antoniadis A, Fan J (1999) Regularization of wavelets approximations. J Am Stat Assoc 96(455):939–967MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Arslan O (2012) Weighted lad-lasso method for robust parameter estimation and variable selection in regression. Comput Stat Data Anal 56(6):1952–1965MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Breiman L (1996) Heuristics of instability and stabilization in model selection. Ann Stat 24(6):2350–2383MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Dicker L, Huang B, Lin X (2013) Variable selection and estimation with the seamless-l0penalty. Stat Sinica 23(2):929–962zbMATHGoogle Scholar
  6. 6.
    Fan J, Baryt E (2012) Adaptive robust variable selection. Ann Stat 42:324–351MathSciNetCrossRefGoogle Scholar
  7. 7.
    Fan J, Li R (2001) Variable selection via nonconvave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360CrossRefzbMATHGoogle Scholar
  8. 8.
    Gu N, Fan M, Meng D (2016) Robust semi-supervised classification for noisy labels based on self-paced learning. IEEE Signal Process Lett 23(12):1806–1810CrossRefGoogle Scholar
  9. 9.
    Hastie T, Tibshirani R (2009) The elements of statistical learning data mining. Inference and prediction. Springer, New YorkzbMATHGoogle Scholar
  10. 10.
    Kaul A, De Leeuw J (2015) Weighted l1-penalized corrected quantile regression for high dimensional measurement error models. J Multivar Anal 140:72–91CrossRefzbMATHGoogle Scholar
  11. 11.
    Kim Y, Choi H (2008) Smoothly clipped absolute deviation on high dimensions. J Am Stat Assoc 103(484):1665–1673MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Kumar MP, Packer B, Koller D (2010) Self-paced learning for latent variable models. In: Advances in neural information processing systems 23: 24th annual conference on neural information processing systems 2010, Curran Associates Inc., 6–9 Dec 2010, Vancouver, British Columbia, Canada.
  13. 13.
    Lai Z, Kong H (2018) Robust jointly sparse embedding for dimensionality reduction. Neurocomputing 103(484):1665–1673Google Scholar
  14. 14.
    Li C, Wei F, Yan J, Zhang X, Liu Q, Zha H (2018) A self-paced regularization framework for multilabel learning. IEEE Trans Neural Netw Learn Syst 29(6):2660–2666MathSciNetCrossRefGoogle Scholar
  15. 15.
    Li Y, Zhu J (2008) L1-norm quantile regression. J Comput Graph Stat 17(1):163–185CrossRefGoogle Scholar
  16. 16.
    Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):15–18MathSciNetCrossRefGoogle Scholar
  17. 17.
    Tibshirani R (1994) Regression shrinkage and selection via the lasso. J R Stat Soc B 58:267–288MathSciNetzbMATHGoogle Scholar
  18. 18.
    Wang L (2013) The l1 penalized lad estimator for high dimensional linear regression. J Multivar Anal 120(9):135–151CrossRefzbMATHGoogle Scholar
  19. 19.
    Wang L (2017) Weighted robust lasso and adaptive elastic net method for regularization and variable selection in robust regression with optimal scaling transformations. Am J Math Stat 7(2):71–77Google Scholar
  20. 20.
    Wu Y, Liu Y (2009) Variable selection in quantile regression. Stat Sinica 19(19):801–817MathSciNetzbMATHGoogle Scholar
  21. 21.
    Xu Z, Zhang H (2010) L1/2 regularization. Sci China (Inform Sci) 53(6):1159–1169CrossRefGoogle Scholar
  22. 22.
    Zhang C (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38(2):894–942MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Zhao Q, Meng D, Jiang L, Xie Q, Xu Z, Hauptmann AG (2015) Self-paced learning for matrix factorization. In: Twenty-ninth AAAI conference on artificial intelligence, pp 3196–3202Google Scholar
  24. 24.
    Zou H, Yuan M (2008) Composite quantile regression and the oracle model selection theory. Ann Stat 36(3):1108–1126MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  • Meihong Su
    • 2
  • Yaqing Guo
    • 2
  • Changqian Men
    • 2
  • Wenjian Wang
    • 1
    Email author
  1. 1.Key Laboratory of Computational Intelligence and Chinese Information Processing (Shanxi University), Ministry of EducationTaiyuanChina
  2. 2.School of Computer and Information TechnologyShanxi UniversityTaiyuanChina

Personalised recommendations