Abstract
Linear regression model is a useful tool in machine learning and has been applied in diverse fields including compressed sensing, computer vision and matrix analysis, among many others. For linear regression model, variable selection and parameter estimation are the most fundamental and important tasks. The seamless-\(L_{0}\) penalty estimator (SELO), which can finish variable selection and parameter estimation simultaneously, is attractive for its good theoretical properties and easy computation. However, the SELO is sensitive to outliers. Besides, seamless-\(L_{0}\) is non-convex, so most of the existing computing methods may easily converge to (bad) local minima. To solve these problems, a novel robust self-weighted model of SELO (RSWSELO) is proposed for linear regression. The RSWSELO is proved to be consistent in term of parameter estimation and variable selection, like the SELO model, which means the RSWSELO can converge to oracle estimator (asymptotically equivalent to the least squares estimator constrained to the true nonzero coefficients). An adaptive regularizer is introduced to the proposed model, which can assign weights to the selected samples based on the loss of samples during the iteration process. Thus, the weights can be decided by the model automatically and the proposed model is much more robust than the SELO model. Furthermore, the experimental results on simulation studies and UCI datasets demonstrate that the proposed model is effective and outperforms SELO in generalization performance.
Similar content being viewed by others
Change history
28 February 2020
In the print published article, the reference 19 was published incorrectly and the correct reference is given below.
References
Akaike H (1992) Information theory and an extension of the maximum likelihood principle. Wiley, New York
Antoniadis A, Fan J (1999) Regularization of wavelets approximations. J Am Stat Assoc 96(455):939–967
Arslan O (2012) Weighted lad-lasso method for robust parameter estimation and variable selection in regression. Comput Stat Data Anal 56(6):1952–1965
Breiman L (1996) Heuristics of instability and stabilization in model selection. Ann Stat 24(6):2350–2383
Dicker L, Huang B, Lin X (2013) Variable selection and estimation with the seamless-l0penalty. Stat Sinica 23(2):929–962
Fan J, Baryt E (2012) Adaptive robust variable selection. Ann Stat 42:324–351
Fan J, Li R (2001) Variable selection via nonconvave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360
Gu N, Fan M, Meng D (2016) Robust semi-supervised classification for noisy labels based on self-paced learning. IEEE Signal Process Lett 23(12):1806–1810
Hastie T, Tibshirani R (2009) The elements of statistical learning data mining. Inference and prediction. Springer, New York
Kaul A, De Leeuw J (2015) Weighted l1-penalized corrected quantile regression for high dimensional measurement error models. J Multivar Anal 140:72–91
Kim Y, Choi H (2008) Smoothly clipped absolute deviation on high dimensions. J Am Stat Assoc 103(484):1665–1673
Kumar MP, Packer B, Koller D (2010) Self-paced learning for latent variable models. In: Advances in neural information processing systems 23: 24th annual conference on neural information processing systems 2010, Curran Associates Inc., 6–9 Dec 2010, Vancouver, British Columbia, Canada. https://doi.org/10.1080/00401706.1993.10485385
Lai Z, Kong H (2018) Robust jointly sparse embedding for dimensionality reduction. Neurocomputing 103(484):1665–1673
Li C, Wei F, Yan J, Zhang X, Liu Q, Zha H (2018) A self-paced regularization framework for multilabel learning. IEEE Trans Neural Netw Learn Syst 29(6):2660–2666
Li Y, Zhu J (2008) L1-norm quantile regression. J Comput Graph Stat 17(1):163–185
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):15–18
Tibshirani R (1994) Regression shrinkage and selection via the lasso. J R Stat Soc B 58:267–288
Wang L (2013) The l1 penalized lad estimator for high dimensional linear regression. J Multivar Anal 120(9):135–151
Wang L (2017) Weighted robust lasso and adaptive elastic net method for regularization and variable selection in robust regression with optimal scaling transformations. Am J Math Stat 7(2):71–77
Wu Y, Liu Y (2009) Variable selection in quantile regression. Stat Sinica 19(19):801–817
Xu Z, Zhang H (2010) L1/2 regularization. Sci China (Inform Sci) 53(6):1159–1169
Zhang C (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38(2):894–942
Zhao Q, Meng D, Jiang L, Xie Q, Xu Z, Hauptmann AG (2015) Self-paced learning for matrix factorization. In: Twenty-ninth AAAI conference on artificial intelligence, pp 3196–3202
Zou H, Yuan M (2008) Composite quantile regression and the oracle model selection theory. Ann Stat 36(3):1108–1126
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Nos. 61673249, U1805263), the Research Project Supported by Shanxi Scholarship Council of China (No. 2016-004).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Su, M., Guo, Y., Men, C. et al. A robust self-weighted SELO regression model. Int. J. Mach. Learn. & Cyber. 10, 3189–3199 (2019). https://doi.org/10.1007/s13042-019-01009-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-019-01009-1