Robust Signed-Rank Variable Selection in Linear Regression

Abebe, Asheber; Bindele, Huybrechts F.

doi:10.1007/978-3-319-39065-9_2

Asheber Abebe³ &
Huybrechts F. Bindele⁴

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 168))

1139 Accesses
5 Citations

Abstract

The growing need for dealing with big data has made it necessary to find computationally efficient methods for identifying important factors to be considered in statistical modeling. In the linear model, the Lasso is an effective way of selecting variables using penalized regression. It has spawned substantial research in the area of variable selection for models that depend on a linear combination of predictors. However, work addressing the lack of optimality of variable selection when the model errors are not Gaussian and/or when the data contain gross outliers is scarce. We propose the weighted signed-rank Lasso as a robust and efficient alternative to least absolute deviations and least squares Lasso. The approach is appealing for use with big data since one can use data augmentation to perform the estimation as a single weighted L ₁ optimization problem. Selection and estimation consistency are theoretically established and evaluated via simulation studies. The results confirm the optimality of the rank-based approach for data with heavy-tailed and contaminated errors or data containing high-leverage points.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abebe, A., McKean, J. W., & Bindele, H. F. (2012). On the consistency of a class of nonlinear regression estimators. Pakistan Journal of Statistics and Operation Research, 8(3), 543–555.
Article MathSciNet Google Scholar
Arslan, O. (2012). Weighted LAD-LASSO method for robust parameter estimation and variable selection in regression. Computational Statistics & Data Analysis, 56(6), 1952–1965.
Article MathSciNet MATH Google Scholar
Bindele, H. F., & Abebe, A. (2012). Bounded influence nonlinear signed-rank regression. Canadian Journal of Statistics, 40(1), 172–189.
Article MathSciNet MATH Google Scholar
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360.
Article MathSciNet MATH Google Scholar
Hettmansperger, T. P., & McKean, J. W. (2011). Robust nonparametric statistical methods. In Monographs on statistics and applied probability (Vol. 119, 2nd ed.). Boca Raton, FL: CRC Press.
Google Scholar
Hössjer, O. (1994). Rank-based estimates in the linear model with high breakdown point. Journal of the American Statistical Association, 89(425), 149–158.
MathSciNet MATH Google Scholar
Johnson, B. A. (2009). Rank-based estimation in the ℓ ₁-regularized partly linear model for censored outcomes with application to integrated analyses of clinical predictors and gene expression data. Biostatistics, 10(4), 659–666.
Article Google Scholar
Johnson, B. A., Lin, D., & Zeng, D. (2008). Penalized estimating functions and variable selection in semiparametric regression models. Journal of the American Statistical Association, 103(482), 672–680.
Article MathSciNet MATH Google Scholar
Johnson, B. A., & Peng, L. (2008). Rank-based variable selection. Journal of Nonparametric Statistics, 20(3), 241–252.
Article MathSciNet MATH Google Scholar
Leng, C. (2010). Variable selection and coefficient estimation via regularized rank regression. Statistica Sinica, 20(1), 167.
MathSciNet MATH Google Scholar
Rousseeuw, P. J. (1984). Least median of squares regression. Journal of the American Statistical Association, 79(388), 871–880.
Article MathSciNet MATH Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58, 267–288.
MathSciNet MATH Google Scholar
Wang, H., & Leng, C. (2008). A note on adaptive group lasso. Computational Statistics & Data Analysis, 52(12), 5277–5286.
Article MathSciNet MATH Google Scholar
Wang, H., Li, G., & Jiang, G. (2007). Robust regression shrinkage and consistent variable selection through the lad-lasso. Journal of Business & Economic Statistics, 25(3), 347–355.
Article MathSciNet Google Scholar
Wang, L., & Li, R. (2009). Weighted Wilcoxon-type smoothly clipped absolute deviation method. Biometrics, 65(2), 564–571.
Article MathSciNet MATH Google Scholar
Wu, C. F. (1981). Asymptotic theory of nonlinear least squares estimation. Annals of Statistics, 9(3), 501–513.
Article MathSciNet MATH Google Scholar
Xu, J., Leng, C., & Ying, Z. (2010). Rank-based variable selection with censored data. Statistics and Computing, 20(2), 165–176.
Article MathSciNet Google Scholar
Zou, H. (206). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429.
Google Scholar
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320.
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We dedicate this work to Joseph W. McKean on the occasion of his 70th birthday. We are thankful for his mentorship and guidance over the years. We also thank the anonymous referee for suggestions that improved the presentation.

Author information

Authors and Affiliations

Department of Mathematics and Statistics, Auburn University, 221 Parker Hall, Auburn, AL, 36849, USA
Asheber Abebe
Department of Mathematics and Statistics, University of South Alabama, 411 University Blvd. N., ILB 316, Mobile, AL, 36688-0002, USA
Huybrechts F. Bindele

Authors

Asheber Abebe
View author publications
You can also search for this author in PubMed Google Scholar
Huybrechts F. Bindele
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Asheber Abebe .

Editor information

Editors and Affiliations

Department of Statistics, Rutgers University, New Brunswick, New Jersey, USA
Regina Y. Liu
Department Statistics, Western Michigan University, Kalamazoo, Michigan, USA
Joseph W. McKean

Appendix

This Appendix provides some lemmas and the proofs of the main results (Theorems 2.1 and 2.2). In the proofs we have taken W = I to simplify notation. The general case follows by taking $W^{1/2}\boldsymbol{x}$ in place of $\boldsymbol{x}$ in the proofs.

2.1.1 Proofs

The following three lemmas, whose proofs follow from slight modifications of those given in Hössjer (1994) and Hettmansperger and McKean (2011), are key to deriving the proof of the main results.

Lemma 2.1.

Under assumptions (I ₁ ) and (I ₂ ), we have $\tilde{\boldsymbol{\beta }}_{n} \rightarrow \boldsymbol{\beta }_{0}\;a.s.$

The proof of this lemma is given in Hössjer (1994) for w ≡ 1 and in Abebe et al. (2012) for any positive w, and a more general regression model. Also, as in Wu (1981), the proof of this lemma is obtained by showing that

$$\displaystyle{ \lim _{n\rightarrow \infty }\inf _{\boldsymbol{\beta }\in B^{c}}\big(D_{n}(\mathbf{v}_{n},w,\boldsymbol{\beta }) - D_{n}(\mathbf{v}_{n},w,\boldsymbol{\beta }_{0})\big) > 0\;\;a.s. }$$

(2.11)

where B is an open subset of $\mathcal{B}$ and $\boldsymbol{\beta }_{0} \in Int(B)$.

Lemma 2.2.

Putting $U_{n}(\boldsymbol{\gamma },\boldsymbol{\beta }) = \frac{\|S_{n}(\boldsymbol{\gamma }) - S_{n}(\boldsymbol{\beta }) -\boldsymbol{\xi }(\boldsymbol{\gamma }) + \boldsymbol{\xi }(\boldsymbol{\beta })\|_{1}} {n^{-1/2} +\| \boldsymbol{\xi }(\boldsymbol{\gamma })\|_{1}}$ , we have for small enough δ > 0 that

$$\displaystyle{\sup _{\|\boldsymbol{\gamma }\|\leq \delta }U_{n}(\boldsymbol{\gamma },\boldsymbol{\beta }_{0})\mathop{\longrightarrow}\limits_{}^{a.s}0\quad \mbox{ as}\quad n \rightarrow \infty.}$$

This lemma ensures that $n^{-1/2}S_{n}(\boldsymbol{\beta }_{0})$ converges in distribution to a multivariate normal distribution with mean zero and covariance matrix $\gamma _{\varphi ^{+}}\varSigma$. It also results in the following asymptotic linearity established in Hettmansperger and McKean (2011).

Lemma 2.3.

Under the assumption of the errors having a finite Fisher information, we have for all ε > 0 and C > 0

$$\displaystyle{P\left [\sup _{\sqrt{n}\|\boldsymbol{\beta }-\boldsymbol{\beta }_{0}\|_{1}\leq C}\|n^{-1/2}(S_{ n}(\boldsymbol{\beta })-S_{n}(\boldsymbol{\beta }_{0}))+\zeta _{\varphi ^{+}}\sqrt{n}(\boldsymbol{\beta }-\boldsymbol{\beta }_{0})\|_{1} \geq \epsilon \right ] \rightarrow 0\quad \mbox{ as $n \rightarrow \infty $}.}$$

From this asymptotic linearity follows that for all $\boldsymbol{\beta }$ such that $\|\boldsymbol{\beta }-\boldsymbol{\beta }_{0}\|_{1} \leq C/\sqrt{n}$, we have

$$\displaystyle{ n^{-1/2}S_{ n}(\boldsymbol{\beta }) = n^{-1/2}S_{ n}(\boldsymbol{\beta }_{0}) -\zeta _{\varphi ^{+}}\sqrt{n}(\boldsymbol{\beta }-\boldsymbol{\beta }_{0}) + o(1) }$$

(2.12)

Proof of Theorem 2.1.

Set $B =\{\boldsymbol{\beta } _{0} + n^{-1/2}\mathbf{u}:\;\;\| \mathbf{u}\|_{1} < C\}$. Clearly B is an open neighborhood of $\boldsymbol{\beta }_{0}$ and therefore B ^c is a closed subset of $\mathcal{B}$ not containing $\boldsymbol{\beta }_{0}$. To complete the proof, it is then sufficient to show that

$$\displaystyle{\lim _{n\rightarrow \infty }\inf _{\boldsymbol{\beta }\in B^{c}}\big(Q(\boldsymbol{\beta }) - Q(\boldsymbol{\beta }_{0})\big) > 0\;\;a.s.}$$

which from Lemma 1 of Wu (1981) will result in the $\sqrt{n}$-consistency of $\hat{\boldsymbol{\beta }}_{n}$. Indeed,

$$\displaystyle{ Q(\boldsymbol{\beta }) - Q(\boldsymbol{\beta }_{0}) = D_{n}(\mathbf{v}_{n},w,\boldsymbol{\beta }) - D_{n}(\mathbf{v}_{n},w,\boldsymbol{\beta }_{0}) + n\sum _{j=1}^{d}\big[P_{\lambda _{ j}}(\vert \beta _{j}\vert ) - P_{\lambda _{j}}(\vert \beta _{0j}\vert )\big]. }$$

(2.13)

Now by the mean value theorem, assuming without loss of generality the | β _0j | < | β _j | , there exits α _j ∈ ( | β _0j | , | β _j | ) such that

$$\displaystyle{P_{\lambda _{j}}(\vert \beta _{j}\vert ) - P_{\lambda _{j}}(\vert \beta _{0j}\vert ) = H_{\lambda _{j}}(\vert \alpha _{j}\vert )sgn(\alpha _{j})(\vert \beta _{j}\vert -\vert \beta _{0j}\vert ),}$$

and therefore

$$\displaystyle{\vert P_{\lambda _{j}}(\vert \beta _{j}\vert ) - P_{\lambda _{j}}(\vert \beta _{0j}\vert )\vert \leq H_{\lambda _{j}}(\vert \alpha _{j}\vert )\vert \beta _{j} -\beta _{0j}\vert.}$$

This together with Eq. (2.13) imply that

$$\displaystyle\begin{array}{rcl} Q(\boldsymbol{\beta }) - Q(\boldsymbol{\beta }_{0})& =& D_{n}(\mathbf{v}_{n},w,\boldsymbol{\beta }) - D_{n}(\mathbf{v}_{n},w,\boldsymbol{\beta }_{0}) + n\sum _{j=1}^{d}H_{\lambda _{ j}}(\vert \alpha _{j}\vert )sgn(\alpha _{j})(\vert \beta _{j}\vert -\vert \beta _{0j}\vert ) \\ & \geq & D_{n}(\mathbf{v}_{n},w,\boldsymbol{\beta }) - D_{n}(\mathbf{v}_{n},w,\boldsymbol{\beta }_{0}) -\sqrt{n}a_{n}\sum _{j=1}^{p_{0} }\vert u_{j}\vert, {}\end{array}$$

(2.14)

as $\boldsymbol{\beta }\in B^{c}$ implies that $\boldsymbol{\beta }$ can be written as $\boldsymbol{\beta }=\boldsymbol{\beta } _{0} + n^{-1/2}\mathbf{u}$ with $\|\mathbf{u}\|_{1} \geq C$. Being a closed subset of a compact space, B ^c is compact, and hence, is closed and bounded. Then, there exists a constant M such that $C \leq \|\mathbf{u}\|_{1} \leq M$. From the last term of equation (2.14), note that $\sum _{j=1}^{p_{0} }\vert u_{j}\vert \leq \|\mathbf{u}\|_{1} \leq M$ from which, we have $-\sqrt{n}a_{n}\sum _{j=1}^{p_{0} }\vert u_{j}\vert \geq -\sqrt{n}a_{n}M$. Thus,

$$\displaystyle{Q(\boldsymbol{\beta }) - Q(\boldsymbol{\beta }_{0}) \geq D_{n}(\mathbf{v}_{n},w,\boldsymbol{\beta }) - D_{n}(\mathbf{v}_{n},w,\boldsymbol{\beta }_{0}) -\sqrt{n}a_{n}M,}$$

and so,

$$\displaystyle{\lim _{n\rightarrow \infty }\inf _{\boldsymbol{\beta }\in B^{c}}\big(Q(\boldsymbol{\beta })-Q(\boldsymbol{\beta }_{0})\big) \geq \lim _{n\rightarrow \infty }\inf _{\boldsymbol{\beta }\in B^{c}}\big(D_{n}(\mathbf{v}_{n},w,\boldsymbol{\beta })-D_{n}(\mathbf{v}_{n},w,\boldsymbol{\beta }_{0})\big)-\lim _{n\rightarrow \infty }\Big[\sqrt{n}a_{n}M\Big].}$$

By assumption (I ₃), $\lim _{n\rightarrow \infty }\Big[\sqrt{n}a_{n}M\Big] = 0$, and by Lemma 2.1, we have

$$\displaystyle{\lim _{n\rightarrow \infty }\inf _{\boldsymbol{\beta }\in B^{c}}\big(Q(\boldsymbol{\beta }) - Q(\boldsymbol{\beta }_{0})\big) > 0\;\;a.s.}$$

Proof of Theorem 2.2.

From the proof of Theorem 2.1 to obtain the oracle property, it is sufficient to show that for any $\boldsymbol{\beta }^{{\ast}}$ satisfying $\|\boldsymbol{\beta }_{a}^{{\ast}}-\boldsymbol{\beta }_{0a}\|_{1} = O_{p}(n^{-1/2})$ and $\vert \beta _{j}^{{\ast}}\vert < Cn^{-1/2}$ for $j = p_{0} + 1,\ldots,d$, $\frac{\partial Q(\boldsymbol{\beta })} {\partial \beta _{j}} \Big\vert _{\boldsymbol{\beta }=\boldsymbol{\beta }^{{\ast}}}$ and β _j ^∗ have the same sign. Indeed,

$$\displaystyle\begin{array}{rcl} n^{-1/2}\frac{\partial Q(\boldsymbol{\beta })} {\partial \beta _{j}} \Big\vert _{\boldsymbol{\beta }=\boldsymbol{\beta }^{{\ast}}}& =& -n^{-1/2}S_{ n}^{j}(\boldsymbol{\beta }_{ 0}) +\zeta _{\varphi ^{+}}\sqrt{n}(\boldsymbol{\beta }^{{\ast}}-\boldsymbol{\beta }_{ 0}) + \sqrt{n}H_{\lambda _{j}}(\vert \beta _{j}^{{\ast}}\vert )\mbox{ sgn}(\beta _{ j}^{{\ast}}) + o(1) {}\\ & =& O_{P}(1) + \sqrt{n}H_{\lambda _{j}}(\vert \beta _{j}^{{\ast}}\vert )\mbox{ sgn}(\beta _{ j}^{{\ast}})\;\;\mbox{ for $j = p_{ 0} + 1,\ldots,d$}, {}\\ \end{array}$$

where $S_{n}^{j}(\boldsymbol{\beta }_{0})$ is the j ^th component of $S_{n}(\boldsymbol{\beta }_{0})$. Note that by assumption (I ₃), $\sqrt{n}H_{ \lambda _{j}}(\vert \beta _{j}^{{\ast}}\vert ) \geq \sqrt{n}b_{ n} \rightarrow \infty $ as n → ∞, and thus the sign of $\frac{\partial Q(\boldsymbol{\beta })} {\partial \beta _{j}} \Big\vert _{\boldsymbol{\beta }=\boldsymbol{\beta }^{{\ast}}}$ is fully determined by that of β _j ^∗ for n large enough. This together with Theorem 2.1 implies that $\lim _{n\rightarrow \infty }P(\hat{\boldsymbol{\beta }}_{nb} = \mathbf{0}) = 1$.

Moreover, by definition of $\hat{\boldsymbol{\beta }}_{n}$, it is obtained in a straightforward manner that $\frac{\partial Q(\boldsymbol{\beta })} {\partial \boldsymbol{\beta }_{a}} \Big\vert _{\boldsymbol{\beta }=(\hat{\boldsymbol{\beta }}_{a},0)} = o_{P}(1)$. From this, partitioning $S_{n}(\boldsymbol{\beta }_{0})$ as $(S_{n,a}(\boldsymbol{\beta }_{0}),S_{n,b}(\boldsymbol{\beta }_{0}))$, it follows from Eq. (2.12) that

$$\displaystyle{o_{P}(1) = n^{-1/2}S_{ n,a}(\boldsymbol{\beta }_{0}) -\zeta _{\varphi ^{+}}\sqrt{n}(\hat{\boldsymbol{\beta }}_{na} -\boldsymbol{\beta }_{0a}) + \sqrt{n}\sum _{j=1}^{p_{0} }H_{\lambda _{j}}(\vert \hat{\beta }_{na,j}\vert )\mbox{ sgn}(\hat{\beta }_{na,j}),}$$

and $\vert \sqrt{n}\sum _{j=1}^{p_{0}}H_{\lambda _{ j}}(\vert \hat{\beta }_{na,j}\vert )\mbox{ sgn}(\hat{\beta }_{na,j})\vert \leq p_{0}\sqrt{n}a_{n} \rightarrow 0$ as n → ∞ by assumption (I ₃). Hence,

$$\displaystyle{\sqrt{n}(\hat{\boldsymbol{\beta }}_{na} -\boldsymbol{\beta }_{0a}) =\zeta _{ \varphi ^{+}}^{-1}n^{-1/2}S_{ n,a}(\boldsymbol{\beta }_{0}) + o_{P}(1).}$$

As $n^{-1/2}S_{n,a}(\boldsymbol{\beta }_{0})\mathop{\longrightarrow}\limits_{}^{\mathcal{D}}N\big(0,\ \gamma _{\varphi ^{+}}\varSigma _{a}\big)$, we have

$$\displaystyle{\sqrt{n}\big(\hat{\boldsymbol{\beta }}_{na} -\boldsymbol{\beta }_{0a}\big)\mathop{\longrightarrow}\limits_{}^{\mathcal{D}}N\big(0,\ \zeta _{\varphi ^{+}}^{-2}\gamma _{ \varphi ^{+}}\varSigma _{a}\big).}$$

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Abebe, A., Bindele, H.F. (2016). Robust Signed-Rank Variable Selection in Linear Regression. In: Liu, R., McKean, J. (eds) Robust Rank-Based and Nonparametric Methods. Springer Proceedings in Mathematics & Statistics, vol 168. Springer, Cham. https://doi.org/10.1007/978-3-319-39065-9_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-39065-9_2
Published: 21 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39063-5
Online ISBN: 978-3-319-39065-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Robust Signed-Rank Variable Selection in Linear Regression

Abstract

Access this chapter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

2.1.1 Proofs

Lemma 2.1.

Lemma 2.2.

Lemma 2.3.

Proof of Theorem 2.1.

Proof of Theorem 2.2.

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation