Linear Regression via Elastic Net: Non-enumerative Leave-One-Out Verification of Feature Selection

Chernousova, Elena; Razin, Nikolay; Krasotkina, Olga; Mottl, Vadim; Windridge, David

doi:10.1007/978-1-4939-0742-7_22

Elena Chernousova⁵,
Nikolay Razin⁵,
Olga Krasotkina⁶,
Vadim Mottl⁷ &
…
David Windridge⁸

Part of the book series: Springer Optimization and Its Applications ((SOIA,volume 92))

1111 Accesses

Abstract

The feature-selective non-quadratic Elastic Net criterion of regression estimation is completely determined by two numerical regularization parameters which penalize, respectively, the squared and absolute values of the regression coefficients under estimation. It is an inherent property of the minimum of the Elastic Net that the values of regularization parameters completely determine a partition of the variable set into three subsets of negative, positive, and strictly zero values, so that the former two subsets and the latter subset are, respectively, associated with “informative” and “redundant” features. We propose in this paper to treat this partition as a secondary structural parameter to be verified by leave-one-out cross validation. Once the partitioning is fixed, we show that there exists a non-enumerative method for computing the leave-one-out error rate, thus enabling an evaluation of model generality in order to tune the structural parameters without the necessity of multiple training repetitions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In [1], denominators in (5) have the form 1 +λ ₂ instead of $1 +\lambda _{2}/N$. This is a consequence of a specific normalization of the training set $\sum \nolimits _{j=1}^{N}\!x_{ij}^{2} = 1$ as distinct to the commonly adopted normalization $(1/N)\sum \nolimits _{j=1}^{N}\!x_{ij}^{2} = 1$ accepted in this paper (2).
2.
http://en.wikipedia.org/wiki/Woodbury_matrix_identity.

References

Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. 67, 301–320 (2005)
Article MATH MathSciNet Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. 58(1), 267–288 (1996)
MATH MathSciNet Google Scholar
Ye, G., Chen, Y., Xie, X.: Efficient variable selection in support vector machines via the alternating direction method of multipliers. J. Mach. Learn. Res. Proc. Track 832–840 (2011)
Google Scholar
Wang, L., Zhu, J., Zou, H.: The doubly regularized support vector machine. Stat. Sinica 16, 589–615 (2006)
MATH MathSciNet Google Scholar
Grosswindhager, S.: Using penalized logistic regression models for predicting the effects of advertising material (2009). http://publik.tuwien.ac.at/files/PubDat_179921.pdf
Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22 (2010)
Google Scholar
Christensen, R.: Plane Answers to Complex Questions. The Theory of Linear Models, 3rd edn. Springer, New York (2010)
Google Scholar
Tibshirani, R., Efron, B., Hastie, T., Johnstone, I.: Least angle regression. Ann. Stat. 32, 407–499 (2004)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Moscow Institute of Physics and Technology, Moscow, Russia
Elena Chernousova & Nikolay Razin
Tula State University, Tula, Russia
Olga Krasotkina
Computing Centre of the Russian Academy of Sciences, Moscow, Russia
Vadim Mottl
University of Surrey, Guildford, UK
David Windridge

Authors

Elena Chernousova
View author publications
You can also search for this author in PubMed Google Scholar
Nikolay Razin
View author publications
You can also search for this author in PubMed Google Scholar
Olga Krasotkina
View author publications
You can also search for this author in PubMed Google Scholar
Vadim Mottl
View author publications
You can also search for this author in PubMed Google Scholar
David Windridge
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elena Chernousova .

Editor information

Editors and Affiliations

Department of Higher Mathematics, National Research University Higher School of Economics, Moscow, Russia
Fuad Aleskerov
Department of Operations, University of Groningen, Groningen, The Netherlands
Boris Goldengorin
Department of Industrial and Systems Eng, University of Florida, Gainesville, Florida, USA
Panos M. Pardalos

Appendix

1.1 Proof of Theorem 1

Let us open out the brackets in (5):

$$\displaystyle\begin{array}{rcl} J_{\mathrm{EN}}(\mathbf{a}\vert \lambda _{1},\lambda _{2})& =& \lambda _{1}\|\mathbf{a}\|_{1} +\lambda _{2}\mathbf{a}^{T}\mathbf{a} - 2 \frac{\lambda _{2}} {N}\mathbf{a}^{T}\mathbf{X}^{T}\mathbf{y} {}\\ & & +\mathop{\underbrace{ \frac{\lambda _{2}} {N^{2}} \mathbf{y}^{T}\mathbf{X}\mathbf{X}^{T}\mathbf{y} + \mathbf{y}^{T}\mathbf{y}}}\limits _{\mathrm{con\!st}} - 2\mathbf{a}^{T}\mathbf{X}^{T}\mathbf{y} + \mathbf{a}^{T}\mathbf{X}^{T}\mathbf{X}\mathbf{a} \rightarrow \min (\mathbf{a}). {}\\ \end{array}$$

Summands not depending on a may be omitted from the optimization. Collecting the remaining summands gives:

$$\displaystyle{J_{\mathrm{EN}}(\mathbf{a}\vert \lambda _{1},\lambda _{2}) =\lambda _{1}\|\mathbf{a}\|_{1} + \mathbf{a}^{T}{\bigl (\mathbf{X}^{T}\mathbf{X} +\lambda _{ 2}\mathbf{I}\bigr )}\mathbf{a} +{\Bigl ( 1 + \frac{\lambda _{2}} {N}\Bigr )}\mathbf{a}^{T}\mathbf{X}^{T}\mathbf{y} \rightarrow \min (\mathbf{a}).}$$

Division of the last equality by the constant $(1 +\lambda _{2}/N)$ yields (8). The theorem is proven.

1.2 Proof of Theorem 2

Differentiation of (11) by the active regression coefficients a _i, $i\notin \hat{I}_{\lambda _{1},\lambda _{2}}^{0}$, leads to the equalities

$$\displaystyle\begin{array}{rcl} & & \frac{\partial } {\partial a_{i}}J_{\mathrm{EN}}{\bigl (a_{l},l\notin \hat{I}_{\lambda _{1},\lambda _{2}}^{0}\vert \lambda _{ 1},\lambda _{2}\bigr )} {}\\ & & \quad = 2\lambda _{2}(a_{i} - a_{i}^{{\ast}})^{2} + \left (\begin{array}{l} \;\;\,\lambda _{1},i \in \hat{ I}_{\lambda _{1},\lambda _{2}}^{+} \\ -\lambda _{1},i \in \hat{ I}_{\lambda _{1},\lambda _{2}}^{-} \end{array} \right ) - 2\sum _{j=1}^{N}{\Bigl (y_{ j} -\sum _{l\notin \hat{I}_{\lambda _{1},\lambda _{2}}^{0}}a_{l}x_{lj}\Bigr )} = 0,{}\\ \end{array}$$

which make a system of linear equations over $i\!\notin \hat{I}_{\lambda _{1},\lambda _{2}}^{0}$

$$\displaystyle{\lambda _{2}a_{i}+\sum _{l\notin \hat{I}_{\lambda _{ 1},\lambda _{2}}^{0}}{\Biggl (\sum _{j=1}^{N}x_{ ij}x_{lj}\Biggr )}a_{l} =\sum _{ j=1}^{N}x_{ ij}y_{j}-\frac{\lambda _{1}} {2}\left (\begin{array}{l} \;\;\,1,i \in \hat{ I}_{\lambda _{1},\lambda _{2}}^{+} \\ - 1,i \in \hat{ I}_{\lambda _{1},\lambda _{2}}^{-} \end{array} \right )+\lambda _{2}\mathbf{a}^{{\ast}}.}$$

The matrix form of this system in accordance with (12), (13), and (14) is just (16), with (13) its solution. The theorem is proven.

1.3 Proof of Theorem 3

Let the feature set partitioning ${\bigl \{\hat{I}_{\lambda _{1},\lambda _{2}}^{-},\hat{I}_{\lambda _{1},\lambda _{2}}^{0},\hat{I}_{\lambda _{1},\lambda _{2}}^{+}\bigr \}}$ (9) at the minimum point of (5) be treated as fixed, and the k th entity (x _k, y _k) be omitted from the training set (1). In terms of notation (4) and (2), this implies deletion of the k element from the vector $\mathbf{y} \in \mathbb{R}^{N}$ and the kth row from the matrix $\tilde{\mathbf{X}}_{\lambda _{1},\lambda _{2}}$ $(N \times \hat{ n}_{\lambda _{1},\lambda _{2}})$:

$$\displaystyle{\mathbf{y}^{(k)} \in \mathbb{R}^{N-1},\;\tilde{\mathbf{X}}_{\lambda _{ 1},\lambda _{2}}^{(k)}{\bigl ((N - 1) \times \hat{ n}_{\lambda _{ 1},\lambda _{2}}\bigr )}.}$$

The vector of preliminary estimates of regression coefficients $\mathbf{a}^{{\ast}} \in \mathbb{R}^{n}$ (12) occurs only in the Elastic Net (EN) training criterion (5), and equals zero in the naive Elastic Net (NEN) (3) $\mathbf{a}^{{\ast}} = \mathbf{0} \in \mathbb{R}^{n}$. Its subvector cut out from a ^∗ by deletion of the kth entity will be:

$$\displaystyle{\tilde{\mathbf{a}}_{\lambda _{1},\lambda _{2}}^{{\ast}(k)} = \left \{\!\begin{array}{ll} \frac{1} {N - 1}\sum _{j=1,j\neq k}^{N}y_{ j}\tilde{\mathbf{x}}_{j,\lambda _{1},\lambda _{2}} = \frac{1} {N - 1}(\tilde{\mathbf{X}}_{\lambda _{1},\lambda _{2}}^{(k)})^{T}\mathbf{y}^{(k)} \in \mathbb{R}^{\hat{n}_{\lambda _{1},\lambda _{2}} },&\mathit{EN} \\ \mathbf{0} \in \mathbb{R}^{\hat{n}_{\lambda _{1},\lambda _{2}}}, &\mathit{NEN} \end{array} \right.}$$

Correspondingly, the solution (13) of the optimization problem (11) will take the form (lower indices (λ ₁, λ ₂) are omitted below):

$$\displaystyle{ \hat{\tilde{\mathbf{a}}}^{(k)} ={\bigl ( (\tilde{\mathbf{X}}^{(k)})^{T}\tilde{\mathbf{X}}^{(k)}+\lambda _{ 2}\tilde{\mathbf{I}}_{\hat{n}}\bigr )}^{-1}\!\left \{\!(\tilde{\mathbf{X}}^{(k)})^{T}\mathbf{y} -\!\frac{\lambda _{1}} {2}\tilde{\mathbf{e}} +\! \left [\!\begin{array}{ll} \lambda _{2}\tilde{\mathbf{a}}^{{\ast}(k)},&\mathit{EN} \\ \mathbf{0}, &\mathit{NEN} \end{array} \right ]\!\right \}. }$$

(23)

Notice here that

$$\displaystyle{ \left \{\begin{array}{l} (\tilde{\mathbf{X}}^{(k)})^{T}\tilde{\mathbf{X}}^{(k)} =\tilde{ \mathbf{X}}^{T}\tilde{\mathbf{X}} -\tilde{\mathbf{x}}_{k}^{T}\tilde{\mathbf{x}}_{k}, \\ (\tilde{\mathbf{X}}^{(k)})^{T}\mathbf{y}^{(k)} =\tilde{ \mathbf{X}}^{T}\mathbf{y} - y_{k}\tilde{\mathbf{x}}_{k}, \\ \tilde{\mathbf{a}}^{{\ast}(k)} = \frac{1} {N - 1}{\bigl [\tilde{\mathbf{X}}^{T}\mathbf{y} - y_{ k}\tilde{\mathbf{x}}_{k}\bigr ]} = \frac{N} {N - 1}\tilde{\mathbf{a}}^{{\ast}}- \frac{1} {N - 1}y_{k}\tilde{\mathbf{x}}_{k} \\ =\tilde{ \mathbf{a}}^{{\ast}}- \frac{1} {N - 1}{\bigl (y_{k}\tilde{\mathbf{x}}_{k} -\tilde{\mathbf{a}}^{{\ast}}\bigr )}. \end{array} \right. }$$

(24)

Application of the Woodbury formula^{Footnote 2}

$$\displaystyle{(\mathbf{A} + \mathbf{B}\mathbf{C})^{-1} = \mathbf{A}^{-1} -\mathbf{A}^{-1}\mathbf{B}{\bigl (\mathbf{I} + \mathbf{C}\mathbf{A}^{-1}\mathbf{B}\bigr )}^{-1}\mathbf{C}\mathbf{A}^{-1}}$$

and (24) to (23) yields:

$$\displaystyle\begin{array}{rcl} & & \!\!\!\!\hat{\tilde{\mathbf{a}}}^{(k)} ={\Bigl (\mathop{\underbrace{ \tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{ 2}\tilde{\mathbf{I}}}}\limits _{\mathbf{A}} +\mathop{\underbrace{ (-\tilde{\mathbf{x}}_{k})}}\limits _{\mathbf{B}}\mathop{\underbrace{ \tilde{\mathbf{x}}_{k}^{T}}}\limits _{ \mathbf{C}}\Bigr )}^{-1} {}\\ & & \qquad \quad \!\!\!\! \times \left \{\tilde{\mathbf{X}}^{T}\mathbf{y} - y_{ k}\tilde{\mathbf{x}}_{k} -\frac{\lambda _{1}} {2}\tilde{\mathbf{e}} +\lambda _{2}\left [\begin{array}{ll} \tilde{\mathbf{a}}^{{\ast}}- \frac{1} {N - 1}{\bigl (y_{k}\tilde{\mathbf{x}}_{k} -\tilde{\mathbf{a}}^{{\ast}}\bigr )},&EN \\ \mathbf{0}, &NEN \end{array} \right ]\right \} {}\\ & & \qquad \!\!\!\!\! =\hat{\tilde{ \mathbf{a}}} + \frac{{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{k}\tilde{\mathbf{x}}_{k}^{T}\hat{\tilde{\mathbf{a}}}} {1 -\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{k}} - \frac{y_{k}} {1 -\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{k}}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{ 2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{ k} {}\\ & & \!\!\!\!\!\quad \qquad - \frac{\lambda _{2}} {N - 1}\!\left [\!\begin{array}{ll} {\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}}\,+\,\lambda _{ 2}\tilde{\mathbf{I}}\bigr )}^{-1} + \frac{{\bigl (\tilde{\mathbf{X}}^{T}\!\tilde{\mathbf{X}} +\!\lambda _{ 2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{ k}\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\!\tilde{\mathbf{X}} +\!\lambda _{ 2}\tilde{\mathbf{I}}\bigr )}^{-1}} {1 -\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{k}} {\bigl (y_{k}\tilde{\mathbf{x}}_{k} -\tilde{\mathbf{a}}^{{\ast}}\bigr )},&EN \\ \mathbf{0}, &NEN \end{array} \!\right ]. {}\\ \end{array}$$

Algebraic transformation of this expression with respect to the notation $\hat{y}_{k} =\tilde{ \mathbf{x}}_{k}^{T}\hat{\tilde{\mathbf{a}}}$ (16) and $\hat{y}_{k}^{(k)} =\tilde{ \mathbf{x}}_{k}^{T}\hat{\tilde{\mathbf{a}}}^{(k)}$ (18) leads to the equality

$$\displaystyle\begin{array}{rcl} \tilde{\mathbf{x}}_{k}^{T}\hat{\tilde{\mathbf{a}}}^{(k)}& =& \frac{\hat{y}_{k}} {1 -\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{k}} - y_{k} \frac{\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{k}} {1 -\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{k}} {}\\ & & - \frac{\lambda _{2}} {N - 1}\left [\begin{array}{ll} \frac{\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}(y_{k}\tilde{\mathbf{x}}_{k} -\tilde{\mathbf{a}}^{{\ast}})} {1 -\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{k}},&EN \\ \mathbf{0}, &NEN \end{array} \right ].{}\\ \end{array}$$

Thus, the leave-one-out residuals $\hat{\delta }_{k}^{(k)}$ in (17) and (18) permit the representation

$$\displaystyle\begin{array}{rcl} \hat{\delta }_{k}^{(k)}& =& y_{ k} -\tilde{\mathbf{x}}_{k}^{T}\hat{\tilde{\mathbf{a}}}^{(k)} {}\\ & =& y_{k} - \frac{\hat{y}_{k}} {1 -\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{k}} - y_{k} \frac{\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{k}} {1 -\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{k}} {}\\ & & - \frac{\lambda _{2}} {N - 1}\left [\begin{array}{ll} \frac{\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}(y_{k}\tilde{\mathbf{x}}_{k} -\tilde{\mathbf{a}}^{{\ast}})} {1 -\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{k}},&EN \\ \mathbf{0}, &NEN \end{array} \right ] {}\\ & =& \frac{y_{k} -\hat{ y}_{k}} {1 -\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{k}} {}\\ & & + \frac{\lambda _{2}} {N - 1}\left [\begin{array}{ll} \frac{\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}(y_{k}\tilde{\mathbf{x}}_{k} -\tilde{\mathbf{a}}^{{\ast}})} {1 -\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{k}},&EN \\ \mathbf{0}, &NEN \end{array} \right ] {}\\ & =& \frac{\delta _{k} + \frac{\lambda _{2}} {N - 1}\left [\begin{array}{ll} \tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}(y_{k}\tilde{\mathbf{x}}_{k} -\tilde{\mathbf{a}}^{{\ast}}),&EN \\ \mathbf{0}, &NEN \end{array} \right ]} {1 -\tilde{\mathbf{x}}_{k}^{T}{\bigl (\tilde{\mathbf{X}}^{T}\tilde{\mathbf{X}} +\lambda _{2}\tilde{\mathbf{I}}\bigr )}^{-1}\tilde{\mathbf{x}}_{k}}.{}\\ \end{array}$$

Substitution of $\hat{\delta }_{k}^{(k)}$ into (17) with respect to notations (21) yields (19) and (20).

The theorem is proven.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chernousova, E., Razin, N., Krasotkina, O., Mottl, V., Windridge, D. (2014). Linear Regression via Elastic Net: Non-enumerative Leave-One-Out Verification of Feature Selection. In: Aleskerov, F., Goldengorin, B., Pardalos, P. (eds) Clusters, Orders, and Trees: Methods and Applications. Springer Optimization and Its Applications, vol 92. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-0742-7_22

Download citation

DOI: https://doi.org/10.1007/978-1-4939-0742-7_22
Published: 03 May 2014
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-0741-0
Online ISBN: 978-1-4939-0742-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Linear Regression via Elastic Net: Non-enumerative Leave-One-Out Verification of Feature Selection

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 Proof of Theorem 1

1.2 Proof of Theorem 2

1.3 Proof of Theorem 3

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation