Preselection in Lasso-Type Analysis for Ultra-High Dimensional Genomic Exploration

Bergersen, Linn Cecilie; Ahmed, Ismaïl; Frigessi, Arnoldo; Glad, Ingrid K.; Richardson, Sylvia

doi:10.1007/978-3-319-27099-9_3

Linn Cecilie Bergersen⁸,
Ismaïl Ahmed⁹,
Arnoldo Frigessi¹⁰,
Ingrid K. Glad⁸ &
…
Sylvia Richardson¹¹

Part of the book series: Abel Symposia ((ABEL,volume 11))

2710 Accesses
1 Altmetric

Abstract

We address the issue of variable preselection in high-dimensional penalized regression, such as the lasso, a commonly used approach to variable selection and prediction in genomics. Preselection—to start with a manageable set of covariates—is becoming increasingly necessary for enabling advanced analysis tasks to be carried out on data sets of huge size created by high throughput technologies. Preselection of features to be included in multivariate analyses based on simple univariate ranking is a natural strategy that has often been implemented despite its potential bias. We demonstrate this bias and propose a way to correct it. Starting with a sequential implementation of the lasso with increasing lists of predictors, we exploit a property of the set of corresponding cross-validation curves, a pattern that we call “freezing”. The ranking of the predictors to be included sequentially is based on simple measures of associations with the outcome, which can be pre-computed in an efficient way for ultra high dimensional data sets externally to the penalized regression implementation. We demonstrate by simulation that our sequential approach leads in a vast majority of cases to a safe and efficient way of focusing the lasso analysis on a smaller and manageable number of predictors. In situations where the lasso performs well, we need typically less than 20 % of the variables to recover the same solution as if using the full set of variables. We illustrate the applicability of our strategy in the context of a genome-wide association study and on microarray genomic data where we need just 2. 5 % and 13 % of the variables respectively. Finally we include an example where 260 million gene-gene interactions are ranked and we are able to recover the lasso solution using only 1 % of these. Freezing offers great potential for extending the applicability of penalized regressions to current and upcoming ultra high dimensional problems in bioinformatics. Its applicability is not limited to the standard lasso but is a generic property of many penalized approaches.

Authors Ingrid K. Glad, Sylvia Richardson contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ambroise, C., McLachlan, G.J.: Selection bias in gene extraction on the basis of microarray gene-expression data. Proc. Natl. Acad. Sci. 99(10), 6562–6566 (2002)
Article MATH Google Scholar
Bair, E., Hastie, T., Paul, D., Tibshirani, R.: Prediction by supervised principal components. J. Am. Stat. Assoc. 101(473), 119–137 (2006)
Article MathSciNet MATH Google Scholar
Bien, J., Taylor, J., Tibshirani, R.: A lasso for hierarchical testing of interactions. Ann. Stat. 41(3), 1111–1141 (2013)
Article MathSciNet MATH Google Scholar
Bühlmann, P., van de Geer, S.: Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer Series in Statistics. Springer, Berlin (2011)
Book Google Scholar
Cantor, R.M., Lange, K., Sinsheimer, J.S.: Prioritizing GWAS results: a review of statistical methods and recommendations for their application. Am. J. Hum. Genet. 86(1), 6–22 (2010)
Article Google Scholar
Cho, S., Kim, K., Kim, Y.J., Lee, J.-K., Cho, Y.S., Lee, J.-Y., Han, B.-G., Kim, H., Ott, J., Park, T.: Joint identification of multiple genetic variants via elastic-net variable selection in a genome-wide association analysis. Ann. Hum. Genet. 74(5), 416–428 (2010)
Article Google Scholar
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–499 (2004)
Article MathSciNet MATH Google Scholar
El Ghaoui, L., Viallon, V., Rabbani, T.: Safe feature elimination for the lasso and sparse supervised learning problems. ArXiv e-prints 1009.4219 (2011)
Google Scholar
Fan, J., Lv, J.: Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 70(5), 849–911 (2008)
Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010)
Article Google Scholar
Genovese, C.R., Jin, J., Wasserman, L., Yao, Z.: A comparison of the lasso and marginal regression. J. Mach. Learn. Res. 13(1), 2107–2143 (2012)
MathSciNet MATH Google Scholar
Hamza, T.H., Zabetian, C.P., Tenesa, A., Laederach, A., Montimurro, J., Yearout, D., Kay, D.M., Doheny, K.F., Paschall, J., Pugh, E., Kusel, V.I., Collura, R., Roberts, J., Griffith, A., Samii, A., Scott, W.K., Nutt, J., Factor, S.A., Payami, H.: Common genetic variation in the HLA region is associated with late-onset sporadic parkinsons disease. Nat. Genet. 42(9), 781–785 (2010)
Article Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer Series in Statistics. Springer, New York (2009)
Book Google Scholar
Meinshausen, N.: Relaxed lasso. Comput. Stat. Data Anal. 52(1), 374–393 (2007)
Article MathSciNet MATH Google Scholar
Reppe, S., Refvem, H., Gautvik, V.T., Olstad, O.K., Høvring, P.I., Reinholt, F.P., Holden, M., Frigessi, A., Jemtland, R., Gautvik, K.M.: Eight genes are highly associated with BMD variation in postmenopausal caucasian women. Bone 46(3), 604–612 (2010)
Article Google Scholar
Simon, R.M., Korn, E.L., McShane, L.M., Radmacher, M.D., Wright, G.W., Zhao, Y.: Design and analysis of DNS microarray investigations. In: Statistics for Biology and Health. Springer, New York (2004)
Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. 58, 267–288 (1996)
MathSciNet MATH Google Scholar
Tibshirani, R., Bien, J., Friedman, J., Hastie, T., Simon, N., Taylor, J., Tibshirani, R.J.: Strong rules for discarding predictors in lasso-type problems. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 74(2), 245–266 (2012)
Google Scholar
van de Geer, S., Bühlmann, P., Zhou, S.: The adaptive and the thresholded lasso for potentially misspecified models (and a lower bound for the lasso). Electron. J. Stat. 5, 688–749 (2011)
Article MathSciNet MATH Google Scholar
Waldmann, P., Mészáros, G., Gredler, B., Fuerst, C., Sölkner, J.: Evaluation of the lasso and the elastic net in genome-wide association studies. Frontiers in Genetics, 4, 270. http://doi.org/10.3389/fgene.2013.00270 (2013)
Waldron, L., Pintilie, M., Tsao, M.-S., Shepherd, F.A., Huttenhower, C., Jurisica, I.: Optimized application of penalized regression methods to diverse genomic data. Bioinformatics 27(24), 3399–3406 (2011)
Article Google Scholar
Yang, C., Wan, X., Yang, Q., Xue, H., Yu, W.: Identifying main effects and epistatic interactions from large-scale snp data via adaptive group lasso. BMC Bioinf. 11(Suppl. 1), S18 (2010)
Article Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B 68(1), 49–67 (2006)
Article MathSciNet MATH Google Scholar
Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006)
Article MATH Google Scholar
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)
Google Scholar

Download references

Acknowledgements

This research was supported by grant number 204664 from the Norwegian Research Council (NRC) and by Statistics for Innovation (sfi)², a centre for research based innovation funded by NRC. SR and LCB spent a research period in Paris at Inserm UMRS937, and SR has an adjunct position at (sfi)². IA was funded by a grant from the Agence Nationale de la Recherche (ANR Maladies neurologiques et maladies psychiatriques) as part of a project on the relation between Parkinson’s disease and genes involved in the metabolism and transport of xenobiotics (PI: Alexis Elbaz, Inserm) for which access to GWAS data was obtained through dbGAP; this work utilized in part data from the NINDS DbGaP database from the CIDR:NGRC PARKINSONS DISEASE STUDY (Accession: phs000196.v2.p1). Sjur Reppe at Ullevaal University Hospital provided the Bone biopsy data.

Author information

Authors and Affiliations

Department of Mathematics, University of Oslo, Oslo, Norway
Linn Cecilie Bergersen & Ingrid K. Glad
INSERM, CESP Center for Research in Epidemiology and Population Health, Paris, France
Ismaïl Ahmed
Oslo Centre for Biostatistics and Epidemiology, University of Oslo, Oslo, Norway
Arnoldo Frigessi
MRC Biostatistics Unit, Cambridge Institute of Public Health, University of Cambridge, Cambridge, UK
Sylvia Richardson

Authors

Linn Cecilie Bergersen
View author publications
You can also search for this author in PubMed Google Scholar
Ismaïl Ahmed
View author publications
You can also search for this author in PubMed Google Scholar
Arnoldo Frigessi
View author publications
You can also search for this author in PubMed Google Scholar
Ingrid K. Glad
View author publications
You can also search for this author in PubMed Google Scholar
Sylvia Richardson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ingrid K. Glad .

Editor information

Editors and Affiliations

Oslo Centre for Biostatistics and Epide, University of Oslo, Oslo, Norway
Arnoldo Frigessi
Seminar for Statistics, ETH Zürich, Zürich, Switzerland
Peter Bühlmann
Department of Mathematics, University of Oslo, Oslo, Norway
Ingrid K. Glad
Norwegian University of Science and Tec, Department of Mathematical Sciences, Trondheim, Norway
Mette Langaas
University of Cambridge, MRC Biostatistics Unit, Cambridge Instit, Cambridge, United Kingdom
Sylvia Richardson
Department of Statistics, Rice University, Houston, Texas, USA
Marina Vannucci

Appendices

Appendix 1 Proof 1 (Proof of (3a) and (3b))

Fix $\lambda$, and drop it from the notation for simplicity. Let

$$\displaystyle{f_{C_{p}}(\boldsymbol{\beta }_{C_{p}}) =\sum _{ i=1}^{n}(y_{ i} -\sum _{j\in C_{p}}x_{ij}\beta _{j,C_{p}})^{2} +\lambda \sum _{ j\in C_{p}}\vert \beta _{j,C_{p}}\vert,}$$

and similarly for $f_{C_{F}}(\boldsymbol{\beta }_{C_{F}})$. Given $\boldsymbol{\beta }_{C_{p}}$, we can form the vector in $\mathbb{R}^{P}$ with $\vert C_{F}\setminus C_{p}\vert $ zeros as $(\boldsymbol{\beta }_{C_{p}},\boldsymbol{\beta }_{C_{F}\setminus C_{p}} =\boldsymbol{ 0})$. For such a vector it holds

$$\displaystyle{ f_{C_{p}}(\boldsymbol{\beta }_{C_{p}}) = f_{C_{F}}(\boldsymbol{\beta }_{C_{p}},\boldsymbol{\beta }_{C_{F}\setminus C_{p}} =\boldsymbol{ 0}). }$$

(6)

Next we show that the nonzero components of

$$\displaystyle{\mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{C_{p}}}f_{C_{p}}(\boldsymbol{\beta }_{C_{p}})}$$

are the same as the nonzero components of

$$\displaystyle{\mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{C_{F}}}f_{C_{F}}(\boldsymbol{\beta }_{C_{F}})}$$

when $S_{F} \subseteq C_{p}$. In fact, because $S_{F} \subseteq C_{p}$, we first have

$$\displaystyle{\mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{C_{p}}}f_{C_{p}}(\boldsymbol{\beta }_{C_{p}}) =\mathop{\arg \!\min }\limits_{\boldsymbol{\beta } _{S_{F}},\boldsymbol{\beta }_{C_{p}\setminus S_{F}}}f_{C_{p}}(\boldsymbol{\beta }_{S_{F}},\boldsymbol{\beta }_{Cp\setminus S_{F}}).}$$

Now we add some zero coefficients, such that

$$\displaystyle{f_{C_{p}}(\boldsymbol{\beta }_{S_{F}},\boldsymbol{\beta }_{C_{p}\setminus S_{F}}) = f_{C_{F}}(\boldsymbol{\beta }_{S_{F}},\boldsymbol{\beta }_{C_{p}\setminus S_{F}},\boldsymbol{\beta }_{C_{F}\setminus C_{p}} =\boldsymbol{ 0})}$$

by (6). Hence

$$\displaystyle{ \mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{C_{p}}}f_{C_{p}}(\boldsymbol{\beta }_{C_{p}}) =\mathop{\arg \!\min }\limits_{\boldsymbol{\beta } _{S_{F}},\boldsymbol{\beta }_{C_{p}\setminus S_{F}}}f_{C_{F}}(\boldsymbol{\beta }_{S_{F}},\boldsymbol{\beta }_{C_{p}\setminus S_{F}},\boldsymbol{\beta }_{C_{F}\setminus C_{p}} =\boldsymbol{ 0}). }$$

(7)

When we minimize $f_{C_{F}}(\boldsymbol{\beta }_{C_{F}})$, we know that for the solution $\hat{\boldsymbol{\beta }}_{C_{F}\setminus C_{p}} =\boldsymbol{ 0}$. Hence we can drop the constraint that $\boldsymbol{\beta }_{C_{F}\setminus C_{p}} =\boldsymbol{ 0}$ in (7) and minimize over $\boldsymbol{\beta }_{C_{F}\setminus C_{p}}$ also, without making any difference. We obtain that the nonzero components of

$$\displaystyle{\mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{C_{p}}}f_{C_{p}}(\boldsymbol{\beta }_{C_{p}})}$$

are the same as the nonzero components of

$$\displaystyle{\mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{S_{F}},\boldsymbol{\beta }_{C_{p}\setminus S_{F}},\boldsymbol{\beta }_{C_{F}\setminus C_{p}}}f_{C_{F}}(\boldsymbol{\beta }_{S_{F}},\boldsymbol{\beta }_{C_{p}\setminus S_{F}},\boldsymbol{\beta }_{C_{F}\setminus C_{p}}) =\mathop{\arg \!\min }\limits_{\boldsymbol{\beta } _{C_{F}}}f_{C_{F}}(\boldsymbol{\beta }_{C_{F}}).}$$

Let

$$\displaystyle{\hat{\boldsymbol{\beta }}_{C_{F}} =\mathop{\arg \!\min }\limits_{\boldsymbol{\beta } _{C_{F}}}f_{C_{F}}(\boldsymbol{\beta }_{C_{F}}).}$$

Then for j ∈ S _F, $\hat{\boldsymbol{\beta }}_{j,C_{F}}\neq 0.$ Therefore since the nonzero components of

$$\displaystyle{\mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{C_{p}}}f_{C_{p}}(\boldsymbol{\beta }_{C_{p}})}$$

are the same as the nonzero components of

$$\displaystyle{\mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{C_{F}}}f_{C_{F}}(\boldsymbol{\beta }_{C_{F}})}$$

when $S_{F} \subseteq C_{p}$, also $\hat{\boldsymbol{\beta }}_{j,C_{p}}\neq 0$. The opposite is also true: if $\hat{\boldsymbol{\beta }}_{j,C_{p}}\neq 0$, then j ∈ S _F and $\hat{\boldsymbol{\beta }}_{j,C_{F}}\neq 0$. Similarly for j ∉ S _F. This proves that

(a)
$S_{p}(\lambda ) = S_{F}(\lambda )\quad \forall p \geq p_{0}(\lambda ),$
(b)
$\hat{\beta }_{j,C_{p}}(\lambda ) =\hat{\beta } _{j,C_{F}}(\lambda ),\quad \forall p \geq p_{0}(\lambda ),\forall j.$

Proof 2 (Proof of (4) and (5))

For fixed $\lambda$ we have that if

$$\displaystyle{p_{1} \geq p_{0,cv}(\lambda ) =\max _{k=1,\ldots,K}p_{0,k}(\lambda ),}$$

then

$$\displaystyle{p_{1} \geq p_{0,k}(\lambda )\quad \forall k.}$$

By (3a) and (3b) it follows that for all $p_{2} > p_{1} \geq p_{0,k}(\lambda )$ and $\forall k$

$$\displaystyle{ S_{p_{1}}^{-k}(\lambda ) = S_{ p_{2}}^{-k}(\lambda ) = S_{ F}^{-k}(\lambda ) }$$

(8)

and

$$\displaystyle{ \hat{\beta }_{j,C_{p_{ 1}}}^{-k}(\lambda ) =\hat{\beta }_{ j,C_{p_{2}}}^{-k}(\lambda ),\quad \forall j \in S_{ F}^{-k}(\lambda ). }$$

(9)

Then

$$\displaystyle\begin{array}{rcl} \hat{y}_{i,C_{p_{ 2}}}^{-k}(\lambda )& =& \sum _{ j\in C_{p_{2}}}x_{ij}\hat{\beta }_{j,C_{p_{2}}}^{-k}(\lambda ) \\ & =& \sum _{j\in S_{p_{ 2}}^{-k}(\lambda )}x_{ij}\hat{\beta }_{j,C_{p_{ 2}}}^{-k}(\lambda ) +\sum _{ j\not\in S_{p_{2}}^{-k}(\lambda )}x_{ij}\hat{\beta }_{j,C_{p_{2}}}^{-k}(\lambda ){}\end{array}$$

(10)

$$\displaystyle\begin{array}{rcl} & =& \sum _{j\in S_{p_{ 2}}^{-k}(\lambda )}x_{ij}\hat{\beta }_{j,C_{p_{ 2}}}^{-k}(\lambda ) \\ & =& \sum _{j\in S_{p_{ 1}}^{-k}(\lambda )}x_{ij}\hat{\beta }_{j,C_{p_{ 2}}}^{-k}(\lambda ){}\end{array}$$

(11)

$$\displaystyle{ =\sum _{j\in S_{p_{ 1}}^{-k}(\lambda )}x_{ij}\hat{\beta }_{j,C_{p_{ 1}}}^{-k}(\lambda ), }$$

(12)

because the last term in (10) is zero and the two last equalities in (11) and (12) follows from (8) and (9) respectively. Similarly we have

$$\displaystyle{\hat{y}_{i,C_{p_{ 1}}}^{-k}(\lambda ) =\sum _{ j\in S_{p_{1}}^{-k}(\lambda )}x_{ij}\hat{\beta }_{j,C_{p_{1}}}^{-k}(\lambda ),}$$

so that $\hat{y}_{i,C_{p_{ 1}}}^{-k}(\lambda ) =\hat{ y}_{i,C_{p_{ 2}}}^{-k}(\lambda )$ holds $\forall i$. Finally this implies $CV _{C_{p_{ 1}}}(\lambda ) = CV _{C_{p_{ 2}}}(\lambda ) = CV _{C_{F}}(\lambda )$ and hereby (5).

Appendix 2

We collect here some further arguments which lead to the reordering of the Part 2 of our algorithm. Consider two consecutive cross-validation curves for $C_{p_{m}}$ and $C_{p_{m+1}}$, and assume that the two curves coincide in an interval $\tilde{\varLambda }= [\tilde{\lambda },\lambda _{max}]$ which includes a minimum in $\lambda _{p_{m}}^{{\ast}} >\tilde{\lambda }$. Part 1 of our algorithm would stop with p _m variables and return the solution $S_{p_{m}}(\lambda _{p_{m}}^{{\ast}})$. By definition of freezing, if $S_{p_{m+1}}^{-k}(\lambda ) \subseteq C_{p_{m}}$ for all folds k and for all $\lambda \in \tilde{\varLambda }$, then the two curves for $C_{p_{m}}$ and $C_{p_{m+1}}$ are identical in $\lambda _{p_{m}}^{{\ast}}$ and in all other values of $\lambda \in \tilde{\varLambda }$. Nevertheless $S_{F}(\lambda ^{{\ast}})$ might not be included in $C_{p_{m}}$, and hence $S_{p_{m}}(\lambda _{p_{m}}^{{\ast}})$ is not the correct solution for the full data set. If on the contrary, there are some variables that are active in the cross-validation for $C_{p_{m+1}}$ that are not in $C_{p_{m}}$, that is for any k, $S_{p_{m+1}}^{-k}(\lambda _{p_{m}}^{{\ast}})\not\subset C_{p_{m}}$, then the two curves would not coincide down and beyond $\lambda _{p_{m}}^{{\ast}}$ and hence the algorithm would not erroneously stop. Therefore the sequence of preselected sets should be such that, while waiting for $S_{F}(\lambda ^{{\ast}})$ to be included in a $C_{p_{m}}$ (at which point the curves cannot change anymore in the minimum), the new active set $S_{p_{m+1}}(\lambda _{p_{m}}^{{\ast}})$ in the current minimum $\lambda _{p_{m}}^{{\ast}}$ includes typically some new variables which were not in the previous set $C_{p_{m}}$. This leads to the idea of sequential reordering once we have found a first “local point of freezing”. The idea of Part 2 of our algorithm follows this line, and greedily constructs the new next set $C_{p_{m+1}}$ by introducing new variables which have a high chance to be in $S_{p_{m+1}}(\lambda _{p_{m}}^{{\ast}})$. This is done by reordering the unused variables based on the residuals $\boldsymbol{r}$, computed using the selected variables $S_{p_{m}}(\lambda _{p_{m}}^{{\ast}})$.

Appendix 3

Further details from the simulation studies are summarized here. First, we consider the linear regression model as described in the main manuscript, while results from experiments using a logistic regression model are reported thereafter.

3.1 Linear Regression Model

The results are reported for Scenario A, B and D and the data are generated as described in Sect. 3 in the main manuscript. We investigate how many variables are needed to avoid the preselection bias.

Comparing Scenario A, B and D, we see that freezing can be very useful not only in situations with no correlation among the covariates, but also in situations where the covariates are correlated. The results are quite similar, with a small advantage when the covariates are generated independently. This is possibly because the marginal correlation ranking captures the true nonzero coefficients earlier when there is little correlation among the covariates.

For all three scenarios, we observe that when models of less noise are considered (S N R ≈ 2), there are practically no situations for which the lasso selects less than 20 variables. When S N R ≈ 0. 5 there are more situations where the cross-validation curves have well-defined minima leading to a smaller number of selected variables, hence there is a greater advantage of using our approach in these situations. For example for the situations where the lasso selects less than 80 variables, the largest gain is observed when S N R ≈ 0. 5, where the average % of data needed to recover the optimal solution is not more than 15 %, 19 % and 15 % for the three different scenarios respectively.

Scenario A (Table 4 and Fig. 7)

Table 4 Results from 100 experiments of Scenario A

Full size table

Scenario B (Table 5 and Fig. 8)

Table 5 Results from 100 experiments of Scenario B

Full size table

Scenario D (Table 6 and Fig. 9)

Table 6 Results from 100 experiments of Scenario D

Full size table

6.1 Logistic Regression Model

Finally we do one experiment of 100 replications with a binary response. For simplicity, in this experiment we only consider the covariate matrix generated as in Scenario B and with P = 10, 000. Results are summarized in Table 7 and Fig. 10. Here we also observe situations where the cross-validated optimal solution is not well-defined and the lasso selects very many. Nevertheless in 57 out of 100 experiments the curves will be frozen down and below the minimum in $\lambda ^{{\ast}}$ for less than 50 % of the data. In several cases it happens already for 20–30 % of the data.

Table 7 Results from 100 experiments with continuous correlated features in a logistic regression model with a binary response

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bergersen, L.C., Ahmed, I., Frigessi, A., Glad, I.K., Richardson, S. (2016). Preselection in Lasso-Type Analysis for Ultra-High Dimensional Genomic Exploration. In: Frigessi, A., Bühlmann, P., Glad, I., Langaas, M., Richardson, S., Vannucci, M. (eds) Statistical Analysis for High-Dimensional Data. Abel Symposia, vol 11. Springer, Cham. https://doi.org/10.1007/978-3-319-27099-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-27099-9_3
Published: 17 February 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27097-5
Online ISBN: 978-3-319-27099-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Preselection in Lasso-Type Analysis for Ultra-High Dimensional Genomic Exploration

Abstract

Access this chapter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

Appendix 1

Proof 1 (Proof of (3a) and (3b))

Proof 2 (Proof of (4) and (5))

Appendix 2

Appendix 3

3.1 Linear Regression Model

Scenario A (Table 4 and Fig. 7)

Scenario B (Table 5 and Fig. 8)

Scenario D (Table 6 and Fig. 9)

6.1 Logistic Regression Model

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation