Skip to main content

Preselection in Lasso-Type Analysis for Ultra-High Dimensional Genomic Exploration

  • Conference paper
  • First Online:
Statistical Analysis for High-Dimensional Data

Part of the book series: Abel Symposia ((ABEL,volume 11))

Abstract

We address the issue of variable preselection in high-dimensional penalized regression, such as the lasso, a commonly used approach to variable selection and prediction in genomics. Preselection—to start with a manageable set of covariates—is becoming increasingly necessary for enabling advanced analysis tasks to be carried out on data sets of huge size created by high throughput technologies. Preselection of features to be included in multivariate analyses based on simple univariate ranking is a natural strategy that has often been implemented despite its potential bias. We demonstrate this bias and propose a way to correct it. Starting with a sequential implementation of the lasso with increasing lists of predictors, we exploit a property of the set of corresponding cross-validation curves, a pattern that we call “freezing”. The ranking of the predictors to be included sequentially is based on simple measures of associations with the outcome, which can be pre-computed in an efficient way for ultra high dimensional data sets externally to the penalized regression implementation. We demonstrate by simulation that our sequential approach leads in a vast majority of cases to a safe and efficient way of focusing the lasso analysis on a smaller and manageable number of predictors. In situations where the lasso performs well, we need typically less than 20 % of the variables to recover the same solution as if using the full set of variables. We illustrate the applicability of our strategy in the context of a genome-wide association study and on microarray genomic data where we need just 2. 5 % and 13 % of the variables respectively. Finally we include an example where 260 million gene-gene interactions are ranked and we are able to recover the lasso solution using only 1 % of these. Freezing offers great potential for extending the applicability of penalized regressions to current and upcoming ultra high dimensional problems in bioinformatics. Its applicability is not limited to the standard lasso but is a generic property of many penalized approaches.

Authors Ingrid K. Glad, Sylvia Richardson contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ambroise, C., McLachlan, G.J.: Selection bias in gene extraction on the basis of microarray gene-expression data. Proc. Natl. Acad. Sci. 99(10), 6562–6566 (2002)

    Article  MATH  Google Scholar 

  2. Bair, E., Hastie, T., Paul, D., Tibshirani, R.: Prediction by supervised principal components. J. Am. Stat. Assoc. 101(473), 119–137 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bien, J., Taylor, J., Tibshirani, R.: A lasso for hierarchical testing of interactions. Ann. Stat. 41(3), 1111–1141 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bühlmann, P., van de Geer, S.: Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer Series in Statistics. Springer, Berlin (2011)

    Book  Google Scholar 

  5. Cantor, R.M., Lange, K., Sinsheimer, J.S.: Prioritizing GWAS results: a review of statistical methods and recommendations for their application. Am. J. Hum. Genet. 86(1), 6–22 (2010)

    Article  Google Scholar 

  6. Cho, S., Kim, K., Kim, Y.J., Lee, J.-K., Cho, Y.S., Lee, J.-Y., Han, B.-G., Kim, H., Ott, J., Park, T.: Joint identification of multiple genetic variants via elastic-net variable selection in a genome-wide association analysis. Ann. Hum. Genet. 74(5), 416–428 (2010)

    Article  Google Scholar 

  7. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–499 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  8. El Ghaoui, L., Viallon, V., Rabbani, T.: Safe feature elimination for the lasso and sparse supervised learning problems. ArXiv e-prints 1009.4219 (2011)

    Google Scholar 

  9. Fan, J., Lv, J.: Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 70(5), 849–911 (2008)

    Google Scholar 

  10. Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010)

    Article  Google Scholar 

  11. Genovese, C.R., Jin, J., Wasserman, L., Yao, Z.: A comparison of the lasso and marginal regression. J. Mach. Learn. Res. 13(1), 2107–2143 (2012)

    MathSciNet  MATH  Google Scholar 

  12. Hamza, T.H., Zabetian, C.P., Tenesa, A., Laederach, A., Montimurro, J., Yearout, D., Kay, D.M., Doheny, K.F., Paschall, J., Pugh, E., Kusel, V.I., Collura, R., Roberts, J., Griffith, A., Samii, A., Scott, W.K., Nutt, J., Factor, S.A., Payami, H.: Common genetic variation in the HLA region is associated with late-onset sporadic parkinsons disease. Nat. Genet. 42(9), 781–785 (2010)

    Article  Google Scholar 

  13. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer Series in Statistics. Springer, New York (2009)

    Book  Google Scholar 

  14. Meinshausen, N.: Relaxed lasso. Comput. Stat. Data Anal. 52(1), 374–393 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  15. Reppe, S., Refvem, H., Gautvik, V.T., Olstad, O.K., Høvring, P.I., Reinholt, F.P., Holden, M., Frigessi, A., Jemtland, R., Gautvik, K.M.: Eight genes are highly associated with BMD variation in postmenopausal caucasian women. Bone 46(3), 604–612 (2010)

    Article  Google Scholar 

  16. Simon, R.M., Korn, E.L., McShane, L.M., Radmacher, M.D., Wright, G.W., Zhao, Y.: Design and analysis of DNS microarray investigations. In: Statistics for Biology and Health. Springer, New York (2004)

    Google Scholar 

  17. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. 58, 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  18. Tibshirani, R., Bien, J., Friedman, J., Hastie, T., Simon, N., Taylor, J., Tibshirani, R.J.: Strong rules for discarding predictors in lasso-type problems. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 74(2), 245–266 (2012)

    Google Scholar 

  19. van de Geer, S., Bühlmann, P., Zhou, S.: The adaptive and the thresholded lasso for potentially misspecified models (and a lower bound for the lasso). Electron. J. Stat. 5, 688–749 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  20. Waldmann, P., Mészáros, G., Gredler, B., Fuerst, C., Sölkner, J.: Evaluation of the lasso and the elastic net in genome-wide association studies. Frontiers in Genetics, 4, 270. http://doi.org/10.3389/fgene.2013.00270 (2013)

  21. Waldron, L., Pintilie, M., Tsao, M.-S., Shepherd, F.A., Huttenhower, C., Jurisica, I.: Optimized application of penalized regression methods to diverse genomic data. Bioinformatics 27(24), 3399–3406 (2011)

    Article  Google Scholar 

  22. Yang, C., Wan, X., Yang, Q., Xue, H., Yu, W.: Identifying main effects and epistatic interactions from large-scale snp data via adaptive group lasso. BMC Bioinf. 11(Suppl. 1), S18 (2010)

    Article  Google Scholar 

  23. Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B 68(1), 49–67 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  24. Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006)

    Article  MATH  Google Scholar 

  25. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)

    Google Scholar 

Download references

Acknowledgements

This research was supported by grant number 204664 from the Norwegian Research Council (NRC) and by Statistics for Innovation (sfi)2, a centre for research based innovation funded by NRC. SR and LCB spent a research period in Paris at Inserm UMRS937, and SR has an adjunct position at (sfi)2. IA was funded by a grant from the Agence Nationale de la Recherche (ANR Maladies neurologiques et maladies psychiatriques) as part of a project on the relation between Parkinson’s disease and genes involved in the metabolism and transport of xenobiotics (PI: Alexis Elbaz, Inserm) for which access to GWAS data was obtained through dbGAP; this work utilized in part data from the NINDS DbGaP database from the CIDR:NGRC PARKINSONS DISEASE STUDY (Accession: phs000196.v2.p1). Sjur Reppe at Ullevaal University Hospital provided the Bone biopsy data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ingrid K. Glad .

Editor information

Editors and Affiliations

Appendices

Appendix 1

Proof 1 (Proof of (3a) and (3b))

Fix \(\lambda\), and drop it from the notation for simplicity. Let

$$\displaystyle{f_{C_{p}}(\boldsymbol{\beta }_{C_{p}}) =\sum _{ i=1}^{n}(y_{ i} -\sum _{j\in C_{p}}x_{ij}\beta _{j,C_{p}})^{2} +\lambda \sum _{ j\in C_{p}}\vert \beta _{j,C_{p}}\vert,}$$

and similarly for \(f_{C_{F}}(\boldsymbol{\beta }_{C_{F}})\). Given \(\boldsymbol{\beta }_{C_{p}}\), we can form the vector in \(\mathbb{R}^{P}\) with \(\vert C_{F}\setminus C_{p}\vert \) zeros as \((\boldsymbol{\beta }_{C_{p}},\boldsymbol{\beta }_{C_{F}\setminus C_{p}} =\boldsymbol{ 0})\). For such a vector it holds

$$\displaystyle{ f_{C_{p}}(\boldsymbol{\beta }_{C_{p}}) = f_{C_{F}}(\boldsymbol{\beta }_{C_{p}},\boldsymbol{\beta }_{C_{F}\setminus C_{p}} =\boldsymbol{ 0}). }$$
(6)

Next we show that the nonzero components of

$$\displaystyle{\mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{C_{p}}}f_{C_{p}}(\boldsymbol{\beta }_{C_{p}})}$$

are the same as the nonzero components of

$$\displaystyle{\mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{C_{F}}}f_{C_{F}}(\boldsymbol{\beta }_{C_{F}})}$$

when \(S_{F} \subseteq C_{p}\). In fact, because \(S_{F} \subseteq C_{p}\), we first have

$$\displaystyle{\mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{C_{p}}}f_{C_{p}}(\boldsymbol{\beta }_{C_{p}}) =\mathop{\arg \!\min }\limits_{\boldsymbol{\beta } _{S_{F}},\boldsymbol{\beta }_{C_{p}\setminus S_{F}}}f_{C_{p}}(\boldsymbol{\beta }_{S_{F}},\boldsymbol{\beta }_{Cp\setminus S_{F}}).}$$

Now we add some zero coefficients, such that

$$\displaystyle{f_{C_{p}}(\boldsymbol{\beta }_{S_{F}},\boldsymbol{\beta }_{C_{p}\setminus S_{F}}) = f_{C_{F}}(\boldsymbol{\beta }_{S_{F}},\boldsymbol{\beta }_{C_{p}\setminus S_{F}},\boldsymbol{\beta }_{C_{F}\setminus C_{p}} =\boldsymbol{ 0})}$$

by (6). Hence

$$\displaystyle{ \mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{C_{p}}}f_{C_{p}}(\boldsymbol{\beta }_{C_{p}}) =\mathop{\arg \!\min }\limits_{\boldsymbol{\beta } _{S_{F}},\boldsymbol{\beta }_{C_{p}\setminus S_{F}}}f_{C_{F}}(\boldsymbol{\beta }_{S_{F}},\boldsymbol{\beta }_{C_{p}\setminus S_{F}},\boldsymbol{\beta }_{C_{F}\setminus C_{p}} =\boldsymbol{ 0}). }$$
(7)

When we minimize \(f_{C_{F}}(\boldsymbol{\beta }_{C_{F}})\), we know that for the solution \(\hat{\boldsymbol{\beta }}_{C_{F}\setminus C_{p}} =\boldsymbol{ 0}\). Hence we can drop the constraint that \(\boldsymbol{\beta }_{C_{F}\setminus C_{p}} =\boldsymbol{ 0}\) in (7) and minimize over \(\boldsymbol{\beta }_{C_{F}\setminus C_{p}}\) also, without making any difference. We obtain that the nonzero components of

$$\displaystyle{\mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{C_{p}}}f_{C_{p}}(\boldsymbol{\beta }_{C_{p}})}$$

are the same as the nonzero components of

$$\displaystyle{\mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{S_{F}},\boldsymbol{\beta }_{C_{p}\setminus S_{F}},\boldsymbol{\beta }_{C_{F}\setminus C_{p}}}f_{C_{F}}(\boldsymbol{\beta }_{S_{F}},\boldsymbol{\beta }_{C_{p}\setminus S_{F}},\boldsymbol{\beta }_{C_{F}\setminus C_{p}}) =\mathop{\arg \!\min }\limits_{\boldsymbol{\beta } _{C_{F}}}f_{C_{F}}(\boldsymbol{\beta }_{C_{F}}).}$$

Let

$$\displaystyle{\hat{\boldsymbol{\beta }}_{C_{F}} =\mathop{\arg \!\min }\limits_{\boldsymbol{\beta } _{C_{F}}}f_{C_{F}}(\boldsymbol{\beta }_{C_{F}}).}$$

Then for j ∈ S F , \(\hat{\boldsymbol{\beta }}_{j,C_{F}}\neq 0.\) Therefore since the nonzero components of

$$\displaystyle{\mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{C_{p}}}f_{C_{p}}(\boldsymbol{\beta }_{C_{p}})}$$

are the same as the nonzero components of

$$\displaystyle{\mathop{\arg \!\min }\limits_{\boldsymbol{\beta }_{C_{F}}}f_{C_{F}}(\boldsymbol{\beta }_{C_{F}})}$$

when \(S_{F} \subseteq C_{p}\), also \(\hat{\boldsymbol{\beta }}_{j,C_{p}}\neq 0\). The opposite is also true: if \(\hat{\boldsymbol{\beta }}_{j,C_{p}}\neq 0\), then j ∈ S F and \(\hat{\boldsymbol{\beta }}_{j,C_{F}}\neq 0\). Similarly for j ∉ S F . This proves that

  1. (a)

    \(S_{p}(\lambda ) = S_{F}(\lambda )\quad \forall p \geq p_{0}(\lambda ),\)

  2. (b)

    \(\hat{\beta }_{j,C_{p}}(\lambda ) =\hat{\beta } _{j,C_{F}}(\lambda ),\quad \forall p \geq p_{0}(\lambda ),\forall j.\)

Proof 2 (Proof of (4) and (5))

For fixed \(\lambda\) we have that if

$$\displaystyle{p_{1} \geq p_{0,cv}(\lambda ) =\max _{k=1,\ldots,K}p_{0,k}(\lambda ),}$$

then

$$\displaystyle{p_{1} \geq p_{0,k}(\lambda )\quad \forall k.}$$

By (3a) and (3b) it follows that for all \(p_{2} > p_{1} \geq p_{0,k}(\lambda )\) and \(\forall k\)

$$\displaystyle{ S_{p_{1}}^{-k}(\lambda ) = S_{ p_{2}}^{-k}(\lambda ) = S_{ F}^{-k}(\lambda ) }$$
(8)

and

$$\displaystyle{ \hat{\beta }_{j,C_{p_{ 1}}}^{-k}(\lambda ) =\hat{\beta }_{ j,C_{p_{2}}}^{-k}(\lambda ),\quad \forall j \in S_{ F}^{-k}(\lambda ). }$$
(9)

Then

$$\displaystyle\begin{array}{rcl} \hat{y}_{i,C_{p_{ 2}}}^{-k}(\lambda )& =& \sum _{ j\in C_{p_{2}}}x_{ij}\hat{\beta }_{j,C_{p_{2}}}^{-k}(\lambda ) \\ & =& \sum _{j\in S_{p_{ 2}}^{-k}(\lambda )}x_{ij}\hat{\beta }_{j,C_{p_{ 2}}}^{-k}(\lambda ) +\sum _{ j\not\in S_{p_{2}}^{-k}(\lambda )}x_{ij}\hat{\beta }_{j,C_{p_{2}}}^{-k}(\lambda ){}\end{array}$$
(10)
$$\displaystyle\begin{array}{rcl} & =& \sum _{j\in S_{p_{ 2}}^{-k}(\lambda )}x_{ij}\hat{\beta }_{j,C_{p_{ 2}}}^{-k}(\lambda ) \\ & =& \sum _{j\in S_{p_{ 1}}^{-k}(\lambda )}x_{ij}\hat{\beta }_{j,C_{p_{ 2}}}^{-k}(\lambda ){}\end{array}$$
(11)
$$\displaystyle{ =\sum _{j\in S_{p_{ 1}}^{-k}(\lambda )}x_{ij}\hat{\beta }_{j,C_{p_{ 1}}}^{-k}(\lambda ), }$$
(12)

because the last term in (10) is zero and the two last equalities in (11) and (12) follows from (8) and (9) respectively. Similarly we have

$$\displaystyle{\hat{y}_{i,C_{p_{ 1}}}^{-k}(\lambda ) =\sum _{ j\in S_{p_{1}}^{-k}(\lambda )}x_{ij}\hat{\beta }_{j,C_{p_{1}}}^{-k}(\lambda ),}$$

so that \(\hat{y}_{i,C_{p_{ 1}}}^{-k}(\lambda ) =\hat{ y}_{i,C_{p_{ 2}}}^{-k}(\lambda )\) holds \(\forall i\). Finally this implies \(CV _{C_{p_{ 1}}}(\lambda ) = CV _{C_{p_{ 2}}}(\lambda ) = CV _{C_{F}}(\lambda )\) and hereby (5).

Appendix 2

We collect here some further arguments which lead to the reordering of the Part 2 of our algorithm. Consider two consecutive cross-validation curves for \(C_{p_{m}}\) and \(C_{p_{m+1}}\), and assume that the two curves coincide in an interval \(\tilde{\varLambda }= [\tilde{\lambda },\lambda _{max}]\) which includes a minimum in \(\lambda _{p_{m}}^{{\ast}} >\tilde{\lambda }\)Part 1 of our algorithm would stop with p m variables and return the solution \(S_{p_{m}}(\lambda _{p_{m}}^{{\ast}})\). By definition of freezing, if \(S_{p_{m+1}}^{-k}(\lambda ) \subseteq C_{p_{m}}\) for all folds k and for all \(\lambda \in \tilde{\varLambda }\), then the two curves for \(C_{p_{m}}\) and \(C_{p_{m+1}}\) are identical in \(\lambda _{p_{m}}^{{\ast}}\) and in all other values of \(\lambda \in \tilde{\varLambda }\). Nevertheless \(S_{F}(\lambda ^{{\ast}})\) might not be included in \(C_{p_{m}}\), and hence \(S_{p_{m}}(\lambda _{p_{m}}^{{\ast}})\) is not the correct solution for the full data set. If on the contrary, there are some variables that are active in the cross-validation for \(C_{p_{m+1}}\) that are not in \(C_{p_{m}}\), that is for any k, \(S_{p_{m+1}}^{-k}(\lambda _{p_{m}}^{{\ast}})\not\subset C_{p_{m}}\), then the two curves would not coincide down and beyond \(\lambda _{p_{m}}^{{\ast}}\) and hence the algorithm would not erroneously stop. Therefore the sequence of preselected sets should be such that, while waiting for \(S_{F}(\lambda ^{{\ast}})\) to be included in a \(C_{p_{m}}\) (at which point the curves cannot change anymore in the minimum), the new active set \(S_{p_{m+1}}(\lambda _{p_{m}}^{{\ast}})\) in the current minimum \(\lambda _{p_{m}}^{{\ast}}\) includes typically some new variables which were not in the previous set \(C_{p_{m}}\). This leads to the idea of sequential reordering once we have found a first “local point of freezing”. The idea of Part 2 of our algorithm follows this line, and greedily constructs the new next set \(C_{p_{m+1}}\) by introducing new variables which have a high chance to be in \(S_{p_{m+1}}(\lambda _{p_{m}}^{{\ast}})\). This is done by reordering the unused variables based on the residuals \(\boldsymbol{r}\), computed using the selected variables \(S_{p_{m}}(\lambda _{p_{m}}^{{\ast}})\).

Appendix 3

Further details from the simulation studies are summarized here. First, we consider the linear regression model as described in the main manuscript, while results from experiments using a logistic regression model are reported thereafter.

3.1 Linear Regression Model

The results are reported for Scenario A, B and D and the data are generated as described in Sect. 3 in the main manuscript. We investigate how many variables are needed to avoid the preselection bias.

Comparing Scenario A, B and D, we see that freezing can be very useful not only in situations with no correlation among the covariates, but also in situations where the covariates are correlated. The results are quite similar, with a small advantage when the covariates are generated independently. This is possibly because the marginal correlation ranking captures the true nonzero coefficients earlier when there is little correlation among the covariates.

For all three scenarios, we observe that when models of less noise are considered (S N R ≈ 2), there are practically no situations for which the lasso selects less than 20 variables. When S N R ≈ 0. 5 there are more situations where the cross-validation curves have well-defined minima leading to a smaller number of selected variables, hence there is a greater advantage of using our approach in these situations. For example for the situations where the lasso selects less than 80 variables, the largest gain is observed when S N R ≈ 0. 5, where the average % of data needed to recover the optimal solution is not more than 15 %, 19 % and 15 % for the three different scenarios respectively.

Scenario A (Table 4 and Fig. 7)

Table 4 Results from 100 experiments of Scenario A

Scenario B (Table 5 and Fig. 8)

Table 5 Results from 100 experiments of Scenario B

Scenario D (Table 6 and Fig. 9)

Table 6 Results from 100 experiments of Scenario D

6.1 Logistic Regression Model

Finally we do one experiment of 100 replications with a binary response. For simplicity, in this experiment we only consider the covariate matrix generated as in Scenario B and with P = 10, 000. Results are summarized in Table 7 and Fig. 10. Here we also observe situations where the cross-validated optimal solution is not well-defined and the lasso selects very many. Nevertheless in 57 out of 100 experiments the curves will be frozen down and below the minimum in \(\lambda ^{{\ast}}\) for less than 50 % of the data. In several cases it happens already for 20–30 % of the data.

Table 7 Results from 100 experiments with continuous correlated features in a logistic regression model with a binary response
Fig. 10
figure 10

Plot of number of nonzero regression coefficients in the lasso vs. \(\tilde{\tilde{p}}\), where \(\tilde{\tilde{p}}\) is the smallest p for which the curves are frozen in the optimal \(\lambda\) denoted by \(\hat{\lambda }\), in the example using logistic regression. The results are reported for 100 replications, and situations where the lasso selects few ( ≤ 20), many (between 20 and 80 variables) or very many ( ≥ 80) are indicated by the grey vertical lines

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Bergersen, L.C., Ahmed, I., Frigessi, A., Glad, I.K., Richardson, S. (2016). Preselection in Lasso-Type Analysis for Ultra-High Dimensional Genomic Exploration. In: Frigessi, A., Bühlmann, P., Glad, I., Langaas, M., Richardson, S., Vannucci, M. (eds) Statistical Analysis for High-Dimensional Data. Abel Symposia, vol 11. Springer, Cham. https://doi.org/10.1007/978-3-319-27099-9_3

Download citation

Publish with us

Policies and ethics