Skip to main content

Variable selection in regression - estimation, prediction,sparsity, inference

  • Chapter
  • First Online:
High-Dimensional Data Analysis in Cancer Research

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Breiman, L. (1995). Better subset regression using the nonnegative garrote. Technometrics, 37:373–384.

    Article  Google Scholar 

  • Candes, E. and Tao, T. (2007). The Dantzig selector: statistical estimation when p is much larger than n. The Annals of Statistics, 35(6):2312–2351.

    Google Scholar 

  • Donoho, D. and Stodden, V. (16–21 July 2006). Breakdown point of model selection when the number of variables exceeds the number of observations. In IJCNN '06. International Joint Conference on Neural Networks, 2006., pages 1916–1921.

    Google Scholar 

  • Donoho, D. L., Elad, M., and Temlyakov, V. N. (2006). Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Transactions on Information Theory, 52(1):6–18.

    Article  Google Scholar 

  • Efron, B., Hastie, T., Johnstone, I., and Tibshirani, R. (2004). Least angle regression. The Annals of Statistics, 32(2):407–499.

    Article  Google Scholar 

  • Efron, B., Hastie, T., and Tibshirani, R. (2007). Discussion of “the Dantzig selector: statistical estimation when p is much larger than n.” The Annals of Statistics, 35(6):2358–2364.

    Article  Google Scholar 

  • Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96:1348–1360.

    Article  Google Scholar 

  • Fan, J. and Li, R. (2006). Statistical challenges with high dimensionality: feature selection in knowledge discovery. In Sanz-Sole, M., Soria, J., Varona, J., and Verdera, J., editors, Proceedings of the International Congress of Mathematicians, volume III, pages 595–622. European Mathematical Society, Zurich.

    Google Scholar 

  • Fu, W. J. (1998). Penalized regressions: The bridge versus the Lasso. Journal of Computational and Graphical Statistics, 7(3):397–416.

    Article  Google Scholar 

  • Furnival, G. M. and Wilson, R. W. J. (1974). Regressions by leaps and bounds. Technometrics, 16(4):499–511.

    Article  Google Scholar 

  • Greenshtein, E. (2006). Best subset selection, persistence in high-dimensional statistical learning and optimization under l 1-constraint. The Annals of Statistics, 34(5):2367–2386.

    Article  Google Scholar 

  • Greenshtein, E. and Ritov, Y. (2004). Persistence in high-dimensional predictor selection and the virtue of overparametrization. Bernoulli, 10:971–988.

    Article  Google Scholar 

  • Hoerl, A. and Kennard, R. (1970). Ridge regression: biased estimation for non-orthogonal problems. Technometrics, 12:55–68.

    Article  Google Scholar 

  • Huang, J., Ma, S., and Zhang, C.-H. (2007). Adaptive Lasso for sparse high-dimensional regression models. Technical report, The University of Iowa.

    Google Scholar 

  • Kabaila, P. and Leeb, H. (2006). On the large-sample minimal coverage probability of confidence intervals after model selection. Journal of the American Statistical Association, 101:619–629.

    Article  CAS  Google Scholar 

  • Knight, K. and Fu, W. (2000). Asymptotics for Lasso-type estimators. The Annals of Statistics, 28:1356–1378.

    Article  Google Scholar 

  • Leeb, H. and Pötscher, B. (2005). Model selection and inference: Facts and fiction. Econometric Theory, 21:21–59.

    Article  Google Scholar 

  • Leeb, H. and Pötscher, B. (2006). Can one estimate the conditional distribution of post-model-selection estimators? Annals of Statistics, 34:2554–2591.

    Article  Google Scholar 

  • Leeb, H. and Pötscher, B. (2008a). Can one estimate the unconditional distribution of post-model-selection estimators? Econometric Theory, 24:338–376.

    Google Scholar 

  • Leeb, H. and Pötscher, B. (2008b). Sparse estimators and the oracle property, or the return of Hodges’ estimator. Journal of Econometrics, 142:201–211.

    Article  Google Scholar 

  • Leng, C., Lin, Y., and Wahba, G. (2006). A note on the Lasso and related procedures in model selection. Statistica Sinica, 16(4):1273–1284.

    Google Scholar 

  • Miller, A. (2002). Subset Selection in Regression. Chapman & Hall/CRC, London.

    Book  Google Scholar 

  • Paul, D., Bair, E., Hastie, T., and Tibshirani, R. (2008). Pre-conditioning for feature selection and regression in high-dimensional problems. The Annals of Statistics, 36:1595–1618.

    Article  Google Scholar 

  • Stamey, T. A., Kabalin, J. N., McNeal, J. E., Johnstone, I. M., Freiha, F., Redwine, E. A., and Yang, N. (1989). Prostate specific antigen in the diagnosis and treatment of adenocarcinoma of the prostate. ii. radical prostatectomy treated patients. The Journal of Urology, 141(5):1076–1083.

    PubMed  CAS  Google Scholar 

  • Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B, Methodological, 58:267–288.

    Google Scholar 

  • Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., and Knight, K. (2005). Sparsity and smoothness via the fused Lasso. Journal of the Royal Statistical Society, Series B, Methodological, 67:91–108.

    Article  Google Scholar 

  • Yang, Y. (2005). Can the strengths of AIC and BIC be shared? Biometrika, 92:937–950.

    Article  Google Scholar 

  • Yang, Y. (2007). Prediction/estimation with simple linear model: Is it really that simple? Econometric Theory, 23:1–36.

    Article  Google Scholar 

  • Yuan, M. and Lin, Y. (2007). On the non-negative garrote estimator. Journal of the Royal Statistical Society: Series B, 69(2):143–161.

    Article  Google Scholar 

  • Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. Journal of Machine Learning Research, 7:2541–2567.

    Google Scholar 

  • Zou, H. (2006). The adaptive Lasso and its oracle properties. Journal of the American Statistical Association, 101(476):1418–1429.

    Article  CAS  Google Scholar 

  • Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B: Statistical Methodology, 67(2):301–320.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jaroslaw Harezlak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Harezlak, J., Tchetgen, E., Li, X. (2009). Variable selection in regression - estimation, prediction,sparsity, inference. In: Li, X., Xu, R. (eds) High-Dimensional Data Analysis in Cancer Research. Applied Bioinformatics and Biostatistics in Cancer Research. Springer, New York, NY. https://doi.org/10.1007/978-0-387-69765-9_2

Download citation

Publish with us

Policies and ethics