Variable Selection and Feature Screening
This chapter provides a selective review on feature screening methods for ultra-high dimensional data. The main idea of feature screening is reducing the ultra-high dimensionality of the feature space to a moderate size in a fast and efficient way and meanwhile retaining all the important features in the reduced feature space. This is referred to as the sure screening property. After feature screening, more sophisticated methods can be applied to reduced feature space for further analysis such as parameter estimation and statistical inference. This chapter only focuses on the feature screening stage. From the perspective of different types of data, we review feature screening methods for independent and identically distributed data, longitudinal data, and survival data. From the perspective of modeling, we review various models including linear model, generalized linear model, additive model, varying-coefficient model, Cox model, etc. We also cover some model-free feature screening procedures.
This work was supported by a NSF grant DMS 1820702 and NIDA, NIH grant P50 DA039838. The content is solely the responsibility of the authors and does not necessarily represent the official views of NSF, NIH, or NIDA.
- Cox, D. (1972). Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 34(2), 87–22.Google Scholar
- Fan, J., Feng, Y., & Wu, Y. (2010). High-dimensional variable selection for cox’s proportional hazards model. In Borrowing strength: Theory powering applications–a festschrift for lawrence d. brown (pp. 70–86). Bethesda, MD: Institute of Mathematical Statistics.Google Scholar
- Fan, J., & Lv, J. (2010). A selective overview of variable selection in high dimensional feature space. Statistica Sinica, 20(1), 101.Google Scholar
- Fan, J., Samworth, R., & Wu, Y. (2009). Ultrahigh dimensional feature selection: Beyond the linear model. The Journal of Machine Learning Research, 10, 2013–2038.Google Scholar
- Hardle, W., Liang, H., & Gao, J. (2012). Partially linear models. Berlin: Springer Science & Business Media.Google Scholar
- Huang, J. Z., Wu, C. O., & Zhou, L. (2004). Polynomial spline estimation and inference for varying coefficient models with longitudinal data. Statistica Sinica, 14, 763–788.Google Scholar
- Song, R., Yi, F., & Zou, H. (2014). On varying-coefficient independence screening for high-dimensional varying-coefficient models. Statistica Sinica, 24(4), 1735.Google Scholar
- Vapnik, V. (2013). The nature of statistical learning theory. Berlin: Springer science & business media.Google Scholar
- Yang, G., Yu, Y., Li, R., & Buu, A. (2016). Feature screening in ultrahigh dimensional Cox’s model. Statistica Sinica, 26, 881.Google Scholar
- Yousuf, K., & Feng, Y. (2018). Partial distance correlation screening for high dimensional time series. Preprint arXiv:1802.09116.Google Scholar