Abstract
In survival analysis, a number of regression models can be used to estimate the effects of covariates on the censored survival outcome. When covariates can be naturally grouped, group selection is important in these models. Motivated by the group bridge approach for variable selection in a multiple linear regression model, we consider group selection in a semiparametric accelerated failure time (AFT) model using Stute’s weighted least squares and a group bridge penalty. This method is able to simultaneously carry out feature selection at both the group and within-group individual variable levels, and enjoys the powerful oracle group selection property. Simulation studies indicate that the group bridge approach for the AFT model can correctly identify important groups and variables even with high censoring rate. A real data analysis is provided to illustrate the application of the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bakin S (1999) Adaptive regression and model selection in data mining problems. The Australian National University
Breheny P, Huang J (2009) Penalized methods for bi-level variable selection. Stat Interface 2:269–380
Buckley J, James I (1979) Linear regression with censored data. Biometrika 66:429–436
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360
Fan J, Lv J (2008) Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc: Ser B (Stat Methodol) 70(5):849–911
Fleming TR, Harrington DP (2011) Counting processes and survival analysis, vol 169. Wiley, New York
Fu WJ (1998) Penalized regressions: the bridge versus the lasso. J Comput Graphic Stat 7:397–416
Fygenson M, Ritov Y (1994) Monotone estimating equations for censored data. The Ann Stat 22:732–746
Heller G (2007) Smoothed rank regression with censored data. J Am Stat Assoc 102(478):552–559
Huang J, Ma S, Xie H (2006) Regularized estimation in the accelerated failure time model with high-dimensional covariates. Biometrics 62:813–820
Huang J, Ma S, Xie H, Zhang CH (2009) A group bridge approach for variable selection. Biometrika 96:339–355
Huang J, Ma S (2010) Variable selection in the accelerated failure time model via the bridge method. Lifetime Data Anal 16:176–195
Huang J, Liu L, Liu Y, Zhao X (2014) Group selection in the Cox model with a diverging number of covariates. Stat Sinica 24:1787–1810
Leng C, Lin Y, Wahba G (2006) A note on the lasso and related procedures in model selection. Stat Sinica 16:1273–1284
Lv J, Fan Y (2009) A unified approach to model selection and sparse recovery using regularized least squares. Ann Stat 37:3498–3528
Ma S, Huang J (2007) Clustering threshold gradient descent regularization: with applications to microarray studies. Bioinformatics 23:466–472
Ma S, Du P (2012) Variable selection in partly linear regression model with diverging dimensions for right censored data. Stat Sinica 22:1003–1020
Meier L, Van De Geer S, Bühlmann P (2008) The group lasso for logistic regression. J Royal Stat Soc: Ser B (Stat Methodol) 70:53–71
Prentice RL (1978) Linear rank tests with right censored data. Biometrika 65:167–179
Ritov Y (1990) Estimation in a linear regression model with censored data. Ann Stat 18:303–328
Stute W (1993) Almost sure representations of the product-limit estimator for truncated data. Ann Stat 21:146–156
Stute W (1996) Distributional convergence under random censorship when covariables are present. Scand J Stat 23:461–471
Stute W, Wang JL (1993) The strong law under random censorship. Ann Stat 9:1591–1607
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Royal Stat Soc. Ser B (Methodol) 58:267–288
Tsiatis AA (1990) Estimating regression parameters using linear rank tests for censored data. Ann Stat 18:354–372
Wang L, Chen G, Li H (2007) Group SCAD regression analysis for microarray time course gene expression data. Bioinformatics 23:1486–1494
Wang S, Nan B, Zhu N, Zhu J (2009) Hierarchically penalized Cox regression with grouped variables. Biometrika 96:307–322
Wei LJ (1992) The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. Stat med 11:1871–1879
Ying Z (1993) A large sample study of rank estimation for censored regression data. Ann Stat 21:76–99
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J Royal Stat Soc: Ser B (Stat Methodol) 68:49–67
Zhang CH, Huang J (2008) The sparsity and bias of the Lasso selection in high-dimensional linear regression. Ann Stat 36:1567–1594
Zhao P, Rocha G, Yu B (2009) The composite absolute penalties family for grouped and hierarchical variable selection. Ann Stat 37:3468–3497
Zhang CH (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38:894–942
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J Royal Stat Soc: Ser B (Stat Methodol) 67:301–320
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media Singapore
About this chapter
Cite this chapter
Huang, L., Kopciuk, K., Lu, X. (2016). Group Selection in Semiparametric Accelerated Failure Time Model. In: Chen, DG., Chen, J., Lu, X., Yi, G., Yu, H. (eds) Advanced Statistical Methods in Data Science. ICSA Book Series in Statistics. Springer, Singapore. https://doi.org/10.1007/978-981-10-2594-5_5
Download citation
DOI: https://doi.org/10.1007/978-981-10-2594-5_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-2593-8
Online ISBN: 978-981-10-2594-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)