Abstract
In the omics era, it has been well recognized that for complex traits and outcomes, the interactions between genetic and environmental factors (i.e., the G×E interactions) have important implications beyond the main effects. Most of the existing interaction analyses have been focused on continuous and categorical traits. Prognosis is of essential importance for complex diseases. However with significantly more complexity, prognosis outcomes have been less studied. In the existing interaction analysis on prognosis outcomes, the most common practice is to fit marginal (semi)parametric models (for example, Cox) using likelihood-based estimation and then identify important interactions based on significance level. Such an approach has limitations. First data contamination is not uncommon. With likelihood-based estimation, even a single contaminated observation can result in severely biased estimation and misleading conclusions. Second, when sample size is not large, the significance-based approach may not be reliable. To overcome these limitations, in this study, we adopt the quantile-based estimation which is robust to data contamination. Two techniques are adopted to accommodate right censoring. For identifying important interactions, we adopt penalization as an alternative to significance level. An efficient computational algorithm is developed. Simulation shows that the proposed method can significantly outperform the alternative. We analyze a lung cancer prognosis study with gene expression measurements.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bang, H., Tsiatis, A.A.: Median regression with censored cost data. Biometrics 58 (3), 643–649 (2002)
Breheny, P., Huang, J.: Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Stat. 5 (1), 232 (2011)
Caspi, A., Moffitt, T.E.: Gene-environment interactions in psychiatry: joining forces with neuroscience. Nat. Rev. Neurosci. 7 (7), 583–590 (2006)
Cordell, H.J.: Detecting gene–gene interactions that underlie human diseases. Nat. Rev. Genet. 10 (6), 392–404 (2009)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96 (456), 1348–1360 (2001)
Hunter, D.R.: MM algorithms for generalized Bradley-Terry models. Ann. Stat. 32, 384–406 (2004)
Hunter, D.J.: Gene-environment interactions in human diseases. Nat. Rev. Genet. 6 (4), 287–298 (2005)
Hunter, D.R., Lange, K.: Quantile regression via an MM algorithm. J. Comput. Graph. Stat. 9 (1), 60–77 (2000)
Hunter, D.R., Li, R.: Variable selection using MM algorithms. Ann. Stat. 33 (4), 1617 (2005)
Koenker, R.: Quantile Regression, vol. 38. Cambridge University Press, Cambridge (2005)
Koenker, R., Bassett Jr, G.: Regression quantiles. Econometrica: J. Econom. Soc. 33–50 (1978)
Liu, J., Huang, J., Xie, Y., Ma, S.: Sparse group penalized integrative analysis of multiple cancer prognosis datasets. Genet. Res. 95 (2–3), 68–77 (2013)
Liu, J., Huang, J., Zhang, Y., Lan, Q., Rothman, N., Zheng, T., Ma, S.: Identification of gene-environment interactions in cancer studies using penalization. Genomics 102 (4), 189–194 (2013)
Lopez, O., Patilea, V.: Nonparametric lack-of-fit tests for parametric mean-regression models with censored data. J. Multivar. Anal. 100 (1), 210–230 (2009)
Mazumder, R., Friedman, J.H., Hastie, T.: Sparsenet: Coordinate descent with nonconvex penalties. J. Am. Stat. Assoc. 106 (495), 1125–1138 (2011)
North, K.E., Martin, L.J.: The importance of gene-environment interaction implications for social scientists. Sociol. Methods Res. 37 (2), 164–200 (2008)
Shi, X., Liu, J., Huang, J., Zhou, Y., Xie, Y., Ma, S.: A penalized robust method for identifying gene-environment interactions. Genet. Epidemiol. 38 (3), 220–230 (2014)
Thomas, D.: Methods for investigating gene-environment interactions in candidate pathway and genome-wide association studies. Ann. Rev. Public Health 31, 21 (2010)
Wang, H.J., Wang, L.: Locally weighted censored quantile regression. J. Am. Stat. Assoc. 104 (487), 1117–1128 (2009)
Wu, C., Ma, S.: A selective review of robust variable selection with applications in bioinformatics. Brief. Bioinform. 16 (5), 873–883 (2015)
Xie, Y., Xiao, G., Coombes, K.R., Behrens, C., Solis, L.M., Raso, G., Girard, L., Erickson, H.S., Roth, J., Heymach, J.V., et al.: Robust gene expression signature from formalin-fixed paraffin-embedded samples predicts prognosis of non-small-cell lung cancer patients. Clin. Cancer Res. 17 (17), 5705–5714 (2011)
Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38, 894–942 (2010)
Zhu, R., Zhao, H., Ma, S.: Identifying gene-environment and gene–gene interactions using a progressive penalization approach. Genet. Epidemiol. 38 (4), 353–368 (2014)
Acknowledgements
We thank the organizers and participants of “The Fourth International Workshop on the Perspectives on High-dimensional Data Analysis.” The authors were supported by the China Postdoctoral Science Foundation (2014M550799), National Science Foundation of China (11401561), National Social Science Foundation of China (13CTJ001, 13&ZD148), National Institutes of Health (CA165923, CA191383, CA016359), and U.S. VA Cooperative Studies Program of the Department of Veterans Affairs, Office of Research and Development.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Wang, G., Zhao, Y., Zhang, Q., Zang, Y., Zang, S., Ma, S. (2017). Identifying Gene–Environment Interactions Associated with Prognosis Using Penalized Quantile Regression. In: Ahmed, S. (eds) Big and Complex Data Analysis. Contributions to Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-41573-4_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-41573-4_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41572-7
Online ISBN: 978-3-319-41573-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)