Abstract
Many applications in modern biology measure a large number of genomic or proteomic covariates and are interested in assessing the impact of each of these covariates on a particular outcome of interest. In a study which follows a cohort of HIV-positive patients over time, for example, a researcher may genotype the virus infecting each patient to ascertain the presence or absence of a large number of mutations, in the hope of identifying mutations that affect how a patient’s plasma HIV RNA level (viral load) responds to a new drug regimen. Along with an estimate of the impact of each mutation on the time course of viral load, the researcher would generally like to have a measure of the statistical significance of these estimates in order to identify those mutations that are most likely to be genuinely related to the outcome. Such information could then be used to inform the decision of which drugs should be included in the regimen of a patient with a particular pattern of mutations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Benjamini, Y. and Hochberg, T. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc., Series B, 85:289–300.
Bland, J.M. and Altman, D.G. (1995). Multiple significance tests: the bonferroni method. Brit. Med. J., 310:170.
Boucher, C.A.B., Cammack, P., Schipper, R., Rouse, P.L., and Cameron, J.M. (1993). High-level resistance to (−) enantiomeric 2′deoxy-3′thiacytidine (3tc) in vitro due to one amino acid substitution in the catalytic site of human immunodeficiency virus type 1 reverse transcriptase. Antimicrobial Agents and Chemotherapy, 37:2231–2234.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2):123–140.
Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J. (1984). Classification and regression trees. The Wadsworth Statistics/Probability series. Wadsworth International Group.
Cleveland, W.S. (1979). Robust locally-weighted regression and smoothing scatterplots. J. Am. Stat. Assoc., 74:829–836.
Dudoit, S. and van der Laan, M. J. (2006). Multiple Testing Procedures and Applications to Genomics. Springer. (In preparation).
Kooperberg, C., Bose, S., and Stone, C.J. (1997). Polychotomous regression. J. Am. Stat. Assoc., 92:117–127.
Lacey, S.F. and Larder, B.A. (1994). Novel mutation (v75t) in human immunodeficiency virus type 1 reverse transcriptase confers resistance to 2′-3′didehydro-2′,3′-dideoxythymidine in cell culture. Antimicrobial Agents and Chemotherapy, 38(6): 1428–1432.
Lehmann, E.L. and Romano, J. (2005). Testing Statistical Hypotheses. Springer, New York, 3rd edition.
Liang, K. and Zeger, S.L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73(1):13–22.
McCullagh, P. and Nelder, J. A. (1989). Generalized linear models (2nd edition). London: Chapman & Hall.
Neugebauer, R. and van der Laan, M.J. (2005). Why prefer double robust estimates in causal inference? J. Stat. Planning and Inference, 129(1–2):405–426.
R Development Core Team (2005). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
Rhee, S., Taylor, J., Wadhera, G., Ravela, J., Ben-Hur, A., Brutlag, D., and Shafer, R.W. (2006). Genotypic predictors of human immunodeficiency virus type 1 drug resistance. (Submitted).
Schurman, R., Nijhuis, M., van Leeuwen, R., Schipper, P., de Jong, D., Collis, P., Danner, S.A., Mulder, J., Loveday, C., and Christopherson, C. (1995). Rapid changes in human immunodeficiency virus type 1 rna load and appearance of drug-resistant virus populations in persons treated with lamivudine (3tc). J. Infect Dis., 171:1411–1419.
Shafer, R.W. (2002). Genotypic testing for human immunodeficiency virus type 1 drug restistance. Clin. Microbiol. Rev., 15(2):247–277.
Sinisi, S.E. and van der Laan, M.J. (2004). Deletion/substitution/addition algorithm in learning with applications in genomics. Stat. Appl. Gen. Mol. Biol., 3(1).
Tisdale, M., Kemp, S.D., Parry, N.R., and Larder, B.A. (1993). Rapid in vitro selection of human immunodeficiency virus 1 type 1 resistant to 3′-thyiacytidine inhibitors due to a mutation in the ymdd region of reverse transcriptase. Proc. Natl. Acad. Sc. USA, 90:5653–5656.
van der Laan, M.J. (2006a). Causal effects for intention to treat and realistic individualized treatment rules. Technical Report 203, Division of Biostatistics, University of California, Berkeley.
van der Laan, M.J. (2006b). Statistical inference for variable importance. Intl. J. Biostat., 2(1).
Westfall, P.H. and Young, S.S. (1993). Resampling-based multiple testing: Examples and methods for p-value adjustment. Wiley, New York.
Zeger, S.L. and Liang, K. (1986). Longitudinal data analysis for discrete and continuous outcomes. Biometrics, 42(1):121–130.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Bembom, O., Petersen, M.L., van der Laan, M.J. (2007). Identifying Important Explanatory Variables for Time-Varying Outcomes. In: Dubitzky, W., Granzow, M., Berrar, D. (eds) Fundamentals of Data Mining in Genomics and Proteomics. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-47509-7_11
Download citation
DOI: https://doi.org/10.1007/978-0-387-47509-7_11
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-47508-0
Online ISBN: 978-0-387-47509-7
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)