Abstract
The number of potential variables included into a regression model is often too large and a more parsimonious model may be preferable. Selection strategies are widely used, but there are few analytical results about their properties. To investigate problems as replication stability, model complexity and selection bias we use bootstrap and cross-validation methods. For stepwise strategies, we discuss the importance of the predefined selection level. The methods are illustrated by investigating prognostic factors for survival time of patients with malignant glioma in the framework of a Cox regression model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Breiman, L.: Better Subset Regression Using the Nonnegative Garotte. Technometrics 37 (1995) 373–384
Buckland, S.T., Burnham, K.P., Augustin, N.H.: Model Selection: An Integral Part Of Inference. Biometrics 53 (1997) 603–618
Chatfield, C.: Model Uncertainty, Data Mining and Statistical Inference (With Discussion). J. R. Statist. Soc. A 158 (1995) 419–466
Chen, C.H., George, S.L.: The Bootstrap and Identification of Prognostic Factors via Cox’s Proportional Hazards Regression Model. Stat. Med. 4 (1985) 39–46
Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chapman and Hall, London (1993)
Hoeting, J.A., Madigan, D., Raftery, A.E., Volinsky, C. T.: Bayesian Model Averaging: A Tutorial. Stat. Science 14 (1999) 382–417
Marubini, E., Valsecchi, M.G.: Analying Survivial Data from Clinical Trials and Observationals Studies. W. Chickster (1994)
Miller, A.J.: Subset Selection in Regression. Chapman and Hall, London (1990)
Sauerbrei, W.: Comparison of Variable Selection Procedures in Regression Models-a Simulation Study and Practical Examples. In: Europäische Perspektiven der Medizinischen Informatik, Biometrie und Epidemiologie (eds. J. Michaelis, G. Hommel and S. Wellek) pp. 108–113. Munich (1993), MMV Medizin Verlag
Sauerbrei, W.: The Use of Resampling Methods to Simplify Regression Models in Medical Statistics. Appl. Stat. 48 (1999) 313–329
Sauerbrei, W., Schumacher, M.: A Bootstrap Resampling Procedure for Model Building: Application to the Cox Regression Model. Stat. Med. 11 (1992) 2093–2109
Schumacher, M., Holländer, N., Sauerbrei W.: Resampling and Cross-Validation Techniques: a Tool to Reduce Bias Caused by Model Building? Stat. Med. 16 (1997) 2813–2827
Teräsvirta, T., Mellin, I.: Model Selection Criteria and Model Selection Tests in Regression Models. Scand. J. Stat., 13 (1986) 159–171
Tibshirani, R.: Regression Shrinkage and Selection via Lasso. J. R. Statist. Soc. B 58 (1996) 267–288
Ulm, K., Schmoor, C., Sauerbrei, W., Kemmler, G., Aydemir, Ü., Müller, B, Schumacher, M.: Strategien zur Auswertung einer Therapiestudie mit der überlebenszeit als Zielkriterium. Biometr. Inform. Med. Biol. 20 (1989) 171–205
Van Houwelingen, J.C., le Cessie, S.: Predictive Value of Statistical Models. Stat. Med. 9 (1990) 1303–1325
Verweij, P.J.M., Van Houwelingen, H.C.: Crossvalidation in Survival Analysis. Stat. Med. 9 (1993) 487–503
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sauerbrei, W., Schumacher1, M. (2000). Bootstrap and Cross-Validation to Assess Complexity of Data-Driven Regression Models. In: Brause, R.W., Hanisch, E. (eds) Medical Data Analysis. ISMDA 2000. Lecture Notes in Computer Science, vol 1933. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-39949-6_29
Download citation
DOI: https://doi.org/10.1007/3-540-39949-6_29
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41089-8
Online ISBN: 978-3-540-39949-0
eBook Packages: Springer Book Archive