Summary
A new variant of cross-validation, called full cross-validation, is proposed in order to overcome some disadvantages of the traditional cross-validation approach in general regression situations. Both criteria may be regarded as estimates of the mean squared error of prediction. Under some assumptions including normally distributed observations, the cross-validation criterion is shown to be outperformed by the full cross-validation criterion. Analogous modifications may be applied to the generalized cross-validation method, providing a similar improvement. This leads to the recommendation of replacing the traditional cross-validation techniques by the new ones for estimating the prediction quality of models or of regression function estimators.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bunke, O. and Droge, B. (1984a). Bootstrap and cross-validation estimates of the prediction error for linear regression models. Ann. Statist. 12, 1400–1424.
Bunke, O. and Droge, B. (1984b). Estimators of the mean squared error of prediction in linear regression. Technometrics26, 145–155.
Bunke, O., Droge, B. and Polzehl, J. (1993). Model selection and variable transformations in nonlinear regression. CORE Discussion Paper No. 9327, C.O.R.E., UCL, Belgium.
Chen, K.-W. (1987). Asymptotically optimal selection ofapiecewise polynomial estimator of a regression function. J. Multiv. Anal. 22, 230–244.
Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer. Math. 31 377–403.
Droge, B. (1987). A note on estimating the MSEP in nonlinear regression. Statistics18, 499–520.
Droge, B. (1994). Some simulation results on cross-validation and competitors for model choice. Discussion Paper No. 30, Sonderforschungsbereich 373, Humboldt-Universität, Berlin.
Eubank, R.L. (1984). The hat matrix for smoothing splines. Statist, and Prob. Letters2, 9–14.
Eubank, R.L. (1988). Spline Smoothing and Nonparametric Regression. Marcel Dekker, New York.
Mallows, C.L. (1973). Some comments on C p . Technometrics15, 661–675.
Müller, M. (1993). Asymptotische Eigenschaften von Modellwahlverfahren in der Regressionsanalyse. Doctoral Thesis, Department of Mathematics, Humboldt University, Berlin (in German).
Nadaraya, E.A. (1964). On estimating regression. Theor. Probab. Appl. 9, 141–142.
Nishii, R. (1984). Asymptotic properties of criteria for selection of variables in multiple regression. Ann. Statist. 12, 758–765.
Rao, R.C. (1976). Estimation of parameters in a linear model.Ann. Statist. 4, 1023–1037.
Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. J. Roy. Statist. Soc. B36, 111–147.
Wahba, G. (1978). Improper priors, spline smoothing and the problem of guarding against model errors in regression. J. Roy. Statist. Soc. B40, 364–372.
Watson, G.S. (1964). Smooth regression analysis. Sankhya A26, 359–372.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1996 Physica-Verlag Heidelberg
About this paper
Cite this paper
Droge, B. (1996). Some Comments on Cross-Validation. In: Härdle, W., Schimek, M.G. (eds) Statistical Theory and Computational Aspects of Smoothing. Contributions to Statistics. Physica-Verlag HD. https://doi.org/10.1007/978-3-642-48425-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-48425-4_14
Publisher Name: Physica-Verlag HD
Print ISBN: 978-3-7908-0930-5
Online ISBN: 978-3-642-48425-4
eBook Packages: Springer Book Archive