Application of Resampling Methods to the Choice of Dimension in Principal Component Analysis
- 222 Downloads
This paper investigates the problem of the choice of dimension in Principal Component Analysis (PCA). PCA is introduced as a model; a loss function assessing the stability of the fit is considered. The choice of dimension then amounts to the minimisation of an expected loss which has to be estimated. This is achieved by resampling methods. Different bootstrap and jackknife estimates are presented. The behaviour of these estimates are investigated on artificial data and on real data. The resulting choices are confronted with those given by naïve rules.
KeywordsPrincipal Component Analysis Optimal Dimension Bootstrap Jackknife Perturbation Theory.
Unable to display preview. Download preview PDF.
- Becker, R.A., Chambers, J.M., Wilks, A.R. (1988). The New S Language, a Programming Environment for Data Analysis and Graphics, Wadsworth and Brooks/Cole, Pacific Grove, Ca 93950.Google Scholar
- Besse, Ph., Pousse, A. (1992). Extension des Analyses Factorielles, in Modèles pour l’Analyse des Données Multidimensionnelles, J.J. Droesbeke et al. (eds.), Economica, Paris.Google Scholar
- Caussinus, H. (1986). Models and Uses of Principal Component Analysis, in Multidimensional Data Analysis, J. de Leeuw et al. (eds.), DSWO Press, Leiden, 149–170.Google Scholar
- Fine, J., Pousse, A. (1991). Asymptotic Study of the Multivariate Functional Model; Application to the Metric Choice in PCA, Statistics, to appear.Google Scholar
- Kato, T. (1966). Perturbation Theory for Linear Operator, Springer-Verlag, New-York. McDonald, G.C., Schwing, R.C. (1973). Instabilities of Regression Estimates Relating Air Pollution to Mortality. Technometrics, 15, 463–481.Google Scholar
- SAS (1989), SAS/STAT User’s Guide, volume 2, Version 6, fourth edition, Sas Institute Inc, Cary.Google Scholar