Advertisement

Application of Resampling Methods to the Choice of Dimension in Principal Component Analysis

  • Ph. Besse
  • A. de Falguerolles
Conference paper
Part of the Statistics and Computing book series (SCO)

Abstract

This paper investigates the problem of the choice of dimension in Principal Component Analysis (PCA). PCA is introduced as a model; a loss function assessing the stability of the fit is considered. The choice of dimension then amounts to the minimisation of an expected loss which has to be estimated. This is achieved by resampling methods. Different bootstrap and jackknife estimates are presented. The behaviour of these estimates are investigated on artificial data and on real data. The resulting choices are confronted with those given by naïve rules.

Keywords

Principal Component Analysis Optimal Dimension Bootstrap Jackknife Perturbation Theory. 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Becker, R.A., Chambers, J.M., Wilks, A.R. (1988). The New S Language, a Programming Environment for Data Analysis and Graphics, Wadsworth and Brooks/Cole, Pacific Grove, Ca 93950.Google Scholar
  2. Beran, R., Srivastava, M.S. (1985). Bootstrap Tests and Confidence Regions for Functions of a Covariance Matrix. The Annals of Statistics, 13, 95–115.MathSciNetCrossRefzbMATHGoogle Scholar
  3. Besse, Ph. (1992). PCA Stability and Choice of Dimensionality. Statistics hi Probability Letters, 13, 405–410.MathSciNetCrossRefzbMATHGoogle Scholar
  4. Besse, P., Caussinus, H., Ferré, L., Fine, J. (1988). Principal Components Analysis and Optimization of Graphical Displays. Statistics, 19, 301–312.MathSciNetCrossRefzbMATHGoogle Scholar
  5. Besse, Ph., Pousse, A. (1992). Extension des Analyses Factorielles, in Modèles pour l’Analyse des Données Multidimensionnelles, J.J. Droesbeke et al. (eds.), Economica, Paris.Google Scholar
  6. Caussinus, H. (1986). Models and Uses of Principal Component Analysis, in Multidimensional Data Analysis, J. de Leeuw et al. (eds.), DSWO Press, Leiden, 149–170.Google Scholar
  7. Daudin, J.J., Duby, C., Trécourt, P. (1988). Stability of Principal Component Analysis Studied by Bootstrap, Statistics, 19, 241–158.MathSciNetCrossRefzbMATHGoogle Scholar
  8. Daudin, J.J., Duby, C., Trécourt, P. (1989). PCA Stability Studied by the Bootstrap and the Infinitesimal Jackknife Method, Statistics, 20, 255–270.MathSciNetCrossRefzbMATHGoogle Scholar
  9. Efron, B. (1982). The Jackknife, the Bootstrap and other Resampling Methods, SIAM, Philadelphie.CrossRefGoogle Scholar
  10. Efron, B. (1992). Jackknife-after-Bootstrap Standard Errors and Influence Functions (with discussion), Journal of the Royal Statistical Society, series B, 54, 83–127.MathSciNetzbMATHGoogle Scholar
  11. Fine, J., Pousse, A. (1991). Asymptotic Study of the Multivariate Functional Model; Application to the Metric Choice in PCA, Statistics, to appear.Google Scholar
  12. Jolliffe, I. (1986). Principal Component Analysis, Springer-Verlag, New-York.zbMATHGoogle Scholar
  13. Kato, T. (1966). Perturbation Theory for Linear Operator, Springer-Verlag, New-York. McDonald, G.C., Schwing, R.C. (1973). Instabilities of Regression Estimates Relating Air Pollution to Mortality. Technometrics, 15, 463–481.Google Scholar
  14. SAS (1989), SAS/STAT User’s Guide, volume 2, Version 6, fourth edition, Sas Institute Inc, Cary.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1993

Authors and Affiliations

  • Ph. Besse
    • 1
  • A. de Falguerolles
    • 1
  1. 1.Laboratoire de Statistique et ProbabilitésU.A. CNRS D0745, Université Paul SabatierToulouse cedexFrance

Personalised recommendations