Advertisement

Combining Linear Dimension Reduction Subspaces

  • Eero Liski
  • Klaus Nordhausen
  • Hannu Oja
  • Anne Ruiz-Gazen
Conference paper

Abstract

Dimensionality is a major concern in the analysis of large data sets. There are various well-known dimension reduction methods with different strengths and weaknesses. In practical situations it is difficult to decide which method to use as different methods emphasize different structures in the data. Like ensemble methods in statistical learning, several dimension reduction methods can be combined using an extension of the Crone and Crosby distance, a weighted distance between the subspaces that allows to combine subspaces of different dimensions. Some natural choices of weights are considered in detail. Based on the weighted distance we discuss the concept of averages of subspaces and how to combine various dimension reduction methods. The performance of the weighted distances and the combining approach is illustrated via simulations and a real data example.

Keywords

Weight Function Orthogonal Projection Independent Component Analysis Dimension Reduction Independent Component Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgments

The work of Klaus Nordhausen and Hannu Oja was supported by the Academy of Finland (grant 268703). The authors are grateful to the reviewers for their helpful comments.

References

  1. Cook RD, Weisberg S (1991) Sliced inverse regression for dimension reduction: comment. J Am Stat Assoc 86:328–332Google Scholar
  2. Crone LJ, Crosby DS (1995) Statistical applications of a metric on subspaces to satellite meteorology. Technometrics 37:324–328MathSciNetCrossRefMATHGoogle Scholar
  3. Croux C, Ruiz-Gazen A (2005) High breakdown estimators for principal components: the projection-pursuit approach revisited. J Multivar Anal 95:206–226MathSciNetCrossRefMATHGoogle Scholar
  4. Escoufier Y (1973) Le traitement des variables vectorielles. Biometrics 29:751–760MathSciNetGoogle Scholar
  5. Filzmoser P, Fritz H, Kalcher K (2012) pcaPP: Robust PCA by projection pursuit. R package version 1.9-47Google Scholar
  6. Friedman JH, Tukey JW (1974) A projection pursuit algorithm for exploratory data analysis. IEEE Trans Comput C 23:881–889CrossRefMATHGoogle Scholar
  7. Halbert K (2011) MMST: Datasets from MMST. R package version 0.6-1.1Google Scholar
  8. Hettmansperger TP, Randles RH (2002) A practical affine equivariant multivariate median. Biometrika 89:851–860MathSciNetCrossRefMATHGoogle Scholar
  9. Hotelling H (1936) Relations between two sets of variates. Biometrika 28:321–377CrossRefMATHGoogle Scholar
  10. Hyvärinen A (1999) Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans Neural Netw 10:626–634CrossRefGoogle Scholar
  11. Li KC (1991) Sliced inverse regression for dimension reduction. J Am Stat Assoc 86:316–327MathSciNetCrossRefMATHGoogle Scholar
  12. Li KC (1992) On principal Hessian directions for data visualization and dimension reduction: another application of Stein’s lemma. J Am Stat Assoc 87:1025–1039CrossRefMATHGoogle Scholar
  13. Liski E, Nordhausen K, Oja H (2014a) Supervised invariant coordinate selection. Stat: A J Theoret Appl Stat 48:711–731MathSciNetCrossRefMATHGoogle Scholar
  14. Liski E, Nordhausen K, Oja H, Ruiz-Gazen A (2014b) LDRTools: tools for linear dimension reduction. R package version 1Google Scholar
  15. Miettinen J, Nordhausen K, Oja H, Taskinen S (2014) Deflation-based FastICA with adaptive choices of nonlinearities. IEEE Trans Signal Process 62:5716–5724MathSciNetCrossRefGoogle Scholar
  16. Nordhausen K, Oja H, Tyler DE (2008) Tools for exploring multivariate data: the package ICS. J Stat Soft 28(6):1–31CrossRefGoogle Scholar
  17. Nordhausen K, Ilmonen P, Mandal A, Oja H, Ollila E (2011) Deflation-based FastICA reloaded. Proceedings of 19th European signal processing conference 2011 (EUSIPCO 2011) 1854–1858Google Scholar
  18. Nordhausen K, Oja H (2011) Multivariate L1 methods: the package MNM. J Stat Softw 43:1–28CrossRefGoogle Scholar
  19. Development Core Team R (2012) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, AustriaGoogle Scholar
  20. Rodriguez-Martinez E, Goulermas JY, Mu T, Ralph JF (2010) Automatic induction of projection pursuit indices. IEEE Trans Neural Netw 21:1281–1295CrossRefGoogle Scholar
  21. Rousseeuw P (1986) Multivariate estimation with high breakdown point. In: Grossman W, Pflug G, Vincze I, Wertz W (eds) Mathematical statistics and applications. Reidel, Dordrecht, pp 283–297Google Scholar
  22. Rousseeuw P, Croux C, Todorov V, Ruckstuhl A, Salibian-Barrera M, Verbeke T, Koller M, Maechler M (2012) Robustbase: basic robust statistics. R package version 0.9-2Google Scholar
  23. Ruiz-Gazen A, Berro A, Larabi Marie-Sainte S, (2010) Detecting multivariate outliers using projection pursuit with particle swarm optimization. Compstat 2010:89–98Google Scholar
  24. Shaker AJ, Prendergast LA (2011) Iterative application of dimension reduction methods. Electron J Stat 5:1471–1494MathSciNetCrossRefMATHGoogle Scholar
  25. Tibshirani R (2013) Bootstrap: functions for the book “An introduction to the bootstrap”. R package version 2012.04-1Google Scholar
  26. Tyler DE (1987) A distribution-free M-estimator of multivariate scatter. Ann Stat 15:234–251MathSciNetCrossRefMATHGoogle Scholar
  27. Tyler DE, Critchley F, Dümbgen L, Oja H (2009) Invariant co-ordinate selection. J Roy Stat Soc 71:549–592MathSciNetCrossRefMATHGoogle Scholar
  28. Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New YorkCrossRefMATHGoogle Scholar
  29. Weisberg S (2002) Dimension reduction regression in R. J Stat Softw 7:1–22CrossRefGoogle Scholar
  30. Ye Z, Weiss RE (2003) Using the bootstrap to select one of a new class of dimension reduction methods. J Am Stat Assoc 98:968–979MathSciNetCrossRefMATHGoogle Scholar
  31. Zhou ZH (2012) Ensemble methods. CRC Press, Boca Raton, Foundations and AlgorithmsGoogle Scholar

Copyright information

© Springer India 2016

Authors and Affiliations

  • Eero Liski
    • 1
  • Klaus Nordhausen
    • 1
    • 2
  • Hannu Oja
    • 2
  • Anne Ruiz-Gazen
    • 3
  1. 1.University of TampereTampereFinland
  2. 2.University of TurkuTurkuFinland
  3. 3.Toulouse School of EconomicsToulouseFrance

Personalised recommendations