Combining Linear Dimension Reduction Subspaces
Dimensionality is a major concern in the analysis of large data sets. There are various well-known dimension reduction methods with different strengths and weaknesses. In practical situations it is difficult to decide which method to use as different methods emphasize different structures in the data. Like ensemble methods in statistical learning, several dimension reduction methods can be combined using an extension of the Crone and Crosby distance, a weighted distance between the subspaces that allows to combine subspaces of different dimensions. Some natural choices of weights are considered in detail. Based on the weighted distance we discuss the concept of averages of subspaces and how to combine various dimension reduction methods. The performance of the weighted distances and the combining approach is illustrated via simulations and a real data example.
KeywordsWeight Function Orthogonal Projection Independent Component Analysis Dimension Reduction Independent Component Analysis
The work of Klaus Nordhausen and Hannu Oja was supported by the Academy of Finland (grant 268703). The authors are grateful to the reviewers for their helpful comments.
- Cook RD, Weisberg S (1991) Sliced inverse regression for dimension reduction: comment. J Am Stat Assoc 86:328–332Google Scholar
- Filzmoser P, Fritz H, Kalcher K (2012) pcaPP: Robust PCA by projection pursuit. R package version 1.9-47Google Scholar
- Halbert K (2011) MMST: Datasets from MMST. R package version 0.6-1.1Google Scholar
- Liski E, Nordhausen K, Oja H, Ruiz-Gazen A (2014b) LDRTools: tools for linear dimension reduction. R package version 1Google Scholar
- Nordhausen K, Ilmonen P, Mandal A, Oja H, Ollila E (2011) Deflation-based FastICA reloaded. Proceedings of 19th European signal processing conference 2011 (EUSIPCO 2011) 1854–1858Google Scholar
- Development Core Team R (2012) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, AustriaGoogle Scholar
- Rousseeuw P (1986) Multivariate estimation with high breakdown point. In: Grossman W, Pflug G, Vincze I, Wertz W (eds) Mathematical statistics and applications. Reidel, Dordrecht, pp 283–297Google Scholar
- Rousseeuw P, Croux C, Todorov V, Ruckstuhl A, Salibian-Barrera M, Verbeke T, Koller M, Maechler M (2012) Robustbase: basic robust statistics. R package version 0.9-2Google Scholar
- Ruiz-Gazen A, Berro A, Larabi Marie-Sainte S, (2010) Detecting multivariate outliers using projection pursuit with particle swarm optimization. Compstat 2010:89–98Google Scholar
- Tibshirani R (2013) Bootstrap: functions for the book “An introduction to the bootstrap”. R package version 2012.04-1Google Scholar
- Zhou ZH (2012) Ensemble methods. CRC Press, Boca Raton, Foundations and AlgorithmsGoogle Scholar