Abstract
For multivariate analysis with p variables the problem that often arises is the ambiguous nature of the correlation or covariance matrix. When p is moderately or very large it is generally difficult to identify the true nature of relationship among the variables as well as observations from the covariance or correlation matrix. Under such situations a very common way to simplify the matter is to reduce the dimension by considering only those variables (actual or derived) which are truly responsible for the overall variation. Important and useful dimension reduction techniques are Principal Component Analysis (PCA), Factor Analysis, Multidimensional Scaling, Independent Component Analysis (ICA), etc. Among them PCA is the most popular one. One may look at this method in three different ways. It may be considered as a method of transforming correlated variables into uncorrelated one or a method of finding linear combinations with relatively small or large variability or a tool for data reduction. The third criterion is more data oriented. In PCA primarily it is not necessary to make any assumption regarding the underlying multivariate distribution but if we are interested in some inference problems related to PCA then assumption of multivariate normality is necessary. The eigen values and eigen vectors of the covariance or correlation matrix are the main contributors of a PCA. The eigen vectors determine the directions of maximum variability whereas the eigen values specify the variances. In practice, decisions regarding the quality of the principal component approximation should be made on the basis of eigen value–eigen vector pairs.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Albazzaz, H., and X.Z. Wang. 2004. Industrial and Engineering Chemistry Research 43(21):6731.
Babu, J., et al. 2009. The Astrophysical Journal 700:1768.
Chattopadhyay, A.K., T. Chattopadhyay, E. Davoust, S. Mondal, and M. Sharina. 2009. The Astrophysical Journal 705:1533.
Chattopadhyay A.K., S. Mondal, and T. Chattopadhyay, 2013. Computational Statistics & Data Analysis 57:17.
Comon, P. 1994. Signal Processing 36:287.
Dickens, R.J. 1972. Monthly Notices of Royal Astronomical Society 157:281
Fusi Pecci, F., et al. 1993. Astronomical Journal 105:1145.
Gabriel, K.R. 1971. Biometrika 5:453.
Hastie, T., and R. Tibshirani. 2003. In Independent component analysis through product density estimation in advances in neural information processing system, vol. 15, ed. Becker, S., and K. Obermayer, 649–656. Cambridge, MA: MIT Press.
Hyvarinen, A., and E. Oja. 2000. Neural Networks 13(4–5):411.
Hyvarinen, A., J. Karhunen, and E. Oja. 2001. Independent component analysis. New York: Wiley.
King, I.R. 2002. Introduction to Classical Stellar Dynamics. Moscow: URSS.
McLaughlin, D.E., et al. 2008. Monthly Notices of the Royal Astronomical Society 384:563.
Qiu, D., and A.C. Tamhane. 2007. Journal of Statistical Planning and Inference 137:3722
Recio-Blanco, A., et al. ( 2006). Astronomy & Astrophysics 452:875
Salaris, M., et al. 2004. Astronomy & Astrophysics 420:911.
Shapiro, S.S., and M.B. Wilk. 1965. Biometrika 52(3–4):591.
Stones, V. 2004. Independent component analysis: a tutorial introduction. Bradford Books. Cambridge: The MIT Press.
Sugar, A.S., and G.M. James. 2003. Journal of the American Statistical Association 98:750.
Woodley, K.A., et al. 2007. Astronomical Journal 134:494.
Author information
Authors and Affiliations
7.1 Electronic Supplementary material
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this chapter
Cite this chapter
Chattopadhyay, A.K., Chattopadhyay, T. (2014). Dimension Reduction and Clustering. In: Statistical Methods for Astronomical Data Analysis. Springer Series in Astrostatistics, vol 3. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-1507-1_7
Download citation
DOI: https://doi.org/10.1007/978-1-4939-1507-1_7
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-1506-4
Online ISBN: 978-1-4939-1507-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)