Singular Value Decomposition

Brown, Jonathon D.

doi:10.1007/978-3-319-93549-2_5

Singular Value Decomposition

Jonathon D. Brown²

Chapter
First Online: 01 May 2019

1485 Accesses

Abstract

In Chap. 4 we learned how to diagonalize a square matrix using the Eigen decomposition. Eigen decomposition has many uses, but it has a limitation: it can only be applied to a square matrix. In this chapter, we will learn how to extend the decomposition to a rectangular matrix using a related method known as a Singular Value Decomposition (SVD). Because of its flexibility and numerical accuracy, the SVD is arguably the most useful decomposition ever developed. In fact, it’s probably fair to say that if you were stuck on an island with only one tool for performing linear algebra, you’d want the SVD.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Later in this chapter we will see that is it neither necessary nor desirable to calculate the SVD from the eigen decomposition of A′A and AA′. Nevertheless, thinking of the decomposition in this way is a useful pedagogical device.
2.
The condition number controls how deformed a unit circle becomes following matrix multiplication. The larger the condition number, the more elongated the ellipse.
3.
Because of round-off error, the rank is usually found as the number of singular values greater than some specified level of tolerance (e.g., 1e − 14)
4.
The pseudoinverse is sometimes called the generalized inverse or the Moore-Penrose inverse. In addition, the symbol used to denote it is not standard, and other symbols, such as A⁺ and A^†, are used as well.
5.
With some algorithms, it is advantageous to bring the matrix to bidiagonal form before performing the SVD [see Demmel and Kahan (1990), Golub and van Loan (2013)].
6.
PCA is related to another statistical technique known as factor analysis. Whereas PCA focuses on variance reduction, factor analysis focuses on covariances, positing that they are the result of unobserved, latent variables. Only PCA will be discussed in this text.
7.
Minimizing the squared distances is equivalent to maximizing the line’s variance.
8.
The singular values have been adjusted for the sample size and squared to convert them to variances.
9.
In some textbooks, total least squares is called orthogonal distance regression.
10.
In fact, instead of diagonalizing the covariance matrix we use SVD with a deviate-centered rectangular matrix. Nonetheless, it is convenient to explain the analysis as an eigen decomposition of a symmetric covariance matrix.
11.
Bootstrapping, which will be discussed in Chap. 7, can be used to calculate confidence intervals for the TLS coefficients.
12.
Standardizing the variables turns our covariance matrix into a correlation matrix. For that reason, Table 5.5 reports the correlations among the 8 variables.
13.
The data in Table 5.4 come from a larger data set that can be accessed from various sources on the world wide web (e.g., http://statweb.stanford.edu/~owen/courses/202/Cereals.txt).
14.
Collinearity is sometimes called multicollinearity.
15.
The data in Table 5.3 aren’t real, but the NFL does gather data of this type each winter during its NFL combine. The combine was going on while I wrote this chapter, so I decided to use it as my example. I don’t know a whole lot about football, however, so don’t forget that the data are fictitious!
16.
The column labeled “Tolerance” in Table 5.7 indicates how independent each predictor is from all of the others. It can be found as the inverse of the diagonal entry of the inverted correlation matrix or, equivalently, as \( 1-{R}_{jj}^2 \) when each predictor is regressed on the others.
17.
Some researchers do not include the intercept when performing the analysis, but most do. In our example, excluding the intercept does not change the conclusions we draw from our (phony) data.
18.
Ideally, we would want the VPS scores to be (more or less) evenly spread among the first few singular values. The \( \mathrm{\mathcal{R}} \) code that accompanies this section will convince you that this is so, as it includes a function for computing the VPS with our data set, as well as one using an orthonormal matrix.
19.
It is customary to standardize the variables for a principal components regression analysis.

References

Belsley, D. A., Kuh, E., & Welsch, R. E. (1980). Regression diagnostics: Identifying influential data and sources of collinearity. New York: Wiley.
Book Google Scholar
Demmel, J., & Kahan, W. (1990). Accurate singular values of bidiagonal matrices. SIAM Journal on Scientific and Statistical Computing, 11, 873–912.
Article MathSciNet Google Scholar
Demmel, J., & Veselić, K. (1992). Jacobi’s method is more accurate than QR. SIAM Journal on Matrix Analysis & Applications, 13, 1204–1245.
Article MathSciNet Google Scholar
Golub, G. H., & van Loan, C. F. (2013). Matrix computations (4th ed.). Baltimore: John Hopkins.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Psychology, University of Washington, Seattle, WA, USA
Jonathon D. Brown

Authors

Jonathon D. Brown
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Brown, J.D. (2018). Singular Value Decomposition. In: Advanced Statistics for the Behavioral Sciences. Springer, Cham. https://doi.org/10.1007/978-3-319-93549-2_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-93549-2_5
Published: 01 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93547-8
Online ISBN: 978-3-319-93549-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics