Some inequalities contrasting principal component and factor analyses solutions
Abstract
Principal component analysis (PCA) and factor analysis (FA) are two timehonored dimension reduction methods. In this paper, some inequalities are presented to contrast the parameters’ estimates in PCA and FA. For this reason, we take advantage of the recently established matrix decomposition (MD) formulation of FA. In summary, the resulting inequalities show that (1) FA gives a better fit to a data set than PCA, (2) PCA extracts a larger amount of common “information” than FA, and (3) for each variable, its unique variance in FA is larger than its residual variance in PCA minus the one in FA. The resulting inequalities can be useful to suggest whether PCA or FA should be used for a particular data set. The answers can also be valid for the classic FA formulation not relying on the MDFA definition, as both “types” FA provide almost equal solutions. Additionally, the inequalities give theoretical explanation of some empirically observed tendencies in PCA and FA solutions, e.g., that the absolute values of PCA loadings tend to be larger than those for FA loadings and that the unique variances in FA tend to be larger than the residual variances of PCA.
Keywords
Matrix decomposition Dimension reduction Common parts Unique parts Loadings Residuals1 Introduction
Principal component analysis (PCA) was conceived by Pearson (1901) and formulated by Hotelling (1933) who named the procedure PCA. On the other hand, factor analysis (FA) was proposed by Spearman (1904) and further developed to its modern form as known today by Thurstone (1935). Both procedures are timehonored dimension reduction methods for an nobservations × qvariables columncentered data matrix X = [x_{1}, …, x_{q}]. Thus, PCA and FA are often applied on identical data sets (e.g., Adachi 2016; Jolliffe 2002). The resulting solutions are compared mathematically and numerically in this paper. Throughout the paper, n ≥ rank(X) = q is supposed with rank(X) denoting the rank of X.
 1.
The common part for PCA is larger than that for FA;
 2.
The residual part for FA is smaller than that for PCA;
 3.
The unique part for FA is larger than the residual one for PCA.
Here, (1) and (2) always hold, while it is suggested that (3) is often observed. Those assertions are proved in Sect. 3 in the form of several inequalities, after preliminary results are presented in Sect. 2. The theoretical results obtained in Sect. 3 are illustrated in Sects. 4 and 5. Throughout the paper, we distinguish a model parameter from its estimate, by putting “hat” on the former to express the latter, as found above.
2 Preliminary notes
In this section, the solutions of PCA and FA are described, which are followed by their rotational indeterminacy. It serves as the preparations for the next section.
2.1 PCA solution
2.2 FA solution
2.3 Rotational indeterminacy
The rotational indeterminacy of the loading matrices in PCA and FA affect some of the results to be presented in the next section.
3 Results
In this section, we present four theorems, which help to contrast the PCA and FA solutions minimizing (5) subject to (6) and minimizing (8) under (3), respectively.
Theorem 1
Proof
The theorem suggests that for better fit to the data, FA should be preferred over PCA.
The next theorem concerns the largeness of squared loadings and common parts:
Theorem 2
Proof
Inequality (17) leads to (19), since \( {\mathbf{\hat{F}\hat{A}^{\prime}}} \)^{2} = ntr\( {\mathbf{\hat{A}\hat{A}^{\prime}}} \) = n\( {\hat{\mathbf{A}}} \)^{2} and \( {\mathbf{\hat{P}\hat{C^{\prime}}}} \)^{2} = ntr\( {\mathbf{\hat{C}\hat{C}^{\prime}}} \) = n\( {\hat{\mathbf{C}}} \)^{2} follow from (3) and (6), respectively. Obviously, (19) leads to (20).\( \square\)
This theorem shows that the common part in PCA is larger than in FA. Inequality (20) shows that the common part is larger in PCA solutions even after oblique rotation. On the other hand, T_{P} and T_{F} in (18) cannot be replaced by N_{P} and N_{F}. That is, after the oblique rotation, \( {\mathbf{\hat{C}N^{\prime}}}_{{\text{P}}}^{  1} \)^{2} ≥ \( {\mathbf{\hat{A}N^{\prime}}}_{{\text{F}}}^{  1} \)^{2} does not necessarily hold.
Though Theorem 2 discusses the lower limits of the magnitudes of the squared loadings and common part in PCA solutions, the next one shows their upper limits.
Theorem 3
For a given X , the sum of the squared PCA loadings cannot exceed the sum of the squared loadings and unique variances in the FA solution:
Proof
(10) and (21) are rewritten as \( {\hat{\mathbf{E}}}_{{\text{FA}}} \)^{2} = n(trS_{XX} − tr\( {\mathbf{\hat{A}\hat{A}^{\prime}}} \) − tr\( \widehat{{\varvec{\Psi}}}^{2} \)) and \( {\hat{\mathbf{E}}}_{\text{PC}} \)^{2} = n(trS_{XX} − tr\( {\mathbf{\hat{C}\hat{C}^{\prime}}} \)), respectively. Using them in (14), we have (23) and it leads to (24), because of (11). Inequality (23) leads to (25) and thus (26), since \( {\mathbf{\hat{F}\hat{A}^{\prime}}} \) + \( {\hat{\mathbf{U}}}\widehat{{\varvec{\Psi}}} \)^{2} = n\( {\hat{\mathbf{A}}} \)^{2} + n\( \widehat{{\varvec{\Psi}}} \)^{2} and \( {\mathbf{\hat{P}\hat{C^{\prime}}}} \)^{2} = n\( {\hat{\mathbf{C}}} \)^{2} follow from (3) and (6), respectively.\( \square \)
Inequality (26) shows that the model part is larger in FA even after oblique rotation. However, after the rotation, the sum of squared loadings in PCA is not necessarily less than the sum of squared loadings and unique variances in FA, since T_{P} and T_{F} in (24) cannot be replaced by N_{P} and N_{F}.
The following theorem concerns the magnitudes of the unique variances in FA:
Theorem 4
Proof
We can rewrite (10) as n^{−1}\( {\hat{\mathbf{E}}}_{{\text{FA}}} \)^{2} = tr(S_{XX} − \( {\mathbf{\hat{A}\hat{A}^{\prime}}} \) − \( \widehat{{\varvec{\Psi}}}^{2} \)), which leads to trS_{XX} − tr\( {\mathbf{\hat{A}\hat{A}^{\prime}}} \) = \( \widehat{{\varvec{\Psi}}}^{2} \) + n^{−1}\( {\hat{\mathbf{E}}}_{{\text{FA}}} \)^{2}. We can also rewrite (21) as trS_{XX} − tr\( {\mathbf{\hat{C}\hat{C}^{\prime}}} \) = n^{−1}\( {\hat{\mathbf{E}}}_{{\text{PC}}} \)^{2}. Their use in (22) we have n^{−1}\( {\hat{\mathbf{E}}}_{{\text{PC}}} \)^{2} ≤ \( \widehat{{\varvec{\Psi}}}^{2} \) + n^{−1}\( {\hat{\mathbf{E}}}_{{\text{FA}}} \)^{2}, which can be rewritten as (27).\( \square\)
 [S1]
The absolute value of each PCA loading before/after orthogonal rotation tends to be greater than the absolute one of the corresponding FA loading (though exceptions can also exist), which is suggested by (17) and (18).
 [S2]
If \( {\hat{\mathbf{E}}}_{{\text{FA}}} \)^{2} is small enough, \( \widehat{{\varvec{\Psi}}} \)^{2} tends to be larger than n^{−1}\( {\hat{\mathbf{E}}}_{{\text{PC}}} \)^{2}. The unique variance \( \widehat{{{\varPsi}}}_{j}^{2} \) for variable j tends to be greater than the corresponding PCA residual variance n^{−1}\( {\hat{\mathbf{e}}}_{j}^{\text{PC}} \)^{2}, where \( {\hat{\mathbf{e}}}_{j}^{\text{PC}} \) and \( \widehat{{{\varPsi}}}_{j}^{2} \) are the jth column and diagonal element of \( {\hat{\mathbf{E}}}_{{\text{PC}}} \) and \( \widehat{{\varvec{\Psi}}} \), respectively. Note, n^{−1}\( {\hat{\mathbf{e}}}_{j}^{\text{PC}} \)^{2} is a variance, because \( {\hat{\mathbf{E}}}_{{\text{PC}}} \)= X −\( {\mathbf{\hat{P}\hat{C^{\prime}}}} \) is columncentered, because X is columncentered and \( {\hat{\mathbf{P}}} \) is also so as found in (7).
These features are numerically assessed in the following sections.
4 Illustration
In this section, two real data examples are used in order to illustrate the theorems in the last section as well as [S1] and [S2]. For every data set, we carry out PCA and FA, together with two classic random FA (RFA) procedures. One of the two RFA procedures is the least squares RFA (LSRFA) with loss function S_{XX} − (AA′ + Ψ^{2})^{2}. The other one is the maximum likelihood RFA (MLRFA), whose loss function is trS_{XX}(AA′ + Ψ^{2})^{−1} − logAA′ + Ψ^{2} following from certain normality assumptions, with · denoting the determinant of its argument. As the theorems in the last section are derived from the formulation of FA with (2), they are not guaranteed to hold in RFA with (4). Thus, it is of interest to see to what extent the RFA solutions follow the inequalities in the theorems. Of course, Theorem 1 is not considered because the error matrix \( {\hat{\mathbf{E}}}_{{\text{FA}}} \) is not relevant to RFA with (4). The resulting loadings in LS and MLRFA are expressed as \( {\hat{\mathbf{A}}}_{\text{L}} \) and \( {\hat{\mathbf{A}}}_{\text{M}} \), respectively, with the corresponding unique variances matrices denoted as \( \widehat{{\varvec{\Psi}}}_{\text{L}}^{2} \) and \( \widehat{{\varvec{\Psi}}}_{\text{M}}^{2} \). The loading matrices in all procedures are rotated by the orthogonal varimax rotation (Kaiser, 1958). We denote the rotated PCA, FA, LSRFA, and MLRFA loading matrices as \( {\hat{\mathbf{C}}\mathbf{T}}_{{\text{P}}} \), \( {\hat{\mathbf{A}}\mathbf{T}}_{{\text{F}}} \), \( {\hat{\mathbf{A}}}_{\text{L}} {\mathbf{T}}_{\text{L}} \), and \( {\hat{\mathbf{A}}}_{\text{M}} {\mathbf{T}}_{\text{M}} \), respectively.
PCA  FA  LSRFA  MLRFA  

\( {\hat{\mathbf{C}}\mathbf{T}}_{{\text{P}}} \)  Res  \( {\hat{\mathbf{A}}\mathbf{T}}_{{\text{F}}} \)  \( \widehat{{\varvec{\Psi}}}^{2} \)  Res  \( {\hat{\mathbf{A}}}_{\text{L}} {\mathbf{T}}_{\text{L}} \)  \( \widehat{{\varvec{\Psi}}}_{\text{L}}^{2} \)  \( \hat{\mathbf{A}}_{\text{M}}{\mathbf{T}}_{\text{M}} \)  \( \widehat{{\varvec{\Psi}}}_{\text{M}}^{2} \)  
Japanese  0.51  0.62  0.13  0.38  0.60  0.50  0.001  0.38  0.60  0.50  0.37  0.61  0.50 
English  0.25  0.81  0.08  0.21  0.76  0.37  0.002  0.19  0.77  0.37  0.21  0.76  0.38 
Social^{a}  − 0.02  0.86  0.07  0.03  0.65  0.58  0.002  0.03  0.64  0.59  0.02  0.65  0.58 
Mathematics  0.80  0.26  0.08  0.59  0.34  0.53  0.003  0.58  0.35  0.54  0.58  0.34  0.55 
Sciences  0.90  0.02  0.03  0.89  0.10  0.19  0.001  0.91  0.10  0.17  0.90  0.11  0.17 
Sum of Squares  \( \left\ {{\hat{\mathbf{C}}}} \right\^{2} \)  \( n^{  1} \left\ {{\hat{\mathbf{E}}}_{{\text{PC}}} } \right\^{2} \)  \( \left\ {{\hat{\mathbf{A}}}} \right\^{2} \)  \( \left\ {\widehat{{\varvec{\Psi}}}} \right\^{2} \)  \( n^{  1} \left\ {{\hat{\mathbf{E}}}_{{\text{FA}}} } \right\^{2} \)  \( \left\ {{\hat{\mathbf{A}}}_{\text{L}} } \right\^{2} \)  \( \left\ {\widehat{{\varvec{\Psi}}}_{\text{L}} } \right\^{2} \)  \( \left\ {{\hat{\mathbf{A}}}_{\text{M}} } \right\^{2} \)  \( \left\ {\widehat{{\varvec{\Psi}}}_{\text{M}} } \right\^{2} \)  
3.62  1.38  2.81  2.18  0.008  2.83  2.17  2.82  2.18 

[Theorem 1] n^{−1}\( {\hat{\mathbf{E}}}_{{\text{PC}}} \)^{2} = 1.38 > n^{−1}\( {\hat{\mathbf{E}}}_{{\text{FA}}} \)^{2} = 0.008;

[Theorem 2] n^{−1}\( {\mathbf{\hat{P}\hat{C^{\prime}}}} \)^{2} = \( {\hat{\mathbf{C}}} \)^{2} = 3.62 ≥ n^{−1}\( {\mathbf{\hat{F}\hat{A}^{\prime}}} \)^{2} = \( {\hat{\mathbf{A}}} \)^{2} = 2.81;

[Theorem 3] \( {\hat{\mathbf{C}}} \)^{2} = 3.62 ≤\( {\hat{\mathbf{A}}} \)^{2} + \( \widehat{{\varvec{\Psi}}} \)^{2} = 2.81 + 2.18 = 4.99;

[Theorem 4] \( \widehat{{\varvec{\Psi}}} \)^{2} = 2.18 ≥ n^{−1}\( {\hat{\mathbf{E}}}_{{\text{PC}}} \)^{2} − n^{−1}\( {\hat{\mathbf{E}}}_{{\text{FA}}} \)^{2} = 1.38 − 0.008 = 1.37.
Next, we consider the loadings, residuals, and unique variance in the left two panels in Table 1. Seven PCA loadings among all ten are boldfaced. Their absolute values are larger than their FA counterparts, which supports the suggestion by [S1]. We also find that \( {\hat{\mathbf{E}}}_{{\text{FA}}} \)^{2} is close to zero and \( \widehat{{\varvec{\Psi}}} \)^{2} = 2.18 > n^{−1}\( {\hat{\mathbf{E}}}_{{\text{PC}}} \)^{2} = 1.38 with all unique variances in FA larger than the corresponding “Res” (residual variances) in PCA, i.e., as suggested by [S2].
Now, we consider the right three panels. The panels for RFA do not have column “Res”, since \( \widehat{{\varvec{\Psi}}}_{\text{L}}^{2} \)= diag(S_{XX} − \( {\hat{\mathbf{A}}}_{\text{L}} {\mathbf{\hat{A}^{\prime}}}_{\text{L}} \)) and \( \widehat{{\varvec{\Psi}}}_{\text{M}}^{2} \)= diag(S_{XX} − \( {\hat{\mathbf{A}}}_{\text{M}} {\mathbf{\hat{A}^{\prime}}}_{\text{M}} \)): the residual variances for variables are always estimated as zero. Besides “Res”, all three FA solutions (loadings and unique variances) are almost identical. Thus, the RFA solutions show the same relationships to PCA ones as the FA solutions.

[O1] The absolute values of the PCA loadings tend to be greater than those of FA.

[O2] \( \widehat{{\varvec{\Psi}}} \)^{2} tends to be larger than n^{−1}\( {\hat{\mathbf{E}}}_{{\text{PC}}} \)^{2}.

[O3] The unique variance \( \widehat{{{\varPsi}}}_{j}^{2} \) for variable j tends to be greater than the variance of PCA residuals n^{−1}\( {\hat{\mathbf{e}}}_{j}^{\text{PC}} \)^{2} for j

[O4] FA and RFA solutions are broadly equivalent.

[O5] The inequalities in Theorems 2–4 also hold in RFA solutions.
The solutions of PCA, FA, LSRFA, and MLRFA for a part of Mullen’s (1939) physical variables data, with Res standing for residual variances and the PCA loadings boldfaced whose absolute values are larger than the FA counterparts
PCA  FA  LSRFA  MLRFA  

\( {\hat{\mathbf{C}}\mathbf{T}}_{{\text{P}}} \)  Res  \( {\hat{\mathbf{A}}\mathbf{T}}_{{\text{F}}} \)  \( \widehat{{\varvec{\Psi}}}^{2} \)  Res  \( {\hat{\mathbf{A}}}_{\text{L}} {\mathbf{T}}_{\text{L}} \)  \( \widehat{{\varvec{\Psi}}}_{\text{L}}^{2} \)  \( \hat{\mathbf{A}}_{\text{M}}{\mathbf{T}}_{\text{M}} \)  \( \widehat{{\varvec{\Psi}}}_{\text{M}}^{2} \)  
Height  0.24  0.91  0.12  0.26  0.88  0.16  0.005  0.25  0.88  0.16  0.27  0.87  0.17 
Arm span  0.18  0.93  0.10  0.17  0.93  0.10  0.006  0.18  0.93  0.11  0.16  0.93  0.11 
Forearm^{a}  0.14  0.92  0.13  0.16  0.89  0.17  0.002  0.16  0.89  0.18  0.16  0.90  0.17 
Lower leg^{a}  0.21  0.90  0.14  0.23  0.87  0.19  0.005  0.22  0.87  0.19  0.23  0.86  0.20 
Weight  0.88  0.27  0.15  0.91  0.26  0.10  0.002  0.91  0.26  0.11  0.92  0.25  0.09 
Bitrochanteric^{b}  0.84  0.20  0.26  0.77  0.21  0.36  0.002  0.77  0.21  0.36  0.77  0.21  0.36 
Chest girth  0.84  0.12  0.28  0.75  0.15  0.41  0.002  0.75  0.15  0.42  0.75  0.15  0.42 
Chest width  0.74  0.27  0.38  0.64  0.28  0.52  0.002  0.64  0.28  0.51  0.62  0.29  0.54 
Sum of squares  \( \left\ {{\hat{\mathbf{C}}}} \right\^{2} \)  \( n^{  1} \left\ {{\hat{\mathbf{E}}}_{{\text{PC}}} } \right\^{2} \)  \( \left\ {{\hat{\mathbf{A}}}} \right\^{2} \)  \( \left\ {\widehat{{\varvec{\Psi}}}} \right\^{2} \)  \( n^{  1} \left\ {{\hat{\mathbf{E}}}_{{\text{FA}}} } \right\^{2} \)  \( \left\ {{\hat{\mathbf{A}}}_{\text{L}} } \right\^{2} \)  \( \left\ {\widehat{{\varvec{\Psi}}}_{\text{L}} } \right\^{2} \)  \( \left\ {{\hat{\mathbf{A}}}_{\text{M}} } \right\^{2} \)  \( \left\ {\widehat{{\varvec{\Psi}}}_{\text{M}} } \right\^{2} \)  
6.44  1.56  5.97  2.01  0.025  5.96  2.04  5.95  5.95 
5 Supplementary simulation studies
In this section, we explore whether [O1]–[O5] from the last section are fulfilled for most of the data sets in practice. It is not efficient to make such assessments with real data sets. We thus resort to using simulated data. Indeed, the correctness of [O4] was demonstrated in the past simulation studies in Adachi (2012, 2015) and Stegeman (2016). Adachi (2012) and Stegeman (2016) have indirectly shown [O4]. This has been assessed without direct comparison of FA and RFA solutions. Instead, it has been shown that the true parameters are recovered well both by FA and RFA. Here, we assess [O4] with direct comparisons.
 1.
Choose q from DU(4m, 8m) and n from DU(8q, 12q), with DU(4m, 8m) defined for the integers within the range [4m, 8m].
 2.
Draw each element of P, F, U, and E (n × q) from the standard normal distribution.
 3.
Draw each element of q × m matrix A_{0} from U(− 1, 1) and each diagonal element of q × q diagonal matrix Ψ_{0} from U(0.1, 1), with U(− 1, 1) defined for the real values within [− 1, 1].
 4.
Set C = αA_{0} and E_{PC} = E so that PC′^{2}/(PC′^{2} +E_{PC}^{2}) = 0.75 with α > 0.
 5.
Set A = βA_{0}, Ψ = γΨ_{0}, and E_{FA} = E so that FA′^{2}/SST = 0.55 and UΨ^{2}/SST = 0.42 with SST = FA′^{2} + UΨ^{2} + E_{FA}^{2}, β > 0, and γ > 0.
 6.
Averages and 95 percentiles of AAD values for loading matrices
Averages and 95 percentiles of AAD values for unique variances
PCmodeled data  FAmodeled data  

LSRFA  MLRFA  LSRFA  MLRFA  
FA  
Ave  0.006  0.006  0.010  0.010 
95%  0.011  0.010  0.016  0.017 
LSRFA  
Ave  0.007  0.013  
95%  0.017  0.026 
The RFA solutions were found to satisfy the inequalities in Theorems 2–4 for every simulated data set, i.e., [O5] is also verified.
Next, we consider [O2], i.e., that \( \widehat{{\varvec{\Psi}}} \)^{2} tends to be larger than n^{−1}\( {\hat{\mathbf{E}}}_{{\text{PC}}} \)^{2}. It turns out that this is fulfilled for every data set, in the FA solutions, and also in the LS and MLRFA solutions.
Averages and 95 percentiles of the proportions of the variables for which the squared sums of PCA residuals are greater than the FA unique variances
PCAmodeled data  FAmodeled data  

FA  LSRFA  MLRFA  FA  LSRFA  MLRFA  
Ave  0.10  0.07  0.07  0.21  0.18  0.19 
95%  0.29  0.25  0.25  0.35  0.30  0.33 
Averages and 95 percentiles of the proportions of the PCA loadings whose absolute values are less than the FA counterparts
(A) After orthogonal rotation  (B) After oblique rotation  

FA  LSRFA  MLRFA  FA  LSRFA  MLRFA  
PCAmodeled data  
Ave  0.27  0.27  0.28  0.26  0.25  0.27 
95%  0.44  0.44  0.44  0.45  0.45  0.46 
FAmodeled data  
Ave  0.30  0.30  0.31  0.30  0.29  0.31 
95%  0.46  0.47  0.46  0.46  0.46  0.47 
We further assess whether the relationships (18) and (24) often occur, even if the orthonormal matrices T_{P} and T_{F} are replaced by nonsingular matrices subject to (12), i.e., even after oblique rotation. For this assessment, we perform Jennrich’s (2006) oblique rotation, in which \( {\hat{\mathbf{C}}\mathbf{N}}_{{\text{P}}} \)_{l1} is minimized over N_{P} and \( {\hat{\mathbf{A}}\mathbf{N}}_{{\text{F}}} \)_{l1} is minimized over N_{F} under (12) for PCA and FA solutions, respectively. As a result, it was found for every data set that the sum of the squares of obliquely rotated PCA loadings was greater than the sum for FA/RFA, but less than that sum plus the sum of FA/RFA unique variances.
6 Discussion
In this paper, we derive several theorems contrasting PCA and FA solutions, with both PCA and FA formulated as matrix decomposition problems. Next, the conclusions from the theorems are assessed numerically.

[P] Choose PCA when a large common part is wished to be found in X.

[F] Choose FA when X is wished to be better explained.
 1.
The absolute values of PCA loadings tend to be greater than the corresponding FA ones, though solutions can also occur in which this is not clearly found.
 2.
It is a common result that the sum of unique variances in FA is larger than the sum of residual variances of PCA. Further, the unique variance for each variable in FA tends to be greater than the corresponding residual variance in PCA.
As the inequalities in Sect. 4 are derived from the matrix decomposition formulation of FA with (2), they are not guaranteed to hold in the classic random FA (RFA) formulated as (4). However, as found in Sects. 4 and 5, the matrix decomposition FA solutions are broadly equivalent to the RFA ones. Thus, the inequalities in the theorems are likely to hold for RFA, except Theorem 1 which does not make sense in RFA.
The above statement “FA fits better than PCA” is to be carefully reconsidered. As found in (1) and (2), the addition of the unique part UΨ to the PCA model leads to the FA model. Thus, PCA has fewer parameters than FA and can be viewed as more parsimonious. This suggests that a model selection strategy taking into account the model’s parsimony remains to be studied for prescribing whether PCA or FA is suitable for a particular data set.
Notes
Acknowledgements
Funding was provided by the Japan Society of the Promotion of Sciences [Grant (C)18K11191]. The authors thank the anonymous reviewers for their useful comments.
References
 Adachi, K. (2012). Some contributions to datafitting factor analysis with empirical comparisons to covariancefitting factor analysis. Journal of the Japanese Society of Computational Statistics, 25, 25–38.MathSciNetCrossRefGoogle Scholar
 Adachi, K. (2015). A matrixintensive approach to factor analysis. Journal of the Japan Statistical Society, Japanese Issue, 44, 363–382. (in Japanese).MathSciNetGoogle Scholar
 Adachi, K. (2016). Matrixbased introduction to multivariate data analysis. Singapore: Springer.CrossRefGoogle Scholar
 Adachi, & Trendafilov. (2018). Some mathematical properties of the matrix decomposition solution in factor analysis. Psychometrika, 83, 407–424.MathSciNetCrossRefGoogle Scholar
 Bentler, P. M., & Kano, Y. (1990). On the equivalence of factors and components. Multivariate Behavioral Research, 25, 67–74.CrossRefGoogle Scholar
 Eckart, C., & Young, G. (1936). The approximation of one matrix by another of lower rank. Psychometrika, 1, 211–218.CrossRefGoogle Scholar
 Gower, J. C., & Dijksterhuis, G. B. (2004). Procrustes problems. Oxford: Oxford University Press.CrossRefGoogle Scholar
 Harman, H. H. (1976). Modern factor analysis (3rd ed.). Chicago: The University of Chicago Press.zbMATHGoogle Scholar
 Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Statistics, 24, 417–441.zbMATHGoogle Scholar
 Jennrich, R. I. (2006). Rotation to simple loadings using component loss function: The oblique case. Psychometrika, 71, 173–191.MathSciNetCrossRefGoogle Scholar
 Jolliffe, I. T. (2002). Principal component analysis (2nd ed.). New York: Springer.zbMATHGoogle Scholar
 Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23, 187–200.CrossRefGoogle Scholar
 Mulaik, S. A. (2010). Foundations of factor analysis (2nd ed.). Boca Raton: CRC Press.zbMATHGoogle Scholar
 Mullen, F. (1939). Factors in the growth of girls seven to seventeen years of age. Ph.D Dissertation. University of Chicago, Department of Education.Google Scholar
 Ogasawara, H. (2000). Some relationships between factors and components. Psychometrika, 65, 167–185.MathSciNetCrossRefGoogle Scholar
 Okamoto, M. (1969). Optimality of principal components. In P. R. Krishinaiah (Ed.), Multivariate analysis (Vol. II, pp. 673–687). New York: Academic Press.Google Scholar
 Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazines, 2, 559–572.zbMATHGoogle Scholar
 Sato, M. (1990). Some remarks on principal component analysis as a substitute for factor analysis in monofactor cases. Journal of the Japan Statistical Society, 20, 23–31.MathSciNetzbMATHGoogle Scholar
 Sočan, G. (2003). The incremental value of minimum rank factor analysis. PhD Thesis, University of Groningen: Groningen.Google Scholar
 Spearman, C. (1904). “General Intelligence”, objectively determined and measured. American Journal of Psychology, 15, 201–293.CrossRefGoogle Scholar
 Stegeman, A. (2016). A new method for simultaneous estimation of the factor model parameters, factor scores, and unique parts. Computational Statistics and Data Analysis, 99, 189–203.MathSciNetCrossRefGoogle Scholar
 Tanaka, Y., & Tarumi, T. (1995). Handbook for statistical analysis: Multivariate analysis (windows version). Tokyo: KyoritsuShuppan. (in Japanese).Google Scholar
 ten Berge, J. M. F., & Kiers, H. A. L. (1996). Optimality criteria for principal component analysis and generalizations. British Journal of Mathematical and Statistical Psychology, 49, 335–345.MathSciNetCrossRefGoogle Scholar
 Thurstone, L. L. (1935). The vectors of mind. Chicago: University if Chicago Press.zbMATHGoogle Scholar
 Trendafilov, N. T., Unkel, S., & Krzanowski, W. (2013). Exploratory factor and principal component analyses: Some new aspects. Statistics and Computing, 23, 209–220.MathSciNetCrossRefGoogle Scholar
 Unkel, S., & Trendafilov, N. T. (2010). Simultaneous parameter estimation in exploratory factor analysis: An expository review. International Statistical Review, 78, 363–382.CrossRefGoogle Scholar