Abstract
It is well known that the classical exploratory factor analysis (EFA) of data with more observations than variables has several types of indeterminacy. We study the factor indeterminacy and show some new aspects of this problem by considering EFA as a specific data matrix decomposition. We adopt a new approach to the EFA estimation and achieve a new characterization of the factor indeterminacy problem. A new alternative model is proposed, which gives determinate factors and can be seen as a semi-sparse principal component analysis (PCA). An alternating algorithm is developed, where in each step a Procrustes problem is solved. It is demonstrated that the new model/algorithm can act as a specific sparse PCA and as a low-rank-plus-sparse matrix decomposition. Numerical examples with several large data sets illustrate the versatility of the new model, and the performance and behaviour of its algorithmic implementation.
Similar content being viewed by others
Notes
The PCA concept is very specific and well defined: it is equivalent to a low-rank approximation of the data matrix using the singular value decomposition. Our proposed method is similar to PCA in a sense that it gives orthogonal factors, but it is not a PCA in the strict sense. In view of the fact that there are other related concepts, such as sparse PCA or robust PCA, which are not real PCA’s, we decided to name our method semi-sparse PCA, SSPCA.
Available from http://datam.i2r.a-star.edu.sg/datasets/krbd/Leukemia/MLL.html.
References
Absil, P.-A., Mahony, R., & Sepulchre, R. (2008). Optimization Algorithms on Matrix Manifolds. Princeton: Princeton University Press.
Adachi, K., & Trendafilov, N. (2017). Sparsest factor analysis for clustering variables: A matrix decomposition approach. Advances in Data Analysis and Classification, 12, 778–794.
Aravkin, A., Becker, S., Cevher, V., & Olsen, P. (2014). A variational approach to stable principal component pursuit. In Conference on uncertainty in artificial intelligence (UAI).
Armstrong, S. A., Staunton, J. E., Silverman, L. B., Pieters, R., den Boer, M. L., Minden, M. D., et al. (2002). MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nature Genetics, 30, 41–47.
Cai, J.-F., Candès, E. J., & Shen, Z. (2008). A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 20, 1956–1982.
Candès, E. J., Li, X., Ma, Y., & Wright, J. (2009). Robust principal component analysis? Journal of ACM, 58, 1–37.
De Leeuw, J. (2004). Least squares optimal scaling of partially observed linear systems. In K. van Montfort, J. Oud, & A. Satorra (Eds.), Recent developments on structural equation models: Theory and applications (pp. 121–134). Dordrecht, NL: Kluwer Academic Publishers.
Edelman, A., Arias, T. A., & Smith, S. T. (1998). The geometry of algorithms with orthogonality constraints. SIAM Journal on Matrix Analysis and Applications, 20, 303–353.
Eldén, L. (2007). Matrix methods in data mining and pattern recognition. Philadelphia: SIAM.
Golub, G. H., & Van Loan, C. F. (2013). Matrix computations (4th ed.). Baltimore, MD: Johns Hopkins University Press.
Harman, H. H. (1976). Modern factor analysis (3rd ed.). Chicago, IL: University of Chicago Press.
Jolliffe, I. T., Trendafilov, N. T., & Uddin, M. (2003). A modified principal component technique based on the LASSO. Journal of Computational and Graphical Statistics, 12, 531–547.
Journée, M., Nesterov, Y., Richtárik, P., & Sepulchre, R. (2010). Generalized power method for sparse principal component analysis. Journal of Machine Learning Research, 11, 517–553.
Lin, Z., Chen, M., Wu, L., & Ma, Y. (2009). The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices. UIUC Technical Report, UILU-ENG-09-2215, November.
Lin, Z., Ganesh, A., Wright, J., Wu, L., Chen, M., & Ma, Y. (2009). Fast convex optimization algorithms for exact recovery of a corrupted low-rank matrix. UIUC Technical Report, UILU-ENG-09-2214, August.
Mulaik, S. A. (2005). Looking back on the factor indeterminacy controversies in factor analysis. In In A. Maydeu-Olivares & J. J. McArdle (Eds.), Contemporary Psychometrics (pp. 174–206). Mahwah, NJ: Lawrence Erlbaum Associates Inc.
Mulaik, S. A. (2010). The foundations of factor analysis (2nd ed.). Boca Raton, FL: Chapman and Hall/CRC.
Shen, H., & Huang, J. Z. (2008). Sparse principal component analysis via regularized low-rank matrix approximation. Journal of Multivariate Analysis, 99, 1015–1034.
Steiger, J. H. (1979). Factor indeterminacy in the 1930’s and the 1970’s: Some interesting parallels. Psychometrika, 44, 157–166.
Steiger, J. H., & Schonemann, P. H. (1978). A history of factor indeterminacy (pp. 136–178). Chicago, IL: University of Chicago Press.
Trendafilov, N., Fontanella, S., & Adachi, K. (2017). Sparse exploratory factor analysis. Psychometrika, 82, 778–794.
Trendafilov, N. T., & Unkel, S. (2011). Exploratory factor analysis of data matrices with more variables than observations. Journal of Computational and Graphical Statistics, 20, 874–891.
Unkel, S., & Trendafilov, N. T. (2010). Simultaneous parameter estimation in exploratory factor analysis: An expository review. International Statistical Review, 78, 363–382.
Witten, D. M., Tibshirani, R., & Hastie, T. (2009). A penalized matrix decomposition, with applications to sparse principal components and canonical correlation. Biostatistics, 10, 515–534.
Yuan, X., & Yang, J. (2013). Sparse and low-rank matrix decomposition via alternating direction methods. Pacific Journal of Optimization, 9, 167–180.
Author information
Authors and Affiliations
Corresponding author
Proof of Lemma 2.1
Proof of Lemma 2.1
Proof
In the proof we will, for convenience, denote \(\mathcal {D}= \mathcal {D}(U_1^\top R)\). Without loss of generality, we assume that the diagonal elements are ordered, \(| u_1^\top r_1 | \ge | u_2^\top r_2 | \ge \cdots \ge | u_p^\top r_p |\). Here \(r_i\) and \(u_i\) denote the i’th column of R and \(U_1\), respectively.
Let the CS decomposition (Golub and Van Loan, 2013, Sect. 2.5.4) of U be
where \(Q_1\), \(Q_2\) and V are orthogonal and C and S are diagonal. The diagonal elements of C satisfy \(c_1 \ge c_2 \ge \cdots \ge c_p \ge 0.\) Clearly, the statement of the lemma is equivalent to \(C=I_p\) and \(S=0\).
We insert the CS decomposition in (15), set \(\nabla \Gamma = 0\) and multiply by
from the left and by V from the right. We get
With \( B = Q_1^\top R \mathcal {D}V \in \mathbb {R}^{p \times p}\), we thus have \(B = C B^\top C\), or, equivalently,
Clearly, for any \(b_{ij} \ne 0\) we must have \(c_i = c_j = 1\). Assume that
and that, for some \(1 \le i \le p\), \(b_{is} \ne 0,\) or, for some \(1 \le j \le p\), \(b_{sj} \ne 0\) (we allow \(s=p\), in which case \(B_s=B\)). Then, since \(b_{is} = c_i^2 b_{is} c_s^2\), or \(b_{sj} = c_s^2 b_{sj} c_j^2\), and since the \(c_i\)’s are ordered, we must have \(c_1 = \cdots = c_s = 1\). Thus, if \(s=p\), then \(C=I_p\), and the lemma is true.
If \(s<p\), C has the structure
where \(C_2=0\) or its diagonal elements are nonnegative.
We now assume that
with \(c_t>0\), i.e. \(U_1\) has rank \(t<p\). We will show that then the stationary point does not correspond to a global maximum.
Due to (27), we can write
Consider the last row of \(BV^\top \):
where \(r_i\) denotes the i’th column of R. Since R is nonsingular, there must exist at least one nonzero element in \(c^\top \), say \(q_p^\top r_k\). Then the corresponding element i \(\mathcal {D}\) must be equal to zero, \(u_k^\top r_k=0\).
Under the assumption (28), \(U_1\) has rank t: using the CS decomposition we can write \(u_j = \sum _{i=1}^t c_i v_{ji} q_i\), for \(j=1,2,\ldots ,p\). Clearly \(q_p\) is orthogonal to \(\{u_1 ,\; u_2,\; \ldots , u_p\}\). Thus, we can replace the column \(u_k\) in \(U_1\) by \(q_p\) and make the objective function larger. It follows that the assumption that \({{\mathrm{rank}}}(U_1)=t<p\) cannot be valid at the global maximum.
It remains to consider the case when all the diagonal elements of C are positive, and \(U_1\) is nonsingular. Due to the structure (27), we have
With the corresponding blocking \(V = (V_1 \; V_2)\), we have \(\mathcal {D}V_2 = 0\), i.e. \(\mathcal {D}\) has a null space of dimension \(p-s\). Since the diagonal elements of \(\mathcal {D}\) are ordered, it follows that \(\mathcal {D}\) has the structure
where \(\mathcal {D}_s\) is nonsingular. Put
From the identity \(\mathcal {D}V_2 =0\), we then get \(V_{12}=0\); consequently, \(V_{22}\) is an orthogonal matrix, and it follows that
From the CS decomposition, we then have
It follows that \(q_p\) is orthogonal to \(u_j\) for \(j=1,2,\ldots ,s\), and as in the cases above we can now replace \(u_p\) by \(q_p\) and increase the value of the objective function.
Thus, we have shown that for \(C \ne I_p\) the stationary point does not correspond to the global maximum, which proves the lemma. \(\square \)
Rights and permissions
About this article
Cite this article
Eldén, L., Trendafilov, N. Semi-sparse PCA. Psychometrika 84, 164–185 (2019). https://doi.org/10.1007/s11336-018-9650-9
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-018-9650-9