The Alternating Least-Squares Algorithm for CDPCA
Clustering and Disjoint Principal Component Analysis (CDPCA) is a constrained principal component analysis recently proposed for clustering of objects and partitioning of variables, simultaneously, which we have implemented in R language. In this paper, we deal in detail with the alternating least-squares algorithm for CDPCA and highlight its algebraic features for constructing both interpretable principal components and clusters of objects. Two applications are given to illustrate the capabilities of this new methodology.
KeywordsPrincipal Component Analysis Clustering K-means
The authors would like to thank the anonymous referee for all the valuable and constructive comments which have helped to improve this paper. A special thanks to Professor Maurizio Vichi for providing us a Matlab version of the ALS algorithm for performing CDPCA. This work was partially supported by Portuguese funds through the CIDMA - Center for Research and Development in Mathematics and Applications, and the Portuguese Foundation for Science and Technology (FCT – Fundação para a Ciência e a Tecnologia), within project UID/MAT/04106/2013.
- 5.Macedo, E., Freitas, A.: Statistical methods and optimization in data mining. In: III International Conference of Optimization and Applications, OPTIMA 2012, pp. 164–169 (2012)Google Scholar
- 6.R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. http://www.R-project.org/
- 7.UCI Repository: Winsconsin Breast Cancer Data Set. http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Original)