Advertisement

The Alternating Least-Squares Algorithm for CDPCA

  • Eloísa MacedoEmail author
  • Adelaide Freitas
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 499)

Abstract

Clustering and Disjoint Principal Component Analysis (CDPCA) is a constrained principal component analysis recently proposed for clustering of objects and partitioning of variables, simultaneously, which we have implemented in R language. In this paper, we deal in detail with the alternating least-squares algorithm for CDPCA and highlight its algebraic features for constructing both interpretable principal components and clusters of objects. Two applications are given to illustrate the capabilities of this new methodology.

Keywords

Principal Component Analysis Clustering K-means 

Notes

Acknowledgments

The authors would like to thank the anonymous referee for all the valuable and constructive comments which have helped to improve this paper. A special thanks to Professor Maurizio Vichi for providing us a Matlab version of the ALS algorithm for performing CDPCA. This work was partially supported by Portuguese funds through the CIDMA - Center for Research and Development in Mathematics and Applications, and the Portuguese Foundation for Science and Technology (FCT – Fundação para a Ciência e a Tecnologia), within project UID/MAT/04106/2013.

References

  1. 1.
    d’Aspremont, A., El Ghaoui, L., Jordan, M.I., Lanckriet, G.R.G.: A direct formulation for sparse PCA using semidefinite programming. SIAM 49(3), 434–448 (2007)zbMATHMathSciNetCrossRefGoogle Scholar
  2. 2.
    Jolliffe, I.T.: Principal Component Analysis, 2nd edn. Springer, New York (2002)zbMATHGoogle Scholar
  3. 3.
    Jolliffe, I.T., Trendafilov, N.T., Uddin, M.: A modified principal component technique based on the lasso. J. Comput. Graph. Stat. 12(3), 531–547 (2003)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Ma, Z.: Sparse principal component analysis and iterative thresholding. Ann. Stat. 41(2), 772–801 (2013)zbMATHCrossRefGoogle Scholar
  5. 5.
    Macedo, E., Freitas, A.: Statistical methods and optimization in data mining. In: III International Conference of Optimization and Applications, OPTIMA 2012, pp. 164–169 (2012)Google Scholar
  6. 6.
    R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. http://www.R-project.org/
  7. 7.
    UCI Repository: Winsconsin Breast Cancer Data Set. http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Original)
  8. 8.
    Vichi, M., Saporta, G.: Clustering and disjoint principal component analysis. Comput. Stat. Data Anal. 53, 3194–3208 (2009)zbMATHMathSciNetCrossRefGoogle Scholar
  9. 9.
    Vines, S.: Simple principal components. Appl. Stat. 49, 441–451 (2000)zbMATHMathSciNetGoogle Scholar
  10. 10.
    Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. Neural Netw. 16, 645–648 (2005)CrossRefGoogle Scholar
  11. 11.
    Zou, H., Hastie, T., Tibshirani, R.: Sparse principal component analysis. J. Comput. Graph. Stat. 15(2), 262–286 (2006)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.University of AveiroAveiroPortugal

Personalised recommendations