PCA-Guided k-Means with Variable Weighting and Its Application to Document Clustering
- 672 Downloads
PCA-guided k-Means is a deterministic approach to k-Means clustering, in which cluster indicators are derived in a PCA-guided manner. This paper proposes a new approach to k-Means with variable selection by introducing variable weighting mechanism into PCA-guided k-Means. The relative responsibility of variables is estimated in a similar way with FCM clustering while the membership indicator is derived from a PCA-guided manner, in which the principal component scores are calculated by considering the responsibility weights of variables. So, the variables that have meaningful information for capturing cluster structures are emphasized in calculation of membership indicators. Numerical experiments including an application to document clustering demonstrate the characteristics of the proposed method.
KeywordsVariable Selection Variable Weighting Principal Component Score Document Cluster Membership Indicator
Unable to display preview. Download preview PDF.
- 1.Ding, C., He, X.: K-means clustering via principal component analysis. In: Proc. of Int’l. Conf. Machine Learning (ICML 2004), pp. 225–232 (2004)Google Scholar
- 4.MacQueen, J.B.: Some methods of classification and analysis of multivariate observations. In: Proc. of 5th Berkeley Symposium on Math. Stat. and Prob., pp. 281–297 (1967)Google Scholar
- 5.Zha, H., Ding, C., Gu, M., He, X., Simon, H.: Spectral relaxation for K-means clustering. In: Advances in Neural Information Processing Systems 14 (Proc. of NIPS 2001), pp. 1057–1064 (2002)Google Scholar
- 6.Ding, C., He, X.: Linearized cluster assignment via spectral ordering. In: Proc. of Int’l. Conf. Machine Learning (ICML 2004), pp. 233–240 (2004)Google Scholar
- 12.VASpca (VAriable Selection in Principal Component Analysis) Web Page, http://mo161.soci.ous.ac.jp/vaspca/indexE.html