Abstract
PCA-guided k-Means is a deterministic approach to k-Means clustering, in which cluster indicators are derived in a PCA-guided manner. This paper proposes a new approach to k-Means with variable selection by introducing variable weighting mechanism into PCA-guided k-Means. The relative responsibility of variables is estimated in a similar way with FCM clustering while the membership indicator is derived from a PCA-guided manner, in which the principal component scores are calculated by considering the responsibility weights of variables. So, the variables that have meaningful information for capturing cluster structures are emphasized in calculation of membership indicators. Numerical experiments including an application to document clustering demonstrate the characteristics of the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ding, C., He, X.: K-means clustering via principal component analysis. In: Proc. of Int’l. Conf. Machine Learning (ICML 2004), pp. 225–232 (2004)
Huang, J.Z., Ng, M.K., Rong, H., Li, Z.: Automated variable weighting in k-means type clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence 27(5), 657–668 (2005)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)
MacQueen, J.B.: Some methods of classification and analysis of multivariate observations. In: Proc. of 5th Berkeley Symposium on Math. Stat. and Prob., pp. 281–297 (1967)
Zha, H., Ding, C., Gu, M., He, X., Simon, H.: Spectral relaxation for K-means clustering. In: Advances in Neural Information Processing Systems 14 (Proc. of NIPS 2001), pp. 1057–1064 (2002)
Ding, C., He, X.: Linearized cluster assignment via spectral ordering. In: Proc. of Int’l. Conf. Machine Learning (ICML 2004), pp. 233–240 (2004)
Honda, K., Ichihashi, H., Masulli, F., Rovetta, S.: Linear fuzzy clustering with selection of variables using graded possibilistic approach. IEEE Trans. Fuzzy Systems 15(5), 878–889 (2007)
Honda, K., Ichihashi, H.: Linear fuzzy clustering techniques with missing values and their application to local principal component analysis. IEEE Trans. Fuzzy Systems 12(2), 183–193 (2004)
Honda, K., Ichihashi, H.: Regularized linear fuzzy clustering and probabilistic PCA mixture models. IEEE Trans. Fuzzy Systems 13(4), 508–516 (2005)
Jolliffe, I.T.: Discarding variables in a principal component analysis. I. Artificial data. Appl. Statist. 21, 160–173 (1972)
Tanaka, Y., Mori, Y.: Principal component analysis based on a subset of variables: variable selection and sensitivity analysis. American Journal of Mathematics and Management Sciences 17(1,2), 61–89 (1997)
VASpca (VAriable Selection in Principal Component Analysis) Web Page, http://mo161.soci.ous.ac.jp/vaspca/indexE.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Honda, K., Notsu, A., Ichihashi, H. (2009). PCA-Guided k-Means with Variable Weighting and Its Application to Document Clustering. In: Torra, V., Narukawa, Y., Inuiguchi, M. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2009. Lecture Notes in Computer Science(), vol 5861. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04820-3_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-04820-3_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04819-7
Online ISBN: 978-3-642-04820-3
eBook Packages: Computer ScienceComputer Science (R0)