PCA-Guided k-Means with Variable Weighting and Its Application to Document Clustering

Honda, Katsuhiro; Notsu, Akira; Ichihashi, Hidetomo

doi:10.1007/978-3-642-04820-3_26

Katsuhiro Honda²²,
Akira Notsu²² &
Hidetomo Ichihashi²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5861))

Included in the following conference series:

International Conference on Modeling Decisions for Artificial Intelligence

921 Accesses
4 Citations

Abstract

PCA-guided k-Means is a deterministic approach to k-Means clustering, in which cluster indicators are derived in a PCA-guided manner. This paper proposes a new approach to k-Means with variable selection by introducing variable weighting mechanism into PCA-guided k-Means. The relative responsibility of variables is estimated in a similar way with FCM clustering while the membership indicator is derived from a PCA-guided manner, in which the principal component scores are calculated by considering the responsibility weights of variables. So, the variables that have meaningful information for capturing cluster structures are emphasized in calculation of membership indicators. Numerical experiments including an application to document clustering demonstrate the characteristics of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ding, C., He, X.: K-means clustering via principal component analysis. In: Proc. of Int’l. Conf. Machine Learning (ICML 2004), pp. 225–232 (2004)
Google Scholar
Huang, J.Z., Ng, M.K., Rong, H., Li, Z.: Automated variable weighting in k-means type clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence 27(5), 657–668 (2005)
Article Google Scholar
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)
MATH Google Scholar
MacQueen, J.B.: Some methods of classification and analysis of multivariate observations. In: Proc. of 5th Berkeley Symposium on Math. Stat. and Prob., pp. 281–297 (1967)
Google Scholar
Zha, H., Ding, C., Gu, M., He, X., Simon, H.: Spectral relaxation for K-means clustering. In: Advances in Neural Information Processing Systems 14 (Proc. of NIPS 2001), pp. 1057–1064 (2002)
Google Scholar
Ding, C., He, X.: Linearized cluster assignment via spectral ordering. In: Proc. of Int’l. Conf. Machine Learning (ICML 2004), pp. 233–240 (2004)
Google Scholar
Honda, K., Ichihashi, H., Masulli, F., Rovetta, S.: Linear fuzzy clustering with selection of variables using graded possibilistic approach. IEEE Trans. Fuzzy Systems 15(5), 878–889 (2007)
Article Google Scholar
Honda, K., Ichihashi, H.: Linear fuzzy clustering techniques with missing values and their application to local principal component analysis. IEEE Trans. Fuzzy Systems 12(2), 183–193 (2004)
Article MathSciNet Google Scholar
Honda, K., Ichihashi, H.: Regularized linear fuzzy clustering and probabilistic PCA mixture models. IEEE Trans. Fuzzy Systems 13(4), 508–516 (2005)
Article Google Scholar
Jolliffe, I.T.: Discarding variables in a principal component analysis. I. Artificial data. Appl. Statist. 21, 160–173 (1972)
Article MathSciNet Google Scholar
Tanaka, Y., Mori, Y.: Principal component analysis based on a subset of variables: variable selection and sensitivity analysis. American Journal of Mathematics and Management Sciences 17(1,2), 61–89 (1997)
MATH MathSciNet Google Scholar
VASpca (VAriable Selection in Principal Component Analysis) Web Page, http://mo161.soci.ous.ac.jp/vaspca/indexE.html

Download references

Author information

Authors and Affiliations

Osaka prefecture University, Sakai, Osaka, 599-8531, Japan
Katsuhiro Honda, Akira Notsu & Hidetomo Ichihashi

Authors

Katsuhiro Honda
View author publications
You can also search for this author in PubMed Google Scholar
Akira Notsu
View author publications
You can also search for this author in PubMed Google Scholar
Hidetomo Ichihashi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IIIA-CSIC, Campus UAB s/n, 08193, Bellaterra, Catalonia, Spain
Vicenç Torra
Toho Gakuen, 3-1-10 Naka, Kunitachi, 186-0004, Tokyo, Japan
Yasuo Narukawa
Graduate School of Engineering Science, Osaka University, 1-3, Machikaneyama, Toyonaka, 560-8531, Osaka, Japan
Masahiro Inuiguchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Honda, K., Notsu, A., Ichihashi, H. (2009). PCA-Guided k-Means with Variable Weighting and Its Application to Document Clustering. In: Torra, V., Narukawa, Y., Inuiguchi, M. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2009. Lecture Notes in Computer Science(), vol 5861. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04820-3_26

Download citation

DOI: https://doi.org/10.1007/978-3-642-04820-3_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04819-7
Online ISBN: 978-3-642-04820-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics