Skip to main content

PCA-Guided k-Means with Variable Weighting and Its Application to Document Clustering

  • Conference paper
Modeling Decisions for Artificial Intelligence (MDAI 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5861))

Abstract

PCA-guided k-Means is a deterministic approach to k-Means clustering, in which cluster indicators are derived in a PCA-guided manner. This paper proposes a new approach to k-Means with variable selection by introducing variable weighting mechanism into PCA-guided k-Means. The relative responsibility of variables is estimated in a similar way with FCM clustering while the membership indicator is derived from a PCA-guided manner, in which the principal component scores are calculated by considering the responsibility weights of variables. So, the variables that have meaningful information for capturing cluster structures are emphasized in calculation of membership indicators. Numerical experiments including an application to document clustering demonstrate the characteristics of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ding, C., He, X.: K-means clustering via principal component analysis. In: Proc. of Int’l. Conf. Machine Learning (ICML 2004), pp. 225–232 (2004)

    Google Scholar 

  2. Huang, J.Z., Ng, M.K., Rong, H., Li, Z.: Automated variable weighting in k-means type clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence 27(5), 657–668 (2005)

    Article  Google Scholar 

  3. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)

    MATH  Google Scholar 

  4. MacQueen, J.B.: Some methods of classification and analysis of multivariate observations. In: Proc. of 5th Berkeley Symposium on Math. Stat. and Prob., pp. 281–297 (1967)

    Google Scholar 

  5. Zha, H., Ding, C., Gu, M., He, X., Simon, H.: Spectral relaxation for K-means clustering. In: Advances in Neural Information Processing Systems 14 (Proc. of NIPS 2001), pp. 1057–1064 (2002)

    Google Scholar 

  6. Ding, C., He, X.: Linearized cluster assignment via spectral ordering. In: Proc. of Int’l. Conf. Machine Learning (ICML 2004), pp. 233–240 (2004)

    Google Scholar 

  7. Honda, K., Ichihashi, H., Masulli, F., Rovetta, S.: Linear fuzzy clustering with selection of variables using graded possibilistic approach. IEEE Trans. Fuzzy Systems 15(5), 878–889 (2007)

    Article  Google Scholar 

  8. Honda, K., Ichihashi, H.: Linear fuzzy clustering techniques with missing values and their application to local principal component analysis. IEEE Trans. Fuzzy Systems 12(2), 183–193 (2004)

    Article  MathSciNet  Google Scholar 

  9. Honda, K., Ichihashi, H.: Regularized linear fuzzy clustering and probabilistic PCA mixture models. IEEE Trans. Fuzzy Systems 13(4), 508–516 (2005)

    Article  Google Scholar 

  10. Jolliffe, I.T.: Discarding variables in a principal component analysis. I. Artificial data. Appl. Statist. 21, 160–173 (1972)

    Article  MathSciNet  Google Scholar 

  11. Tanaka, Y., Mori, Y.: Principal component analysis based on a subset of variables: variable selection and sensitivity analysis. American Journal of Mathematics and Management Sciences 17(1,2), 61–89 (1997)

    MATH  MathSciNet  Google Scholar 

  12. VASpca (VAriable Selection in Principal Component Analysis) Web Page, http://mo161.soci.ous.ac.jp/vaspca/indexE.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Honda, K., Notsu, A., Ichihashi, H. (2009). PCA-Guided k-Means with Variable Weighting and Its Application to Document Clustering. In: Torra, V., Narukawa, Y., Inuiguchi, M. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2009. Lecture Notes in Computer Science(), vol 5861. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04820-3_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04820-3_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04819-7

  • Online ISBN: 978-3-642-04820-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics