Skip to main content

K-Means Clustering Seeds Initialization Based on Centrality, Sparsity, and Isotropy

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5788))

Abstract

K-Means is the most commonly used clustering algorithm. Despite its numerous advantages, it has a crucial drawback: the final cluster structure entirely relies on the choice of initial seeds. In this paper, a new seeds initialization algorithm based on centrality, sparsity, and isotropy is proposed. Preliminary experiments show that the proposed algorithm not only resulted in better clustering structures, but also accelerated the convergence.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berkin, P.: A servey of clustering data mining techniques. In: Kogan, J., Nicholas, C., Teboulle, M. (eds.) Grouping Multidimensional Data, pp. 25–71. Springer, Berlin (2006)

    Chapter  Google Scholar 

  2. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2006)

    MATH  Google Scholar 

  3. Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Transactions on Neural Networks 16(3), 645–678 (2005)

    Article  Google Scholar 

  4. Jain, A., Murty, M., Flynn, P.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)

    Article  Google Scholar 

  5. MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)

    Google Scholar 

  6. Dhillon, I., Modha, D.: A data clustering algorithm on distributed memory multiprocessors. In: Proceedings of the fifth ACM SIGKDD, Large-scale Parallel KDD Systems Workshop, San Diego, CA, USA, pp. 245–260 (1999)

    Google Scholar 

  7. Trujillo, M., Izquierdo, E.: Combining k-Means and semivariogram-based grid clustering. In: Proceedings of the 47th International Symposium ELMAR focused on Multimedia Systems and Applications, Zadar, Croatia, pp. 9–12 (2005)

    Google Scholar 

  8. He, J., Tan, A., Tan, C., Sung, S.: ART-C: A neural architecture for self-organization under constraints. In: Proceedings of International Joint Conference on Neural Networks (IJCNN 2002), Hawaii, USA, pp. 2550–2555 (2002)

    Google Scholar 

  9. Tou, J., Gonzalez, R.: Pattern Recognition Principles. Addison-Wesley, Reading (1974)

    MATH  Google Scholar 

  10. Katsavounidis, I., Kuo, C., Zhang, Z.: A new initialization technique for generalized lloyd iteration. IEEE Signal Processing Letters 1(10), 144–146 (1994)

    Article  Google Scholar 

  11. Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons Ltd., New York (1990)

    Book  MATH  Google Scholar 

  12. Khan, S.S., Ahmad, A.: Cluster center initialization algorithm for K-means clustering. Pattern Recognition Letters 25(11), 1293–1302 (2004)

    Article  Google Scholar 

  13. Redmond, S.J., Heneghan, C.: A method for initialising the K-means clustering algorithm using kd-trees. Pattern Recognition Letters 28(8), 965–973 (2007)

    Article  Google Scholar 

  14. Likas, A., Vlassis, N., Verbeek, J.J.: The global k-means clustering algorithm. Pattern Recognition 36(2), 451–461 (2003)

    Article  Google Scholar 

  15. Pen̄a, J., Lozano, J., Larran̄aga, P.: An empirical comparison of four initialization methods for the K-Means algorithm. Pattern Recognition Letters 20(10), 1027–1040 (1999)

    Article  Google Scholar 

  16. Mitra, P., Murthy, C., Pal, S.K.: Density-based multiscale data condensation. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(6), 734–747 (2002)

    Article  Google Scholar 

  17. Kang, P., Cho, S.: Locally linear reconstruction for inatance-based learning. Pattern Recognition 41(11), 3507–3518 (2008)

    Article  MATH  Google Scholar 

  18. Halkidi, M., Vazirgiannis, M., Batistakis, Y.: Quality scheme assessment in the clustering process. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 265–276. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kang, P., Cho, S. (2009). K-Means Clustering Seeds Initialization Based on Centrality, Sparsity, and Isotropy. In: Corchado, E., Yin, H. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2009. IDEAL 2009. Lecture Notes in Computer Science, vol 5788. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04394-9_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04394-9_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04393-2

  • Online ISBN: 978-3-642-04394-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics