Skip to main content

A Comparative Study on k-means Clustering Method and Analysis

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 985))

Abstract

A study of three clustering methods using four different cluster validity metrics is being presented here. We have discussed the clustering methods and made an analysis. We have given the mathematical formation of four cluster validity measures. From the experimental outcomes, indications regarding the optimal validation method, as well as, optimal clustering method are being presented. Choice of preferable clustering technique is presented after getting outcomes using real-world data sets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Abbas, O.A.: Comparisons between data clustering algorithms. Int. Arab J. Inf. Technol. 5, 320–325 (2008)

    Google Scholar 

  2. Bezdek, J.C., Pal, N.R.: Some new indices of cluster validity. IEEE Trans. Syst. Man Cybern. 28, 301–315 (1998)

    Article  Google Scholar 

  3. Bradley, P.S., Fayyad, U.M.: Refining initial points for \(k\)-means clustering. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 91–99 (1998)

    Google Scholar 

  4. Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 2, 224–227 (1979)

    Article  Google Scholar 

  5. Dheeru, D., Taniskidou, E.K.: UCI Machine Learning Repository (2017)

    Google Scholar 

  6. Dunn, J.C.: Well-separated clusters and optimal fuzzy partitions. J. Cybern. 4, 95–104 (1974)

    Article  MathSciNet  Google Scholar 

  7. Eslamnezhad, M., Varjani, A.Y.: Intrusion detection based on MinMax K-means clustering. In: 7th International Symposium on Telecommunications, pp. 804–808 (2014)

    Google Scholar 

  8. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann, Burlington (2011)

    MATH  Google Scholar 

  9. Hand, D., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001)

    Google Scholar 

  10. Huang, Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min. Knowl. Discov. 2, 283–304 (1998)

    Article  Google Scholar 

  11. Johnson, T., Singh, S.K.: K-strange points clustering algorithm. In: Jain, L.C., Behera, H.S., Mandal, J.K., Mohapatra, D.P. (eds.) Computational Intelligence in Data Mining - Volume 1. SIST, vol. 31, pp. 415–425. Springer, New Delhi (2015). https://doi.org/10.1007/978-81-322-2205-7_39

    Chapter  Google Scholar 

  12. Jones, N.C., Pevzner, P.A.: An Introduction to Bioinformatics Algorithms. The MIT Press, Cambridge (2004)

    Google Scholar 

  13. Krey, S., Ligges, U., Leisch, F.: Music and timbre segmentation by recursive constrained K-means clustering. Comput. Stat. 29, 37–50 (2014)

    Article  MathSciNet  Google Scholar 

  14. Li, W.: Modified K-means clustering algorithm. In: 2008 Congress on Image and Signal Processing, pp. 618–621 (2008)

    Google Scholar 

  15. Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)

    Article  MathSciNet  Google Scholar 

  16. Mahmud, M.S., Rahman, M.M., Akhtar, M.N.: Improvement of k-means clustering algorithm with better initial centroids based on weighted average. In: International Conference on Electrical & Computer Engineering, pp. 647–650 (2012)

    Google Scholar 

  17. Maulik, U., Bandyopadhyay, S.: Performance evaluation of some clustering algorithms and validity indices. IEEE Trans. Pattern Anal. Mach. Intell. 24, 1650–1654 (2002)

    Article  Google Scholar 

  18. Na, S., Xumin, L., Yong, G.: Research on \(k\)-means clustering algorithm: an improved \(k\)-means clustering algorithm. In: Proceedings of the Third International Symposium on Intelligent Information Technology and Security Informatics, pp. 63–67 (2010)

    Google Scholar 

  19. Patil, Y.S., Vaidya, M.B.: A technical survey on cluster analysis in data mining. Int. J. Emerg. Technol. Adv. Eng. 2, 503–513 (2012)

    Google Scholar 

  20. Peña, J.M.S., Lozano, J.A., Larrañaga, P.: An empirical comparison of four initialization methods for the \({k}\)-means algorithm. Pattern Recogn. Lett. 20, 1027–1040 (1999)

    Article  Google Scholar 

  21. Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)

    Article  Google Scholar 

  22. Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994)

    Book  Google Scholar 

  23. Wilkin, G.A., Huang, X.: \({K}\)-means clustering algorithms: implementation and comparison. In: Proceedings of the Second International Multi-Symposiums on Computer and Computational Sciences, pp. 133–136 (2007)

    Google Scholar 

  24. Zhao, Q., Hautamaki, V., Fränti, P.: Knee point detection in BIC for detecting the number of clusters. In: Blanc-Talon, J., Bourennane, S., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2008. LNCS, vol. 5259, pp. 664–673. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88458-3_60

    Chapter  Google Scholar 

Download references

Acknowledgment

This research is funded by Jadavpur University (UGC-UPE, Phase-II, grant no. P-1/RS/115/13).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Rajdeep Baruri or Ranjan Banerjee .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Baruri, R. et al. (2019). A Comparative Study on k-means Clustering Method and Analysis. In: Somani, A., Ramakrishna, S., Chaudhary, A., Choudhary, C., Agarwal, B. (eds) Emerging Technologies in Computer Engineering: Microservices in Big Data Analytics. ICETCE 2019. Communications in Computer and Information Science, vol 985. Springer, Singapore. https://doi.org/10.1007/978-981-13-8300-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-8300-7_10

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-8299-4

  • Online ISBN: 978-981-13-8300-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics