Abstract
A study of three clustering methods using four different cluster validity metrics is being presented here. We have discussed the clustering methods and made an analysis. We have given the mathematical formation of four cluster validity measures. From the experimental outcomes, indications regarding the optimal validation method, as well as, optimal clustering method are being presented. Choice of preferable clustering technique is presented after getting outcomes using real-world data sets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Abbas, O.A.: Comparisons between data clustering algorithms. Int. Arab J. Inf. Technol. 5, 320–325 (2008)
Bezdek, J.C., Pal, N.R.: Some new indices of cluster validity. IEEE Trans. Syst. Man Cybern. 28, 301–315 (1998)
Bradley, P.S., Fayyad, U.M.: Refining initial points for \(k\)-means clustering. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 91–99 (1998)
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 2, 224–227 (1979)
Dheeru, D., Taniskidou, E.K.: UCI Machine Learning Repository (2017)
Dunn, J.C.: Well-separated clusters and optimal fuzzy partitions. J. Cybern. 4, 95–104 (1974)
Eslamnezhad, M., Varjani, A.Y.: Intrusion detection based on MinMax K-means clustering. In: 7th International Symposium on Telecommunications, pp. 804–808 (2014)
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann, Burlington (2011)
Hand, D., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001)
Huang, Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min. Knowl. Discov. 2, 283–304 (1998)
Johnson, T., Singh, S.K.: K-strange points clustering algorithm. In: Jain, L.C., Behera, H.S., Mandal, J.K., Mohapatra, D.P. (eds.) Computational Intelligence in Data Mining - Volume 1. SIST, vol. 31, pp. 415–425. Springer, New Delhi (2015). https://doi.org/10.1007/978-81-322-2205-7_39
Jones, N.C., Pevzner, P.A.: An Introduction to Bioinformatics Algorithms. The MIT Press, Cambridge (2004)
Krey, S., Ligges, U., Leisch, F.: Music and timbre segmentation by recursive constrained K-means clustering. Comput. Stat. 29, 37–50 (2014)
Li, W.: Modified K-means clustering algorithm. In: 2008 Congress on Image and Signal Processing, pp. 618–621 (2008)
Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Mahmud, M.S., Rahman, M.M., Akhtar, M.N.: Improvement of k-means clustering algorithm with better initial centroids based on weighted average. In: International Conference on Electrical & Computer Engineering, pp. 647–650 (2012)
Maulik, U., Bandyopadhyay, S.: Performance evaluation of some clustering algorithms and validity indices. IEEE Trans. Pattern Anal. Mach. Intell. 24, 1650–1654 (2002)
Na, S., Xumin, L., Yong, G.: Research on \(k\)-means clustering algorithm: an improved \(k\)-means clustering algorithm. In: Proceedings of the Third International Symposium on Intelligent Information Technology and Security Informatics, pp. 63–67 (2010)
Patil, Y.S., Vaidya, M.B.: A technical survey on cluster analysis in data mining. Int. J. Emerg. Technol. Adv. Eng. 2, 503–513 (2012)
Peña, J.M.S., Lozano, J.A., Larrañaga, P.: An empirical comparison of four initialization methods for the \({k}\)-means algorithm. Pattern Recogn. Lett. 20, 1027–1040 (1999)
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
Wasserman, S., Faust, K.: Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994)
Wilkin, G.A., Huang, X.: \({K}\)-means clustering algorithms: implementation and comparison. In: Proceedings of the Second International Multi-Symposiums on Computer and Computational Sciences, pp. 133–136 (2007)
Zhao, Q., Hautamaki, V., Fränti, P.: Knee point detection in BIC for detecting the number of clusters. In: Blanc-Talon, J., Bourennane, S., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2008. LNCS, vol. 5259, pp. 664–673. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88458-3_60
Acknowledgment
This research is funded by Jadavpur University (UGC-UPE, Phase-II, grant no. P-1/RS/115/13).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Baruri, R. et al. (2019). A Comparative Study on k-means Clustering Method and Analysis. In: Somani, A., Ramakrishna, S., Chaudhary, A., Choudhary, C., Agarwal, B. (eds) Emerging Technologies in Computer Engineering: Microservices in Big Data Analytics. ICETCE 2019. Communications in Computer and Information Science, vol 985. Springer, Singapore. https://doi.org/10.1007/978-981-13-8300-7_10
Download citation
DOI: https://doi.org/10.1007/978-981-13-8300-7_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8299-4
Online ISBN: 978-981-13-8300-7
eBook Packages: Computer ScienceComputer Science (R0)