Advertisement

Multimedia Tools and Applications

, Volume 78, Issue 23, pp 33415–33434 | Cite as

Weighted adjacent matrix for K-means clustering

  • Jukai Zhou
  • Tong LiuEmail author
  • Jingting Zhu
Article
  • 114 Downloads

Abstract

K-means clustering is one of the most popular clustering algorithms and has been embedded in other clustering algorithms, e.g. the last step of spectral clustering. In this paper, we propose two techniques to improve previous k-means clustering algorithm by designing two different adjacent matrices. Extensive experiments on public UCI datasets showed the clustering results of our proposed algorithms significantly outperform three classical clustering algorithms in terms of different evaluation metrics.

Keywords

k-means clustering Similarity measurement Adjacent matrix Unsupervised learning 

Notes

References

  1. 1.
    Abe S (2010) Feature selection and extraction, in support vector machines for pattern classification. Springer, London, pp 331–341CrossRefGoogle Scholar
  2. 2.
    Arora P, Varshney S (2016) Analysis of k-means and k-medoids algorithm for big data. Proc Comput Sci 78:507–512CrossRefGoogle Scholar
  3. 3.
    Bachem O et al. (2016) Approximate K-means++ in sublinear time. AAAI. Phoenix, Arizona USA: 1459–1467Google Scholar
  4. 4.
    Birch ZT (1996) BIRCH: an efficient data clustering method for very large databases, in SIGMOD, R.R. T. Zhang, M. Livny, Editor. New York: 103–114Google Scholar
  5. 5.
    Bryant A, Cios K (2018) RNN-DBSCAN: a density-based clustering algorithm using reverse nearest neighbor density estimates. IEEE Trans Knowl Data Eng 30(6):1109–1121CrossRefGoogle Scholar
  6. 6.
    Capó M, Pérez A, Lozano JA (2017) An efficient approximation to the K-means clustering for massive data. Knowl-Based Syst 117:56–69CrossRefGoogle Scholar
  7. 7.
    Cassisi C et al (2013) Enhancing density-based clustering: parameter reduction and outlier detection. Inf Syst 38(3):317–330CrossRefGoogle Scholar
  8. 8.
    Chang L et al (2017) Fast and exact structural graph clustering. IEEE Trans Knowl Data Eng 29(2):387–401CrossRefGoogle Scholar
  9. 9.
    Chen J et al (2018) FGCH: a fast and grid based clustering algorithm for hybrid data stream. Appl Intell 49(4):1228–1244CrossRefGoogle Scholar
  10. 10.
    Deng Z et al (2015) A scalable and fast OPTICS for clustering trajectory big data. Clust Comput 18(2):549–562MathSciNetCrossRefGoogle Scholar
  11. 11.
    Ding Y, Fu X (2016) Kernel-based fuzzy c-means clustering algorithm based on genetic algorithm. Neurocomputing 188:233–238CrossRefGoogle Scholar
  12. 12.
    Domeniconi C, Al-Razgan M (2009) Weighted cluster ensembles: methods and analysis. ACM Trans Knowledge Discov Data (TKDD) 2(4):17–57Google Scholar
  13. 13.
    Du L et al. (2015) Robust multiple kernel K-means using L21-norm. IJCAI.: Buenos Aires, Argentina: 3476–3482Google Scholar
  14. 14.
    Du, T., et al., (2018) Spectral clustering algorithm combining local covariance matrix with normalization. Neural Comput & Applic: 1–8.Google Scholar
  15. 15.
    Ferreira, M.R.P., F.d.A.T. de Carvalho, and E.C. Simões, Kernel-based hard clustering methods with kernelization of the metric and automatic weighting of the variables. Pattern Recogn, 2016. 51: p. 310–321.CrossRefGoogle Scholar
  16. 16.
    Gan J, Tao Y (2015) DBSCAN revisited: mis-claim, un-fixability, and approximation. SIGMOD: Melbourne: 519–530Google Scholar
  17. 17.
    Gebru ID et al (2016) EM algorithms for weighted-data clustering with application to audio-visual scene analysis. IEEE Trans Pattern Anal Mach Intell 38(12):2402–2415CrossRefGoogle Scholar
  18. 18.
    Guha S, Rastogi R, Shim K (1998) CURE: an efficient clustering algorithm for large databases, in SIGMOD. ACM, Seattle, Washington, USA, pp 73–84Google Scholar
  19. 19.
    Guha S, Rastogi R, Shim K (2000) ROCK: a robust clustering algorithm for categorical attributes. Inf Syst 25(5):345–366CrossRefGoogle Scholar
  20. 20.
    He L et al (2018) Fast large-scale spectral clustering via explicit feature mapping. IEEE Transactions on Cybernetics 49(3):1058–1071CrossRefGoogle Scholar
  21. 21.
    Jayasumana S et al (2015) Kernel methods on Riemannian manifolds with Gaussian RBF kernels. IEEE Trans Pattern Anal Mach Intell 37(12):2464–2477CrossRefGoogle Scholar
  22. 22.
    Kodinariya TM, Makwana PR (2013) Review on determining number of cluster in K-means clustering. Int J 1(6):90–95Google Scholar
  23. 23.
    Lattanzi S et al (2015) Robust hierarchical k-Center clustering, in ITCS. ACM, Rehovot, Israel, pp 211–218zbMATHGoogle Scholar
  24. 24.
    Lei C, Zhu X (2018) Unsupervised feature selection via local structure learning and sparse learning. Multimed Tools Appl 77(22):29605–29622CrossRefGoogle Scholar
  25. 25.
    Liu F, Ye C, Zhu E (2017) Accurate grid-based clustering algorithm with diagonal grid searching and merging. ICAMMT.  https://doi.org/10.1088/1757-899X/242/1/012123 CrossRefGoogle Scholar
  26. 26.
    Lv Y et al (2016) An efficient and scalable density-based clustering algorithm for datasets with complex structures. Neurocomputing 171:9–22CrossRefGoogle Scholar
  27. 27.
    Malinen MI, Fränti P (2014) Balanced K-means for clustering. S+SSPR. : Berlin, Heidelberg: 32–41Google Scholar
  28. 28.
    Murtagh F, Contreras P (2012) Algorithms for hierarchical clustering: an overview. Wiley Data Mining Knowl Discov 2(1):86–97CrossRefGoogle Scholar
  29. 29.
    Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: Analysis and an algorithm, in NIPS: 849–856Google Scholar
  30. 30.
    Pavan KK, Rao AD, Sridhar G (2010) Single pass seed selection algorithm for k-means. Comput Sci 6(1):60–66CrossRefGoogle Scholar
  31. 31.
    Sharma A, Sharma A (2017) KNN-DBSCAN: Using k-nearest neighbor information for parameter-free density based clustering, in ICICICT: Kannur, India: 787–792Google Scholar
  32. 32.
    Shiokawa H, Fujiwara Y, Onizuka M (2015) SCAN++: efficient algorithm for finding clusters, hubs and outliers on large-scale graphs. Proc VLDB Endow 8(11):1178–1189CrossRefGoogle Scholar
  33. 33.
    Shiokawa H, Takahashi T, Kitagawa H (2018) ScaleSCAN: scalable density-based graph clustering, in DEXA: Cham: 18–34Google Scholar
  34. 34.
    Souza CR (2010) Kernel functions for machine learning applications. Creative Commons Attribution-Noncommercial-Share Alike 3:29–41Google Scholar
  35. 35.
    Sting WWYJMR (1997) A Statistical Information Grid Approach to Spatial Data Mining, in VLDB. : Athens, Greece: 186–195Google Scholar
  36. 36.
    Tibshirani R, Walther G, Hastie T (2002) Estimating the number of clusters in a data set via the gap statistic. J Royal Stat Soc: Ser B (Statistical Methodology) 63(2):411–423MathSciNetCrossRefGoogle Scholar
  37. 37.
    Tremblay N et al. (2016) Compressive spectral clustering. ICML : New York: 1002–1011Google Scholar
  38. 38.
    Vajda S, Santosh KC (2016) A fast k-nearest neighbor classifier using unsupervised clustering, in RTIP2R : Singapore: 185–193Google Scholar
  39. 39.
    Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416MathSciNetCrossRefGoogle Scholar
  40. 40.
    Xu D, Tian Y (2015) A comprehensive survey of clustering algorithms. Ann Data Sci 2(2):165–193MathSciNetCrossRefGoogle Scholar
  41. 41.
    Xu X et al. (2007) Scan: a structural clustering algorithm for networks. In KDD. ACM: San Jose, CA: 824–833Google Scholar
  42. 42.
    Zahra S et al (2015) Novel centroid selection approaches for KMeans-clustering based recommender systems. Inf Sci 320:156–189MathSciNetCrossRefGoogle Scholar
  43. 43.
    Zhang S (2018) Multiple-scale cost sensitive decision tree learning. World Wide Web 21:1787–1800.  https://doi.org/10.1007/s11280-018-0619-5 CrossRefGoogle Scholar
  44. 44.
    Zhang S (2019) Cost-sensitive KNN classification. Neurocomputing.  https://doi.org/10.1016/j.neucom.2018.11.101
  45. 45.
    Zheng W et al (2017) Dynamic graph learning for spectral feature selection. Multimed Tools Appl 77(22):29739–29755CrossRefGoogle Scholar
  46. 46.
    Zheng W et al (2018) Unsupervised feature selection by self-paced learning regularization. Pattern Recogn Lett.  https://doi.org/10.1016/j.patrec.2018.06.029
  47. 47.
    Zhu X, Li X, Zhang S (2015) Block-row sparse multiview multilabel learning for image classification. IEEE Trans Cybernet 46(2):450–461CrossRefGoogle Scholar
  48. 48.
    Zhu X et al (2017) Graph PCA hashing for similarity search. IEEE Trans Multimed 19(9):2033–2044CrossRefGoogle Scholar
  49. 49.
    Zhu X et al (2018) Low-rank sparse subspace for spectral clustering. IEEE Trans Knowl Data Eng.  https://doi.org/10.1109/TKDE.2018.2858782 CrossRefGoogle Scholar
  50. 50.
    Zhu X et al (2018) One-step multi-view spectral clustering. IEEE Trans Knowl Data Eng.  https://doi.org/10.1109/TKDE.2018.2873378 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Natural and Computational SciencesMassey UniversityAucklandNew Zealand

Personalised recommendations