Abstract
The number of clusters is crucial to the correctness of the clustering. However, most available clustering algorithms have two main issues: (1) they need to specify the number of clusters by users; (2) they are easy to fall into local optimum because the selection of initial centers is random. To solve these problems, we propose a novel algorithm using gravity for auto determining the number of clusters, and this method can obtain the better initial centers. In the proposed algorithm, we firstly scatter some detectors on the data space uniformly and they can be moved according to the law of universal gravitation, and two detectors can be merged when the distance between them less than a given threshold. When all detectors no longer move, we take the number of detectors as the number of the clusters. Then, we utilize the finally obtained detectors as the initial center points. Finally, the experimental results show that the proposed method can automatically determine the number of clusters and generate better initial centers, thus the clustering accuracy is improved observably.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Pena, J.M., Lozano, J.A., Larranaga, P.: An empirical comparison of four initialization methods for the K-means algorithm. Pattern Recognit. Lett. 20, 1027–1040 (1999)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, USA, pp. 281–297. University of California Press (1967)
Estivill, C.V., Yang, J.: Fast and robust general purpose clustering algorithms. Data Min. Knowl. Discov. 8(2), 127–150 (2004)
Muchun, S.U., Chienhsing, C.H.O.U.: A modified version of the K-means algorithm with a distance based on cluster symmetry. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 674–680 (2001)
Likas, A., Vlassis, M., Verbeek, J.: The global k-means clustering algorithm. Pattern Recognit. 36, 451–461 (2003)
D’Urso, P., Giordani, P.: A robust fuzzy k-means clustering model for interval valued data. Comput. Stat. 21(2), 251–269 (2006)
Chunsheng, H.U.A., Qian, C.H.E.N., et al.: RK-means clustering: K-means with reliability. IEICE Trans. Inf. Syst. E91D(1), 96–104 (2008)
Timmerman, M.E., Ceulemans, E., et al.: Subspace K-means clustering. Behav. Res. Methods 45(4), 1011–1023 (2013)
Pelleg, D., Moore, A.: X-means: extending k-means with efficient estimation of the number of clusters. In: Proceedings of the 17th International Conference on Machine Learning, pp. 727–734 (2000)
Hamerly, G., Elkan, C.: Learning the k in k-means. In: Proceedings of the 17th Annual Conference on Neural Information Processing Systems, pp. 281–288 (2003)
Fujita, A., Takahashi, D.Y., Patriota, A.G.: A non-parametric method to estimate the number of clusters. Comput. Stat. Data Anal. 73, 27–39 (2014)
Kolesnikov, A., Trichina, E., Kauranne, T.: Estimating the number of clusters in a numerical data set via quantization error modeling. Pattern Recognit. 48(3), 941–952 (2015)
Tzortzis Likas, G.A.: The MinMax k-means clustering algorithm. Pattern Recognit. 47, 2505–2516 (2014)
Fang, K.T., Shiu, W.C., Pan, J.X.: Uniform designs based on Latin squares. Stat. Sin. 9(3), 905–912 (1999)
Fang, K.T., Wang, Y.: Number-Theoretic Methods in Statistics. Chapman and Hall, London (1994)
Zhang, L., Liang, Y., Jiang, J., Yu, R., Fang, K.T.: Uniform designs applied to nonlinear multivariate calibration by ANN. Anal. Chim. Acta 370(1), 65–77 (1998)
Shang, F.H., Jiao, L.C.: Fast affinity propagation clustering: a multilevel approach. Pattern Recognit. 45, 474–486 (2012)
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intel. (PAMI) 1, 224–227 (1979)
http://www.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/
Acknowledgment
This work is supported by the National Natural Science Foundation of China (No. 61472297 and No. 61402350 and No. 61662068).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Du, H., Wang, X., Huang, M., Wang, X. (2019). A Method to Estimate the Number of Clusters Using Gravity. In: Krömer, P., Zhang, H., Liang, Y., Pan, JS. (eds) Proceedings of the Fifth Euro-China Conference on Intelligent Data Analysis and Applications. ECC 2018. Advances in Intelligent Systems and Computing, vol 891. Springer, Cham. https://doi.org/10.1007/978-3-030-03766-6_47
Download citation
DOI: https://doi.org/10.1007/978-3-030-03766-6_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03765-9
Online ISBN: 978-3-030-03766-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)