A Method to Estimate the Number of Clusters Using Gravity

Du, Hui; Wang, Xiaoniu; Huang, Mengyin; Wang, Xiaoli

doi:10.1007/978-3-030-03766-6_47

A Method to Estimate the Number of Clusters Using Gravity

Hui Du¹⁸,
Xiaoniu Wang¹⁸,
Mengyin Huang¹⁸ &
…
Xiaoli Wang¹⁸

Conference paper
First Online: 25 December 2018

696 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 891))

Abstract

The number of clusters is crucial to the correctness of the clustering. However, most available clustering algorithms have two main issues: (1) they need to specify the number of clusters by users; (2) they are easy to fall into local optimum because the selection of initial centers is random. To solve these problems, we propose a novel algorithm using gravity for auto determining the number of clusters, and this method can obtain the better initial centers. In the proposed algorithm, we firstly scatter some detectors on the data space uniformly and they can be moved according to the law of universal gravitation, and two detectors can be merged when the distance between them less than a given threshold. When all detectors no longer move, we take the number of detectors as the number of the clusters. Then, we utilize the finally obtained detectors as the initial center points. Finally, the experimental results show that the proposed method can automatically determine the number of clusters and generate better initial centers, thus the clustering accuracy is improved observably.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Pena, J.M., Lozano, J.A., Larranaga, P.: An empirical comparison of four initialization methods for the K-means algorithm. Pattern Recognit. Lett. 20, 1027–1040 (1999)
Article Google Scholar
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, USA, pp. 281–297. University of California Press (1967)
Google Scholar
Estivill, C.V., Yang, J.: Fast and robust general purpose clustering algorithms. Data Min. Knowl. Discov. 8(2), 127–150 (2004)
Google Scholar
Muchun, S.U., Chienhsing, C.H.O.U.: A modified version of the K-means algorithm with a distance based on cluster symmetry. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 674–680 (2001)
Article Google Scholar
Likas, A., Vlassis, M., Verbeek, J.: The global k-means clustering algorithm. Pattern Recognit. 36, 451–461 (2003)
Article Google Scholar
D’Urso, P., Giordani, P.: A robust fuzzy k-means clustering model for interval valued data. Comput. Stat. 21(2), 251–269 (2006)
Article MathSciNet Google Scholar
Chunsheng, H.U.A., Qian, C.H.E.N., et al.: RK-means clustering: K-means with reliability. IEICE Trans. Inf. Syst. E91D(1), 96–104 (2008)
Google Scholar
Timmerman, M.E., Ceulemans, E., et al.: Subspace K-means clustering. Behav. Res. Methods 45(4), 1011–1023 (2013)
Article Google Scholar
Pelleg, D., Moore, A.: X-means: extending k-means with efficient estimation of the number of clusters. In: Proceedings of the 17th International Conference on Machine Learning, pp. 727–734 (2000)
Google Scholar
Hamerly, G., Elkan, C.: Learning the k in k-means. In: Proceedings of the 17th Annual Conference on Neural Information Processing Systems, pp. 281–288 (2003)
Google Scholar
Fujita, A., Takahashi, D.Y., Patriota, A.G.: A non-parametric method to estimate the number of clusters. Comput. Stat. Data Anal. 73, 27–39 (2014)
Article MathSciNet Google Scholar
Kolesnikov, A., Trichina, E., Kauranne, T.: Estimating the number of clusters in a numerical data set via quantization error modeling. Pattern Recognit. 48(3), 941–952 (2015)
Article Google Scholar
Tzortzis Likas, G.A.: The MinMax k-means clustering algorithm. Pattern Recognit. 47, 2505–2516 (2014)
Article Google Scholar
Fang, K.T., Shiu, W.C., Pan, J.X.: Uniform designs based on Latin squares. Stat. Sin. 9(3), 905–912 (1999)
MathSciNet MATH Google Scholar
Fang, K.T., Wang, Y.: Number-Theoretic Methods in Statistics. Chapman and Hall, London (1994)
Book Google Scholar
Zhang, L., Liang, Y., Jiang, J., Yu, R., Fang, K.T.: Uniform designs applied to nonlinear multivariate calibration by ANN. Anal. Chim. Acta 370(1), 65–77 (1998)
Article Google Scholar
Shang, F.H., Jiao, L.C.: Fast affinity propagation clustering: a multilevel approach. Pattern Recognit. 45, 474–486 (2012)
Article Google Scholar
http://archive.ics.uci.edu/ml/
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intel. (PAMI) 1, 224–227 (1979)
Article Google Scholar
http://www.ux.uis.no/~tranden/brodatz.html
http://www.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/

Download references

Acknowledgment

This work is supported by the National Natural Science Foundation of China (No. 61472297 and No. 61402350 and No. 61662068).

Author information

Authors and Affiliations

College of Computer Science and Engineering, Northwest Normal University, Lanzhou, 730070, China
Hui Du, Xiaoniu Wang, Mengyin Huang & Xiaoli Wang

Authors

Hui Du
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoniu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Mengyin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoli Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hui Du .

Editor information

Editors and Affiliations

Department of Computer Science, VSB-Technical University of Ostrava, Ostrava, Czech Republic
Pavel Krömer
School of Automation, Xi’an University of Posts and Telecommunications, Xi’an, China
Hong Zhang
College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, China
Yongquan Liang
College of Information Science and Engineering, Fujian University of Technology, Fuzhou, Fujian, China
Jeng-Shyang Pan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Du, H., Wang, X., Huang, M., Wang, X. (2019). A Method to Estimate the Number of Clusters Using Gravity. In: Krömer, P., Zhang, H., Liang, Y., Pan, JS. (eds) Proceedings of the Fifth Euro-China Conference on Intelligent Data Analysis and Applications. ECC 2018. Advances in Intelligent Systems and Computing, vol 891. Springer, Cham. https://doi.org/10.1007/978-3-030-03766-6_47

Download citation

DOI: https://doi.org/10.1007/978-3-030-03766-6_47
Published: 25 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03765-9
Online ISBN: 978-3-030-03766-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics