Abstract
k-means clustering (KM) algorithm, also called hard c-means clustering (HCM) algorithm, is a very powerful clustering algorithm [1, 2], but it has a serious problem of strong initial value dependence. To decrease the dependence, Arthur and Vassilvitskii proposed an algorithm of k-means++ clustering (KM++) algorithm on 2007 [3]. By the way, there are many case that each object is allocated on an unit sphere, e.g. text clustering. Dhillon and Modha proposed the primitive spherical k-means clustering algorithm to classify such objects on 2007 [4] and Honik, Kober, and Buchta proposed new spherical k-means clustering (SKM) algorithm on 2012 [5]. However, both of the algorithms also have the same problem of initial value dependence as KM. Therefore, the paper discuss the following points: (1) the dissimilarity of SKM is extended to satisfy the triangle inequality, and (2) spherical k-means++ clustering (SKM++) algorithm which works well for the problem is proposed. The paper shows that the effectiveness of SKM++ is theoretically guaranteed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Steinhaus, H.: Sur la division des corps matériels en parties. Bulletin de l’Académie Polonaise des Sci. 4(12), 801–804 (1957)
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Statistics, vol. 1, pp. 281–297. University of California Press (1967)
Arthur, D., Vassilvitskii, S.: \(k\)-means++: the advantages of careful seeding, In: Proceedings of the Eighteenth Annual ACM-SIAM symposium on Discrete algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics, Philadelphia (2007)
Dhillon, I.S., Modha, D.S.: Concept decompositions for large sparse text data using clustering. Mach. Learn. 42, 143–175 (2001)
Hornik, K., Feinerer, I., Kober, M., Buchta, C.: Spherical \(k\)-Means Clustering, vol. 50(10), September 2012
Dasgupta, S.: Lecture 3 – Algorithms for \(k\)-means clustering (2013). http://cseweb.ucsd.edu/dasgupta/291-geom/kmeans.pdf
Acknowledgment
This work has partly been supported by JSPS KAKENHI Grant Numbers 26330270 and 26330271.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Endo, Y., Miyamoto, S. (2015). Spherical k-Means++ Clustering. In: Torra, V., Narukawa, T. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2015. Lecture Notes in Computer Science(), vol 9321. Springer, Cham. https://doi.org/10.1007/978-3-319-23240-9_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-23240-9_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23239-3
Online ISBN: 978-3-319-23240-9
eBook Packages: Computer ScienceComputer Science (R0)