A Discretization Algorithm for k-Means with Capacity Constraints
We consider capacitated k-means clustering whose object is to minimize the within-cluster sum of squared Euclidean distances. The task is to partition a set of n observations into k disjoint clusters satisfying the capacity constraints, both upper and lower bound capacities are considered. One of the reasons making these clustering problems hard to deal with is the continuous choices of the centroid. In this paper we propose a discretization algorithm that in polynomial time outputs an approximate centroid set with at most \(\epsilon \) fractional loss of the original object. This result implies an FPT(k,d) PTAS for uniform capacitated k-means and makes more techniques, for example local search, possible to apply to it.
Keywordsk-means Capacity constraints Discretization algorithm FPT PTAS
The first author is supported by China Postdoctoral Science Foundation funded project (No. 2018M643233). The second author is supported by Natural Science Foundation of China(Nos. 11531014, 11871081). The third author is supported by Higher Educational Science and Technology Program of Shandong Province (No. J15LN23). The fourth author is supported by Natural Science Foundation of China (Nos. 61433012, U1435215).
- 2.Byrka J., Fleszar K., Rybicki B., Spoerhase J.: Bi-factor approximation algorithms for hard capacitated k-median problems. In: Proceedings of the 26th Annual ACMSIAM Symposium on Discrete Algorithms, pp. 722–736. SIAM, San Diego, USA (2015)Google Scholar
- 3.Chen X., Cai D.: Large scale spectral clustering with landmark-based representation. In: Proceedings of the 25th AAAI Conference on Artificial Intelligence, pp. 313–318. AAAI, San Francisco, USA (2011)Google Scholar
- 4.Cohen-Addad V., Klein P.N., Mathieu C.: Local search yields approximation schemes for \(k\)-means and \(k\)-median in Euclidean and minor-free metrics. In: Proceedings of the 57th IEEE Annual Symposium on Foundations of Computer Science, pp. 353–364. IEEE, New Brunswick, USA (2016)Google Scholar
- 5.Friggstad Z., Rezapour M., Salavatipour M.R.: Local search yields a PTAS for \(k\)-means in doubling metrics. In: Proceedings of the 57th IEEE Annual Symposium on Foundations of Computer Science, pp. 365–374. IEEE, New Brunswick, USA (2016)Google Scholar
- 6.Geetha S., Poonthalir G., Vanathi P.: Improved \(k\)-means algorithm for capacitated clustering problem. In: Proceedings of the 28th IEEE Conference on Computer Communications, pp. 52–59. IEEE, Rio de Janeiro, Brazil (2009)Google Scholar
- 7.Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: A local search approximation algorithm for \(k\)-means clustering. Comput. Geom. 28(2–3), 89–112 (2004)Google Scholar
- 8.Li, S.: On uniform capacitated \(k\)-median beyond the natural LP Relaxation. ACM Trans. Algorithms 13(2), 1–22 (2017)Google Scholar
- 9.Matoušek, J.: On approximate geometric \(k\)-clustering. Discret. Comput. Geom. 24(1), 61–84 (2000)Google Scholar
- 10.Shen X., Liu W., Tsang I., Shen F., Sun Q.: Compressed \(k\)-means for large-scale clustering. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, pp. 2527–2533. AAAI, San Francisco, USA (2017)Google Scholar