Advertisement

An Improved Approximation Algorithm for the k-Means Problem with Penalties

  • Qilong Feng
  • Zhen Zhang
  • Feng Shi
  • Jianxin WangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11458)

Abstract

The clustering problem has been paid lots of attention in various fields of compute science. However, in many applications, the existence of noisy data poses a big challenge for the clustering problem. As one way to deal with clustering problem with noisy data, clustering with penalties has been studied extensively, such as the k-median problem with penalties and the facility location problem with penalties. As far as we know, there is only one approximation algorithm for the k-means problem with penalties with ratio \(25+\epsilon \). All the previous related results for the clustering with penalties problems were based on the techniques of local search, LP-rounding, or primal-dual, which cannot be applied directly to the k-means problem with penalties to get better approximation ratio than \(25+\epsilon \). In this paper, we apply primal-dual technique to solve the k-means problem with penalties by a different rounding method, i.e., employing a deterministic rounding algorithm, instead of using the randomized rounding algorithm used in the previous approximation schemes. Based on the above method, an approximation algorithm with ratio \(19.849+\epsilon \) is presented for the k-means problem with penalties.

Keywords

Approximation algorithm k-means clustering Primal-dual 

References

  1. 1.
    Ahmadian, S., Norouzi-Fard, A., Svensson, O., Ward, J.: Better guarantees for \(k\)-means and Euclidean \(k\)-median by primal-dual algorithms. In: Proceedings of 58th IEEE Symposium on Foundations of Computer Science, pp. 61–72 (2017)Google Scholar
  2. 2.
    Aloise, D., Deshpande, A., Hansen, P., Popat, P.: NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75(2), 245–248 (2009)CrossRefGoogle Scholar
  3. 3.
    Arthur, D., Vassilvitskii, S.: \(k\)-means++: the advantages of careful seeding. In: Proceedings of 18th ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035 (2007)Google Scholar
  4. 4.
    Byrka, J., Pensyl, T., Rybicki, B., Srinivasan, A., Trinh, K.: An improved approximation for \(k\)-median and positive correlation in budgeted optimization. ACM Trans. Algorithms 13(2), 23 (2017)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Charikar, M., Khuller, S., Mount, D.M., Narasimhan, G.: Algorithms for facility location problems with outliers. In: Proceedings of 12th ACM-SIAM Symposium on Discrete Algorithms, pp. 642–651 (2001)Google Scholar
  6. 6.
    Chen, K.: A constant factor approximation algorithm for \(k\)-median clustering with outliers. In: Proceedings of 19th ACM-SIAM Symposium on Discrete Algorithms, pp. 826–835 (2008)Google Scholar
  7. 7.
    Cohen-Addad, V., Klein, P.N., Mathieu, C.: Local search yields approximation schemes for \(k\)-means and \(k\)-median in Euclidean and minor-free metrics. In: Proceedings of 57th IEEE Symposium on Foundations of Computer Science, pp. 353–364 (2016)Google Scholar
  8. 8.
    Feldman, D., Schulman, L.J.: Data reduction for weighted and outlier-resistant clustering. In: Proceedings of 23st ACM-SIAM Symposium on Discrete Algorithms, pp. 1343–1354 (2012)Google Scholar
  9. 9.
    Friggstad, Z., Khodamoradi, K., Rezapour, M., Salavatipour, M.R.: Approximation schemes for clustering with outliers. In: Proceedings of 28th ACM-SIAM Symposium on Discrete Algorithms, pp. 398–414 (2018)Google Scholar
  10. 10.
    Friggstad, Z., Rezapour, M., Salavatipour, M.R.: Local search yields a PTAS for \(k\)-means in doubling metrics. In: Proceedings of 57th IEEE Symposium on Foundations of Computer Science, pp. 365–374 (2016)Google Scholar
  11. 11.
    Guha, S., Li, Y., Zhang, Q.: Distributed partial clustering. In: Proceedings of 29th ACM Symposium on Parallelism in Algorithms and Architectures, pp. 143–152 (2017)Google Scholar
  12. 12.
    Gupta, A., Guruganesh, G., Schmidt, M.: Approximation algorithms for aversion \(k\)-clustering via local \(k\)-median. In: Proceedings of 43rd International Colloquium on Automata, Languages and Programming, pp. 1–13 (2016)Google Scholar
  13. 13.
    Gupta, S., Kumar, R., Lu, K., Moseley, B., Vassilvitskii, S.: Local search methods for \(k\)-means with outliers. Proc. VLDB Endow. 10(7), 757–768 (2017)CrossRefGoogle Scholar
  14. 14.
    Hajiaghayi, M., Khandekar, R., Kortsarz, G.: Local search algorithms for the red-blue median problem. Algorithmica 63(4), 795–814 (2012)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Huang, L., Jiang, S., Li, J., Wu, X.: \(\epsilon \)-coresets for clustering (with outliers) in doubling metrics. In: Proceedings of 50th ACM Symposium on Theory of Computing, pp. 814–825 (2018)Google Scholar
  16. 16.
    Jain, K., Mahdian, M., Markakis, E., Saberi, A., Vazirani, V.V.: Greedy facility location algorithms analyzed using dual fitting with factor-revealing LP. J. ACM 50(6), 795–824 (2003)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Jain, K., Vazirani, V.V.: Approximation algorithms for metric facility location and \(k\)-median problems using the primal-dual schema and lagrangian relaxation. J. ACM 48(2), 274–296 (2001)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: A local search approximation algorithm for \(k\)-means clustering. Comput. Geom. 28(2–3), 89–112 (2004)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Kumar, A., Sabharwal, Y., Sen, S.: Linear-time approximation schemes for clustering problems in any dimensions. J. ACM 57(2), 1–32 (2010)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Li, S., Guo, X.: Distributed \(k\)-clustering for data with heavy noise. In: Proceedings of 32nd Annual Conference on Neural Information Processing Systems, pp. 7849–7857 (2018)Google Scholar
  21. 21.
    Li, S., Svensson, O.: Approximating \(k\)-median via pseudo-approximation. SIAM J. Comput. 45(2), 530–547 (2016)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Li, Y., Du, D., Xiu, N., Xu, D.: Improved approximation algorithms for the facility location problems with linear/submodular penalties. Algorithmica 73(2), 460–482 (2015)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Mahajan, M., Nimbhorkar, P., Varadarajan, K.: The planar \(k\)-means problem is NP-hard. Theoret. Comput. Sci. 442, 13–21 (2012)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Makarychev, K., Makarychev, Y., Sviridenko, M., Ward, J.: A bi-criteria approximation algorithm for \(k\)-means. In: Proceedings of 19th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems and 20th International Workshop on Randomization and Computation, pp. 1–20 (2016)Google Scholar
  25. 25.
    Matousek, J.: On approximate geometric \(k\)-clustering. Discrete Comput. Geom. 24(1), 61–84 (2000)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Ravishankar, K., Li, S., Sai, S.: Constant approximation for \(k\)-median and \(k\)-means with outliers via iterative rounding. In: Proceedings of 50th ACM Symposium on Theory of Computing, pp. 646–659 (2018)Google Scholar
  27. 27.
    Wu, C., Du, D., Xu, D.: An approximation algorithm for the \(k\)-median problem with uniform penalties via pseudo-solution. Theoret. Comput. Sci. 749, 80–92 (2018)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Xu, G., Xu, J.: An LP rounding algorithm for approximating uncapacitated facility location problem with penalties. Inf. Process. Lett. 94(3), 119–123 (2005)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Xu, G., Xu, J.: An improved approximation algorithm for uncapacitated facility location problem with penalties. J. Comb. Optim. 17(4), 424–436 (2009)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Zhang, D., Hao, C., Wu, C., Xu, D., Zhang, Z.: A local search approximation algorithm for the k-means problem with penalties. In: Cao, Y., Chen, J. (eds.) COCOON 2017. LNCS, vol. 10392, pp. 568–574. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-62389-4_47CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Qilong Feng
    • 1
  • Zhen Zhang
    • 1
  • Feng Shi
    • 1
  • Jianxin Wang
    • 1
    Email author
  1. 1.School of Computer Science and EngineeringCentral South UniversityChangshaPeople’s Republic of China

Personalised recommendations