NP-hardness of Some Max-Min Clustering Problems

  • Alexander Kel’manov
  • Vladimir KhandeevEmail author
  • Artem Pyatkin
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 974)


We consider some consimilar problems of searching for disjoint clusters in the finite set of points in Euclidean space. The goal is to maximize the minimum subset size so that the value of each intracluster quadratic variation would not exceed a given constant. We prove that all considered problems are NP-hard even on a line.


Euclidean space Clustering Max-Min problem NP-hardness Quadratic variation 



The study presented was supported by the Russian Science Foundation, project 16-11-10041.


  1. 1.
    MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)Google Scholar
  2. 2.
    Rao, M.: Cluster analysis and mathematical programming. J. Amer. Stat. Assoc. 66, 622–626 (1971)CrossRefGoogle Scholar
  3. 3.
    Hansen, P., Jaumard, B., Mladenovich, N.: Minimum sum of squares clustering in a low dimensional space. J. Classif. 15, 37–55 (1998)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Hansen, P., Jaumard, B.: Cluster analysis and mathematical programming. Math. Program. 79, 191–215 (1997)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Aloise, D., Deshpande, A., Hansen, P., Popat, P.: NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75(2), 245–248 (2009)CrossRefGoogle Scholar
  6. 6.
    Drineas, P., Frieze, A., Kannan, R., Vempala, S., Vinay, V.: Clustering large graphs via the singular value decomposition. Mach. Learn. 56, 9–33 (2004)CrossRefGoogle Scholar
  7. 7.
    Arora, S., Raghavan, P., Rao, S.: Approximation schemes for Euclidean k-medians and related problems. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing, pp. 106–113 (1998)Google Scholar
  8. 8.
    Kariv, O., Hakimi, S.: An algorithmic approach to network location problems. Part 1: the \(p\)-centers. SIAM J. Appl. Math. 37, 513–538 (1979)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Feder, T., Greene, D.: Optimal algorithms for approximate clustering. In: Proceedings of 20th Annual ACM Symposium on Theory of Computing, pp. 434–444 (1988)Google Scholar
  10. 10.
    Hochbaum, D.S., Shmoys, D.B.: A best possible heuristic for the k-center problem. Math. Oper. Res. 10(2), 180–184 (1985)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Kaufman, L., Rousseeuw, P.J.: Clustering by means of Medoids, in statistical data analysis based on the \(L_{1}\)-Norm and related methods. Edited by Dodge, Y., pp. 405–416. Dodge, North-Holland (1987)Google Scholar
  12. 12.
    Krarup, J., Pruzan, P.: The simple plant location problem: survey and synthesis. Eur. J. Oper. Res. 12(1), 36–81 (1983)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Mirchandani, P., Francis, R. (eds.): Discrete Location Theory. Wiley-Intersience, New York (1990)zbMATHGoogle Scholar
  14. 14.
    Charikar, M., Khuller, S., Mount, D.M., Narasimhan, G.: Algorithms for facility location problems with outliers. In: Proceedings of 12th ACM-SIAM Symposium on Discrete Algorithms, pp. 642–651 (2001)Google Scholar
  15. 15.
    Agarwal, P.K., Phillips, J.M.: An efficient algorithm for 2D Euclidean 2-center with outliers. In: Proceedings of 16th Annual European Symposium Algorithms, pp. 64–75 (2008)Google Scholar
  16. 16.
    McCutchen, R.M., Khuller, S.: Streaming algorithms for k-center clustering with outliers and with anonymity. In: Goel, A., Jansen, K., Rolim, J.D.P., Rubinfeld, R. (eds.) APPROX/RANDOM -2008. LNCS, vol. 5171, pp. 165–178. Springer, Heidelberg (2008). Scholar
  17. 17.
    Hatami, B., Zarrabi-Zade, H.: A streaming algorithm for 2-center with outliers in high dimensions. Comput. Geom. 60, 26–36 (2017)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Jain, A.K.: Data clustering: 50 years beyond \(k\)-Means. Patt. Recogn. Lett. 31(8), 651–666 (2010)CrossRefGoogle Scholar
  19. 19.
    Shirkhorshidi, A.S., Aghabozorgi, S., Wah, T.Y., Herawan, T.: Big data clustering: a review. In: Murgante, B., et al. (eds.) ICCSA 2014, Part V. LNCS, vol. 8583, pp. 707–720. Springer, Cham (2014). Scholar
  20. 20.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)zbMATHGoogle Scholar
  21. 21.
    James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning. Springer, New York (2013)CrossRefGoogle Scholar
  22. 22.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, 2nd edn. Springer, New York (2009)CrossRefGoogle Scholar
  23. 23.
    Aggarwal, C.C.: Data Mining: The Textbook. Springer, Switzerland (2015)zbMATHGoogle Scholar
  24. 24.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning (Adaptive Computation and Machine Learning series). The MIT Press (2017)Google Scholar
  25. 25.
    Fu, T.-C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)CrossRefGoogle Scholar
  26. 26.
    Kuenzer, C., Dech, S., Wagner, W.: Remote Sensing Time Series. Remote Sensing and Digital Image Processing, vol. 22. Springer, Switzerland (2015)Google Scholar
  27. 27.
    Liao, T.W.: Clustering of time series data – a survey. Patt. Recogn. 38(11), 1857–1874 (2005)CrossRefGoogle Scholar
  28. 28.
    de Waal, T., Pannekoek, J., Scholtus, S.: Handbook of Statistical Data Editing and Imputation. Wiley, New Jersey (2011)CrossRefGoogle Scholar
  29. 29.
    Osborne, J.W.: Best Practices in Data Cleaning: A Complete Guide to Everything You Need to Do Before and After Collecting Your Data, 1st edn. SAGE Publication, Inc., Los Angeles (2013)CrossRefGoogle Scholar
  30. 30.
    Farcomeni, A., Greco, L.: Robust Methods for Data Reduction. Chapman and Hall/CRC (2015)Google Scholar
  31. 31.
    Ageev, A.A., Kel’manov, A.V., Pyatkin, A.V., Khamidullin, S.A., Shenmaier, V.V.: Approximation polynomial algorithm for the data editing and data cleaning problem. Patt. Recogn. Image Anal. 27(3), 365–370 (2017)CrossRefGoogle Scholar
  32. 32.
    Ageev, A.A., Kel’manov, A.V., Pyatkin, A.V., Khamidullin, S.A., Shenmaier, V.V.: 1/2-Approximation polynomial-time algorithm for a problem of searching a subset. In: Proceedings of 2017 International MultiConference on Engineering, Computer and Information Sciences (SIBIRCON), 18–22 September 2017, Novosibirsk, Russia, pp. 8–12 (2017)Google Scholar
  33. 33.
    Kel’manov, A.V., Khamidullin, S.A., Khandeev, V.I., Pyatkin, A.V.: Exact algorithms for two quadratic Euclidean problems of searching for the largest subset and longest subsequence. In: Battiti, R., Brunato, M., Kotsireas, I., Pardalos, P.M. (eds.) LION 2018. LNCS, vol. 11353, pp. 308–318. Springer (2018). Scholar
  34. 34.
    Kel’manov, A.V., Khandeev, V.I., Panasenko A.V.: Exact algorithms for the special cases of two hard to solve problems of searching for the largest subset. In: van der Aalst, W.M.P., et al. (eds.) AIST 2018. LNCS, vol. 11179, pp. 294–304. Springer, Cham (2018). Scholar
  35. 35.
    Kel’manov, A.V., Pyatkin, A.V., Khamidullin, S.A., Khandeev, V.I., Shenmaier, V.V., Shamardin, Yu.V.: An approximation polynomial algorithm for a problem of searching for the longest subsequence in a finite sequence of points in Euclidean space. In: Eremeev, A., Khachay, M., Kochetov, Y., Pardalos, P. (eds.) OPTA 2018. CCIS, vol. 871, pp. 120–130. Springer, Cham (2018). Scholar
  36. 36.
    Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, San Francisco (1979)zbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Sobolev Institute of MathematicsNovosibirskRussia
  2. 2.Novosibirsk State UniversityNovosibirskRussia

Personalised recommendations