NP-hardness of Some Max-Min Clustering Problems

Kel’manov, Alexander; Khandeev, Vladimir; Pyatkin, Artem

doi:10.1007/978-3-030-10934-9_11

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 974))

Included in the following conference series:

International Conference on Optimization and Applications

585 Accesses
2 Citations

Abstract

We consider some consimilar problems of searching for disjoint clusters in the finite set of points in Euclidean space. The goal is to maximize the minimum subset size so that the value of each intracluster quadratic variation would not exceed a given constant. We prove that all considered problems are NP-hard even on a line.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)
Google Scholar
Rao, M.: Cluster analysis and mathematical programming. J. Amer. Stat. Assoc. 66, 622–626 (1971)
Article Google Scholar
Hansen, P., Jaumard, B., Mladenovich, N.: Minimum sum of squares clustering in a low dimensional space. J. Classif. 15, 37–55 (1998)
Article MathSciNet Google Scholar
Hansen, P., Jaumard, B.: Cluster analysis and mathematical programming. Math. Program. 79, 191–215 (1997)
MathSciNet MATH Google Scholar
Aloise, D., Deshpande, A., Hansen, P., Popat, P.: NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75(2), 245–248 (2009)
Article Google Scholar
Drineas, P., Frieze, A., Kannan, R., Vempala, S., Vinay, V.: Clustering large graphs via the singular value decomposition. Mach. Learn. 56, 9–33 (2004)
Article Google Scholar
Arora, S., Raghavan, P., Rao, S.: Approximation schemes for Euclidean k-medians and related problems. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing, pp. 106–113 (1998)
Google Scholar
Kariv, O., Hakimi, S.: An algorithmic approach to network location problems. Part 1: the \(p\)-centers. SIAM J. Appl. Math. 37, 513–538 (1979)
Article MathSciNet Google Scholar
Feder, T., Greene, D.: Optimal algorithms for approximate clustering. In: Proceedings of 20th Annual ACM Symposium on Theory of Computing, pp. 434–444 (1988)
Google Scholar
Hochbaum, D.S., Shmoys, D.B.: A best possible heuristic for the k-center problem. Math. Oper. Res. 10(2), 180–184 (1985)
Article MathSciNet Google Scholar
Kaufman, L., Rousseeuw, P.J.: Clustering by means of Medoids, in statistical data analysis based on the \(L_{1}\)-Norm and related methods. Edited by Dodge, Y., pp. 405–416. Dodge, North-Holland (1987)
Google Scholar
Krarup, J., Pruzan, P.: The simple plant location problem: survey and synthesis. Eur. J. Oper. Res. 12(1), 36–81 (1983)
Article MathSciNet Google Scholar
Mirchandani, P., Francis, R. (eds.): Discrete Location Theory. Wiley-Intersience, New York (1990)
MATH Google Scholar
Charikar, M., Khuller, S., Mount, D.M., Narasimhan, G.: Algorithms for facility location problems with outliers. In: Proceedings of 12th ACM-SIAM Symposium on Discrete Algorithms, pp. 642–651 (2001)
Google Scholar
Agarwal, P.K., Phillips, J.M.: An efficient algorithm for 2D Euclidean 2-center with outliers. In: Proceedings of 16th Annual European Symposium Algorithms, pp. 64–75 (2008)
Google Scholar
McCutchen, R.M., Khuller, S.: Streaming algorithms for k-center clustering with outliers and with anonymity. In: Goel, A., Jansen, K., Rolim, J.D.P., Rubinfeld, R. (eds.) APPROX/RANDOM -2008. LNCS, vol. 5171, pp. 165–178. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85363-3_14
Chapter MATH Google Scholar
Hatami, B., Zarrabi-Zade, H.: A streaming algorithm for 2-center with outliers in high dimensions. Comput. Geom. 60, 26–36 (2017)
Article MathSciNet Google Scholar
Jain, A.K.: Data clustering: 50 years beyond \(k\)-Means. Patt. Recogn. Lett. 31(8), 651–666 (2010)
Article Google Scholar
Shirkhorshidi, A.S., Aghabozorgi, S., Wah, T.Y., Herawan, T.: Big data clustering: a review. In: Murgante, B., et al. (eds.) ICCSA 2014, Part V. LNCS, vol. 8583, pp. 707–720. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09156-3_49
Chapter Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
MATH Google Scholar
James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning. Springer, New York (2013)
Book Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, 2nd edn. Springer, New York (2009)
Book Google Scholar
Aggarwal, C.C.: Data Mining: The Textbook. Springer, Switzerland (2015)
MATH Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning (Adaptive Computation and Machine Learning series). The MIT Press (2017)
Google Scholar
Fu, T.-C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)
Article Google Scholar
Kuenzer, C., Dech, S., Wagner, W.: Remote Sensing Time Series. Remote Sensing and Digital Image Processing, vol. 22. Springer, Switzerland (2015)
Google Scholar
Liao, T.W.: Clustering of time series data – a survey. Patt. Recogn. 38(11), 1857–1874 (2005)
Article Google Scholar
de Waal, T., Pannekoek, J., Scholtus, S.: Handbook of Statistical Data Editing and Imputation. Wiley, New Jersey (2011)
Book Google Scholar
Osborne, J.W.: Best Practices in Data Cleaning: A Complete Guide to Everything You Need to Do Before and After Collecting Your Data, 1st edn. SAGE Publication, Inc., Los Angeles (2013)
Book Google Scholar
Farcomeni, A., Greco, L.: Robust Methods for Data Reduction. Chapman and Hall/CRC (2015)
Google Scholar
Ageev, A.A., Kel’manov, A.V., Pyatkin, A.V., Khamidullin, S.A., Shenmaier, V.V.: Approximation polynomial algorithm for the data editing and data cleaning problem. Patt. Recogn. Image Anal. 27(3), 365–370 (2017)
Article Google Scholar
Ageev, A.A., Kel’manov, A.V., Pyatkin, A.V., Khamidullin, S.A., Shenmaier, V.V.: 1/2-Approximation polynomial-time algorithm for a problem of searching a subset. In: Proceedings of 2017 International MultiConference on Engineering, Computer and Information Sciences (SIBIRCON), 18–22 September 2017, Novosibirsk, Russia, pp. 8–12 (2017)
Google Scholar
Kel’manov, A.V., Khamidullin, S.A., Khandeev, V.I., Pyatkin, A.V.: Exact algorithms for two quadratic Euclidean problems of searching for the largest subset and longest subsequence. In: Battiti, R., Brunato, M., Kotsireas, I., Pardalos, P.M. (eds.) LION 2018. LNCS, vol. 11353, pp. 308–318. Springer (2018). https://doi.org/10.1007/978-3-030-05348-2_28
Google Scholar
Kel’manov, A.V., Khandeev, V.I., Panasenko A.V.: Exact algorithms for the special cases of two hard to solve problems of searching for the largest subset. In: van der Aalst, W.M.P., et al. (eds.) AIST 2018. LNCS, vol. 11179, pp. 294–304. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-11027-7_28
Google Scholar
Kel’manov, A.V., Pyatkin, A.V., Khamidullin, S.A., Khandeev, V.I., Shenmaier, V.V., Shamardin, Yu.V.: An approximation polynomial algorithm for a problem of searching for the longest subsequence in a finite sequence of points in Euclidean space. In: Eremeev, A., Khachay, M., Kochetov, Y., Pardalos, P. (eds.) OPTA 2018. CCIS, vol. 871, pp. 120–130. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93800-4_10
Google Scholar
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, San Francisco (1979)
MATH Google Scholar

Download references

Acknowledgments

The study presented was supported by the Russian Science Foundation, project 16-11-10041.

Author information

Authors and Affiliations

Sobolev Institute of Mathematics, 4 Koptyug Ave., 630090, Novosibirsk, Russia
Alexander Kel’manov, Vladimir Khandeev & Artem Pyatkin
Novosibirsk State University, 2 Pirogova St., 630090, Novosibirsk, Russia
Alexander Kel’manov, Vladimir Khandeev & Artem Pyatkin

Authors

Alexander Kel’manov
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir Khandeev
View author publications
You can also search for this author in PubMed Google Scholar
Artem Pyatkin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vladimir Khandeev .

Editor information

Editors and Affiliations

FRC CSC RAS, Dorodnitsyn Computing Centre, Moscow, Russia
Yury Evtushenko
University of Montenegro, Podgorica, Montenegro
Milojica Jaćimović
Krasovsky Institute of Mathematics and Mechanics, Yekaterinburg, Russia
Michael Khachay
Sobolev Institute of Mathematics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
Yury Kochetov
FRC CSC RAS, Dorodnitsyn Computing Centre, Moscow, Russia
Vlasta Malkova
FRC CSC RAS, Dorodnitsyn Computing Centre, Moscow, Russia
Mikhail Posypkin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kel’manov, A., Khandeev, V., Pyatkin, A. (2019). NP-hardness of Some Max-Min Clustering Problems. In: Evtushenko, Y., Jaćimović, M., Khachay, M., Kochetov, Y., Malkova, V., Posypkin, M. (eds) Optimization and Applications. OPTIMA 2018. Communications in Computer and Information Science, vol 974. Springer, Cham. https://doi.org/10.1007/978-3-030-10934-9_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-10934-9_11
Published: 10 January 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-10933-2
Online ISBN: 978-3-030-10934-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics