Skip to main content

K-Medoids Clustering Is Solvable in Polynomial Time for a 2d Pareto Front

  • Conference paper
  • First Online:
Optimization of Complex Systems: Theory, Models, Algorithms and Applications (WCGO 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 991))

Included in the following conference series:

Abstract

The k-medoids problem is a discrete sum-of-square clustering problem, which is known to be more robust to outliers than k-means clustering. As an optimization problem, k-medoids is NP-hard. This paper examines k-medoids clustering in the case of a two-dimensional Pareto front, as generated by bi-objective optimization approaches. A characterization of optimal clusters is provided in this case. This allows to solve k-medoids to optimality in polynomial time using a dynamic programming algorithm. More precisely, having N points to cluster, the complexity of the algorithm is proven in \(O(N^3)\) time and \(O(N^2)\) memory space. This algorithm can also be used to minimize conjointly the number of clusters and the dissimilarity of clusters. This bi-objective extension is also solvable to optimality in \(O(N^3)\) time and \(O(N^2)\) memory space, which is useful to choose the appropriate number of clusters for the real-life applications. Parallelization issues are also discussed, to speed-up the algorithm in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aloise, D., Deshpande, A., Hansen, P., Popat, P.: NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75(2), 245–248 (2009)

    Google Scholar 

  2. Auger, A., Bader, J., Brockhoff, D., Zitzler, E.: Investigating and exploiting the bias of the weighted hypervolume to articulate user preferences. In: Proceedings of GECCO 2009, pp. 563–570. ACM (2009)

    Google Scholar 

  3. Bringmann, K., Friedrich, T., Klitzke, P.: Two-dimensional subset selection for hypervolume and epsilon-indicator. In: Annual Conference on Genetic and Evolutionary Computation, pp. 589–596. ACM (2014)

    Google Scholar 

  4. Dupin, N.: Modélisation et résolution de grands problèmes stochastiques combinatoires: application à la gestion de production d’électricité. Ph.D. thesis, University Lille 1 (2015)

    Google Scholar 

  5. Dupin, N., Nielsen, F., Talbi, E.: Clustering in a 2d pareto front: p-median and p-center are solvable in polynomial time, pp. 1–24 (2018). arXiv:1806.02098

  6. Dupin, N., Nielsen, F., Talbi, E.: Dynamic programming heuristic for k-means clustering among a 2-dimensional pareto frontier. In: 7th International Conference on Metaheuristics and Nature Inspired Computing, pp. 1–8 (2018)

    Google Scholar 

  7. Ehrgott, M., Gandibleux, X.: Multiobjective combinatorial optimization-theory, methodology, and applications. In: Multiple Criteria Optimization: State of the Art Annotated Bibliographic Surveys, pp. 369–444. Springer (2003)

    Google Scholar 

  8. Grønlund, A., Larsen, K.G., Mathiasen, A., Nielsen, J.S., Schneider, S., Song, M.: Fast exact k-means, k-medians and bregman divergence clustering in 1d (2017). arXiv preprint arXiv:1701.07204

  9. Hsu, W., Nemhauser, G.: Easy and hard bottleneck location problems. Discret. Appl. Math. 1(3), 209–215 (1979)

    Google Scholar 

  10. Jain, A.: Data clustering: 50 years beyond k-means. Pattern Recognit. Lett. 31(8), 651–666 (2010)

    Google Scholar 

  11. Kaufman, L., Rousseeuw, P.: Clustering by Means of Medoids (1987)

    Google Scholar 

  12. Kuhn, T., Fonseca, C.M., Paquete, L., Ruzika, S., Duarte, M.M., Figueira, J.R.: Hypervolume subset selection in two dimensions: formulations and algorithms. Evol. Comput. 24(3), 411–425 (2016)

    Google Scholar 

  13. Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)

    Google Scholar 

  14. Nielsen, F.: Output-sensitive peeling of convex and maximal layers. Inf. Process. Lett. 59(5), 255–259 (1996)

    Google Scholar 

  15. Nielsen, F.: Introduction to HPC with MPI for Data Science. Springer (2016)

    Google Scholar 

  16. Peugeot, T., Dupin, N., Sembely, M.J., Dubecq, C.: MBSE, PLM, MIP and robust optimization for system of systems management, application to SCCOA French air defense program. In: Complex Systems Design & Management, pp. 29–40. Springer (2017)

    Google Scholar 

  17. Rasson, J.P., Kubushishi, T.: The gap test: an optimal method for determining the number of natural classes in cluster analysis. In: New Approaches in Classification and Data Analysis, pp. 186–193. Springer (1994)

    Google Scholar 

  18. Saule, E., Baş, E., Çatalyürek, Ü.: Load-balancing spatially located computations using rectangular partitions. J. Parallel Distrib. Comput. 72(10), 1201–1214 (2012)

    Google Scholar 

  19. Schubert, E., Rousseeuw, P.: Faster k-Medoids clustering: improving the PAM, CLARA, and CLARANS algorithms (2018). arXiv preprint arXiv:1810.05691

  20. Sheng, W., Liu, X.: A genetic k-medoids clustering algorithm. J. Heuristics 12(6), 447–466 (2006)

    Google Scholar 

  21. Talbi, E.: Metaheuristics: From Design to Implementation. Wiley (2009)

    Google Scholar 

  22. Wang, H., Song, M.: Ckmeans. 1d. dp: optimal k-means clustering in one dimension by dynamic programming. The R J. 3(2), 29 (2011)

    Google Scholar 

  23. Zio, E., Bazzo, R.: A clustering procedure for reducing the number of representative solutions in the Pareto Front of multiobjective optimization problems. Eur. J. Oper. Res. 210(3), 624–634 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicolas Dupin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dupin, N., Nielsen, F., Talbi, EG. (2020). K-Medoids Clustering Is Solvable in Polynomial Time for a 2d Pareto Front. In: Le Thi, H., Le, H., Pham Dinh, T. (eds) Optimization of Complex Systems: Theory, Models, Algorithms and Applications. WCGO 2019. Advances in Intelligent Systems and Computing, vol 991. Springer, Cham. https://doi.org/10.1007/978-3-030-21803-4_79

Download citation

Publish with us

Policies and ethics