Robust convex clustering

  • Zhenzhen Quan
  • Songcan ChenEmail author


Objective-based clustering is a class of important clustering analysis techniques; however, these methods are easily beset by local minima due to the non-convexity of their objective functions involved, as a result, impacting final clustering performance. Recently, a convex clustering method (CC) has been on the spot light and enjoys the global optimality and independence on the initialization. However, one of its downsides is non-robustness to data contaminated with outliers, leading to a deviation of the clustering results. In order to improve its robustness, in this paper, an outlier-aware robust convex clustering algorithm, called as RCC, is proposed. Specifically, RCC extends the CC by modeling the contaminated data as the sum of the clean data and the sparse outliers and then adding a Lasso-type regularization term to the objective of the CC to reflect the sparsity of outliers. In this way, RCC can both resist the outliers to great extent and still maintain the advantages of CC, including the convexity of the objective. Further we develop a block coordinate descent approach with the convergence guarantee and find that RCC can usually converge just in a few iterations. Finally, the effectiveness and robustness of RCC are empirically corroborated by numerical experiments on both synthetic and real datasets.


Convex clustering Outliers Robustness Sparsity Lasso 



This work is supported by the National Natural Science Foundation of China (NSFC) under the Grant Nos. 61732006 and 61672281, as well as the Key Program of NSFC under Grant No. 61472186.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.


  1. Ascari G, Fagiolo G, Roventini A (2012) Fat-tail distributions and business-cycle models. Macroecon Dyn 19(2):465–476CrossRefGoogle Scholar
  2. Bache K, Lichman M (2013) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine, CA.
  3. Berkhin P (2006) A survey of clustering data mining techniques. Group Multidimens Data 43(1):25–71MathSciNetCrossRefGoogle Scholar
  4. Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, CambridgezbMATHCrossRefGoogle Scholar
  5. Candès EJ, Li X, Ma Y, Wright J (2011) Robust principal component analysis? J ACM 58(3):11MathSciNetzbMATHCrossRefGoogle Scholar
  6. Chen GK, Chi EC, Ranola JMO, Lange K (2015) Convex clustering: an attractive alternative to hierarchical clustering. PLoS Comput Biol 11(5):e1004228CrossRefGoogle Scholar
  7. Chi EC, Lange K (2015) Splitting methods for convex clustering. J Comput Gr Stat 24(4):994–1013MathSciNetCrossRefGoogle Scholar
  8. Chi EC, Allen GI, Baraniuk RG (2016) Convex biclustering. Biometrics 73(1):10–19MathSciNetzbMATHCrossRefGoogle Scholar
  9. Dave RN, Krishnapuram R (2002) Robust clustering methods: a unified view. IEEE Trans Fuzzy Syst 5(2):270–293CrossRefGoogle Scholar
  10. Davis J, Goadrich M (2006) The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd international conference on machine learning, ACM, New York, pp 233–240Google Scholar
  11. Dietterich TG (2017) Steps toward robust artificial intelligence. AI Mag 38(3):3–24CrossRefGoogle Scholar
  12. Donoho DL (1995) De-noising by soft-thresholding. IEEE Trans Inf Theory 41(3):613–627MathSciNetzbMATHCrossRefGoogle Scholar
  13. Du L, Shen YD (2013) Towards robust co-clustering. In: International joint conferences on artificial intelligence (IJCAI), pp 1317–1322Google Scholar
  14. Fan J, Li R (2001) Variable selection via non-concave penalized likelihood and its oracle properties. Publ Am Stat Assoc 96(456):1348–1360zbMATHCrossRefGoogle Scholar
  15. Forero PA, Kekatos V, Giannakis GB (2011) Outlier-aware robust clustering. In: 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 2244–2247Google Scholar
  16. Forero PA, Kekatos V, Giannakis GB (2012) Robust clustering using outlier-sparsity regularization. IEEE Trans Signal Process 60(8):4163–4177MathSciNetzbMATHCrossRefGoogle Scholar
  17. García-Escudero LA, Gordaliza A, Matrán C, Mayo-Iscar A (2010) A review of robust clustering methods. Adv Data Anal Classif 4(2–3):89–109MathSciNetzbMATHCrossRefGoogle Scholar
  18. Giannakis GB, Mateos G, Farahmand S, Kekatos V, Zhu H (2011) USPACOR: universal sparsity-controlling outlier rejection. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 1952–1955Google Scholar
  19. Hall LO (2012) Objective function-based clustering. Wiley Interdiscip Rev Data Min Knowl Discov 2(4):326–339CrossRefGoogle Scholar
  20. Hallac D, Leskovec J, Boyd S (2015) Network lasso: clustering and optimization in large graphs. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, pp 387–396Google Scholar
  21. Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA (1986) Robust statistics: the approach based on influence functions. Wiley, New YorkzbMATHGoogle Scholar
  22. Hocking TD, Joulin A, Bach F, Vert JP (2011) Clusterpath: an algorithm for clustering using convex fusion penalties. In: 28th international conference on machine learning, p 1Google Scholar
  23. Huber PJ (1981) Robust statistics. Wiley, New YorkzbMATHCrossRefGoogle Scholar
  24. Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218zbMATHCrossRefGoogle Scholar
  25. Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice Hall, Upper Saddle RiverzbMATHGoogle Scholar
  26. Krijthe JH (2016) RSSL: semi-supervised learning in R. In: International workshop on reproducible research in pattern recognition, Springer, Cham, pp 104–115CrossRefGoogle Scholar
  27. Lindsten F, Ohlsson H, Ljung L (2011) Just relax and come clustering!: a convexification of k-means clustering. Linköping University Electronic Press, LinköpingGoogle Scholar
  28. Lu C, Yan S, Lin Z (2016) Convex sparse spectral clustering: single-view to multi-view. IEEE Trans Image Process 25(6):2833–2843MathSciNetzbMATHCrossRefGoogle Scholar
  29. Madeira SC, Oliveira AL (2004) Biclustering algorithms for biological data analysis: a survey. IEEE ACM Trans Comput Biol Bioinf 1(1):24–45CrossRefGoogle Scholar
  30. Mateos G, Giannakis GB (2012) Robust PCA as bilinear decomposition with outlier-sparsity regularization. IEEE Trans Signal Process 60(10):5176–5190MathSciNetzbMATHCrossRefGoogle Scholar
  31. Meng D, Zhao Q, Xu Z (2012) Improve robustness of sparse PCA by L1-norm maximization. Pattern Recognit 45(1):487–497zbMATHCrossRefGoogle Scholar
  32. Nagorski J, Allen GI (2016) Genomic region detection via spatial convex clustering. arXiv preprint arXiv:1611.04696
  33. Nie F, Wang H, Cai X et al (2012) Robust matrix completion via joint schatten p-norm and lp-norm minimization. In: 2012 IEEE 12th international conference on data mining (ICDM), IEEE, pp 566–574Google Scholar
  34. Oliveira JVD, Pedrycz W et al (2007) Advances in fuzzy clustering and its applications. Wiley, New YorkCrossRefGoogle Scholar
  35. Parikh N, Boyd S (2014) Proximal algorithms. Found Trends Optim 1(3):127–239CrossRefGoogle Scholar
  36. Poddar S, Jacob M (2018) Clustering of data with missing entries. arXiv preprint arXiv:1801.01455
  37. Tachikawa T, Yatabe K, Ikeda Y, et al (2016) Sound source localization based on sparse estimation and convex clustering. In: Proceedings of meetings on acoustics 172ASA, ASA, vol 29, no 1, p 055004Google Scholar
  38. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol) 58:267–288MathSciNetzbMATHGoogle Scholar
  39. Tibshirani R (2011) Regression shrinkage and selection via the lasso: a retrospective. J R Stat Soc Ser B (Stat Methodol) 73(3):273–282MathSciNetzbMATHCrossRefGoogle Scholar
  40. Tošić I, Frossard P (2011) Dictionary learning. IEEE Signal Process Mag 28(2):27–38zbMATHCrossRefGoogle Scholar
  41. Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J Optim Theory Appl 109(3):475–494MathSciNetzbMATHCrossRefGoogle Scholar
  42. Wang S, Liu D, Zhang Z (2013) Nonconvex relaxation approaches to robust matrix recovery. In: International joint conferences on artificial intelligence (IJCAI), pp 1764–1770Google Scholar
  43. Wang B, Zhang Y, Sun W et al (2016) Sparse convex clustering. J Comput Gr Stat. MathSciNetCrossRefGoogle Scholar
  44. Wang Q, Gong P, Chang S et al (2017) Robust convex clustering analysis. In: IEEE international conference on data miningGoogle Scholar
  45. Weylandt M, Nagorski J, Allen GI (2019) Dynamic visualization and fast computation for convex clustering via algorithmic regularization. J Comput Gr Stat.
  46. Yuan Y, Sun D, Toh KC (2018) An efficient semismooth Newton based algorithm for convex clustering. arXiv preprint arXiv:1802.07091
  47. Zhang CH (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38(2):894–942MathSciNetzbMATHCrossRefGoogle Scholar
  48. Zhang H, Zha ZJ, Yan S, Wang M, Chua TS (2012) Robust non-negative graph embedding: towards noisy data, unreliable graphs, and noisy labels. In: IEEE conference on computer vision and pattern recognition (CVPR), IEEE, pp 2464–2471Google Scholar
  49. Zhao Y, Zhu E, Xinwang LIU et al (2019) Simultaneous clustering and optimization for evolving datasets. IEEE Trans Knowl Data Eng.
  50. Zhu C, Xu H, Leng C et al (2014) Convex optimization procedure for clustering: theoretical revisit. In: Advances in neural information processing systems (NIPS), pp 1619–1627Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyNanjing University of Aeronautics and AstronauticsNanjingChina

Personalised recommendations