A Scalable Exemplar-Based Subspace Clustering Algorithm for Class-Imbalanced Data

  • Chong YouEmail author
  • Chi Li
  • Daniel P. Robinson
  • René Vidal
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11213)


Subspace clustering methods based on expressing each data point as a linear combination of a few other data points (e.g., sparse subspace clustering) have become a popular tool for unsupervised learning due to their empirical success and theoretical guarantees. However, their performance can be affected by imbalanced data distributions and large-scale datasets. This paper presents an exemplar-based subspace clustering method to tackle the problem of imbalanced and large-scale datasets. The proposed method searches for a subset of the data that best represents all data points as measured by the \(\ell _1\) norm of the representation coefficients. To solve our model efficiently, we introduce a farthest first search algorithm which iteratively selects the least well-represented point as an exemplar. When data comes from a union of subspaces, we prove that the computed subset contains enough exemplars from each subspace for expressing all data points even if the data are imbalanced. Our experiments demonstrate that the proposed method outperforms state-of-the-art subspace clustering methods in two large-scale image datasets that are imbalanced. We also demonstrate the effectiveness of our method on unsupervised data subset selection for a face image classification task.


Subspace clustering Imbalanced data Large-scale data 



C. You, D. P. Robinson and R. Vidal are supported by NSF under grant 1618637. C. Li is supported by IARPA under grant 127228.


  1. 1.
    Vidal, R.: Subspace clustering. IEEE Signal Process. Mag. 28(3), 52–68 (2011)CrossRefGoogle Scholar
  2. 2.
    von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Elhamifar, E., Vidal, R.: Sparse subspace clustering. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2790–2797 (2009)Google Scholar
  4. 4.
    Elhamifar, E., Vidal, R.: Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2765–2781 (2013)CrossRefGoogle Scholar
  5. 5.
    Soltanolkotabi, M., Candès, E.J.: A geometric analysis of subspace clustering with outliers. Ann. Stat. 40(4), 2195–2238 (2012)MathSciNetzbMATHCrossRefGoogle Scholar
  6. 6.
    You, C., Vidal, R.: Geometric conditions for subspace-sparse recovery. In: International Conference on Machine Learning, pp. 1585–1593 (2015)Google Scholar
  7. 7.
    Wang, Y.X., Xu, H.: Noisy sparse subspace clustering. J. Mach. Learn. Res. 17(12), 1–41 (2016)MathSciNetzbMATHGoogle Scholar
  8. 8.
    You, C., Robinson, D., Vidal, R.: Provable self-representation based outlier detection in a union of subspaces. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  9. 9.
    Liu, G., Lin, Z., Yu, Y.: Robust subspace segmentation by low-rank representation. In: International Conference on Machine Learning, pp. 663–670 (2010)Google Scholar
  10. 10.
    Lu, C.-Y., Min, H., Zhao, Z.-Q., Zhu, L., Huang, D.-S., Yan, S.: Robust and efficient subspace segmentation via least squares regression. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7578, pp. 347–360. Springer, Heidelberg (2012). Scholar
  11. 11.
    Dyer, E.L., Sankaranarayanan, A.C., Baraniuk, R.G.: Greedy feature selection for subspace clustering. J. Mach. Learn. Res. 14(1), 2487–2517 (2013)MathSciNetzbMATHGoogle Scholar
  12. 12.
    You, C., Li, C.G., Robinson, D., Vidal, R.: Oracle based active set algorithm for scalable elastic net subspace clustering. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3928–3937 (2016)Google Scholar
  13. 13.
    Yang, Y., Feng, J., Jojic, N., Yang, J., Huang, T.S.: \(\ell ^{0}\)-sparse subspace clustering. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 731–747. Springer, Cham (2016). Scholar
  14. 14.
    Xin, B., Wang, Y., Gao, W., Wipf, D.: Building invariances into sparse subspace clustering. IEEE Trans. Signal Process. 66(2), 449–462 (2018)MathSciNetCrossRefGoogle Scholar
  15. 15.
    You, C., Donnat, C., Robinson, D., Vidal, R.: A divide-and-conquer framework for large-scale subspace clustering. In: Asilomar Conference on Signals, Systems and Computers (2016)Google Scholar
  16. 16.
    You, C., Li, C., Robinson, D., Vidal, R.: A scalable exemplar-based subspace clustering algorithm for class-imbalanced data (2018)CrossRefGoogle Scholar
  17. 17.
    Aharon, M., Elad, M., Bruckstein, A.M.: K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006)zbMATHCrossRefGoogle Scholar
  18. 18.
    Bach, F., Jenatton, R., Mairal, J., Obozinski, G.: Optimization with sparsity-inducing penalties. J. Found. Trends Mach. Learn. 4(1), 1–106 (2012)zbMATHGoogle Scholar
  19. 19.
    Adler, A., Elad, M., Hel-Or, Y.: Linear-time subspace clustering via bipartite graph modeling. IEEE Trans. Neural Netw. Learn. Syst. 26(10), 2234–2246 (2015)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Traganitis, P.A., Giannakis, G.B.: Sketched subspace clustering. IEEE Trans. Signal Process. 66, 1663–1675 (2017)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Tropp, J.A., Gilbert, A.C., Strauss, M.J.: Algorithms for simultaneous sparse approximation. Part I: greedy pursuit. Signal Process. 86(3), 572–588 (2006)zbMATHCrossRefGoogle Scholar
  22. 22.
    Cevher, V., Krause, A.: Greedy dictionary selection for sparse representation. IEEE J. Sel. Top. Signal Process. 5(5), 979–988 (2011)CrossRefGoogle Scholar
  23. 23.
    Das, A., Kempe, D.: Submodular meets spectral: greedy algorithms for subset selection, sparse approximation and dictionary selection. arXiv preprint arXiv:1102.3975 (2011)
  24. 24.
    Tropp, J.A.: Algorithms for simultaneous sparse approximation. Part II: convex relaxation. Signal Process. 86, 589–602 (2006). Special Issue on Sparse approximations in signal and image processingzbMATHCrossRefGoogle Scholar
  25. 25.
    Cong, Y., Yuan, J., Liu, J.: Sparse reconstruction cost for abnormal event detection. In: The 24th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, Colorado Springs, CO, USA, 20–25 June 2011, pp. 3449–3456 (2011)Google Scholar
  26. 26.
    Elhamifar, E., Sapiro, G., Vidal, R.: See all by looking at a few: sparse modeling for finding representative objects. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)Google Scholar
  27. 27.
    Meng, J., Wang, H., Yuan, J., Tan, Y.P.: From keyframes to key objects: video summarization by representative object proposal selection. In: CVPR, pp. 1039–1048 (2016)Google Scholar
  28. 28.
    Wang, H., Kawahara, Y., Weng, C., Yuan, J.: Representative selection with structured sparsity. Pattern Recognit. 63, 268–278 (2017)CrossRefGoogle Scholar
  29. 29.
    Cong, Y., Yuan, J., Luo, J.: Towards scalable summarization of consumer videos via sparse dictionary selection. IEEE Trans. Multimed. 14(1), 66–75 (2012)CrossRefGoogle Scholar
  30. 30.
    Borodin, A.: Determinantal point processes. arXiv preprint arXiv:0911.1153 (2009)
  31. 31.
    Gillenwater, J.A., Kulesza, A., Fox, E., Taskar, B.: Expectation-maximization for learning determinantal point processes. In: NIPS, pp. 3149–3157 (2014)Google Scholar
  32. 32.
    Kulesza, A., Taskar, B.: k-DPPs: Fixed-size determinantal point processes. In: ICML, pp. 1193–1200 (2011)Google Scholar
  33. 33.
    Chan, T.: Rank revealing QR factorizations. Linear Algebra Appl. 88–89, 67–82 (1987)MathSciNetzbMATHGoogle Scholar
  34. 34.
    Boutsidis, C., Mahoney, M.W., Drineas, P.: An improved approximation algorithm for the column subset selection problem. In: Proceedings of SODA, pp. 968–977 (2009)CrossRefGoogle Scholar
  35. 35.
    Altschuler, J., Bhaskara, A., Fu, G., Mirrokni, V., Rostamizadeh, A., Zadimoghaddam, M.: Greedy column subset selection: new bounds and distributed algorithms. In: International Conference on Machine Learning, pp. 2539–2548 (2016)Google Scholar
  36. 36.
    Arora, S., Ge, R., Kannan, R., Moitra, A.: Computing a nonnegative matrix factorization–provably. In: Proceedings of the Forty-Fourth Annual ACM Symposium on Theory of Computing, pp. 145–162. ACM (2012)Google Scholar
  37. 37.
    Kumar, A., Sindhwani, V., Kambadur, P.: Fast conical hull algorithms for near-separable non-negative matrix factorization. In: Proceedings of the 30th International Conference on Machine Learning, ICML, pp. 231–239 (2013)Google Scholar
  38. 38.
    Elhamifar, E., Sapiro, G., Vidal, R.: Finding exemplars from pairwise dissimilarities via simultaneous sparse recovery. In: Neural Information Processing and Systems (2012)Google Scholar
  39. 39.
    Aldroubi, A., Sekmen, A., Koku, A.B., Cakmak, A.F.: Similarity matrix framework for data from union of subspaces. Appl. Comput. Harmon. Anal. 45, 425–435 (2017)MathSciNetzbMATHCrossRefGoogle Scholar
  40. 40.
    Aldroubi, A., Hamm, K., Koku, A.B., Sekmen, A.: CUR decompositions, similarity matrices, and subspace clustering. arXiv preprint arXiv:1711.04178 (2017)
  41. 41.
    Abdolali, M., Gillis, N., Rahmati, M.: Scalable and robust sparse subspace clustering using randomized clustering and multilayer graphs. arXiv preprint arXiv:1802.07648 (2018)
  42. 42.
    Williamson, D.P., Shmoys, D.B.: The Design of Approximation Algorithms. Cambridge University Press, Cambridge (2011)zbMATHCrossRefGoogle Scholar
  43. 43.
    Vershynin, R.: Lectures in geometric functional analysis (2009)Google Scholar
  44. 44.
    Donoho, D.L.: Neighborly polytopes and sparse solution of underdetermined linear equations. Technical report. Stanford University (2005)Google Scholar
  45. 45.
    Toth, L.F.: On covering a spherical surface with equal spherical caps. Matematikai Fiz. Lapok 50, 40–46 (1943). (in Hungarian)Google Scholar
  46. 46.
    Croft, H.T., Guy, R.K., Falconer, K.J.: Unsolved Problems in Geometry. Springer, New York (1991). Scholar
  47. 47.
    Vidal, R., Tron, R., Hartley, R.: Multiframe motion segmentation with missing data using PowerFactorization, and GPCA. Int. J. Comput. Vis. 79(1), 85–105 (2008)CrossRefGoogle Scholar
  48. 48.
    You, C., Robinson, D., Vidal, R.: Scalable sparse subspace clustering by orthogonal matching pursuit. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3918–3927 (2016)Google Scholar
  49. 49.
    Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32(2), 407–499 (2004)MathSciNetzbMATHCrossRefGoogle Scholar
  50. 50.
    Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online learning for matrix factorization and sparse coding. J. Mach. Learn. Res. 11, 19–60 (2010)MathSciNetzbMATHGoogle Scholar
  51. 51.
    Vedaldi, A., Fulkerson, B.: VLFeat: an open and portable library of computer vision algorithms (2008).
  52. 52.
    Cohen, G., Afshar, S., Tapson, J., van Schaik, A.: EMNIST: an extension of MNIST to handwritten letters. arXiv preprint arXiv:1702.05373 (2017)
  53. 53.
    Bruna, J., Mallat, S.: Invariant scattering convolution networks. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1872–1886 (2013)CrossRefGoogle Scholar
  54. 54.
    Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: Man vs. computer: benchmarking machine learning algorithms for traffic sign recognition. Neural Netw. 32, 323–332 (2012)CrossRefGoogle Scholar
  55. 55.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)Google Scholar
  56. 56.
    Basri, R., Jacobs, D.: Lambertian reflection and linear subspaces. IEEE Trans. Pattern Anal. Mach. Intell. 25(2), 218–233 (2003)CrossRefGoogle Scholar
  57. 57.
    Heckel, R., Bölcskei, H.: Robust subspace clustering via thresholding. IEEE Trans. Inf. Theory 61(11), 6320–6342 (2015)MathSciNetzbMATHCrossRefGoogle Scholar
  58. 58.
    Shen, J., Li, P., Xu, H.: Online low-rank subspace clustering by basis dictionary pursuit. In: Proceedings of the 33rd International Conference on Machine Learning, pp. 622–631 (2016)Google Scholar
  59. 59.
    Brodersen, K.H., Ong, C.S., Stephan, K.E., Buhmann, J.M.: The balanced accuracy and its posterior distribution. In: 2010 20th International Conference on Pattern Recognition, ICPR, pp. 3121–3124. IEEE (2010)Google Scholar
  60. 60.
    Wright, S.J., Nowak, R.D., Figueiredo, M.A.T.: Sparse reconstruction by separable approximation. IEEE Trans. Signal Process. 57, 2479–2493 (2009)MathSciNetzbMATHCrossRefGoogle Scholar
  61. 61.
    Park, H.S., Jun, C.H.: A simple and fast algorithm for K-medoids clustering. Expert Syst. Appl. 36(2), 3336–3341 (2009)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Chong You
    • 1
    Email author
  • Chi Li
    • 1
  • Daniel P. Robinson
    • 1
  • René Vidal
    • 1
  1. 1.Johns Hopkins UniversityBaltimoreUSA

Personalised recommendations