An Image Clustering Auto-Encoder Based on Predefined Evenly-Distributed Class Centroids and MMD Distance

  • 9 Accesses


In this paper, we propose a novel, effective and simpler end-to-end image clustering auto-encoder algorithm: ICAE. The algorithm uses predefined evenly-distributed class centroids (PEDCC) as the clustering centers, which ensures the inter-class distance of latent features is maximal, and adds data distribution constraint, data augmentation constraint, auto-encoder reconstruction constraint and Sobel smooth constraint to improve the clustering performance. Specifically, we perform one-to-one data augmentation to learn the more effective features. The data and the augmented data are simultaneously input into the autoencoder to obtain latent features and the augmented latent features whose similarity are constrained by an augmentation loss. Then, making use of the maximum mean discrepancy distance, we combine the latent features and augmented latent features to make their distribution close to the PEDCC distribution (uniform distribution between classes, Dirac distribution within the class) to further learn clustering-oriented features. At the same time, the MSE of the original input image and reconstructed image is used as reconstruction constraint, and the Sobel smooth loss to build generalization constraint to improve the generalization ability. Finally, extensive experiments on three common datasets MNIST, Fashion-MNIST, COIL20 are conducted. The experimental results show that the algorithm has achieved the best clustering results so far. In addition, we can use the predefined PEDCC class centers, and the decoder to clearly generate the samples of each class. The code can be downloaded at!

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 99

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. 1.

    Cai D, He X, Wang X, Bao H, Han J (2009) Locality preserving nonnegative matrix factorization. In: Twenty-first international joint conference on artificial intelligence

  2. 2.

    Chang J, Wang L, Meng G, Xiang S, Pan C (2017) Deep adaptive image clustering. In: Proceedings of the IEEE international conference on computer vision, pp 5879–5887

  3. 3.

    Chen D, Lv J, Zhang Y (2017) Unsupervised multi-manifold clustering by learning deep representation. In: Workshops at the thirty-first AAAI conference on artificial intelligence

  4. 4.

    Chen X, Cai D (2011) Large scale spectral clustering with landmark-based representation. In: Twenty-fifth AAAI conference on artificial intelligence

  5. 5.

    Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc: Ser B (Methodological) 39(1):1–22

  6. 6.

    Deng L (2012) The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process Mag 29(6):141–142

  7. 7.

    Ding C, He X (2004) K-means clustering via principal component analysis. In: Proceedings of the twenty-first international conference on machine learning, ICML ’04. ACM, New York, p. 29

  8. 8.

    Doersch C (2016) Tutorial on variational autoencoders. arXiv preprint arXiv:1606.05908

  9. 9.

    Gdalyahu Y, Weinshall D, Werman M (2001) Self-organization in vision: stochastic clustering for image segmentation, perceptual grouping, and image database organization. IEEE Trans Pattern Anal Mach Intell 23(10):1053–1074

  10. 10.

    Ghasedi Dizaji K, Herandi A, Deng C, Cai W, Huang H (2017) Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization. In: Proceedings of the IEEE international conference on computer vision, pp 5736–5745

  11. 11.

    Gowda KC, Krishna G (1978) Agglomerative clustering using the concept of mutual nearest neighbourhood. Pattern Recognit 10(2):105–112

  12. 12.

    Guo X, Gao L, Liu X, and Yin J (2017) Improved deep embedded clustering with local structure preservation. In: IJCAI, pp 1753–1759

  13. 13.

    Guo X, Zhu E, Liu X, Yin J (2018) Deep embedded clustering with data augmentation. In: Asian conference on machine learning, pp 550–565

  14. 14.

    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  15. 15.

    Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670

  16. 16.

    Hsu C-C, Lin C-W (2017) Cnn-based joint clustering and representation learning with feature drift compensation for large-scale image data. IEEE Trans Multimedia 20(2):421–429

  17. 17.

    Hu W, Miyato T, Tokui S, Matsumoto E, Sugiyama M (2017) Learning discrete representations via information maximizing self-augmented training. In: Proceedings of the 34th international conference on machine learning, vol. 70, pp 1558–1567. JMLR. org

  18. 18.

    Huang P, Huang Y, Wang W, Wang L (2014) Deep embedding network for clustering. In: 2014 22nd International conference on pattern recognition, pp 1532–1537. IEEE

  19. 19.

    Jiang Z, Zheng Y, Tan H, Tang B, Zhou H (2016) Variational deep embedding: an unsupervised and generative approach to clustering. arXiv preprint arXiv:1611.05148

  20. 20.

    Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114

  21. 21.

    Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86

  22. 22.

    Kurita T (1991) An efficient agglomerative clustering algorithm using a heap. Pattern Recognit 24(3):205–209

  23. 23.

    Li F, Qiao H, Zhang B (2018) Discriminatively boosted image clustering with fully convolutional auto-encoders. Pattern Recognit 83:161–173

  24. 24.

    Liu X, Dou Y, Yin J, Wang L, Zhu E (2016) Multiple kernel k-means clustering with matrix-induced regularization. In: Thirtieth AAAI conference on artificial intelligence

  25. 25.

    MacQueen J et al. (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1. Oakland, CA, USA, pp 281–297

  26. 26.

    McLachlan G, Peel D (2004) Finite mixture models. Wiley, London

  27. 27.

    Nene SA, Nayar SK, Murase H, et al (1996) Columbia object image library (coil-20)

  28. 28.

    Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems, pp 849–856

  29. 29.

    Peng X, Feng J, Lu J, Yau W-Y, Yi Z (2017) Cascade subspace clustering. In: Thirty-first AAAI conference on artificial intelligence

  30. 30.

    Saito S, Tan RT (2017) Neural clustering: concatenating layers for better projections

  31. 31.

    Shah SA, Koltun V (2017) Robust continuous clustering. Proc Natl Acad Sci 114(37):9814–9819

  32. 32.

    Shi J, Malik J (2000) Normalized cuts and image segmentation. Departmental Papers (CIS), p 107

  33. 33.

    Song C, Liu F, Huang Y, Wang L, Tan T (2013) Auto-encoder based data clustering. In: Iberoamerican congress on pattern recognition. Springer, Berlin, pp 117–124

  34. 34.

    Strehl A, Ghosh J (2002) Cluster ensembles: a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3(Dec):583–617

  35. 35.

    Trigeorgis G, Bousmalis K, Zafeiriou S, Schuller BW. Supplementary material for a deep semi-NMF model for learning hidden representations

  36. 36.

    Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P-A (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11(Dec):3371–3408

  37. 37.

    Wang Z, Chang S, Zhou J, Wang M, Huang TS (2016) Learning a task-specific deep architecture for clustering. In: Proceedings of the 2016 SIAM international conference on data mining. SIAM, pp 369–377

  38. 38.

    Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747

  39. 39.

    Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: International conference on machine learning, pp 478–487

  40. 40.

    Yang B, Fu X, Sidiropoulos ND, Hong M (2017) Towards k-means-friendly spaces: Simultaneous deep learning and clustering. In: Proceedings of the 34th international conference on machine learning, vol 70, pp 3861–3870. JMLR. org

  41. 41.

    Yang J, Parikh D, Batra D (2016) Joint unsupervised learning of deep representations and image clusters. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5147–5156

  42. 42.

    Zhang J, Li K, Liang Y, Li N (2017) Learning 3D faces from 2D images via stacked contractive autoencoder. Neurocomputing, S0925231217301431

  43. 43.

    Zhang J, Yu J, Tao D (2018) Local deep-feature alignment for unsupervised dimension reduction. IEEE Trans Image Process, pp 1–1

  44. 44.

    Zhang W, Wang X, Zhao D, Tang X (2012) Graph degree linkage: agglomerative clustering on a directed graph. In: European conference on computer vision. Springer, Berlin, pp 428–441

  45. 45.

    Zhao D, Tang X (2009) Cyclizing clusters via zeta function of a graph. In: Advances in neural information processing systems, pp 1953–1960

  46. 46.

    Zhu Q, Zhang R (2019) A classification supervised auto-encoder based on predefined evenly-distributed class centroids. arXiv preprint arXiv:1902.00220

Download references

Author information

Correspondence to Zhengyong Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhu, Q., Wang, Z. An Image Clustering Auto-Encoder Based on Predefined Evenly-Distributed Class Centroids and MMD Distance. Neural Process Lett (2020).

Download citation


  • Auto-encoder
  • Clustering
  • Predefined evenly-distributed class centroids (PEDCC)
  • Data augmentation
  • Maximum mean discrepancy (MMD)