Auto-encoder Based Data Clustering

  • Chunfeng Song
  • Feng Liu
  • Yongzhen Huang
  • Liang Wang
  • Tieniu Tan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8258)


Linear or non-linear data transformations are widely used processing techniques in clustering. Usually, they are beneficial to enhancing data representation. However, if data have a complex structure, these techniques would be unsatisfying for clustering. In this paper, based on the auto-encoder network, which can learn a highly non-linear mapping function, we propose a new clustering method. Via simultaneously considering data reconstruction and compactness, our method can obtain stable and effective clustering. Experiments on three databases show that the proposed clustering model achieves excellent performance in terms of both accuracy and normalized mutual information.


Clustering Auto-encoder Non-linear transformation 


  1. 1.
    Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and new perspectives. arXiv preprint arXiv:1206.5538 (2012)Google Scholar
  2. 2.
    Dhillon, I.S., Guan, Y., Kulis, B.: Kernel k-means: spectral clustering and normalized cuts. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2004)Google Scholar
  3. 3.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786) (2006)Google Scholar
  4. 4.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computing Surveys 31(3), 264–323 (1999)CrossRefGoogle Scholar
  5. 5.
    LeCun, Y.A., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient backProp. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade, 2nd edn. LNCS, vol. 7700, pp. 9–48. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  6. 6.
    Li, Z., Yang, Y., Liu, J., Zhou, X., Lu, H.: Unsupervised feature selection using nonnegative spectral analysis. In: AAAI Conference on Artificial Intelligence (2012)Google Scholar
  7. 7.
    Ng, A.Y., Jordan, M.I., Weiss, Y., et al.: On spectral clustering: Analysis and an algorithm. Advances in Neural Information Processing Systems 2, 849–856 (2002)Google Scholar
  8. 8.
    Plummer, M., Lovász, L.: Matching theory, vol. 121. North Holland (1986)Google Scholar
  9. 9.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8) (2000)Google Scholar
  10. 10.
    Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained k-means clustering with background knowledge. In: International Conference on Machine Learning, pp. 577–584 (2001)Google Scholar
  11. 11.
    Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: ACM SIGIR Conference on Research and Development in Informaion Retrieval (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Chunfeng Song
    • 1
  • Feng Liu
    • 2
  • Yongzhen Huang
    • 1
  • Liang Wang
    • 1
  • Tieniu Tan
    • 1
  1. 1.National Laboratory of Pattern Recognition (NLPR), Institute of AutomationChinese Academy of SciencesBeijingChina
  2. 2.School of AutomationSoutheast UniversityNanjingChina

Personalised recommendations