Skip to main content

An Unsupervised Deep Learning Framework via Integrated Optimization of Representation Learning and GMM-Based Modeling

  • Conference paper
  • First Online:
Computer Vision – ACCV 2018 (ACCV 2018)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11361))

Included in the following conference series:

Abstract

While supervised deep learning has achieved great success in a range of applications, relatively little work has studied the discovery of knowledge from unlabeled data. In this paper, we propose an unsupervised deep learning framework to provide a potential solution for the problem that existing deep learning techniques require large labeled data sets for completing the training process. Our proposed introduces a new principle of joint learning on both deep representations and GMM (Gaussian Mixture Model)-based deep modeling, and thus an integrated objective function is proposed to facilitate the principle. In comparison with the existing work in similar areas, our objective function has two learning targets, which are created to be jointly optimized to achieve the best possible unsupervised learning and knowledge discovery from unlabeled data sets. While maximizing the first target enables the GMM to achieve the best possible modeling of the data representations and each Gaussian component corresponds to a compact cluster, maximizing the second term will enhance the separability of the Gaussian components and hence the inter-cluster distances. As a result, the compactness of clusters is significantly enhanced by reducing the intra-cluster distances, and the separability is improved by increasing the inter-cluster distances. Extensive experimental results show that the propose method can improve the clustering performance compared with benchmark methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://cs.nyu.edu/~roweis/data.html.

References

  1. Aggarwal, C.C., Reddy, C.K.: Data Clustering: Algorithms and Applications, 1st edn. Chapman & Hall/CRC, Boca Raton (2013)

    Book  Google Scholar 

  2. Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)

    MATH  Google Scholar 

  3. Bruna, J., Mallat, S.: Invariant scattering convolution networks. TPAMI 35(8), 1872–1886 (2013)

    Article  Google Scholar 

  4. Cai, D., He, X., Han, J.: Document clustering using locality preserving indexing. TKDE 17(12), 1624–1637 (2005)

    Google Scholar 

  5. CaliåSki, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat. 3(1), 1–27 (1974)

    MathSciNet  MATH  Google Scholar 

  6. Chen, X., Cai, D.: Large scale spectral clustering with landmark-based representation. In: AAAI, pp. 313–318 (2011)

    Google Scholar 

  7. Coates, A., Ng, A., Lee, H.: An analysis of single-layer networks in unsupervised feature learning 15, 215–223 (2011)

    Google Scholar 

  8. Deng, L., Chen, J.: Sequence classification using the high-level features extracted from deep neural networks. In: ICASSP, pp. 6844–6848 (2014)

    Google Scholar 

  9. Ding, C., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. TPAMI 32(1), 45–55 (2010)

    Article  Google Scholar 

  10. Dizaji, K.G., Herandi, A., Huang, H.: Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization. In: ICCV, pp. 5747–5756 (2017)

    Google Scholar 

  11. Doersch, C., Singh, S., Gupta, A., Sivic, J., Efros, A.A.: What makes paris look like paris? ACM Trans. Graph. 31(4), 101:1–101:9 (2012)

    Article  Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  13. Heigold, G., Ney, H., Lehnen, P., Gass, T., Schluter, R.: Equivalence of generative and log-linear models. IEEE Trans. Audio Speech Lang. Process. 19(5), 1138–1148 (2011)

    Article  Google Scholar 

  14. Heigold, G.: A log-linear discriminative modeling framework for speech recognition. Ph.D. dissertation, Rwth Aachen (2010)

    Google Scholar 

  15. Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  Google Scholar 

  16. Law, M.T., Urtasun, R., Zemel, R.S.: Deep spectral clustering learning. In: ICML, vol. 70, pp. 1985–1994 (2017)

    Google Scholar 

  17. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  18. Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)

    Google Scholar 

  19. Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)

    Article  MathSciNet  Google Scholar 

  20. Maaten, L.: Learning a parametric embedding by preserving local structure. In: Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, pp. 384–391 (2009)

    Google Scholar 

  21. Nakayama, H., Harada, T., Kuniyoshi, Y.: Global Gaussian approach for scene categorization using information geometry, pp. 2336–2343 (2010)

    Google Scholar 

  22. Nene, S.A., Nayar, S.K., Murase, H.: Columbia university image library (coil-100) (1996)

    Google Scholar 

  23. Nene, S.A., Nayar, S.K., Murase, H.: Columbia university image library (coil-20) (1996)

    Google Scholar 

  24. Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: CVPR, pp. 1520–1528 (2015)

    Google Scholar 

  25. Paulik, M.: Lattice-based training of bottleneck feature extraction neural networks. In: INTERSPEECH (2013)

    Google Scholar 

  26. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)

    Google Scholar 

  27. Sainath, T.N., Kingsbury, B., Ramabhadran, B.: Auto-encoder bottleneck features using deep belief networks. In: ICASSP, pp. 4153–4156 (2012)

    Google Scholar 

  28. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: CVPR, pp. 815–823 (2015)

    Google Scholar 

  29. Serra, G., Grana, C., Manfredi, M., Cucchiara, R.: Gold: Gaussians of local descriptors for image representation. Comput. Vis. Image Underst. 134, 22–32 (2015)

    Article  Google Scholar 

  30. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15, 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  31. Stuhlsatz, A., Lippel, J., Zielke, T.: Feature extraction with deep neural networks by a generalized discriminant analysis. IEEE Trans. Neural Netw. Learn. Syst. 23, 596–608 (2012)

    Article  Google Scholar 

  32. Trigeorgis, G., Bousmalis, K., Zafeiriou, S., Schuller, B.W.: A deep semi-NMF model for learning hidden representations. In: ICML, pp. II-1692–II-1700 (2014)

    Google Scholar 

  33. Tüske, Z., Tahir, M.A., Schlüter, R., Ney, H.: Integrating Gaussian mixtures into deep neural networks: softmax layer with hidden variables. In: ICASSP, pp. 4285–4289 (2015)

    Google Scholar 

  34. Variani, E., Mcdermott, E., Heigold, G.: A Gaussian mixture model layer jointly optimized with discriminative features within a deep neural network architecture. In: ICASSP, pp. 4270–4274 (2015)

    Google Scholar 

  35. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. JMLR 11, 3371–3408 (2010)

    MathSciNet  MATH  Google Scholar 

  36. Wang, J., Wang, G.: Hierarchical spatial sum-product networks for action recognition in still images. IEEE Trans. Circuits Syst. Video Technol. 28(1), 90–100 (2018)

    Article  Google Scholar 

  37. Wang, J., Wang, Z., Tao, D., See, S., Wang, G.: Learning common and specific features for RGB-D semantic segmentation with deconvolutional networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 664–679. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_40

    Chapter  Google Scholar 

  38. Wang, Q., Li, P., Zhang, L.: G\(^2\)DeNet: global gaussian distribution embedding network and its application to visual recognition. In: CVPR (2017)

    Google Scholar 

  39. Wang, Q., Li, P., Zuo, W., Zhang, L.: RAID-G: robust estimation of approximate infinite dimensional Gaussian with application to material recognition. In: CVPR, pp. 4433–4441 (2016)

    Google Scholar 

  40. Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: ICML, pp. 478–487

    Google Scholar 

  41. Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: Proceedings of the ACM SIGIR 2003, pp. 267–273 (2003)

    Google Scholar 

  42. Yang, B., Fu, X., Sidiropoulos, N.D., Hong, M.: Towards k-means-friendly spaces: simultaneous deep learning and clustering. ICML 70, 3861–3870 (2017)

    Google Scholar 

  43. Yang, J., Parikh, D., Batra, D.: Joint unsupervised learning of deep representations and image clusters. In: CVPR, pp. 5147–5156 (2016)

    Google Scholar 

  44. You, C., Robinson, D.P., Vidal, R.: Scalable sparse subspace clustering by orthogonal matching pursuit. In: CVPR, pp. 3918–3927, June 2016

    Google Scholar 

  45. Zelnik-Manor, L.: Self-tuning spectral clustering. NIPS 17, 1601–1608 (2004)

    Google Scholar 

  46. Zhang, W., Wang, X., Zhao, D., Tang, X.: Graph degree linkage: agglomerative clustering on a directed graph. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 428–441. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33718-5_31

    Chapter  Google Scholar 

  47. Zhang, W., Zhao, D., Wang, X.: Agglomerative clustering via maximum incremental path integral. Pattern Recognit. 46(11), 3056–3065 (2013)

    Article  Google Scholar 

Download references

Acknowledgment

The authors wish to acknowledge the financial support from: (i) Natural Science Foundation China (NSFC) under the Grant No. 61620106008; (ii) Natural Science Foundation China (NSFC) under the Grant No. 61802266; and (iii) Shenzhen Commission for Scientific Research & Innovations under the Grant No. JCYJ20160226191842793.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianmin Jiang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, J., Jiang, J. (2019). An Unsupervised Deep Learning Framework via Integrated Optimization of Representation Learning and GMM-Based Modeling. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11361. Springer, Cham. https://doi.org/10.1007/978-3-030-20887-5_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20887-5_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20886-8

  • Online ISBN: 978-3-030-20887-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics