An Unsupervised Deep Learning Framework via Integrated Optimization of Representation Learning and GMM-Based Modeling

Wang, Jinghua; Jiang, Jianmin

doi:10.1007/978-3-030-20887-5_16

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11361))

Included in the following conference series:

Asian Conference on Computer Vision

2116 Accesses
3 Citations

Abstract

While supervised deep learning has achieved great success in a range of applications, relatively little work has studied the discovery of knowledge from unlabeled data. In this paper, we propose an unsupervised deep learning framework to provide a potential solution for the problem that existing deep learning techniques require large labeled data sets for completing the training process. Our proposed introduces a new principle of joint learning on both deep representations and GMM (Gaussian Mixture Model)-based deep modeling, and thus an integrated objective function is proposed to facilitate the principle. In comparison with the existing work in similar areas, our objective function has two learning targets, which are created to be jointly optimized to achieve the best possible unsupervised learning and knowledge discovery from unlabeled data sets. While maximizing the first target enables the GMM to achieve the best possible modeling of the data representations and each Gaussian component corresponds to a compact cluster, maximizing the second term will enhance the separability of the Gaussian components and hence the inter-cluster distances. As a result, the compactness of clusters is significantly enhanced by reducing the intra-cluster distances, and the separability is improved by increasing the inter-cluster distances. Extensive experimental results show that the propose method can improve the clustering performance compared with benchmark methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://cs.nyu.edu/~roweis/data.html.

References

Aggarwal, C.C., Reddy, C.K.: Data Clustering: Algorithms and Applications, 1st edn. Chapman & Hall/CRC, Boca Raton (2013)
Book Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)
MATH Google Scholar
Bruna, J., Mallat, S.: Invariant scattering convolution networks. TPAMI 35(8), 1872–1886 (2013)
Article Google Scholar
Cai, D., He, X., Han, J.: Document clustering using locality preserving indexing. TKDE 17(12), 1624–1637 (2005)
Google Scholar
CaliåSki, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat. 3(1), 1–27 (1974)
MathSciNet MATH Google Scholar
Chen, X., Cai, D.: Large scale spectral clustering with landmark-based representation. In: AAAI, pp. 313–318 (2011)
Google Scholar
Coates, A., Ng, A., Lee, H.: An analysis of single-layer networks in unsupervised feature learning 15, 215–223 (2011)
Google Scholar
Deng, L., Chen, J.: Sequence classification using the high-level features extracted from deep neural networks. In: ICASSP, pp. 6844–6848 (2014)
Google Scholar
Ding, C., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. TPAMI 32(1), 45–55 (2010)
Article Google Scholar
Dizaji, K.G., Herandi, A., Huang, H.: Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization. In: ICCV, pp. 5747–5756 (2017)
Google Scholar
Doersch, C., Singh, S., Gupta, A., Sivic, J., Efros, A.A.: What makes paris look like paris? ACM Trans. Graph. 31(4), 101:1–101:9 (2012)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
Heigold, G., Ney, H., Lehnen, P., Gass, T., Schluter, R.: Equivalence of generative and log-linear models. IEEE Trans. Audio Speech Lang. Process. 19(5), 1138–1148 (2011)
Article Google Scholar
Heigold, G.: A log-linear discriminative modeling framework for speech recognition. Ph.D. dissertation, Rwth Aachen (2010)
Google Scholar
Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet Google Scholar
Law, M.T., Urtasun, R., Zemel, R.S.: Deep spectral clustering learning. In: ICML, vol. 70, pp. 1985–1994 (2017)
Google Scholar
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)
Google Scholar
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Article MathSciNet Google Scholar
Maaten, L.: Learning a parametric embedding by preserving local structure. In: Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, pp. 384–391 (2009)
Google Scholar
Nakayama, H., Harada, T., Kuniyoshi, Y.: Global Gaussian approach for scene categorization using information geometry, pp. 2336–2343 (2010)
Google Scholar
Nene, S.A., Nayar, S.K., Murase, H.: Columbia university image library (coil-100) (1996)
Google Scholar
Nene, S.A., Nayar, S.K., Murase, H.: Columbia university image library (coil-20) (1996)
Google Scholar
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: CVPR, pp. 1520–1528 (2015)
Google Scholar
Paulik, M.: Lattice-based training of bottleneck feature extraction neural networks. In: INTERSPEECH (2013)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)
Google Scholar
Sainath, T.N., Kingsbury, B., Ramabhadran, B.: Auto-encoder bottleneck features using deep belief networks. In: ICASSP, pp. 4153–4156 (2012)
Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: CVPR, pp. 815–823 (2015)
Google Scholar
Serra, G., Grana, C., Manfredi, M., Cucchiara, R.: Gold: Gaussians of local descriptors for image representation. Comput. Vis. Image Underst. 134, 22–32 (2015)
Article Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15, 1929–1958 (2014)
MathSciNet MATH Google Scholar
Stuhlsatz, A., Lippel, J., Zielke, T.: Feature extraction with deep neural networks by a generalized discriminant analysis. IEEE Trans. Neural Netw. Learn. Syst. 23, 596–608 (2012)
Article Google Scholar
Trigeorgis, G., Bousmalis, K., Zafeiriou, S., Schuller, B.W.: A deep semi-NMF model for learning hidden representations. In: ICML, pp. II-1692–II-1700 (2014)
Google Scholar
Tüske, Z., Tahir, M.A., Schlüter, R., Ney, H.: Integrating Gaussian mixtures into deep neural networks: softmax layer with hidden variables. In: ICASSP, pp. 4285–4289 (2015)
Google Scholar
Variani, E., Mcdermott, E., Heigold, G.: A Gaussian mixture model layer jointly optimized with discriminative features within a deep neural network architecture. In: ICASSP, pp. 4270–4274 (2015)
Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. JMLR 11, 3371–3408 (2010)
MathSciNet MATH Google Scholar
Wang, J., Wang, G.: Hierarchical spatial sum-product networks for action recognition in still images. IEEE Trans. Circuits Syst. Video Technol. 28(1), 90–100 (2018)
Article Google Scholar
Wang, J., Wang, Z., Tao, D., See, S., Wang, G.: Learning common and specific features for RGB-D semantic segmentation with deconvolutional networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 664–679. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_40
Chapter Google Scholar
Wang, Q., Li, P., Zhang, L.: G\(^2\)DeNet: global gaussian distribution embedding network and its application to visual recognition. In: CVPR (2017)
Google Scholar
Wang, Q., Li, P., Zuo, W., Zhang, L.: RAID-G: robust estimation of approximate infinite dimensional Gaussian with application to material recognition. In: CVPR, pp. 4433–4441 (2016)
Google Scholar
Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: ICML, pp. 478–487
Google Scholar
Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: Proceedings of the ACM SIGIR 2003, pp. 267–273 (2003)
Google Scholar
Yang, B., Fu, X., Sidiropoulos, N.D., Hong, M.: Towards k-means-friendly spaces: simultaneous deep learning and clustering. ICML 70, 3861–3870 (2017)
Google Scholar
Yang, J., Parikh, D., Batra, D.: Joint unsupervised learning of deep representations and image clusters. In: CVPR, pp. 5147–5156 (2016)
Google Scholar
You, C., Robinson, D.P., Vidal, R.: Scalable sparse subspace clustering by orthogonal matching pursuit. In: CVPR, pp. 3918–3927, June 2016
Google Scholar
Zelnik-Manor, L.: Self-tuning spectral clustering. NIPS 17, 1601–1608 (2004)
Google Scholar
Zhang, W., Wang, X., Zhao, D., Tang, X.: Graph degree linkage: agglomerative clustering on a directed graph. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 428–441. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33718-5_31
Chapter Google Scholar
Zhang, W., Zhao, D., Wang, X.: Agglomerative clustering via maximum incremental path integral. Pattern Recognit. 46(11), 3056–3065 (2013)
Article Google Scholar

Download references

Acknowledgment

The authors wish to acknowledge the financial support from: (i) Natural Science Foundation China (NSFC) under the Grant No. 61620106008; (ii) Natural Science Foundation China (NSFC) under the Grant No. 61802266; and (iii) Shenzhen Commission for Scientific Research & Innovations under the Grant No. JCYJ20160226191842793.

Author information

Authors and Affiliations

Research Institute for Future Media Computing, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
Jinghua Wang & Jianmin Jiang

Authors

Jinghua Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jianmin Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianmin Jiang .

Editor information

Editors and Affiliations

IIIT Hyderabad, Hyderabad, India
C. V. Jawahar
ANU, Canberra, ACT, Australia
Hongdong Li
Simon Fraser University, Burnaby, BC, Canada
Greg Mori
ETH Zurich, Zurich, Zürich, Switzerland
Konrad Schindler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, J., Jiang, J. (2019). An Unsupervised Deep Learning Framework via Integrated Optimization of Representation Learning and GMM-Based Modeling. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11361. Springer, Cham. https://doi.org/10.1007/978-3-030-20887-5_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-20887-5_16
Published: 28 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20886-8
Online ISBN: 978-3-030-20887-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics