Abstract
While supervised deep learning has achieved great success in a range of applications, relatively little work has studied the discovery of knowledge from unlabeled data. In this paper, we propose an unsupervised deep learning framework to provide a potential solution for the problem that existing deep learning techniques require large labeled data sets for completing the training process. Our proposed introduces a new principle of joint learning on both deep representations and GMM (Gaussian Mixture Model)-based deep modeling, and thus an integrated objective function is proposed to facilitate the principle. In comparison with the existing work in similar areas, our objective function has two learning targets, which are created to be jointly optimized to achieve the best possible unsupervised learning and knowledge discovery from unlabeled data sets. While maximizing the first target enables the GMM to achieve the best possible modeling of the data representations and each Gaussian component corresponds to a compact cluster, maximizing the second term will enhance the separability of the Gaussian components and hence the inter-cluster distances. As a result, the compactness of clusters is significantly enhanced by reducing the intra-cluster distances, and the separability is improved by increasing the inter-cluster distances. Extensive experimental results show that the propose method can improve the clustering performance compared with benchmark methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C.C., Reddy, C.K.: Data Clustering: Algorithms and Applications, 1st edn. Chapman & Hall/CRC, Boca Raton (2013)
Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)
Bruna, J., Mallat, S.: Invariant scattering convolution networks. TPAMI 35(8), 1872–1886 (2013)
Cai, D., He, X., Han, J.: Document clustering using locality preserving indexing. TKDE 17(12), 1624–1637 (2005)
CaliåSki, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat. 3(1), 1–27 (1974)
Chen, X., Cai, D.: Large scale spectral clustering with landmark-based representation. In: AAAI, pp. 313–318 (2011)
Coates, A., Ng, A., Lee, H.: An analysis of single-layer networks in unsupervised feature learning 15, 215–223 (2011)
Deng, L., Chen, J.: Sequence classification using the high-level features extracted from deep neural networks. In: ICASSP, pp. 6844–6848 (2014)
Ding, C., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. TPAMI 32(1), 45–55 (2010)
Dizaji, K.G., Herandi, A., Huang, H.: Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization. In: ICCV, pp. 5747–5756 (2017)
Doersch, C., Singh, S., Gupta, A., Sivic, J., Efros, A.A.: What makes paris look like paris? ACM Trans. Graph. 31(4), 101:1–101:9 (2012)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Heigold, G., Ney, H., Lehnen, P., Gass, T., Schluter, R.: Equivalence of generative and log-linear models. IEEE Trans. Audio Speech Lang. Process. 19(5), 1138–1148 (2011)
Heigold, G.: A log-linear discriminative modeling framework for speech recognition. Ph.D. dissertation, Rwth Aachen (2010)
Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Law, M.T., Urtasun, R., Zemel, R.S.: Deep spectral clustering learning. In: ICML, vol. 70, pp. 1985–1994 (2017)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Maaten, L.: Learning a parametric embedding by preserving local structure. In: Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, pp. 384–391 (2009)
Nakayama, H., Harada, T., Kuniyoshi, Y.: Global Gaussian approach for scene categorization using information geometry, pp. 2336–2343 (2010)
Nene, S.A., Nayar, S.K., Murase, H.: Columbia university image library (coil-100) (1996)
Nene, S.A., Nayar, S.K., Murase, H.: Columbia university image library (coil-20) (1996)
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: CVPR, pp. 1520–1528 (2015)
Paulik, M.: Lattice-based training of bottleneck feature extraction neural networks. In: INTERSPEECH (2013)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)
Sainath, T.N., Kingsbury, B., Ramabhadran, B.: Auto-encoder bottleneck features using deep belief networks. In: ICASSP, pp. 4153–4156 (2012)
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: CVPR, pp. 815–823 (2015)
Serra, G., Grana, C., Manfredi, M., Cucchiara, R.: Gold: Gaussians of local descriptors for image representation. Comput. Vis. Image Underst. 134, 22–32 (2015)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15, 1929–1958 (2014)
Stuhlsatz, A., Lippel, J., Zielke, T.: Feature extraction with deep neural networks by a generalized discriminant analysis. IEEE Trans. Neural Netw. Learn. Syst. 23, 596–608 (2012)
Trigeorgis, G., Bousmalis, K., Zafeiriou, S., Schuller, B.W.: A deep semi-NMF model for learning hidden representations. In: ICML, pp. II-1692–II-1700 (2014)
Tüske, Z., Tahir, M.A., Schlüter, R., Ney, H.: Integrating Gaussian mixtures into deep neural networks: softmax layer with hidden variables. In: ICASSP, pp. 4285–4289 (2015)
Variani, E., Mcdermott, E., Heigold, G.: A Gaussian mixture model layer jointly optimized with discriminative features within a deep neural network architecture. In: ICASSP, pp. 4270–4274 (2015)
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. JMLR 11, 3371–3408 (2010)
Wang, J., Wang, G.: Hierarchical spatial sum-product networks for action recognition in still images. IEEE Trans. Circuits Syst. Video Technol. 28(1), 90–100 (2018)
Wang, J., Wang, Z., Tao, D., See, S., Wang, G.: Learning common and specific features for RGB-D semantic segmentation with deconvolutional networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 664–679. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_40
Wang, Q., Li, P., Zhang, L.: G\(^2\)DeNet: global gaussian distribution embedding network and its application to visual recognition. In: CVPR (2017)
Wang, Q., Li, P., Zuo, W., Zhang, L.: RAID-G: robust estimation of approximate infinite dimensional Gaussian with application to material recognition. In: CVPR, pp. 4433–4441 (2016)
Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: ICML, pp. 478–487
Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: Proceedings of the ACM SIGIR 2003, pp. 267–273 (2003)
Yang, B., Fu, X., Sidiropoulos, N.D., Hong, M.: Towards k-means-friendly spaces: simultaneous deep learning and clustering. ICML 70, 3861–3870 (2017)
Yang, J., Parikh, D., Batra, D.: Joint unsupervised learning of deep representations and image clusters. In: CVPR, pp. 5147–5156 (2016)
You, C., Robinson, D.P., Vidal, R.: Scalable sparse subspace clustering by orthogonal matching pursuit. In: CVPR, pp. 3918–3927, June 2016
Zelnik-Manor, L.: Self-tuning spectral clustering. NIPS 17, 1601–1608 (2004)
Zhang, W., Wang, X., Zhao, D., Tang, X.: Graph degree linkage: agglomerative clustering on a directed graph. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7572, pp. 428–441. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33718-5_31
Zhang, W., Zhao, D., Wang, X.: Agglomerative clustering via maximum incremental path integral. Pattern Recognit. 46(11), 3056–3065 (2013)
Acknowledgment
The authors wish to acknowledge the financial support from: (i) Natural Science Foundation China (NSFC) under the Grant No. 61620106008; (ii) Natural Science Foundation China (NSFC) under the Grant No. 61802266; and (iii) Shenzhen Commission for Scientific Research & Innovations under the Grant No. JCYJ20160226191842793.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, J., Jiang, J. (2019). An Unsupervised Deep Learning Framework via Integrated Optimization of Representation Learning and GMM-Based Modeling. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11361. Springer, Cham. https://doi.org/10.1007/978-3-030-20887-5_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-20887-5_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20886-8
Online ISBN: 978-3-030-20887-5
eBook Packages: Computer ScienceComputer Science (R0)