Abstract
We propose a novel semi-supervised learning method for convolutional neural networks (CNNs). CNN is one of the most popular models for deep learning and its successes among various types of applications include image and speech recognition, image captioning, and the game of ‘go’. However, the requirement for a vast amount of labeled data for supervised learning in CNNs is a serious problem. Unsupervised learning, which uses the information of unlabeled data, might be key to addressing the problem, although it has not been investigated sufficiently in CNN regimes. The proposed method involves both supervised and unsupervised learning in identical feedforward networks, and enables seamless switching among them. We validated the method using an image recognition task. The results showed that learning using non-labeled data dramatically improves the efficiency of supervised learning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Jackel, L.D.: Backpropagation applied to hand-written zip code recognition. Neural Comput. 1(4), 541–551 (1989)
Krizhevsky, A., Sutskerver, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1106–1114 (2012)
Dahl, G.E., Yu, D., Deng, L., Acero, A.: Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1), 30–42 (2012)
Fukushima, K.: Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980)
Hinton, G.E., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)
Le, Q.V., Ranzato, M.A., Monga, R., Devin, M., Chen, K., Corrado, G.S., Dean, J., Ng, A.Y.: Building high-level features using large scale unsupervised learning. In: Proceedings of the 29th International Conference on Machine Learning (2012)
Radford, A., Metz, L.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: IPLR 2016 (2016)
Goroshin, R., Bruna, J., Tompson, J., Eigen, D., LeCun, Y.: Unsupervised Learning of Spatiotemporally Coherent Metrics, arXiv:1412.6056 (2015)
Bengio, Y., Lambling, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. Adv. Neural Inf. Process. Syst. 19, 153–160 (2007)
Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: Proceedings of the 4th International Conference on Learning Representations (2015)
Bottou, L.: Online algorithms and stochastic approximations. In: Online Learning and Neural Networks. Cambridge University Press (1998)
LeCun, Y., Cortes, C., Barges, C.J.C.: The MNIST database of handwritten digits (1998)
Tokui, S., Oono, K., Hido, S., Clayton, J.: Chainer: a next-generation open source framework for deep learning. In: NIPS (2015)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Shinozaki, T. (2016). Semi-supervised Learning for Convolutional Neural Networks Using Mild Supervisory Signals. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds) Neural Information Processing. ICONIP 2016. Lecture Notes in Computer Science(), vol 9950. Springer, Cham. https://doi.org/10.1007/978-3-319-46681-1_46
Download citation
DOI: https://doi.org/10.1007/978-3-319-46681-1_46
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46680-4
Online ISBN: 978-3-319-46681-1
eBook Packages: Computer ScienceComputer Science (R0)