Semi-supervised Learning for Convolutional Neural Networks Using Mild Supervisory Signals

  • Takashi ShinozakiEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9950)


We propose a novel semi-supervised learning method for convolutional neural networks (CNNs). CNN is one of the most popular models for deep learning and its successes among various types of applications include image and speech recognition, image captioning, and the game of ‘go’. However, the requirement for a vast amount of labeled data for supervised learning in CNNs is a serious problem. Unsupervised learning, which uses the information of unlabeled data, might be key to addressing the problem, although it has not been investigated sufficiently in CNN regimes. The proposed method involves both supervised and unsupervised learning in identical feedforward networks, and enables seamless switching among them. We validated the method using an image recognition task. The results showed that learning using non-labeled data dramatically improves the efficiency of supervised learning.


Unsupervised learning Convolutional neural network Deep learning 


  1. 1.
    LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Jackel, L.D.: Backpropagation applied to hand-written zip code recognition. Neural Comput. 1(4), 541–551 (1989)CrossRefGoogle Scholar
  2. 2.
    Krizhevsky, A., Sutskerver, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1106–1114 (2012)Google Scholar
  3. 3.
    Dahl, G.E., Yu, D., Deng, L., Acero, A.: Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1), 30–42 (2012)CrossRefGoogle Scholar
  4. 4.
    Fukushima, K.: Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980)CrossRefzbMATHGoogle Scholar
  5. 5.
    Hinton, G.E., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Le, Q.V., Ranzato, M.A., Monga, R., Devin, M., Chen, K., Corrado, G.S., Dean, J., Ng, A.Y.: Building high-level features using large scale unsupervised learning. In: Proceedings of the 29th International Conference on Machine Learning (2012)Google Scholar
  7. 7.
    Radford, A., Metz, L.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: IPLR 2016 (2016)Google Scholar
  8. 8.
    Goroshin, R., Bruna, J., Tompson, J., Eigen, D., LeCun, Y.: Unsupervised Learning of Spatiotemporally Coherent Metrics, arXiv:1412.6056 (2015)
  9. 9.
    Bengio, Y., Lambling, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. Adv. Neural Inf. Process. Syst. 19, 153–160 (2007)Google Scholar
  10. 10.
    Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: Proceedings of the 4th International Conference on Learning Representations (2015)Google Scholar
  11. 11.
    Bottou, L.: Online algorithms and stochastic approximations. In: Online Learning and Neural Networks. Cambridge University Press (1998)Google Scholar
  12. 12.
    LeCun, Y., Cortes, C., Barges, C.J.C.: The MNIST database of handwritten digits (1998)Google Scholar
  13. 13.
    Tokui, S., Oono, K., Hido, S., Clayton, J.: Chainer: a next-generation open source framework for deep learning. In: NIPS (2015)Google Scholar
  14. 14.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.CiNetNational Institute of Information and Communications TechnologyKoganeiJapan
  2. 2.Graduate School of Information Science and TechnologyOsaka UniversitySuita, OsakaJapan

Personalised recommendations