Advertisement

Semi-supervised Deep Learning with Memory

  • Yanbei ChenEmail author
  • Xiatian Zhu
  • Shaogang Gong
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11205)

Abstract

We consider the semi-supervised multi-class classification problem of learning from sparse labelled and abundant unlabelled training data. To address this problem, existing semi-supervised deep learning methods often rely on the up-to-date “network-in-training” to formulate the semi-supervised learning objective. This ignores both the discriminative feature representation and the model inference uncertainty revealed by the network in the preceding learning iterations, referred to as the memory of model learning. In this work, we propose a novel Memory-Assisted Deep Neural Network (MA-DNN) capable of exploiting the memory of model learning to enable semi-supervised learning. Specifically, we introduce a memory mechanism into the network training process as an assimilation-accommodation interaction between the network and an external memory module. Experiments demonstrate the advantages of the proposed MA-DNN model over the state-of-the-art semi-supervised deep learning methods on three image classification benchmark datasets: SVHN, CIFAR10, and CIFAR100.

Keywords

Semi-supervised learning Neural network with memory 

Notes

Acknowledgements

This work was partly supported by the China Scholarship Council, Vision Semantics Limited, the Royal Society Newton Advanced Fellowship Programme (NA150459) and Innovate UK Industrial Challenge Project on Developing and Commercialising Intelligent Video Analytics Solutions for Public Safety (98111-571149).

Supplementary material

474172_1_En_17_MOESM1_ESM.pdf (175 kb)
Supplementary material 1 (pdf 175 KB)

References

  1. 1.
    Blum, A., Lafferty, J., Rwebangira, M.R., Reddy, R.: Semi-supervised learning using randomized mincuts. In: International Conference on Machine Learning (2004)Google Scholar
  2. 2.
    Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory. ACM (1998)Google Scholar
  3. 3.
    Chapelle, O., Zien, A., Ghahramani, C.Z., et al.: Semi-supervised classification by low density separation. In: Tenth International Workshop on Artificial Intelligence and Statistics (2005)Google Scholar
  4. 4.
    Chapelle, O., Schlkopf, B., Zien, A.: Semi-supervised Learning. The MIT Press, Cambridge, MA (2010)Google Scholar
  5. 5.
    Dumoulin, V., et al.: Adversarially learned inference. In: International Conference on Learning Representation (2017)Google Scholar
  6. 6.
    Fergus, R., Weiss, Y., Torralba, A.: Semi-supervised learning in gigantic image collections. In: Advances in Neural Information Processing Systems (2009)Google Scholar
  7. 7.
    Ginsburg, H.P., Opper, S.: Piaget’s Theory of Intellectual Development. Prentice-Hall Inc., Upper Saddle River (1988)Google Scholar
  8. 8.
    Grandvalet, Y., Bengio, Y.: Semi-supervised learning by entropy minimization. In: Advances in Neural Information Processing Systems (2005)Google Scholar
  9. 9.
    Haeusser, P., Mordvintsev, A., Cremers, D.: Learning by association-a versatile semi-supervised training method for neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  10. 10.
    Joachims, T.: Transductive inference for text classification using support vector machines. In: International Conference on Machine Learning (1999)Google Scholar
  11. 11.
    Kaiser, Ł., Nachum, O., Roy, A., Bengio, S.: Learning to remember rare events. In: International Conference on Learning Representation (2017)Google Scholar
  12. 12.
    Kingma, D.P., Mohamed, S., Rezende, D.J., Welling, M.: Semi-supervised learning with deep generative models. In: Advances in Neural Information Processing Systems (2014)Google Scholar
  13. 13.
    Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, University of Toronto (2009)Google Scholar
  14. 14.
    Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. In: International Conference on Learning Representation (2017)Google Scholar
  15. 15.
    Lee, D.H.: Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In: ICML Workshop on Challenges in Representation Learning (2013)Google Scholar
  16. 16.
    Maaløe, L., Sønderby, C.K., Sønderby, S.K., Winther, O.: Auxiliary deep generative models. In: International Conference on Machine Learning (2016)Google Scholar
  17. 17.
    Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res, 2579–2605 (2008)Google Scholar
  18. 18.
    Miller, A., Fisch, A., Dodge, J., Karimi, A.H., Bordes, A., Weston, J.: Key-value memory networks for directly reading documents. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (2016)Google Scholar
  19. 19.
    Miyato, T., Maeda, S.I., Koyama, M., Nakae, K., Ishii, S.: Distributional smoothing with virtual adversarial training. In: International Conference on Learning Representation (2016)Google Scholar
  20. 20.
    Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS workshop on deep learning and unsupervised feature learning (2011)Google Scholar
  21. 21.
    Nigam, K., Ghani, R.: Analyzing the effectiveness and applicability of co-training. In: Proceedings of the ninth international conference on Information and knowledge management. ACM (2000)Google Scholar
  22. 22.
    Pereyra, G., Tucker, G., Chorowski, J., Kaiser, Ł., Hinton, G.: Regularizing neural networks by penalizing confident output distributions. In: International Conference on Learning Representation (2017)Google Scholar
  23. 23.
    Ranzato, M., Szummer, M.: Semi-supervised learning of compact document representations with deep networks. In: International Conference on Machine Learning (2008)Google Scholar
  24. 24.
    Rasmus, A., Berglund, M., Honkala, M., Valpola, H., Raiko, T.: Semi-supervised learning with ladder networks. In: Advances in Neural Information Processing Systems (2015)Google Scholar
  25. 25.
    Rosenberg, C., Hebert, M., Schneiderman, H.: Semi-supervised self-training of object detection models. In: Seventh IEEE Workshop on Applications of Computer Vision. Citeseer (2005)Google Scholar
  26. 26.
    Sajjadi, M., Javanmardi, M., Tasdizen, T.: Regularization with stochastic transformations and perturbations for deep semi-supervised learning. In: Advances in Neural Information Processing Systems, pp. 1163–1171 (2016)Google Scholar
  27. 27.
    Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training gans. In: Advances in Neural Information Processing Systems (2016)Google Scholar
  28. 28.
    Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., Lillicrap, T.: Meta-learning with memory-augmented neural networks. In: International Conference on Machine Learning, pp. 1842–1850 (2016)Google Scholar
  29. 29.
    Shi, M., Zhang, B.: Semi-supervised learning improves gene expression-based prediction of cancer recurrence. Bioinformatics 27(21), 3017–3023 (2011)CrossRefGoogle Scholar
  30. 30.
    Springenberg, J.T.: Unsupervised and semi-supervised learning with categorical generative adversarial networks. In: International Conference on Learning Representation (2016)Google Scholar
  31. 31.
    Sukhbaatar, S., Weston, J., Fergus, R., et al.: End-to-end memory networks. In: Advances in Neural Information Processing Systems (2015)Google Scholar
  32. 32.
    Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in Neural Information Processing Systems (2017)Google Scholar
  33. 33.
    Wen, Y., Zhang, K., Li, Z., Qiao, Y.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46478-7_31CrossRefGoogle Scholar
  34. 34.
    Weston, J., Chopra, S., Bordes, A.: Memory networks. In: International Conference on Learning Representation (2014)Google Scholar
  35. 35.
    Weston, J., Ratle, F., Mobahi, H., Collobert, R.: Deep learning via semi-supervised embedding. In: International Conference on Machine Learning (2008)Google Scholar
  36. 36.
    Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems (2004)Google Scholar
  37. 37.
    Zhu, X.: Semi-supervised learning literature survey. Comput. Sci. Univ. Wisconsin-Madison 2(3), 4 (2006)Google Scholar
  38. 38.
    Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation. Technical Report CMU-CALD-02-107, Carnegie Mellon University (2002)Google Scholar
  39. 39.
    Zhu, X., Ghahramani, Z., Lafferty, J.D.: Semi-supervised learning using gaussian fields and harmonic functions. In: International Conference on Machine Learning (2003)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Queen Mary University of LondonLondonUK
  2. 2.Vision Semantics Ltd.LondonUK

Personalised recommendations