Generative Models for Deep Learning with Very Scarce Data

  • Juan MaroñasEmail author
  • Roberto Paredes
  • Daniel Ramos
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11401)


The goal of this paper is to deal with a data scarcity scenario where deep learning techniques use to fail. We compare the use of two well established techniques, Restricted Boltzmann Machines and Variational Auto-encoders, as generative models in order to increase the training set in a classification framework. Essentially, we rely on Markov Chain Monte Carlo (MCMC) algorithms for generating new samples. We show that generalization can be improved comparing this methodology to other state-of-the-art techniques, e.g. semi-supervised learning with ladder networks. Furthermore, we show that RBM is better than VAE generating new samples for training a classifier with good generalization capabilities.


Data scarcity Generative models Data augmentation Markov Chain Monte Carlo algorithms 



We gratefully acknowledge the support of NVIDIA Corporation with the donation of two Titan Xp GPU used for this research. The work of Daniel Ramos has been supported by the Spanish Ministry of Education by project TEC2015-68172-C2-1-P. Juan Maroñas is supported by grant FPI-UPV.


  1. 1.
    Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)CrossRefGoogle Scholar
  2. 2.
    Bengio, Y., et al.: Generalized denoising auto-encoders as generative models. In: Advances in Neural Information Processing Systems, vol. 26, pp. 899–907. Curran Associates, Inc. (2013)Google Scholar
  3. 3.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)zbMATHGoogle Scholar
  4. 4.
    Carreira-Perpinan, M.A., et al.: On contrastive divergence learning. In: AISTATS, vol. 10, pp. 33–40. Citeseer (2005)Google Scholar
  5. 5.
    Gal, Y., Ghahramani, Z.: Bayesian convolutional neural networks with Bernoulli approximate variational inference. In: 4th International Conference on Learning Representations (ICLR) Workshop Track (2016)Google Scholar
  6. 6.
    Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27, pp. 2672–2680. Curran Associates, Inc. (2014)Google Scholar
  7. 7.
    Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 29, 82–97 (2012)CrossRefGoogle Scholar
  8. 8.
    Ioffe, S., et al.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning, ICML 2015, vol. 37, pp. 448–456. (2015)
  9. 9.
    Kingma, D.P., et al.: Auto-encoding variational bayes (2013)Google Scholar
  10. 10.
    Mikolov, T., et al.: Efficient estimation of word representations in vector space (2013)Google Scholar
  11. 11.
    Neal, R.M.: Probabilistic inference using Markov Chain Monte Carlo methods (1993)Google Scholar
  12. 12.
    van den Oord, A., et al.: Conditional image generation with PixeLCNN decoders. In: Advances in Neural Information Processing Systems, vol. 29, pp. 4790–4798. Curran Associates, Inc. (2016)Google Scholar
  13. 13.
    Rasmus, A., et al.: Semi-supervised learning with ladder networks. In: Advances in Neural Information Processing Systems, vol. 28, pp. 3546–3554. Curran Associates, Inc. (2015)Google Scholar
  14. 14.
    Redmon, J., et al.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 779–788 (2016)Google Scholar
  15. 15.
    Rezende, D.J., et al.: Stochastic backpropagation and approximate inference in deep generative models. In: Proceedings of the 31st International Conference on International Conference on Machine Learning, ICML 2014, vol. 32, pp. II-1278–II-1286. (2014)
  16. 16.
    Srivastava, N., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Sutskever, I., et al.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, vol. 27, pp. 3104–3112. Curran Associates, Inc. (2014)Google Scholar
  18. 18.
    Szegedy, C., et al.: Inception-v4, inception-ResNet and the impact of residual connections on learning (2016)Google Scholar
  19. 19.
    Tran, et al.: A Bayesian data augmentation approach for learning deep models. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 2797–2806. Curran Associates, Inc. (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Pattern Recognition and Human Language TechnologyUniversitat Politècnica de ValènciaValenciaSpain
  2. 2.AUDIASUniversidad Autónoma de MadridMadridSpain

Personalised recommendations