Skip to main content
Log in

An Overview of Restricted Boltzmann Machines

  • Review Article
  • Published:
Journal of the Indian Institute of Science Aims and scope

Abstract

The restricted Boltzmann machine (RBM) is a two-layered network of stochastic units with undirected connections between pairs of units in the two layers. The two layers of nodes are called visible and hidden nodes. In an RBM, there are no connections from visible to visible or hidden to hidden nodes. RBMs are used mainly as a generative model. They can be suitably modified to perform classification tasks also. They are among the basic building blocks of other deep learning models such as deep Boltzmann machine and deep belief networks. The aim of this article is to give a tutorial introduction to the restricted Boltzmann machines and to review the evolution of this model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1:
Figuer 2:
Figure 3:
Figure 4:

Similar content being viewed by others

References

  1. Bengio Y, Delalleau O (2009) Justifying and generalizing contrastive divergence. Neural Comput 21(6):1601–1621

    Article  Google Scholar 

  2. Bengio Y, Yao L, Cho K (2013) Bounding the test log-likelihood of generative models. arXiv:1311.6184 (arXiv preprint)

  3. Burda Y, Grosse RB, Salakhutdinov R (2014) Accurate and conservative estimates of MRF log-likelihood using reverse annealing. arXiv:1412.8566 (arXiv preprint)

  4. Carlson D, Cevher V, Carin L (2015) Stochastic spectral descent for restricted Boltzmann machines. In: Proceedings of the eighteenth international conference on artificial intelligence and statistics, pp 111–119

  5. Carreira-PMA, Hinton GE (2005) On contrastive divergence learning. In: Proceedings of the tenth international workshop on artificial intelligence and statistics. Citeseer, pp 33–40

  6. Cho K, Ilin A, Raiko T (2011) Improved learning of Gaussian–Bernoulli restricted Boltzmann machines. In: Honkela T, Duch W, Girolami M, Kaski S (eds) Artificial neural networks and machine learning–ICANN 2011. Springer, Berlin, pp 10–17 (ISBN 978-3-642-21735-7)

    Chapter  Google Scholar 

  7. Courville A, Bergstra J, Bengio Y A spike and slab restricted Boltzmann machine. In: Gordon G, Dunson D, Dudík M (eds) Proceedings of the fourteenth international conference on artificial intelligence and statistics, volume 15 of proceedings of machine learning research, Fort Lauderdale, FL, USA, 11–13 Apr 2011a. PMLR, pp 233–241. http://proceedings.mlr.press/v15/courville11a.html

  8. Courville Aaron, Bergstra James, Bengio Yoshua (2011b) Unsupervised models of images by spike-and-slab rbms. In: Proceedings of the 28th international conference on international conference on machine learning, ICML’11, USA. Omnipress, pp 1145–1152. http://dl.acm.org/citation.cfm?id=3104482.3104626 (ISBN 978-1-4503-0619-5)

  9. Desjardins G, Courville A, Bengio Y (2010a) Adaptive parallel tempering for stochastic maximum likelihood learning of RBMS. arXiv:1012.3476 (arXiv preprint)

  10. Desjardins G, Courville AC, Bengio Y, Vincent P, Delalleau O (2010b) Tempered Markov chain Monte Carlo for training of restricted Boltzmann machines. In: International conference on artificial intelligence and statistics, pp 145–152

  11. Desjardins G, Pascanu R, Courville AC, Bengio Y (2013) Metric-free natural gradient for joint-training of Boltzmann machines. CoRR. arXiv:1301.3545

  12. Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12(Jul):2121–2159

    Google Scholar 

  13. Fischer A, Igel C (2010) Empirical analysis of the divergence of Gibbs sampling based learning algorithms for restricted Boltzmann machines. In: Artificial neural networks–ICANN 2010. Springer, pp 208–217

  14. Fischer A, Igel C (2011) Bounding the bias of contrastive divergence learning. Neural Comput 23(3):664–673

    Article  Google Scholar 

  15. Freund Y, Haussler D (1994) Unsupervised learning of distributions of binary vectors using two layer networks. Computer Research Laboratory [University of California, Santa Cruz]

  16. Grosse RB, Salakhutdinov R (2015) Scaling up natural gradient by sparsely factorizing the inverse fisher matrix. In: Proceedings of the 32nd international conference on international conference on machine learning, volume 37, ICML’15, pp 2304–2313. JMLR.org. http://dl.acm.org/citation.cfm?id=3045118.3045363

  17. Hinton GE, Sejnowski TJ (1986) Parallel distributed processing: Explorations in the microstructure of cognition, vol. 1. chapter learning and relearning in Boltzmann machines. MIT Press, Cambridge, pp 282–317. URL http://dl.acm.org/citation.cfm?id=104279.104291 (ISBN 0-262-68053-X)

  18. Hinton G, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  Google Scholar 

  19. Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14(8):1771–1800

    Article  Google Scholar 

  20. Hinton GE, Salakhutdinov RR (2009) Replicated Softmax: an undirected topic model. In: Bengio Y, Schuurmans D, Lafferty JD, Williams CKI, Culotta A (eds) Advances in neural information processing systems 22. Curran Associates, Inc., pp 1607–1614. http://papers.nips.cc/paper/3856-replicated-softmax-an-undirected-topic-model.pdf

  21. Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci 79(8):2554–2558. https://doi.org/10.1073/pnas.79.8.2554. https://www.pnas.org/content/79/8/2554 (ISSN 0027-8424)

  22. Jiang B, Wu T-Y, Jin Y, Wong WH (2016) Convergence of contrastive divergence algorithm in exponential family. arXiv:1603.05729 (arXiv e-prints)

  23. Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master’s Thesis. http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf

  24. Le Roux N, Bengio Y (2008) Representational power of restricted Boltzmann machines and deep belief networks. Neural Comput 20(6):1631–1649

    Article  Google Scholar 

  25. Lee H, Ekanadham C, Ng AY (2008) Sparse deep belief net model for visual area v2. In: Platt JC, Koller D, Singer Y, Roweis ST (eds) Advances in neural information processing systems 20. Curran Associates, Inc, pp 873–880. http://papers.nips.cc/paper/3313-sparse-deep-belief-net-model-for-visual-area-v2.pdf

  26. Ma X, Wang X (2016) Average contrastive divergence for training restricted Boltzmann machines. Entropy 18(1):35

    Article  Google Scholar 

  27. MacKay DJC (2003) Information theory, inference, and learning algorithms, vol 7. Cambridge University Press, Cambridge

    Google Scholar 

  28. Marlin BM, Swersky K, Chen B, Freitas ND (2010) Inductive principles for restricted Boltzmann machine learning. In: International conference on artificial intelligence and statistics, pp 509–516

  29. Martens J (2010) Deep learning via hessian-free optimization. In: ICML

  30. Melchior J, Fischer A, Wiskott L (2016) How to center deep Boltzmann machines. J Mach Learn Res 17(99):1–61

    Google Scholar 

  31. Montavon G, Klaus-Robert M (2012) Deep Boltzmann machines and the centering trick. Springer, Berlin, pp 621–637. https://doi.org/10.1007/978-3-642-35289-8_33 (ISBN 978-3-642-35289-8)

  32. Montufar G, Ay N (2011) Refinements of universal approximation results for deep belief networks and restricted Boltzmann machines. Neural Comput 23(5):1306—1319. https://doi.org/10.1162/neco_a_00113. https://doi.org/10.1162/NECO_a_00113 (ISSN 0899-7667)

  33. Montúfar G, Rauh J (2017) Hierarchical models as marginals of hierarchical models. Int J Approx Reason 88:531–546. https://doi.org/10.1016/j.ijar.2016.09.003. http://www.sciencedirect.com/science/article/pii/S0888613X16301414 (ISSN 0888-613X)

  34. Neal RM (2001) Annealed importance sampling. Stat Comput 11(2):125–139

    Article  Google Scholar 

  35. Nitanda A, Suzuki T Stochastic difference of convex algorithm and its application to training deep Boltzmann machines. In: Singh A, Zhu J (eds) Proceedings of the 20th international conference on artificial intelligence and statistics, vol 54 of Proceedings of machine learning research, Fort Lauderdale, FL, USA, 20–22 Apr 2017, pp 470–478. PMLR. http://proceedings.mlr.press/v54/nitanda17a.html

  36. Oswin K, Igel C, Fischer A (2015) Population-contrastive-divergence: does consistency help with RBM training? CoRR. arXiv:1510.01624

  37. Ranzato M, Hinton GE (2010) Modeling pixel means and covariances using factorized third-order Boltzmann machines. In: 2010 IEEE computer society conference on computer vision and pattern recognition, pp 2551–2558. https://doi.org/10.1109/CVPR.2010.5539962

  38. Roux NL, Manzagol PA, Bengio Y (2008) Topmoumoute online natural gradient algorithm. In: Platt JC, Koller D, Singer Y, Roweis ST (eds) Advances in neural information processing systems 20. Curran Associates, Inc., pp 849–856. http://papers.nips.cc/paper/3234-topmoumoute-online-natural-gradient-algorithm.pdf

  39. Salakhutdinov R, Mnih A, Hinton G (2007) Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the 24th international conference on machine learning, ICML ’07, New York, NY, USA. ACM, pp 791–798. https://doi.org/10.1145/1273496.1273596. http://doi.acm.org/10.1145/1273496.1273596 (ISBN 978-1-59593-793-3)

  40. Schmah T, Hinton GE, Small SL, Strother S, Zemel RS (2009) Generative versus discriminative training of RBMs for classification of fMRI images. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds) Advances in neural information processing systems 21. Curran Associates, Inc., pp 1409–1416. http://papers.nips.cc/paper/3577-generative-versus-discriminative-training-of-rbms-for-classification-of-fmri-images.pdf

  41. Schulz H, Müller A, Behnke S (2010) Investigating convergence of restricted Boltzmann machine learning. In: NIPS 2010 workshop on deep learning and unsupervised feature learning

  42. Smolensky P (1986) Information processing in dynamical systems: foundations of harmony theory

  43. Sutskever I, Tieleman T (2010) On the convergence properties of contrastive divergence. In: International conference on artificial intelligence and statistics, pp 789–795

  44. Theis L, Gerwinn S, Sinz F, Bethge M (2011). In: All likelihood, deep belief is not enough. J Mach Learn Res 12:3071–3096. http://dl.acm.org/citation.cfm?id=1953048.2078204 (ISSN 1532-4435)

  45. Tieleman T (2008) Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th international conference on machine learning. ACM, pp 1064–1071

  46. Tieleman T, Hinton G (2009) Using fast weights to improve persistent contrastive divergence. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 1033–1040

  47. Upadhya V, Sastry PS (2017) Learning RBM with a DC programming approach. In: Proceedings of the ninth Asian conference on machine learning, volume 77 of proceedings of machine learning research. PMLR, 15–17 Nov 2017, pp 498–513

  48. Wang N, Melchior J, Wiskott L (2014) Gaussian-binary restricted Boltzmann machines on modeling natural image statistics. CoRR. arXiv:1401.5900

  49. Younes L (1989) Parametric inference for imperfectly observed gibbsian fields. Prob Theory Relat Fields 82(4):625–645

    Article  Google Scholar 

  50. Younes L (1999) On the convergence of markovian stochastic algorithms with rapidly decreasing ergodicity rates. Stoch Stoch Rep 65(3–4):177–228. https://doi.org/10.1080/17442509908834179

    Article  Google Scholar 

  51. Yuille AL (2006) The convergence of contrastive divergences. Department of Statistics, UCLA

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to P. S. Sastry.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Special issue—Recent Advances in Machine Learning.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Upadhya, V., Sastry, P.S. An Overview of Restricted Boltzmann Machines. J Indian Inst Sci 99, 225–236 (2019). https://doi.org/10.1007/s41745-019-0102-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41745-019-0102-z

Navigation