An Overview of Restricted Boltzmann Machines

Upadhya, Vidyadhar; Sastry, P. S.

doi:10.1007/s41745-019-0102-z

An Overview of Restricted Boltzmann Machines

Review Article
Published: 18 February 2019

Volume 99, pages 225–236, (2019)
Cite this article

Journal of the Indian Institute of Science Aims and scope

984 Accesses
17 Citations
Explore all metrics

Abstract

The restricted Boltzmann machine (RBM) is a two-layered network of stochastic units with undirected connections between pairs of units in the two layers. The two layers of nodes are called visible and hidden nodes. In an RBM, there are no connections from visible to visible or hidden to hidden nodes. RBMs are used mainly as a generative model. They can be suitably modified to perform classification tasks also. They are among the basic building blocks of other deep learning models such as deep Boltzmann machine and deep belief networks. The aim of this article is to give a tutorial introduction to the restricted Boltzmann machines and to review the evolution of this model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Restricted Boltzmann Machines: Introduction and Review

Empirical Bayes Method for Boltzmann Machines

The Architectures of Geoffrey Hinton

References

Bengio Y, Delalleau O (2009) Justifying and generalizing contrastive divergence. Neural Comput 21(6):1601–1621
Article Google Scholar
Bengio Y, Yao L, Cho K (2013) Bounding the test log-likelihood of generative models. arXiv:1311.6184 (arXiv preprint)
Burda Y, Grosse RB, Salakhutdinov R (2014) Accurate and conservative estimates of MRF log-likelihood using reverse annealing. arXiv:1412.8566 (arXiv preprint)
Carlson D, Cevher V, Carin L (2015) Stochastic spectral descent for restricted Boltzmann machines. In: Proceedings of the eighteenth international conference on artificial intelligence and statistics, pp 111–119
Carreira-PMA, Hinton GE (2005) On contrastive divergence learning. In: Proceedings of the tenth international workshop on artificial intelligence and statistics. Citeseer, pp 33–40
Cho K, Ilin A, Raiko T (2011) Improved learning of Gaussian–Bernoulli restricted Boltzmann machines. In: Honkela T, Duch W, Girolami M, Kaski S (eds) Artificial neural networks and machine learning–ICANN 2011. Springer, Berlin, pp 10–17 (ISBN 978-3-642-21735-7)
Chapter Google Scholar
Courville A, Bergstra J, Bengio Y A spike and slab restricted Boltzmann machine. In: Gordon G, Dunson D, Dudík M (eds) Proceedings of the fourteenth international conference on artificial intelligence and statistics, volume 15 of proceedings of machine learning research, Fort Lauderdale, FL, USA, 11–13 Apr 2011a. PMLR, pp 233–241. http://proceedings.mlr.press/v15/courville11a.html
Courville Aaron, Bergstra James, Bengio Yoshua (2011b) Unsupervised models of images by spike-and-slab rbms. In: Proceedings of the 28th international conference on international conference on machine learning, ICML’11, USA. Omnipress, pp 1145–1152. http://dl.acm.org/citation.cfm?id=3104482.3104626 (ISBN 978-1-4503-0619-5)
Desjardins G, Courville A, Bengio Y (2010a) Adaptive parallel tempering for stochastic maximum likelihood learning of RBMS. arXiv:1012.3476 (arXiv preprint)
Desjardins G, Courville AC, Bengio Y, Vincent P, Delalleau O (2010b) Tempered Markov chain Monte Carlo for training of restricted Boltzmann machines. In: International conference on artificial intelligence and statistics, pp 145–152
Desjardins G, Pascanu R, Courville AC, Bengio Y (2013) Metric-free natural gradient for joint-training of Boltzmann machines. CoRR. arXiv:1301.3545
Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12(Jul):2121–2159
Google Scholar
Fischer A, Igel C (2010) Empirical analysis of the divergence of Gibbs sampling based learning algorithms for restricted Boltzmann machines. In: Artificial neural networks–ICANN 2010. Springer, pp 208–217
Fischer A, Igel C (2011) Bounding the bias of contrastive divergence learning. Neural Comput 23(3):664–673
Article Google Scholar
Freund Y, Haussler D (1994) Unsupervised learning of distributions of binary vectors using two layer networks. Computer Research Laboratory [University of California, Santa Cruz]
Grosse RB, Salakhutdinov R (2015) Scaling up natural gradient by sparsely factorizing the inverse fisher matrix. In: Proceedings of the 32nd international conference on international conference on machine learning, volume 37, ICML’15, pp 2304–2313. JMLR.org. http://dl.acm.org/citation.cfm?id=3045118.3045363
Hinton GE, Sejnowski TJ (1986) Parallel distributed processing: Explorations in the microstructure of cognition, vol. 1. chapter learning and relearning in Boltzmann machines. MIT Press, Cambridge, pp 282–317. URL http://dl.acm.org/citation.cfm?id=104279.104291 (ISBN 0-262-68053-X)
Hinton G, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Article Google Scholar
Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14(8):1771–1800
Article Google Scholar
Hinton GE, Salakhutdinov RR (2009) Replicated Softmax: an undirected topic model. In: Bengio Y, Schuurmans D, Lafferty JD, Williams CKI, Culotta A (eds) Advances in neural information processing systems 22. Curran Associates, Inc., pp 1607–1614. http://papers.nips.cc/paper/3856-replicated-softmax-an-undirected-topic-model.pdf
Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci 79(8):2554–2558. https://doi.org/10.1073/pnas.79.8.2554. https://www.pnas.org/content/79/8/2554 (ISSN 0027-8424)
Jiang B, Wu T-Y, Jin Y, Wong WH (2016) Convergence of contrastive divergence algorithm in exponential family. arXiv:1603.05729 (arXiv e-prints)
Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master’s Thesis. http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf
Le Roux N, Bengio Y (2008) Representational power of restricted Boltzmann machines and deep belief networks. Neural Comput 20(6):1631–1649
Article Google Scholar
Lee H, Ekanadham C, Ng AY (2008) Sparse deep belief net model for visual area v2. In: Platt JC, Koller D, Singer Y, Roweis ST (eds) Advances in neural information processing systems 20. Curran Associates, Inc, pp 873–880. http://papers.nips.cc/paper/3313-sparse-deep-belief-net-model-for-visual-area-v2.pdf
Ma X, Wang X (2016) Average contrastive divergence for training restricted Boltzmann machines. Entropy 18(1):35
Article Google Scholar
MacKay DJC (2003) Information theory, inference, and learning algorithms, vol 7. Cambridge University Press, Cambridge
Google Scholar
Marlin BM, Swersky K, Chen B, Freitas ND (2010) Inductive principles for restricted Boltzmann machine learning. In: International conference on artificial intelligence and statistics, pp 509–516
Martens J (2010) Deep learning via hessian-free optimization. In: ICML
Melchior J, Fischer A, Wiskott L (2016) How to center deep Boltzmann machines. J Mach Learn Res 17(99):1–61
Google Scholar
Montavon G, Klaus-Robert M (2012) Deep Boltzmann machines and the centering trick. Springer, Berlin, pp 621–637. https://doi.org/10.1007/978-3-642-35289-8_33 (ISBN 978-3-642-35289-8)
Montufar G, Ay N (2011) Refinements of universal approximation results for deep belief networks and restricted Boltzmann machines. Neural Comput 23(5):1306—1319. https://doi.org/10.1162/neco_a_00113. https://doi.org/10.1162/NECO_a_00113 (ISSN 0899-7667)
Montúfar G, Rauh J (2017) Hierarchical models as marginals of hierarchical models. Int J Approx Reason 88:531–546. https://doi.org/10.1016/j.ijar.2016.09.003. http://www.sciencedirect.com/science/article/pii/S0888613X16301414 (ISSN 0888-613X)
Neal RM (2001) Annealed importance sampling. Stat Comput 11(2):125–139
Article Google Scholar
Nitanda A, Suzuki T Stochastic difference of convex algorithm and its application to training deep Boltzmann machines. In: Singh A, Zhu J (eds) Proceedings of the 20th international conference on artificial intelligence and statistics, vol 54 of Proceedings of machine learning research, Fort Lauderdale, FL, USA, 20–22 Apr 2017, pp 470–478. PMLR. http://proceedings.mlr.press/v54/nitanda17a.html
Oswin K, Igel C, Fischer A (2015) Population-contrastive-divergence: does consistency help with RBM training? CoRR. arXiv:1510.01624
Ranzato M, Hinton GE (2010) Modeling pixel means and covariances using factorized third-order Boltzmann machines. In: 2010 IEEE computer society conference on computer vision and pattern recognition, pp 2551–2558. https://doi.org/10.1109/CVPR.2010.5539962
Roux NL, Manzagol PA, Bengio Y (2008) Topmoumoute online natural gradient algorithm. In: Platt JC, Koller D, Singer Y, Roweis ST (eds) Advances in neural information processing systems 20. Curran Associates, Inc., pp 849–856. http://papers.nips.cc/paper/3234-topmoumoute-online-natural-gradient-algorithm.pdf
Salakhutdinov R, Mnih A, Hinton G (2007) Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the 24th international conference on machine learning, ICML ’07, New York, NY, USA. ACM, pp 791–798. https://doi.org/10.1145/1273496.1273596. http://doi.acm.org/10.1145/1273496.1273596 (ISBN 978-1-59593-793-3)
Schmah T, Hinton GE, Small SL, Strother S, Zemel RS (2009) Generative versus discriminative training of RBMs for classification of fMRI images. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds) Advances in neural information processing systems 21. Curran Associates, Inc., pp 1409–1416. http://papers.nips.cc/paper/3577-generative-versus-discriminative-training-of-rbms-for-classification-of-fmri-images.pdf
Schulz H, Müller A, Behnke S (2010) Investigating convergence of restricted Boltzmann machine learning. In: NIPS 2010 workshop on deep learning and unsupervised feature learning
Smolensky P (1986) Information processing in dynamical systems: foundations of harmony theory
Sutskever I, Tieleman T (2010) On the convergence properties of contrastive divergence. In: International conference on artificial intelligence and statistics, pp 789–795
Theis L, Gerwinn S, Sinz F, Bethge M (2011). In: All likelihood, deep belief is not enough. J Mach Learn Res 12:3071–3096. http://dl.acm.org/citation.cfm?id=1953048.2078204 (ISSN 1532-4435)
Tieleman T (2008) Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th international conference on machine learning. ACM, pp 1064–1071
Tieleman T, Hinton G (2009) Using fast weights to improve persistent contrastive divergence. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 1033–1040
Upadhya V, Sastry PS (2017) Learning RBM with a DC programming approach. In: Proceedings of the ninth Asian conference on machine learning, volume 77 of proceedings of machine learning research. PMLR, 15–17 Nov 2017, pp 498–513
Wang N, Melchior J, Wiskott L (2014) Gaussian-binary restricted Boltzmann machines on modeling natural image statistics. CoRR. arXiv:1401.5900
Younes L (1989) Parametric inference for imperfectly observed gibbsian fields. Prob Theory Relat Fields 82(4):625–645
Article Google Scholar
Younes L (1999) On the convergence of markovian stochastic algorithms with rapidly decreasing ergodicity rates. Stoch Stoch Rep 65(3–4):177–228. https://doi.org/10.1080/17442509908834179
Article Google Scholar
Yuille AL (2006) The convergence of contrastive divergences. Department of Statistics, UCLA

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Indian Institute of Science, Bangalore, India
Vidyadhar Upadhya & P. S. Sastry

Authors

Vidyadhar Upadhya
View author publications
You can also search for this author in PubMed Google Scholar
P. S. Sastry
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to P. S. Sastry.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Special issue—Recent Advances in Machine Learning.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Upadhya, V., Sastry, P.S. An Overview of Restricted Boltzmann Machines. J Indian Inst Sci 99, 225–236 (2019). https://doi.org/10.1007/s41745-019-0102-z

Download citation

Received: 26 December 2018
Accepted: 30 January 2019
Published: 18 February 2019
Issue Date: 01 June 2019
DOI: https://doi.org/10.1007/s41745-019-0102-z

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Overview of Restricted Boltzmann Machines

Abstract

Access this article

Similar content being viewed by others

Restricted Boltzmann Machines: Introduction and Review

Empirical Bayes Method for Boltzmann Machines

The Architectures of Geoffrey Hinton

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Navigation

An Overview of Restricted Boltzmann Machines

Abstract

Access this article

Similar content being viewed by others

Restricted Boltzmann Machines: Introduction and Review

Empirical Bayes Method for Boltzmann Machines

The Architectures of Geoffrey Hinton

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation