An Introduction to Restricted Boltzmann Machines

Fischer, Asja; Igel, Christian

doi:10.1007/978-3-642-33275-3_2

Asja Fischer^19,20 &
Christian Igel²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7441))

Included in the following conference series:

Iberoamerican Congress on Pattern Recognition

14k Accesses
246 Citations
3 Altmetric

Abstract

Restricted Boltzmann machines (RBMs) are probabilistic graphical models that can be interpreted as stochastic neural networks. The increase in computational power and the development of faster learning algorithms have made them applicable to relevant machine learning problems. They attracted much attention recently after being proposed as building blocks of multi-layer learning systems called deep belief networks. This tutorial introduces RBMs as undirected graphical models. The basic concepts of graphical models are introduced first, however, basic knowledge in statistics is presumed. Different learning algorithms for RBMs are discussed. As most of them are based on Markov chain Monte Carlo (MCMC) methods, an introduction to Markov chains and the required MCMC techniques is provided.

Download to read the full chapter text

Chapter PDF

An Overview of Restricted Boltzmann Machines

Article 18 February 2019

Restricted Boltzmann Machines: Introduction and Review

Empirical Bayes Method for Boltzmann Machines

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Ackley, D.H., Hinton, G.E., Sejnowski, T.J.: A learning algorithm for Boltzmann machines. Cognitive Science 9, 147–169 (1985)
Article Google Scholar
Bengio, Y.: Learning deep architectures for AI. Foundations and Trends in Machine Learning 21(6), 1601–1621 (2009)
MathSciNet MATH Google Scholar
Bengio, Y., Delalleau, O.: Justifying and generalizing contrastive divergence. Neural Computation 21(6), 1601–1621 (2009)
Article MathSciNet MATH Google Scholar
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H., Montreal, U.: Greedy layer-wise training of deep networks. In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing (NIPS 19), pp. 153–160. MIT Press (2007)
Google Scholar
Bishop, C.M.: Pattern recognition and machine learning. Springer (2006)
Google Scholar
Brémaud, P.: Markov chains: Gibbs fields, Monte Carlo simulation, and queues. Springer (1999)
Google Scholar
Carreira-Perpiñán, M.Á., Hinton, G.E.: On contrastive divergence learning. In: 10th International Workshop on Artificial Intelligence and Statistics (AISTATS 2005), pp. 59–66 (2005)
Google Scholar
Cho, K., Raiko, T., Ilin, A.: Parallel tempering is efficient for learning restricted Boltzmann machines. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN 2010), pp. 3246–3253. IEEE Press (2010)
Google Scholar
Desjardins, G., Courville, A., Bengio, Y.: Adaptive parallel tempering for stochastic maximum likelihood learning of RBMs. In: Lee, H., Ranzato, M., Bengio, Y., Hinton, G., LeCun, Y., Ng, A.Y. (eds.) NIPS 2010 Workshop on Deep Learning and Unsupervised Feature Learning (2010)
Google Scholar
Desjardins, G., Courville, A., Bengio, Y., Vincent, P., Dellaleau, O.: Parallel tempering for training of restricted Boltzmann machines. In: JMLR Workshop and Conference Proceedings: AISTATS 2010, vol. 9, pp. 145–152 (2010)
Google Scholar
Fischer, A., Igel, C.: Empirical Analysis of the Divergence of Gibbs Sampling Based Learning Algorithms for Restricted Boltzmann Machines. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds.) ICANN 2010, Part III. LNCS, vol. 6354, pp. 208–217. Springer, Heidelberg (2010)
Chapter Google Scholar
Fischer, A., Igel, C.: Bounding the bias of contrastive divergence learning. Neural Computation 23, 664–673 (2011)
Article MathSciNet MATH Google Scholar
Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence 6, 721–741 (1984)
Article MATH Google Scholar
Hastings, W.K.: Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1), 97–109 (1970)
Article MATH Google Scholar
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Computation 14, 1771–1800 (2002)
Article MATH Google Scholar
Hinton, G.E.: Boltzmann machine. Scholarpedia 2(5), 1668 (2007)
Article Google Scholar
Hinton, G.E.: Learning multiple layers of representation. Trends in Cognitive Sciences 11(10), 428–434 (2007)
Article Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation 18(7), 1527–1554 (2006)
Article MathSciNet MATH Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Igel, C., Glasmachers, T., Heidrich-Meisner, V.: Shark. Journal of Machine Learning Research 9, 993–996 (2008)
MATH Google Scholar
Kivinen, J., Williams, C.: Multiple texture boltzmann machines. In: JMLR Workshop and Conference Proceedings: AISTATS 2012, vol. 22, pp. 638–646 (2012)
Google Scholar
Koller, D., Friedman, N.: Probabilistic graphical models: Principles and techniques. MIT Press (2009)
Google Scholar
Lauritzen, S.L.: Graphical models. Oxford University Press (1996)
Google Scholar
Le Roux, N., Bengio, Y.: Representational power of restricted Boltzmann machines and deep belief networks. Neural Computation 20(6), 1631–1649 (2008)
Article MathSciNet MATH Google Scholar
Le Roux, N., Heess, N., Shotton, J., Winn, J.M.: Learning a generative model of images by factoring appearance and shape. Neural Computation 23(3), 593–650 (2011)
Article MathSciNet MATH Google Scholar
Lingenheil, M., Denschlag, R., Mathias, G., Tavan, P.: Efficiency of exchange schemes in replica exchange. Chemical Physics Letters 478, 80–84 (2009)
Article Google Scholar
MacKay, D.J.C.: Failures of the one-step learning algorithm. Cavendish Laboratory, Madingley Road, Cambridge CB3 0HE, UK (2001), http://www.cs.toronto.edu/~mackay/gbm.pdf
MacKay, D.J.C.: Information Theory, Inference & Learning Algorithms. Cambridge University Press (2002)
Google Scholar
Mnih, V., Larochelle, H., Hinton, G.: Conditional restricted Boltzmann machines for structured output prediction. In: Cozman, F.G., Pfeffer, A. (eds.) Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (UAI 2011), p. 514. AUAI Press (2011)
Google Scholar
Montufar, G., Ay, N.: Refinements of universal approximation results for deep belief networks and restricted Boltzmann machines. Neural Comput. 23(5), 1306–1319
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. In: Rumelhart, D.E., McClelland, J.L. (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1: Foundations, pp. 318–362. MIT Press (1986)
Google Scholar
Salakhutdinov, R., Hinton, G.E.: Deep Boltzmann machines. In: JMLR Workshop and Conference Proceedings: AISTATS 2009, vol. 5, pp. 448–455 (2009)
Google Scholar
Salakhutdinov, R.: Learning in Markov random fields using tempered transitions. In: Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22, pp. 1598–1606 (2009)
Google Scholar
Smolensky, P.: Information processing in dynamical systems: Foundations of harmony theory. In: Rumelhart, D.E., McClelland, J.L. (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1: Foundations, pp. 194–281. MIT Press (1986)
Google Scholar
Taylor, G.W., Hinton, G.E., Roweis, S.T.: Modeling human motion using binary latent variables. In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing Systems (NIPS 19), pp. 1345–1352. MIT Press (2007)
Google Scholar
Tieleman, T.: Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Cohen, W.W., McCallum, A., Roweis, S.T. (eds.) International Conference on Machine learning (ICML), pp. 1064–1071. ACM (2008)
Google Scholar
Tieleman, T., Hinton, G.E.: Using fast weights to improve persistent contrastive divergence. In: Pohoreckyj Danyluk, A., Bottou, L., Littman, M.L. (eds.) International Conference on Machine Learning (ICML), pp. 1033–1040. ACM (2009)
Google Scholar
Wang, N., Melchior, J., Wiskott, L.: An analysis of Gaussian-binary restricted Boltzmann machines for natural images. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), pp. 287–292. d-side publications, Evere (2012)
Google Scholar
Welling, M.: Product of experts. Scholarpedia 2(10), 3879 (2007)
Article Google Scholar
Welling, M., Rosen-Zvi, M., Hinton, G.: Exponential family harmoniums with an application to information retrieval. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems (NIPS 17), pp. 1481–1488. MIT Press, Cambridge (2005)
Google Scholar
Younes, L.: Maximum likelihood estimation of Gibbs fields. In: Possolo, A. (ed.) Proceedings of an AMS-IMS-SIAM Joint Conference on Spacial Statistics and Imaging. Lecture Notes Monograph Series, Institute of Mathematical Statistics, Hayward (1991)
Google Scholar
Yuille, A.L.: The convergence of contrastive divergence. In: Saul, L., Weiss, Y., Bottou, L. (eds.) Advances in Neural Processing Systems (NIPS 17), pp. 1593–1600. MIT Press (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Neuroinformatik, Ruhr-Universität Bochum, Germany
Asja Fischer
Department of Computer Science, University of Copenhagen, Denmark
Asja Fischer & Christian Igel

Authors

Asja Fischer
View author publications
You can also search for this author in PubMed Google Scholar
Christian Igel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Informatica y Sistemas, Universidad de Las Palmas de Gran Canaria, Campus de Tafira, 35017, Las Palmas de Gran Canaria, Spain
Luis Alvarez
Universidad de Buenos Aires, Argentina
Marta Mejail & Julio Jacobo &
Universidad de Las Palmas de Gran Canaria, Spain
Luis Gomez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fischer, A., Igel, C. (2012). An Introduction to Restricted Boltzmann Machines. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2012. Lecture Notes in Computer Science, vol 7441. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33275-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-33275-3_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33274-6
Online ISBN: 978-3-642-33275-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

An Introduction to Restricted Boltzmann Machines

Abstract

Chapter PDF

Similar content being viewed by others

An Overview of Restricted Boltzmann Machines

Restricted Boltzmann Machines: Introduction and Review

Empirical Bayes Method for Boltzmann Machines

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

An Introduction to Restricted Boltzmann Machines

Abstract

Chapter PDF

Similar content being viewed by others

An Overview of Restricted Boltzmann Machines

Restricted Boltzmann Machines: Introduction and Review

Empirical Bayes Method for Boltzmann Machines

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation