Boltzmann Machines

Du, Ke-Lin; Swamy, M. N. S.

doi:10.1007/978-1-4471-7452-3_23

Ke-Lin Du^3,4 &
M. N. S. Swamy³

4382 Accesses

Abstract

Since its invention in 1985, the Boltzmann machine has long been treated as a model with mere historic significance to the machine learning community. In 2006, this model began to gain popularity when Hinton and collaborators achieved a breakthrough in deep learning, where restricted Boltzmann machine is the prime component of the deep neural network. In this chapter, we introduce the Boltzmann machine and its reduced form known as the restricted Boltzmann machine, as well as their learning algorithms. Related topics are also treated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ackley, D. H., Hinton, G. E., & Sejnowski, T. J. (1985). A learning algorithm for Boltzmann machines. Cognitive Science, 9, 147–169.
Article Google Scholar
Akiyama, Y., Yamashita, A., Kajiura, M., & Aiso, H. (1989). Combinatorial optimization with Gaussian machines. In Proceedings of International Joint Conference on Neural Networks (pp. 533–540). Washington, DC.
Google Scholar
Attias, H. (1999). Inferring parameters and structure of latent variable models by variational Bayes. In Proceedings of the 15th Annual Conference on Uncertainty in AI (pp. 21–30).
Google Scholar
Azencott, R., Doutriaux, A., & Younes, L. (1993). Synchronous Boltzmann machines and curve identification tasks. Network, 4, 461–480.
MATH Google Scholar
Baldi, P., & Pineda, F. (1991). Contrastive learning and neural oscillations. Neural Computation, 3(4), 526–545.
Article Google Scholar
Baldi, P., & Sadowski, P. (2014). The dropout learning algorithm. Artificial Intelligence, 210, 78–122.
Article MathSciNet MATH Google Scholar
Barra, A., Bernacchia, A., Santucci, E., & Contucci, P. (2012). On the equivalence of Hopfield networks and Boltzmann machines. Neural Networks, 34, 1–9.
Article MATH Google Scholar
Bengio, Y. (2009). Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1), 1–127.
Article MathSciNet MATH Google Scholar
Bengio, Y., & Delalleau, O. (2009). Justifying and generalizing contrastive divergence. Neural Computation, 21(6), 1601–1621.
Article MathSciNet MATH Google Scholar
Brugge, K., Fischer, A., & Igel, C. (2013). The flip-the-state transition operator for restricted Boltzmann machines. Machine Learning, 93(1), 53–69.
Article MathSciNet MATH Google Scholar
Carreira-Perpinan, M. A., & Hinton, G. E. (2005). On contrastive divergence learning. In Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (pp. 59–66).
Google Scholar
Cho, K. H., Raiko, T., & Ilin, A. (2013). Gaussian–Bernoulli deep Boltzmann machine. In Proceedings of International Joint Conference on Neural Networks (IJCNN) (pp. 1–7).
Google Scholar
Cote, M. A., & Larochelle, H. (2016). An infinite restricted Boltzmann machine. Neural Computation, 28, 1265–1289.
Article MathSciNet MATH Google Scholar
Del Genio, C. I., Gross, T., & Bassler, K. E. (2011). All scale-free networks are sparse. Physical Review Letters, 107(19), Paper No. 178701.
Google Scholar
Desjardins, G., Courville, A., Bengio, Y., Vincent, P., & Dellaleau, O. (2010). Parallel tempering for training of restricted Boltzmann machines. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS’10) (pp. 145–152).
Google Scholar
Detorakis, G., Bartley, T., & Neftci, E. (2019). Contrastive Hebbian learning with random feedback weights. Neural Networks, 114, 1–14.
Article Google Scholar
Elfwing, S., Uchibe, E., & Doya, K. (2015). Expected energy-based restricted Boltzmann machine for classification. Neural Networks, 64, 29–38.
Article MATH Google Scholar
Fischer, A., & Igel, C. (2011). Bounding the bias of contrastive divergence learning. Neural Computation, 23(3), 664–673.
Article MathSciNet MATH Google Scholar
Gabrie, M., Tramel, E. W., & Krzakala, F. (2015). Training restricted Boltzmann machine via the Thouless–Anderson–Palmer free energy. In Advances in neural information processing systems (pp. 640–648).
Google Scholar
Galland, C. C. (1993). The limitations of deterministic Boltzmann machine learning. Network, 4, 355–380.
Article MATH Google Scholar
Glauber, R. J. (1963). Time-dependent statistics of the Ising model. Journal of Mathematical Physics, 4, 294–307.
Article MathSciNet MATH Google Scholar
Hartman, E. (1991). A high storage capacity neural network content-addressable memory. Network, 2, 315–334.
Article Google Scholar
Haykin, S. (1999). Neural networks: A comprehensive foundation (2nd ed.). Upper Saddle River, NJ: Prentice Hall.
MATH Google Scholar
Hinton, G. E. (1989). Deterministic Boltzmann learning performs steepest descent in weight-space. Neural Computation, 1, 143–150.
Article Google Scholar
Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), 1771–1800.
Article MATH Google Scholar
Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554.
Article MathSciNet MATH Google Scholar
Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504–507.
Article MathSciNet MATH Google Scholar
Hinton, G. E., & Sejnowski, T. J. (1986). Learning and relearning in Boltzmann machines. In D. E. Rumelhart & J. L. McClelland (Eds.), Parallel distributed processing: Explorations in microstructure of cognition (Vol. 1, pp. 282–317). Cambridge, MA: MIT Press.
Google Scholar
Igel, C., Glasmachers, T., & Heidrich-Meisner, V. (2008). Shark. Journal of Machine Learning Research, 9, 993–996.
Google Scholar
Kam, M., & Cheng, R. (1989). Convergence and pattern stabilization in the Boltzmann machine. In D. S. Touretzky (Ed.), Advances in neural information processing systems (Vol. 1, pp. 511–518). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Kappen, H. J., & Rodriguez, F. B. (1998). Efficient learning in Boltzmann machine using linear response theory. Neural Computation, 10, 1137–1156.
Article Google Scholar
Kurita, N., & Funahashi, K. I. (1996). On the Hopfield neural networks and mean field theory. Neural Networks, 9, 1531–1540.
Article MATH Google Scholar
Larochelle, H., & Bengio, Y. (2008). Classification using discriminative restricted Boltzmann machines. In Proceedings of the 25th International Conference on Machine Learning (pp. 536–543). Helsinki, Finlan.
Google Scholar
Le Roux, N., & Bengio, Y. (2008). Representational power of restricted Boltzmann machines and deep belief networks. Neural Computation, 20(6), 1631–1649.
Article MathSciNet MATH Google Scholar
Levy, B. C., & Adams, M. B. (1987). Global optimization with stochastic neural networks. In Proceedings of the 1st IEEE Conference on Neural Networks (Vol. 3, pp. 681–689). San Diego, CA.
Google Scholar
Lillicrap, T. P., Cownden, D., Tweed, D. B., & Akerman, C. J. (2016). Random synaptic feedback weights support error backpropagation for deep learning. Nature Communications, 7, Paper No. 13276.
Google Scholar
Lin, C. T., & Lee, C. S. G. (1995). A multi-valued Boltzmann machine. IEEE Transactions on Systems Man and Cybernetics, 25(4), 660–669.
Article Google Scholar
Mocanu, D. C., Mocanu, E., Nguyen, P. H., Gibescu, M., & Liotta, A. (2016). A topological insight into restricted Boltzmann machines. Machine Learning, 104(2), 243–270.
Article MathSciNet MATH Google Scholar
Montufar, G., Ay, N., & Ghazi-Zahedi, K. (2015). Geometry and expressive power of conditional restricted Boltzmann machines. Journal of Machine Learning Research, 16, 2405–2436.
MathSciNet MATH Google Scholar
Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted Boltzmann machines. In Proceedings of the International Conference on Machine Learning (ICML) (pp. 807–814).
Google Scholar
Neftci, E., Das, S., Pedroni, B., Kreutz-Delgado, K., & Cauwenberghs, G. (2014). Event-driven contrastive divergence for spiking neuromorphic systems. Frontiers in Neuroscience, 8, 1–14.
Google Scholar
Odense, S., & Edwards, R. (2016). Universal approximation results for the temporal restricted Boltzmann machine and the recurrent temporal restricted Boltzmann machine. Journal of Machine Learning Research, 17, 1–21.
MathSciNet MATH Google Scholar
Peng, X., Gao, X., & Li, X. (2918). On better training the infinite restricted Boltzmann machines. Machine Learning, 107(6), 943–968.
Google Scholar
Peterson, C., & Anderson, J. R. (1987). A mean field learning algorithm for neural networks. Complex Systems, 1(5), 995–1019.
MATH Google Scholar
Ranzato, M. A., Krizhevsky, A., & Hinton, G. E. (2010). Factored 3-way restricted Boltzmann machines for modeling natural images. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS) (pp. 621–628). Sardinia, Italy.
Google Scholar
Romero, E., Mazzantib, F., Delgado, J., & Buchaca, D. (2019). Weighted contrastive divergence. Neural Networks, 114, 147–156.
Article Google Scholar
Salakhutdinov, R., & Hinton, G. (2009). Replicated softmax: An undirected topic model. In Advances in neural information processing systems (Vol. 22, pp. 1607–1614). Vancouver, Canada.
Google Scholar
Sankar, A. R., & Balasubramanian, V. N. (2015). Similarity-based contrastive divergence methods for energy-based deep learning models. In JMLR Workshop and Conference Proceedings (Vol. 45, pp. 391–406).
Google Scholar
Smolensky, P. (1986). Information processing in dynamical systems: Foundations of harmony theory. In D. E. Rumelhart, J. L. McClelland, & the PDP Research Group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (Vol. 1, pp. 194–281). Cambridge, MA: MIT Press.
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15, 1929–1958.
MathSciNet MATH Google Scholar
Szu, H. H., & Hartley, R. L. (1987). Nonconvex optimization by fast simulated annealing. Proceedings of the IEEE, 75, 1538–1540.
Article Google Scholar
Taylor, G. W., Hinton, G. E., & Roweis, S. T. (2011). Two distributed-state models for generating high-dimensional time series. Journal of Machine Learning Research, 12, 1025–1068.
MathSciNet MATH Google Scholar
Thouless, D. J., Anderson, P. W., & Palmer, R. G. (1977). Solution of “solvable model of a spin glass”. Philosophical Magazine, 35(3), 593–601.
Article Google Scholar
Tieleman, T. (2008). Training restricted Boltzmann machines using approximations to the likelihood gradient. In W. W. Cohen, A. McCallum, & S. T. Roweis (Eds.), Proceedings of the 25th International Conference on Machine Learning (pp. 1064–1071). New York: ACM.
Google Scholar
Tieleman, T., & Hinton, G. E. (2009). Using fast weights to improve persistent contrastive divergence. In A. P. Danyluk, L. Bottou, & M. L. Littman (Eds.), Proceedings of the 26th Annual International Conference on Machine Learning (pp. 1033–1040). New York: ACM.
Google Scholar
Welling, M., Rosen-Zvi, M., & Hinton, G. (2004). Exponential family harmoniums with an application to information retrieval. In Advances in neural information processing systems (Vol. 17, pp. 1481–1488).
Google Scholar
Wu, J. M. (2004). Annealing by two sets of interactive dynamics. IEEE Transactions on Systems, Man, and Cybernetics Part B, 34(3), 1519–1525.
Article MathSciNet Google Scholar
Xie, X., & Seung, H. S. (2003). Equivalence of backpropagation and contrastive Hebbian learning in a layered network. Neural Computation, 15(2), 441–454.
Article MATH Google Scholar
Yasuda, M., & Tanaka, K. (2009). Approximate learning algorithm in Boltzmann machines. Neural Computation, 21, 3130–3178.
Article MathSciNet MATH Google Scholar
Younes, L. (1996). Synchronous Boltzmann machines can be universal approximators. Applied Mathematics Letters, 9(3), 109–113.
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Concordia University, Montreal, QC, Canada
Ke-Lin Du & M. N. S. Swamy
Xonlink Inc., Hangzhou, China
Ke-Lin Du

Authors

Ke-Lin Du
View author publications
You can also search for this author in PubMed Google Scholar
M. N. S. Swamy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ke-Lin Du .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Du, KL., Swamy, M.N.S. (2019). Boltzmann Machines. In: Neural Networks and Statistical Learning. Springer, London. https://doi.org/10.1007/978-1-4471-7452-3_23

Download citation

DOI: https://doi.org/10.1007/978-1-4471-7452-3_23
Published: 13 September 2019
Publisher Name: Springer, London
Print ISBN: 978-1-4471-7451-6
Online ISBN: 978-1-4471-7452-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics