Skip to main content

Boltzmann Machines

  • Chapter
  • First Online:
Book cover Neural Networks and Statistical Learning
  • 4382 Accesses

Abstract

Since its invention in 1985, the Boltzmann machine has long been treated as a model with mere historic significance to the machine learning community. In 2006, this model began to gain popularity when Hinton and collaborators achieved a breakthrough in deep learning, where restricted Boltzmann machine is the prime component of the deep neural network. In this chapter, we introduce the Boltzmann machine and its reduced form known as the restricted Boltzmann machine, as well as their learning algorithms. Related topics are also treated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ackley, D. H., Hinton, G. E., & Sejnowski, T. J. (1985). A learning algorithm for Boltzmann machines. Cognitive Science, 9, 147–169.

    Article  Google Scholar 

  2. Akiyama, Y., Yamashita, A., Kajiura, M., & Aiso, H. (1989). Combinatorial optimization with Gaussian machines. In Proceedings of International Joint Conference on Neural Networks (pp. 533–540). Washington, DC.

    Google Scholar 

  3. Attias, H. (1999). Inferring parameters and structure of latent variable models by variational Bayes. In Proceedings of the 15th Annual Conference on Uncertainty in AI (pp. 21–30).

    Google Scholar 

  4. Azencott, R., Doutriaux, A., & Younes, L. (1993). Synchronous Boltzmann machines and curve identification tasks. Network, 4, 461–480.

    MATH  Google Scholar 

  5. Baldi, P., & Pineda, F. (1991). Contrastive learning and neural oscillations. Neural Computation, 3(4), 526–545.

    Article  Google Scholar 

  6. Baldi, P., & Sadowski, P. (2014). The dropout learning algorithm. Artificial Intelligence, 210, 78–122.

    Article  MathSciNet  MATH  Google Scholar 

  7. Barra, A., Bernacchia, A., Santucci, E., & Contucci, P. (2012). On the equivalence of Hopfield networks and Boltzmann machines. Neural Networks, 34, 1–9.

    Article  MATH  Google Scholar 

  8. Bengio, Y. (2009). Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1), 1–127.

    Article  MathSciNet  MATH  Google Scholar 

  9. Bengio, Y., & Delalleau, O. (2009). Justifying and generalizing contrastive divergence. Neural Computation, 21(6), 1601–1621.

    Article  MathSciNet  MATH  Google Scholar 

  10. Brugge, K., Fischer, A., & Igel, C. (2013). The flip-the-state transition operator for restricted Boltzmann machines. Machine Learning, 93(1), 53–69.

    Article  MathSciNet  MATH  Google Scholar 

  11. Carreira-Perpinan, M. A., & Hinton, G. E. (2005). On contrastive divergence learning. In Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (pp. 59–66).

    Google Scholar 

  12. Cho, K. H., Raiko, T., & Ilin, A. (2013). Gaussian–Bernoulli deep Boltzmann machine. In Proceedings of International Joint Conference on Neural Networks (IJCNN) (pp. 1–7).

    Google Scholar 

  13. Cote, M. A., & Larochelle, H. (2016). An infinite restricted Boltzmann machine. Neural Computation, 28, 1265–1289.

    Article  MathSciNet  MATH  Google Scholar 

  14. Del Genio, C. I., Gross, T., & Bassler, K. E. (2011). All scale-free networks are sparse. Physical Review Letters, 107(19), Paper No. 178701.

    Google Scholar 

  15. Desjardins, G., Courville, A., Bengio, Y., Vincent, P., & Dellaleau, O. (2010). Parallel tempering for training of restricted Boltzmann machines. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS’10) (pp. 145–152).

    Google Scholar 

  16. Detorakis, G., Bartley, T., & Neftci, E. (2019). Contrastive Hebbian learning with random feedback weights. Neural Networks, 114, 1–14.

    Article  Google Scholar 

  17. Elfwing, S., Uchibe, E., & Doya, K. (2015). Expected energy-based restricted Boltzmann machine for classification. Neural Networks, 64, 29–38.

    Article  MATH  Google Scholar 

  18. Fischer, A., & Igel, C. (2011). Bounding the bias of contrastive divergence learning. Neural Computation, 23(3), 664–673.

    Article  MathSciNet  MATH  Google Scholar 

  19. Gabrie, M., Tramel, E. W., & Krzakala, F. (2015). Training restricted Boltzmann machine via the Thouless–Anderson–Palmer free energy. In Advances in neural information processing systems (pp. 640–648).

    Google Scholar 

  20. Galland, C. C. (1993). The limitations of deterministic Boltzmann machine learning. Network, 4, 355–380.

    Article  MATH  Google Scholar 

  21. Glauber, R. J. (1963). Time-dependent statistics of the Ising model. Journal of Mathematical Physics, 4, 294–307.

    Article  MathSciNet  MATH  Google Scholar 

  22. Hartman, E. (1991). A high storage capacity neural network content-addressable memory. Network, 2, 315–334.

    Article  Google Scholar 

  23. Haykin, S. (1999). Neural networks: A comprehensive foundation (2nd ed.). Upper Saddle River, NJ: Prentice Hall.

    MATH  Google Scholar 

  24. Hinton, G. E. (1989). Deterministic Boltzmann learning performs steepest descent in weight-space. Neural Computation, 1, 143–150.

    Article  Google Scholar 

  25. Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), 1771–1800.

    Article  MATH  Google Scholar 

  26. Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554.

    Article  MathSciNet  MATH  Google Scholar 

  27. Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504–507.

    Article  MathSciNet  MATH  Google Scholar 

  28. Hinton, G. E., & Sejnowski, T. J. (1986). Learning and relearning in Boltzmann machines. In D. E. Rumelhart & J. L. McClelland (Eds.), Parallel distributed processing: Explorations in microstructure of cognition (Vol. 1, pp. 282–317). Cambridge, MA: MIT Press.

    Google Scholar 

  29. Igel, C., Glasmachers, T., & Heidrich-Meisner, V. (2008). Shark. Journal of Machine Learning Research, 9, 993–996.

    Google Scholar 

  30. Kam, M., & Cheng, R. (1989). Convergence and pattern stabilization in the Boltzmann machine. In D. S. Touretzky (Ed.), Advances in neural information processing systems (Vol. 1, pp. 511–518). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  31. Kappen, H. J., & Rodriguez, F. B. (1998). Efficient learning in Boltzmann machine using linear response theory. Neural Computation, 10, 1137–1156.

    Article  Google Scholar 

  32. Kurita, N., & Funahashi, K. I. (1996). On the Hopfield neural networks and mean field theory. Neural Networks, 9, 1531–1540.

    Article  MATH  Google Scholar 

  33. Larochelle, H., & Bengio, Y. (2008). Classification using discriminative restricted Boltzmann machines. In Proceedings of the 25th International Conference on Machine Learning (pp. 536–543). Helsinki, Finlan.

    Google Scholar 

  34. Le Roux, N., & Bengio, Y. (2008). Representational power of restricted Boltzmann machines and deep belief networks. Neural Computation, 20(6), 1631–1649.

    Article  MathSciNet  MATH  Google Scholar 

  35. Levy, B. C., & Adams, M. B. (1987). Global optimization with stochastic neural networks. In Proceedings of the 1st IEEE Conference on Neural Networks (Vol. 3, pp. 681–689). San Diego, CA.

    Google Scholar 

  36. Lillicrap, T. P., Cownden, D., Tweed, D. B., & Akerman, C. J. (2016). Random synaptic feedback weights support error backpropagation for deep learning. Nature Communications, 7, Paper No. 13276.

    Google Scholar 

  37. Lin, C. T., & Lee, C. S. G. (1995). A multi-valued Boltzmann machine. IEEE Transactions on Systems Man and Cybernetics, 25(4), 660–669.

    Article  Google Scholar 

  38. Mocanu, D. C., Mocanu, E., Nguyen, P. H., Gibescu, M., & Liotta, A. (2016). A topological insight into restricted Boltzmann machines. Machine Learning, 104(2), 243–270.

    Article  MathSciNet  MATH  Google Scholar 

  39. Montufar, G., Ay, N., & Ghazi-Zahedi, K. (2015). Geometry and expressive power of conditional restricted Boltzmann machines. Journal of Machine Learning Research, 16, 2405–2436.

    MathSciNet  MATH  Google Scholar 

  40. Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted Boltzmann machines. In Proceedings of the International Conference on Machine Learning (ICML) (pp. 807–814).

    Google Scholar 

  41. Neftci, E., Das, S., Pedroni, B., Kreutz-Delgado, K., & Cauwenberghs, G. (2014). Event-driven contrastive divergence for spiking neuromorphic systems. Frontiers in Neuroscience, 8, 1–14.

    Google Scholar 

  42. Odense, S., & Edwards, R. (2016). Universal approximation results for the temporal restricted Boltzmann machine and the recurrent temporal restricted Boltzmann machine. Journal of Machine Learning Research, 17, 1–21.

    MathSciNet  MATH  Google Scholar 

  43. Peng, X., Gao, X., & Li, X. (2918). On better training the infinite restricted Boltzmann machines. Machine Learning, 107(6), 943–968.

    Google Scholar 

  44. Peterson, C., & Anderson, J. R. (1987). A mean field learning algorithm for neural networks. Complex Systems, 1(5), 995–1019.

    MATH  Google Scholar 

  45. Ranzato, M. A., Krizhevsky, A., & Hinton, G. E. (2010). Factored 3-way restricted Boltzmann machines for modeling natural images. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS) (pp. 621–628). Sardinia, Italy.

    Google Scholar 

  46. Romero, E., Mazzantib, F., Delgado, J., & Buchaca, D. (2019). Weighted contrastive divergence. Neural Networks, 114, 147–156.

    Article  Google Scholar 

  47. Salakhutdinov, R., & Hinton, G. (2009). Replicated softmax: An undirected topic model. In Advances in neural information processing systems (Vol. 22, pp. 1607–1614). Vancouver, Canada.

    Google Scholar 

  48. Sankar, A. R., & Balasubramanian, V. N. (2015). Similarity-based contrastive divergence methods for energy-based deep learning models. In JMLR Workshop and Conference Proceedings (Vol. 45, pp. 391–406).

    Google Scholar 

  49. Smolensky, P. (1986). Information processing in dynamical systems: Foundations of harmony theory. In D. E. Rumelhart, J. L. McClelland, & the PDP Research Group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (Vol. 1, pp. 194–281). Cambridge, MA: MIT Press.

    Google Scholar 

  50. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15, 1929–1958.

    MathSciNet  MATH  Google Scholar 

  51. Szu, H. H., & Hartley, R. L. (1987). Nonconvex optimization by fast simulated annealing. Proceedings of the IEEE, 75, 1538–1540.

    Article  Google Scholar 

  52. Taylor, G. W., Hinton, G. E., & Roweis, S. T. (2011). Two distributed-state models for generating high-dimensional time series. Journal of Machine Learning Research, 12, 1025–1068.

    MathSciNet  MATH  Google Scholar 

  53. Thouless, D. J., Anderson, P. W., & Palmer, R. G. (1977). Solution of “solvable model of a spin glass”. Philosophical Magazine, 35(3), 593–601.

    Article  Google Scholar 

  54. Tieleman, T. (2008). Training restricted Boltzmann machines using approximations to the likelihood gradient. In W. W. Cohen, A. McCallum, & S. T. Roweis (Eds.), Proceedings of the 25th International Conference on Machine Learning (pp. 1064–1071). New York: ACM.

    Google Scholar 

  55. Tieleman, T., & Hinton, G. E. (2009). Using fast weights to improve persistent contrastive divergence. In A. P. Danyluk, L. Bottou, & M. L. Littman (Eds.), Proceedings of the 26th Annual International Conference on Machine Learning (pp. 1033–1040). New York: ACM.

    Google Scholar 

  56. Welling, M., Rosen-Zvi, M., & Hinton, G. (2004). Exponential family harmoniums with an application to information retrieval. In Advances in neural information processing systems (Vol. 17, pp. 1481–1488).

    Google Scholar 

  57. Wu, J. M. (2004). Annealing by two sets of interactive dynamics. IEEE Transactions on Systems, Man, and Cybernetics Part B, 34(3), 1519–1525.

    Article  MathSciNet  Google Scholar 

  58. Xie, X., & Seung, H. S. (2003). Equivalence of backpropagation and contrastive Hebbian learning in a layered network. Neural Computation, 15(2), 441–454.

    Article  MATH  Google Scholar 

  59. Yasuda, M., & Tanaka, K. (2009). Approximate learning algorithm in Boltzmann machines. Neural Computation, 21, 3130–3178.

    Article  MathSciNet  MATH  Google Scholar 

  60. Younes, L. (1996). Synchronous Boltzmann machines can be universal approximators. Applied Mathematics Letters, 9(3), 109–113.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ke-Lin Du .

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer-Verlag London Ltd., part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Du, KL., Swamy, M.N.S. (2019). Boltzmann Machines. In: Neural Networks and Statistical Learning. Springer, London. https://doi.org/10.1007/978-1-4471-7452-3_23

Download citation

Publish with us

Policies and ethics