Abstract
Since its invention in 1985, the Boltzmann machine has long been treated as a model with mere historic significance to the machine learning community. In 2006, this model began to gain popularity when Hinton and collaborators achieved a breakthrough in deep learning, where restricted Boltzmann machine is the prime component of the deep neural network. In this chapter, we introduce the Boltzmann machine and its reduced form known as the restricted Boltzmann machine, as well as their learning algorithms. Related topics are also treated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ackley, D. H., Hinton, G. E., & Sejnowski, T. J. (1985). A learning algorithm for Boltzmann machines. Cognitive Science, 9, 147–169.
Akiyama, Y., Yamashita, A., Kajiura, M., & Aiso, H. (1989). Combinatorial optimization with Gaussian machines. In Proceedings of International Joint Conference on Neural Networks (pp. 533–540). Washington, DC.
Attias, H. (1999). Inferring parameters and structure of latent variable models by variational Bayes. In Proceedings of the 15th Annual Conference on Uncertainty in AI (pp. 21–30).
Azencott, R., Doutriaux, A., & Younes, L. (1993). Synchronous Boltzmann machines and curve identification tasks. Network, 4, 461–480.
Baldi, P., & Pineda, F. (1991). Contrastive learning and neural oscillations. Neural Computation, 3(4), 526–545.
Baldi, P., & Sadowski, P. (2014). The dropout learning algorithm. Artificial Intelligence, 210, 78–122.
Barra, A., Bernacchia, A., Santucci, E., & Contucci, P. (2012). On the equivalence of Hopfield networks and Boltzmann machines. Neural Networks, 34, 1–9.
Bengio, Y. (2009). Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1), 1–127.
Bengio, Y., & Delalleau, O. (2009). Justifying and generalizing contrastive divergence. Neural Computation, 21(6), 1601–1621.
Brugge, K., Fischer, A., & Igel, C. (2013). The flip-the-state transition operator for restricted Boltzmann machines. Machine Learning, 93(1), 53–69.
Carreira-Perpinan, M. A., & Hinton, G. E. (2005). On contrastive divergence learning. In Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (pp. 59–66).
Cho, K. H., Raiko, T., & Ilin, A. (2013). Gaussian–Bernoulli deep Boltzmann machine. In Proceedings of International Joint Conference on Neural Networks (IJCNN) (pp. 1–7).
Cote, M. A., & Larochelle, H. (2016). An infinite restricted Boltzmann machine. Neural Computation, 28, 1265–1289.
Del Genio, C. I., Gross, T., & Bassler, K. E. (2011). All scale-free networks are sparse. Physical Review Letters, 107(19), Paper No. 178701.
Desjardins, G., Courville, A., Bengio, Y., Vincent, P., & Dellaleau, O. (2010). Parallel tempering for training of restricted Boltzmann machines. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS’10) (pp. 145–152).
Detorakis, G., Bartley, T., & Neftci, E. (2019). Contrastive Hebbian learning with random feedback weights. Neural Networks, 114, 1–14.
Elfwing, S., Uchibe, E., & Doya, K. (2015). Expected energy-based restricted Boltzmann machine for classification. Neural Networks, 64, 29–38.
Fischer, A., & Igel, C. (2011). Bounding the bias of contrastive divergence learning. Neural Computation, 23(3), 664–673.
Gabrie, M., Tramel, E. W., & Krzakala, F. (2015). Training restricted Boltzmann machine via the Thouless–Anderson–Palmer free energy. In Advances in neural information processing systems (pp. 640–648).
Galland, C. C. (1993). The limitations of deterministic Boltzmann machine learning. Network, 4, 355–380.
Glauber, R. J. (1963). Time-dependent statistics of the Ising model. Journal of Mathematical Physics, 4, 294–307.
Hartman, E. (1991). A high storage capacity neural network content-addressable memory. Network, 2, 315–334.
Haykin, S. (1999). Neural networks: A comprehensive foundation (2nd ed.). Upper Saddle River, NJ: Prentice Hall.
Hinton, G. E. (1989). Deterministic Boltzmann learning performs steepest descent in weight-space. Neural Computation, 1, 143–150.
Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), 1771–1800.
Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554.
Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504–507.
Hinton, G. E., & Sejnowski, T. J. (1986). Learning and relearning in Boltzmann machines. In D. E. Rumelhart & J. L. McClelland (Eds.), Parallel distributed processing: Explorations in microstructure of cognition (Vol. 1, pp. 282–317). Cambridge, MA: MIT Press.
Igel, C., Glasmachers, T., & Heidrich-Meisner, V. (2008). Shark. Journal of Machine Learning Research, 9, 993–996.
Kam, M., & Cheng, R. (1989). Convergence and pattern stabilization in the Boltzmann machine. In D. S. Touretzky (Ed.), Advances in neural information processing systems (Vol. 1, pp. 511–518). San Mateo, CA: Morgan Kaufmann.
Kappen, H. J., & Rodriguez, F. B. (1998). Efficient learning in Boltzmann machine using linear response theory. Neural Computation, 10, 1137–1156.
Kurita, N., & Funahashi, K. I. (1996). On the Hopfield neural networks and mean field theory. Neural Networks, 9, 1531–1540.
Larochelle, H., & Bengio, Y. (2008). Classification using discriminative restricted Boltzmann machines. In Proceedings of the 25th International Conference on Machine Learning (pp. 536–543). Helsinki, Finlan.
Le Roux, N., & Bengio, Y. (2008). Representational power of restricted Boltzmann machines and deep belief networks. Neural Computation, 20(6), 1631–1649.
Levy, B. C., & Adams, M. B. (1987). Global optimization with stochastic neural networks. In Proceedings of the 1st IEEE Conference on Neural Networks (Vol. 3, pp. 681–689). San Diego, CA.
Lillicrap, T. P., Cownden, D., Tweed, D. B., & Akerman, C. J. (2016). Random synaptic feedback weights support error backpropagation for deep learning. Nature Communications, 7, Paper No. 13276.
Lin, C. T., & Lee, C. S. G. (1995). A multi-valued Boltzmann machine. IEEE Transactions on Systems Man and Cybernetics, 25(4), 660–669.
Mocanu, D. C., Mocanu, E., Nguyen, P. H., Gibescu, M., & Liotta, A. (2016). A topological insight into restricted Boltzmann machines. Machine Learning, 104(2), 243–270.
Montufar, G., Ay, N., & Ghazi-Zahedi, K. (2015). Geometry and expressive power of conditional restricted Boltzmann machines. Journal of Machine Learning Research, 16, 2405–2436.
Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted Boltzmann machines. In Proceedings of the International Conference on Machine Learning (ICML) (pp. 807–814).
Neftci, E., Das, S., Pedroni, B., Kreutz-Delgado, K., & Cauwenberghs, G. (2014). Event-driven contrastive divergence for spiking neuromorphic systems. Frontiers in Neuroscience, 8, 1–14.
Odense, S., & Edwards, R. (2016). Universal approximation results for the temporal restricted Boltzmann machine and the recurrent temporal restricted Boltzmann machine. Journal of Machine Learning Research, 17, 1–21.
Peng, X., Gao, X., & Li, X. (2918). On better training the infinite restricted Boltzmann machines. Machine Learning, 107(6), 943–968.
Peterson, C., & Anderson, J. R. (1987). A mean field learning algorithm for neural networks. Complex Systems, 1(5), 995–1019.
Ranzato, M. A., Krizhevsky, A., & Hinton, G. E. (2010). Factored 3-way restricted Boltzmann machines for modeling natural images. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS) (pp. 621–628). Sardinia, Italy.
Romero, E., Mazzantib, F., Delgado, J., & Buchaca, D. (2019). Weighted contrastive divergence. Neural Networks, 114, 147–156.
Salakhutdinov, R., & Hinton, G. (2009). Replicated softmax: An undirected topic model. In Advances in neural information processing systems (Vol. 22, pp. 1607–1614). Vancouver, Canada.
Sankar, A. R., & Balasubramanian, V. N. (2015). Similarity-based contrastive divergence methods for energy-based deep learning models. In JMLR Workshop and Conference Proceedings (Vol. 45, pp. 391–406).
Smolensky, P. (1986). Information processing in dynamical systems: Foundations of harmony theory. In D. E. Rumelhart, J. L. McClelland, & the PDP Research Group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (Vol. 1, pp. 194–281). Cambridge, MA: MIT Press.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15, 1929–1958.
Szu, H. H., & Hartley, R. L. (1987). Nonconvex optimization by fast simulated annealing. Proceedings of the IEEE, 75, 1538–1540.
Taylor, G. W., Hinton, G. E., & Roweis, S. T. (2011). Two distributed-state models for generating high-dimensional time series. Journal of Machine Learning Research, 12, 1025–1068.
Thouless, D. J., Anderson, P. W., & Palmer, R. G. (1977). Solution of “solvable model of a spin glass”. Philosophical Magazine, 35(3), 593–601.
Tieleman, T. (2008). Training restricted Boltzmann machines using approximations to the likelihood gradient. In W. W. Cohen, A. McCallum, & S. T. Roweis (Eds.), Proceedings of the 25th International Conference on Machine Learning (pp. 1064–1071). New York: ACM.
Tieleman, T., & Hinton, G. E. (2009). Using fast weights to improve persistent contrastive divergence. In A. P. Danyluk, L. Bottou, & M. L. Littman (Eds.), Proceedings of the 26th Annual International Conference on Machine Learning (pp. 1033–1040). New York: ACM.
Welling, M., Rosen-Zvi, M., & Hinton, G. (2004). Exponential family harmoniums with an application to information retrieval. In Advances in neural information processing systems (Vol. 17, pp. 1481–1488).
Wu, J. M. (2004). Annealing by two sets of interactive dynamics. IEEE Transactions on Systems, Man, and Cybernetics Part B, 34(3), 1519–1525.
Xie, X., & Seung, H. S. (2003). Equivalence of backpropagation and contrastive Hebbian learning in a layered network. Neural Computation, 15(2), 441–454.
Yasuda, M., & Tanaka, K. (2009). Approximate learning algorithm in Boltzmann machines. Neural Computation, 21, 3130–3178.
Younes, L. (1996). Synchronous Boltzmann machines can be universal approximators. Applied Mathematics Letters, 9(3), 109–113.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2019 Springer-Verlag London Ltd., part of Springer Nature
About this chapter
Cite this chapter
Du, KL., Swamy, M.N.S. (2019). Boltzmann Machines. In: Neural Networks and Statistical Learning. Springer, London. https://doi.org/10.1007/978-1-4471-7452-3_23
Download citation
DOI: https://doi.org/10.1007/978-1-4471-7452-3_23
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-7451-6
Online ISBN: 978-1-4471-7452-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)