Advertisement

Restricted Boltzmann Machines

  • Charu C. Aggarwal
Chapter

Abstract

The restricted Boltzmann machine (RBM) is a fundamentally different model from the feed-forward network. Conventional neural networks are input-output mapping networks where a set of inputs is mapped to a set of outputs. On the other hand, RBMs are networks in which the probabilistic states of a network are learned for a set of inputs, which is useful for unsupervised modeling.

Bibliography

  1. [1]
    D. Ackley, G. Hinton, and T. Sejnowski. A learning algorithm for Boltzmann machines. Cognitive Science, 9(1), pp. 147–169, 1985.CrossRefGoogle Scholar
  2. [6]
    C. Aggarwal. Machine learning for text. Springer, 2018.Google Scholar
  3. [29]
    Y. Bengio and O. Delalleau. Justifying and generalizing contrastive divergence. Neural Computation, 21(6), pp. 1601–1621, 2009.MathSciNetCrossRefGoogle Scholar
  4. [61]
    M. Carreira-Perpinan and G. Hinton. On Contrastive Divergence Learning. AISTATS, 10, pp. 33–40, 2005.Google Scholar
  5. [86]
    G. Dahl, R. Adams, and H. Larochelle. Training restricted Boltzmann machines on word observations. arXiv:1202.5695, 2012.https://arxiv.org/abs/1202.5695
  6. [119]
    A. Fischer and C. Igel. An introduction to restricted Boltzmann machines. Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, pp. 14–36, 2012.Google Scholar
  7. [124]
    Y. Freund and D. Haussler. Unsupervised learning of distributions on binary vectors using two layer networks. Technical report, Santa Cruz, CA, USA, 1994Google Scholar
  8. [134]
    P. Gehler, A. Holub, and M. Welling. The Rate Adapting Poisson (RAP) model for information retrieval and object recognition. ICML Confererence, 2006.Google Scholar
  9. [138]
    W. Gilks, S. Richardson, and D. Spiegelhalter. Markov chain Monte Carlo in practice.CRC Press, 1995.Google Scholar
  10. [191]
    G. Hinton. Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), pp. 1771–1800, 2002.CrossRefGoogle Scholar
  11. [193]
    G. Hinton. A practical guide to training restricted Boltzmann machines. Momentum, 9(1), 926, 2010.Google Scholar
  12. [195]
    G. Hinton, P. Dayan, B. Frey, and R. Neal. The wake–sleep algorithm for unsupervised neural networks. Science, 268(5214), pp. 1158–1162, 1995.CrossRefGoogle Scholar
  13. [196]
    G. Hinton, S. Osindero, and Y. Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18(7), pp. 1527–1554, 2006.MathSciNetCrossRefGoogle Scholar
  14. [197]
    G. Hinton and T. Sejnowski. Learning and relearning in Boltzmann machines. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, MIT Press, 1986.Google Scholar
  15. [198]
    G. Hinton and R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313, (5766), pp. 504–507, 2006.Google Scholar
  16. [199]
    G. Hinton and R. Salakhutdinov. Replicated softmax: an undirected topic model. NIPS Conference, pp. 1607–1614, 2009.Google Scholar
  17. [200]
    G. Hinton and R. Salakhutdinov. A better way to pretrain deep Boltzmann machines. NIPS Conference, pp. 2447–2455, 2012.Google Scholar
  18. [206]
    T. Hofmann. Probabilistic latent semantic indexing. ACM SIGIR Conference, pp. 50–57, 1999.Google Scholar
  19. [207]
    J. J. Hopfield. Neural networks and physical systems with emergent collective computational abilities. National Academy of Sciences of the USA, 79(8), pp. 2554–2558, 1982.MathSciNetCrossRefGoogle Scholar
  20. [251]
    D. Koller and N. Friedman. Probabilistic graphical models: principles and techniques. MIT Press, 2009.Google Scholar
  21. [263]
    H. Larochelle and Y. Bengio. Classification using discriminative restricted Boltzmann machines. ICML Conference, pp. 536–543, 2008.Google Scholar
  22. [264]
    H. Larochelle, M. Mandel, R. Pascanu, and Y. Bengio. Learning algorithms for the classification restricted Boltzmann machine. Journal of Machine Learning Research, 13, pp. 643–669, 2012.MathSciNetzbMATHGoogle Scholar
  23. [265]
    H. Larochelle and I. Murray. The neural autoregressive distribution estimator. International Conference on Artificial Intelligence and Statistics, pp. 29–37, 2011.Google Scholar
  24. [280]
    Y. LeCun, S. Chopra, R. M. Hadsell, M. A. Ranzato, and F.-J. Huang. A tutorial on energy-based learning. Predicting Structured Data, MIT Press, pp. 191–246,, 2006.Google Scholar
  25. [341]
    G. Montufar and N. Ay. Refinements of universal approximation results for deep belief networks and restricted Boltzmann machines. Neural Computation, 23(5), pp. 1306–1319, 2011.MathSciNetCrossRefGoogle Scholar
  26. [348]
    V. Nair and G. Hinton. Rectified linear units improve restricted Boltzmann machines. ICML Conference, pp. 807–814, 2010.Google Scholar
  27. [350]
    R. M. Neal. Connectionist learning of belief networks. Artificial intelligence, 1992.Google Scholar
  28. [351]
    R. M. Neal. Probabilistic inference using Markov chain Monte Carlo methods. Technical Report CRG-TR-93-1, 1993.Google Scholar
  29. [352]
    R. M. Neal. Annealed importance sampling. Statistics and Computing, 11(2), pp. 125–139, 2001.MathSciNetCrossRefGoogle Scholar
  30. [357]
    J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Ng. Multimodal deep learning. ICML Conference, pp. 689–696, 2011.Google Scholar
  31. [373]
    C. Peterson and J. Anderson. A mean field theory learning algorithm for neural networks. Complex Systems, 1(5), pp. 995–1019, 1987.zbMATHGoogle Scholar
  32. [396]
    S. Rendle. Factorization machines. IEEE ICDM Conference, pp. 995–100, 2010.Google Scholar
  33. [414]
    R. Salakhutdinov, A. Mnih, and G. Hinton. Restricted Boltzmann machines for collaborative filtering. ICML Confererence, pp. 791–798, 2007.Google Scholar
  34. [415]
    R. Salakhutdinov and G. Hinton. Semantic Hashing. SIGIR workshop on Information Retrieval and applications of Graphical Models, 2007.Google Scholar
  35. [417]
    R. Salakhutdinov and G. Hinton. Deep Boltzmann machines. Artificial Intelligence and Statistics, pp. 448–455, 2009.Google Scholar
  36. [418]
    R. Salakhutdinov and H. Larochelle. Efficient Learning of Deep Boltzmann Machines. AISTATs, pp. 693–700, 2010.Google Scholar
  37. [437]
    T. J. Sejnowski. Higher-order Boltzmann machines. AIP Conference Proceedings, 15(1), pp. 298–403, 1986.MathSciNetGoogle Scholar
  38. [457]
    P. Smolensky. Information processing in dynamical systems: Foundations of harmony theory. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 1: Foundations. pp. 194–281, 1986.Google Scholar
  39. [468]
    N. Srivastava and R. Salakhutdinov. Multimodal learning with deep Boltzmann machines. NIPS Conference, pp. 2222–2230, 2012.Google Scholar
  40. [469]
    N. Srivastava, R. Salakhutdinov, and G. Hinton. Modeling documents with deep Boltzmann machines. Uncertainty in Artificial Intelligence, 2013.Google Scholar
  41. [471]
    A. Storkey. Increasing the capacity of a Hopfield network without sacrificing functionality. Artificial Neural Networks, pp. 451–456, 1997.Google Scholar
  42. [479]
    I. Sutskever and T. Tieleman. On the convergence properties of contrastive divergence. International Conference on Artificial Intelligence and Statistics, pp. 789–795, 2010.Google Scholar
  43. [491]
    T. Tieleman. Training restricted Boltzmann machines using approximations to the likelihood gradient. ICML Conference, pp. 1064–1071, 2008.Google Scholar
  44. [495]
    Y. Teh and G. Hinton. Rate-coded restricted Boltzmann machines for face recognition. NIPS Conference, 2001.Google Scholar
  45. [522]
    M. Welling, M. Rosen-Zvi, and G. Hinton. Exponential family harmoniums with an application to information retrieval. NIPS Conference, pp. 1481–1488, 2005.Google Scholar
  46. [538]
    E. Xing, R. Yan, and A. Hauptmann. Mining associated text and images with dual-wing harmoniums. Uncertainty in Artificial Intelligence, 2005.Google Scholar
  47. [577]

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Charu C. Aggarwal
    • 1
  1. 1.IBM T. J. Watson Research CenterInternational Business MachinesYorktown HeightsUSA

Personalised recommendations