Advertisement

A new variant of restricted Boltzmann machine with horizontal connections

  • Guang Shi
  • Jiangshe Zhang
  • NanNan Ji
  • ChangPeng Wang
Original Article
  • 28 Downloads

Abstract

Restricted Boltzmann machines (RBMs) are successfully employed to construct deep architectures because their power of expression and the inference is tractable and easy. In this paper, we propose a model named self-connected restricted Boltzmann machine (SCRBM), which adds horizontal connections to the hidden layer to enable direct information transfer between hidden units. We present a simple and effective method based on greedy layer-wise procedure of deep learning algorithms to train the model. Under the algorithm, SCRBM has a three-layer architecture. The first hidden layer extracts features from the data, and the second hidden layer is used to stimulate various interactions between units in the layer. Specifically, to stimulate the lateral inhibition that exists in sensory systems, a log sparse item is introduced to the second hidden layer of SCRBM. Our experiments show that the features learned by our algorithm are more vivid and clean than those learned by basic RBM and SparseRBM. Further experiments show the performance of SCRBM outperforms basic RBM and SparseRBM on several widely used datasets in terms of accuracy.

Keywords

Neural networks RBM Horizontal connections Greedy layer-wise learning 

Notes

Acknowledgements

This work is supported by the National Basic Research Program of China (973 Program, No. 2013CB329404), the National Natural Science Foundation of China (Nos. 61572393, 11501049, 11671317, 11131006) and the Special Program for Applied Research on Super Computation of the NSFC-Guangdong Joint Fund (the second phase).

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Human and animal rights

This article does not contain any studies with human participants or animals performed by any of the authors.

References

  1. 1.
    McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5(4):115–133MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Widrow B, Hoff ME (1962) Associative storage and retrieval of digital information in networks of adaptive neurons. In: Biological prototypes and synthetic systems, Springer US, pp 160–160Google Scholar
  3. 3.
    Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99Google Scholar
  4. 4.
    Ciresan DC, Giusti A, Gambardella LM, Schmidhuber J (2013) Mitosis detection in breast cancer histology images with deep neural networks. In: Medical image computing and computer-assisted intervention–MICCAI 2013, Springer, Berlin, pp 411–418Google Scholar
  5. 5.
    Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20(1):30–42CrossRefGoogle Scholar
  6. 6.
    Dan CC, Giusti A, Gambardella LM, Schmidhuber J (2012) Deep neural networks segment neuronal membranes in electron microscopy images. Adv Neural Inf Process Syst 25:2852–2860Google Scholar
  7. 7.
    Hinton GE, Deng L, Yu D, Dahl GE, Mohamed A-R, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath TN (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. Sig Process Mag IEEE 29(6):82–97CrossRefGoogle Scholar
  8. 8.
    Von der Malsburg C (1973) Self-organization of orientation sensitive cells in the striate cortex. Biol Cybern 14(2):85–100Google Scholar
  9. 9.
    Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci 79(8):2554–2558MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Oyedotun OK, Khashman A (2017) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951CrossRefGoogle Scholar
  11. 11.
    Zhang H, Cao X, Ho JK, Chow TW (2017) Object-level video advertising: an optimization framework. IEEE Trans Ind Inf 13(2):520–531CrossRefGoogle Scholar
  12. 12.
    Minsky ML, Papert SA (1987) Perceptrons-expanded edition: an introduction to computational geometry. MIT Press, CambridgezbMATHGoogle Scholar
  13. 13.
    Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2:1–55CrossRefzbMATHGoogle Scholar
  14. 14.
    Werbos PJ (1982) Applications of advances in nonlinear sensitivity analysis. In: System modeling and optimization. Springer, Berlin, pp 762–770Google Scholar
  15. 15.
    Werbos PJ. Beyond regression: New tools for prediction and analysis in the behavioral sciences, Ph.d. dissertation Harvard UniversityGoogle Scholar
  16. 16.
    Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536CrossRefzbMATHGoogle Scholar
  17. 17.
    Bengio Y, Lamblin P, Popovici D, Larochelle H et al (2007) Greedy layer-wise training of deep networks. Adv Neural Inf Process Syst 19:153Google Scholar
  18. 18.
    Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Hartline HK, Wagner HG, Ratliff F (1956) Inhibition in the eye of limulus. J Gen Physiol 39(5):651–673CrossRefGoogle Scholar
  20. 20.
    Lee H, Ekanadham C, Ng AY (2008) Sparse deep belief net model for visual area v2. In: Advances in neural information processing systems, vol 20, pp 873–880Google Scholar
  21. 21.
    Osindero S, Hinton GE (2008) Modeling image patches with a directed hierarchy of markov random fields. In: Advances in neural information processing systems, pp 1121–1128Google Scholar
  22. 22.
    Larochelle H, Erhan D, Vincent P (2009) Deep learning using robust interdependent codes. In: AISTATS, pp 312–319Google Scholar
  23. 23.
    Hinton GE, Sejnowski TJ (1986) Learning and relearning in boltzmann machines. Parallel Distrib Process Explor Microstruct Cognit 1:282–317Google Scholar
  24. 24.
    Memisevic R, Hinton GE (2010) Learning to represent spatial transformations with factored higher-order boltzmann machines. Neural Comput 22(6):1473–1492CrossRefzbMATHGoogle Scholar
  25. 25.
    Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14(8):1771–1800CrossRefzbMATHGoogle Scholar
  26. 26.
    Goldstein E (2013) Sensation and perception, Cengage LearningGoogle Scholar
  27. 27.
    Welling M, Hinton GE (2002) A new learning algorithm for mean field boltzmann machines. In: International conference on artificial neural networks (ICANN’02), Springer, Berlin, pp 351–357Google Scholar
  28. 28.
    Ranzato M, Boureau YL, Lecun Y (2007) Sparse feature learning for deep belief networks. Adv Neural Inf Process Syst 20:1185–1192Google Scholar
  29. 29.
    Ji NN, Zhang JS, Zhang CX, Yin QY (2014) Enhancing performance of restricted boltzmann machines via log-sum regularization. Knowl-Based Syst 63:82–96CrossRefGoogle Scholar
  30. 30.
    Melacci S, Belkin M (2011) Laplacian support vector machines trained in the primal. J Mach Learn Res 12:1149–1184MathSciNetzbMATHGoogle Scholar
  31. 31.
    LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRefGoogle Scholar
  32. 32.
    LeCun Y, Huang FJ, Bottou L (2004) Learning methods for generic object recognition with invariance to pose and lighting. In: IEEE computer society conference on computer vision and pattern recognition (CVPR’04), vol 2, IEEE, pp II–97Google Scholar
  33. 33.
    Decoste D, Scholkopf B (2002) Training invariant support vector machines. Mach Learn 46(1–3):161–190CrossRefzbMATHGoogle Scholar
  34. 34.
    Williams CK, Agakov FV. An analysis of contrastive divergence learning in gaussian boltzmann machines. Institute for Adaptive and Neural ComputationGoogle Scholar
  35. 35.
    Teh YW, Welling M, Osindero S, Hinton GE (2003) Energy-based models for sparse overcomplete representations. J Mach Learn Res 4(12):1235–1260MathSciNetzbMATHGoogle Scholar
  36. 36.
    Yuille AL (2005) The convergence of contrastive divergences. In: Advances in neural information processing systems, pp 1593–1600Google Scholar

Copyright information

© The Natural Computing Applications Forum 2018

Authors and Affiliations

  • Guang Shi
    • 1
  • Jiangshe Zhang
    • 1
  • NanNan Ji
    • 2
  • ChangPeng Wang
    • 3
  1. 1.School of Mathematics and StatisticsXi’an Jiaotong UniversityXi’anPeople’s Republic of China
  2. 2.School of ScienceChang’an UniversityXi’anPeople’s Republic of China
  3. 3.School of Mathematics and Information ScienceChang’an UniversityXi’anPeople’s Republic of China

Personalised recommendations