Skip to main content

Measuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information

  • Conference paper
Neural Information Processing (ICONIP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8226))

Included in the following conference series:

Abstract

Restricted Boltzmann machines (RBMs) and deep Boltzmann machines (DBMs) are important models in deep learning, but it is often difficult to measure their performance in general, or measure the importance of individual hidden units in specific. We propose to use mutual information to measure the usefulness of individual hidden units in Boltzmann machines. The measure serves as an upper bound for the information the neuron can pass on, enabling detection of a particular kind of poor training results. We confirm experimentally, that the proposed measure is telling how much the performance of the model drops when some of the units of an RBM are pruned away. Our experiments on DBMs highlight differences among different pretraining options.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cho, K., Raiko, T., Ilin, A.: Enhanced gradient for training restricted Boltzmann machines. Neural Computation 25(3), 805–831 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  2. Cho, K., Raiko, T., Ilin, A., Karhunen, J.: A two-stage pretraining algorithm for deep Boltzmann machines. In: Proceedings of the 23rd International Conference on Artificial Neural Networks (to appear, September 2013)

    Google Scholar 

  3. Zhou, G., Sohn, K., Lee, H.: Online incremental feature learning with denoising autoencoders. In: Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2012), pp. 1453–1461 (2012)

    Google Scholar 

  4. Adams, R.P., Wallach, H.M., Ghahramani, Z.: Learning the structure of deep sparse graphical models. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), pp. 1–8 (2010)

    Google Scholar 

  5. Engelbrecht, A.P.: A new pruning heuristic based on variance analysis of sensitivity information. Transactions on Neural Networks 12(6), 1386–1399 (2001)

    Article  Google Scholar 

  6. Reed, R.: Pruning algorithms-a survey. Transactions on Neural Networks 4(5), 740–747 (1993)

    Article  Google Scholar 

  7. Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Foundations, vol. 1, pp. 194–281. MIT Press, Cambridge (1986)

    Google Scholar 

  8. Salakhutdinov, R., Hinton, G.: Deep Boltzmann machines. In: Proceedings of the Twelfth Internation Conference on Artificial Intelligence and Statistics (AISTATS 2009), pp. 448–455 (2009)

    Google Scholar 

  9. Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  10. Hinton, G., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation 18(7), 1527–1554 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  11. Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the 24th International Conference on Machine Learning (ICML 2007), pp. 791–798. ACM, New York (2007)

    Chapter  Google Scholar 

  12. Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep boltzmann machines. In: Bartlett, P., Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 2231–2239 (2012)

    Google Scholar 

  13. Lee, H., Grosse, R., Ranganath, R., Ng, A.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning (ICML 2009), pp. 609–616. ACM (2009)

    Google Scholar 

  14. Salakhutdinov, R.: Learning and evaluating Boltzmann machines. Technical Report UTML TR 2008-002, Department of Computer Science, University of Toronto (June 2008)

    Google Scholar 

  15. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), pp. 249–256 (May 2010)

    Google Scholar 

  16. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 2278–2324 (1998)

    Article  Google Scholar 

  17. Ranzato, M., Boureau, Y.L., LeCun, Y.: Sparse feature learning for deep belief networks. In: Platt, J., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems 20, pp. 1185–1192. MIT Press, Cambridge (2008)

    Google Scholar 

  18. Peltonen, J., Kaski, S.: Discriminative components of data. IEEE Transactions on Neural Networks 16(1), 68–83 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Berglund, M., Raiko, T., Cho, K. (2013). Measuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information. In: Lee, M., Hirose, A., Hou, ZG., Kil, R.M. (eds) Neural Information Processing. ICONIP 2013. Lecture Notes in Computer Science, vol 8226. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-42054-2_60

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-42054-2_60

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-42053-5

  • Online ISBN: 978-3-642-42054-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics