Measuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information

Berglund, Mathias; Raiko, Tapani; Cho, KyungHyun

doi:10.1007/978-3-642-42054-2_60

Mathias Berglund²⁰,
Tapani Raiko²⁰ &
KyungHyun Cho²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8226))

Included in the following conference series:

International Conference on Neural Information Processing

3797 Accesses
1 Citations

Abstract

Restricted Boltzmann machines (RBMs) and deep Boltzmann machines (DBMs) are important models in deep learning, but it is often difficult to measure their performance in general, or measure the importance of individual hidden units in specific. We propose to use mutual information to measure the usefulness of individual hidden units in Boltzmann machines. The measure serves as an upper bound for the information the neuron can pass on, enabling detection of a particular kind of poor training results. We confirm experimentally, that the proposed measure is telling how much the performance of the model drops when some of the units of an RBM are pruned away. Our experiments on DBMs highlight differences among different pretraining options.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cho, K., Raiko, T., Ilin, A.: Enhanced gradient for training restricted Boltzmann machines. Neural Computation 25(3), 805–831 (2013)
Article MathSciNet MATH Google Scholar
Cho, K., Raiko, T., Ilin, A., Karhunen, J.: A two-stage pretraining algorithm for deep Boltzmann machines. In: Proceedings of the 23rd International Conference on Artificial Neural Networks (to appear, September 2013)
Google Scholar
Zhou, G., Sohn, K., Lee, H.: Online incremental feature learning with denoising autoencoders. In: Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2012), pp. 1453–1461 (2012)
Google Scholar
Adams, R.P., Wallach, H.M., Ghahramani, Z.: Learning the structure of deep sparse graphical models. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), pp. 1–8 (2010)
Google Scholar
Engelbrecht, A.P.: A new pruning heuristic based on variance analysis of sensitivity information. Transactions on Neural Networks 12(6), 1386–1399 (2001)
Article Google Scholar
Reed, R.: Pruning algorithms-a survey. Transactions on Neural Networks 4(5), 740–747 (1993)
Article Google Scholar
Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Foundations, vol. 1, pp. 194–281. MIT Press, Cambridge (1986)
Google Scholar
Salakhutdinov, R., Hinton, G.: Deep Boltzmann machines. In: Proceedings of the Twelfth Internation Conference on Artificial Intelligence and Statistics (AISTATS 2009), pp. 448–455 (2009)
Google Scholar
Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Hinton, G., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation 18(7), 1527–1554 (2006)
Article MathSciNet MATH Google Scholar
Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the 24th International Conference on Machine Learning (ICML 2007), pp. 791–798. ACM, New York (2007)
Chapter Google Scholar
Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep boltzmann machines. In: Bartlett, P., Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 2231–2239 (2012)
Google Scholar
Lee, H., Grosse, R., Ranganath, R., Ng, A.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning (ICML 2009), pp. 609–616. ACM (2009)
Google Scholar
Salakhutdinov, R.: Learning and evaluating Boltzmann machines. Technical Report UTML TR 2008-002, Department of Computer Science, University of Toronto (June 2008)
Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), pp. 249–256 (May 2010)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 2278–2324 (1998)
Article Google Scholar
Ranzato, M., Boureau, Y.L., LeCun, Y.: Sparse feature learning for deep belief networks. In: Platt, J., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems 20, pp. 1185–1192. MIT Press, Cambridge (2008)
Google Scholar
Peltonen, J., Kaski, S.: Discriminative components of data. IEEE Transactions on Neural Networks 16(1), 68–83 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information and Computer Science, Aalto University School of Science, Finland
Mathias Berglund, Tapani Raiko & KyungHyun Cho

Authors

Mathias Berglund
View author publications
You can also search for this author in PubMed Google Scholar
Tapani Raiko
View author publications
You can also search for this author in PubMed Google Scholar
KyungHyun Cho
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Kyungpook National University, 1370 Sankyuk-Dong, Puk-Gu, 702-701, Taegu, Korea
Minho Lee
The University of Tokyo, 7-3-1 Hongo, 113-8656, Bunkyo-ku, Tokyo, Japan
Akira Hirose
Key Laboratory of Complex Systems and Intelligence Science, Chinese Academy of Sciences, Institute of Automation, 100190, Beijing, China
Zeng-Guang Hou
Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu,, 440-746, Suwon, Korea
Rhee Man Kil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Berglund, M., Raiko, T., Cho, K. (2013). Measuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information. In: Lee, M., Hirose, A., Hou, ZG., Kil, R.M. (eds) Neural Information Processing. ICONIP 2013. Lecture Notes in Computer Science, vol 8226. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-42054-2_60

Download citation

DOI: https://doi.org/10.1007/978-3-642-42054-2_60
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-42053-5
Online ISBN: 978-3-642-42054-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics