Abstract
Restricted Boltzmann machines (RBMs) and deep Boltzmann machines (DBMs) are important models in deep learning, but it is often difficult to measure their performance in general, or measure the importance of individual hidden units in specific. We propose to use mutual information to measure the usefulness of individual hidden units in Boltzmann machines. The measure serves as an upper bound for the information the neuron can pass on, enabling detection of a particular kind of poor training results. We confirm experimentally, that the proposed measure is telling how much the performance of the model drops when some of the units of an RBM are pruned away. Our experiments on DBMs highlight differences among different pretraining options.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cho, K., Raiko, T., Ilin, A.: Enhanced gradient for training restricted Boltzmann machines. Neural Computation 25(3), 805–831 (2013)
Cho, K., Raiko, T., Ilin, A., Karhunen, J.: A two-stage pretraining algorithm for deep Boltzmann machines. In: Proceedings of the 23rd International Conference on Artificial Neural Networks (to appear, September 2013)
Zhou, G., Sohn, K., Lee, H.: Online incremental feature learning with denoising autoencoders. In: Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2012), pp. 1453–1461 (2012)
Adams, R.P., Wallach, H.M., Ghahramani, Z.: Learning the structure of deep sparse graphical models. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), pp. 1–8 (2010)
Engelbrecht, A.P.: A new pruning heuristic based on variance analysis of sensitivity information. Transactions on Neural Networks 12(6), 1386–1399 (2001)
Reed, R.: Pruning algorithms-a survey. Transactions on Neural Networks 4(5), 740–747 (1993)
Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Foundations, vol. 1, pp. 194–281. MIT Press, Cambridge (1986)
Salakhutdinov, R., Hinton, G.: Deep Boltzmann machines. In: Proceedings of the Twelfth Internation Conference on Artificial Intelligence and Statistics (AISTATS 2009), pp. 448–455 (2009)
Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Hinton, G., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation 18(7), 1527–1554 (2006)
Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the 24th International Conference on Machine Learning (ICML 2007), pp. 791–798. ACM, New York (2007)
Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep boltzmann machines. In: Bartlett, P., Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 2231–2239 (2012)
Lee, H., Grosse, R., Ranganath, R., Ng, A.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning (ICML 2009), pp. 609–616. ACM (2009)
Salakhutdinov, R.: Learning and evaluating Boltzmann machines. Technical Report UTML TR 2008-002, Department of Computer Science, University of Toronto (June 2008)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), pp. 249–256 (May 2010)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 2278–2324 (1998)
Ranzato, M., Boureau, Y.L., LeCun, Y.: Sparse feature learning for deep belief networks. In: Platt, J., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems 20, pp. 1185–1192. MIT Press, Cambridge (2008)
Peltonen, J., Kaski, S.: Discriminative components of data. IEEE Transactions on Neural Networks 16(1), 68–83 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Berglund, M., Raiko, T., Cho, K. (2013). Measuring the Usefulness of Hidden Units in Boltzmann Machines with Mutual Information. In: Lee, M., Hirose, A., Hou, ZG., Kil, R.M. (eds) Neural Information Processing. ICONIP 2013. Lecture Notes in Computer Science, vol 8226. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-42054-2_60
Download citation
DOI: https://doi.org/10.1007/978-3-642-42054-2_60
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-42053-5
Online ISBN: 978-3-642-42054-2
eBook Packages: Computer ScienceComputer Science (R0)