Abstract
Variational EM (VEM) is an efficient parameter learning scheme for sigmoid belief networks with many layers of latent variables. The choice of the inference model that forms the variational lower bound of the log likelihood is critical in VEM learning. The mean field approximations and wake-sleep algorithm use simple models that are computationally efficient, but may be poor approximations to the true posterior densities when the latent variables have strong mutual dependencies. In this paper, we describe a variational EM learning method of DSBNs with a new inference model known as the conditional deep Boltzmann machine (cDBM), which is an undirected graphical model capable of representing complex dependencies among latent variables. We show that this algorithm does not require the computation of the intractable partition function in the undirected cDBM model, and can be accelerated with contrastive learning. Performances of the proposed method are evaluated and compared on handwritten digit data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bengio, Y., LeCun, Y.: Scaling learning algorithms towards ai. In: Bottou, L., Chapelle, O., DeCoste, D., Weston, J. (eds.) Large-Scale Kernel Machines. MIT Press (2007)
Cover, T., Thomas, J.: Elements of Information Theory, 2nd edn. Wiley-Interscience (2006)
Dayan, P., Hinton, G.E.: Varieties of helmholtz machines. Neural Networks 9, 1385–1403 (1996)
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society, Series B 39, 1–38 (1977)
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Computation 14, 1771–1800 (2002)
Hinton, G.E., Dayan, P., Frey, B.J., Neal, R.: The wake-sleep algorithm for unsupervised neural networks. Science 268, 1158–1161 (1995)
Hinton, G.E., Osindero, S., Teh, Y.: A fast learning algorithm for deep belief nets. Neural Computation 18(10), 1527–1554 (2006)
Jaakkola, T., Jordan, M.: Improving the mean field approximation via the use of mixture distributions. In: Jordan, M.I. (ed.) Learning in Graphical Models. MIT Press (1998)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Mnih, A., Gregor, K.: Neural variational inference and learning in belief networks. arXiv:1402.0030v1 (cs.LG) (January 2014)
Neal, R.M.: Connectionist learning of belief networks. Artificial Intelligence 56, 71–113 (1992)
Neal, R.M., Hinton, G.E.: A view of the em algorithm that justifies incremental, sparse, and other variants. In: Learning in Graphical Models, pp. 355–368. Kluwer Academic Publishers (1998)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan-Kaufmann (1988)
Salakhutdinov, R., Hinton, G.E.: Deep boltzmann machines. In: AISTATS (2009)
Saul, L.K., Jaakkola, T., Jordan, M.I.: Mean field theory for sigmoid belief networks. Journal of Artificial Intelligence Research 4, 61–76 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhang, X., Lyu, S. (2014). Variational EM Learning of DSBNs with Conditional Deep Boltzmann Machines. In: Wermter, S., et al. Artificial Neural Networks and Machine Learning – ICANN 2014. ICANN 2014. Lecture Notes in Computer Science, vol 8681. Springer, Cham. https://doi.org/10.1007/978-3-319-11179-7_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-11179-7_33
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11178-0
Online ISBN: 978-3-319-11179-7
eBook Packages: Computer ScienceComputer Science (R0)