An Empirical Investigation of Minimum Probability Flow Learning Under Different Connectivity Patterns

Im, Daniel Jiwoong; Buchman, Ethan; Taylor, Graham W.

doi:10.1007/978-3-319-23528-8_30

An Empirical Investigation of Minimum Probability Flow Learning Under Different Connectivity Patterns

Daniel Jiwoong Im¹⁰,
Ethan Buchman¹⁰ &
Graham W. Taylor¹⁰

Conference paper
First Online: 01 January 2015

4744 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9284))

Abstract

Energy-based models are popular in machine learning due to the elegance of their formulation and their relationship to statistical physics. Among these, the Restricted Boltzmann Machine (RBM), and its staple training algorithm contrastive divergence (CD), have been the prototype for some recent advancements in the unsupervised training of deep neural networks. However, CD has limited theoretical motivation, and can in some cases produce undesirable behaviour. Here, we investigate the performance of Minimum Probability Flow (MPF) learning for training RBMs. Unlike CD, with its focus on approximating an intractable partition function via Gibbs sampling, MPF proposes a tractable, consistent, objective function defined in terms of a Taylor expansion of the KL divergence with respect to sampling dynamics. Here we propose a more general form for the sampling dynamics in MPF, and explore the consequences of different choices for these dynamics for training RBMs. Experimental results show MPF outperforming CD for various RBM configurations.

Download to read the full chapter text

Chapter PDF

References

Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I.J., Bergeron, A., Bouchard, N., Bengio, Y.: Theano: new features and speed improvements. In: Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop (2012)
Google Scholar
Bengio, Y., Yao, L., Cho, K.: Bounding the test log-likelihood of generative models. In: Proceedings of the International Conference on Learning Representations (ICLR) (2013)
Google Scholar
Besag, J.: Statistical analysis of non-lattice data. The Statistician 24, 179–195 (1975)
Article Google Scholar
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Computation 14, 1771–1880 (2002)
Article MATH Google Scholar
Hyvärinen, A.: Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research 6, 695–709 (2005)
MATH Google Scholar
MacKay, D.J.C.: Failures of the one-step learning algorithm (2001). http://www.inference.phy.cam.ac.uk/mackay/abstracts/gbm.html, unpublished Technical Report
Marlin, B.M., de Freitas, N.: Asymptotic efficiency of deterministic estimators for discrete energy-based models: ratio matching and pseudolikelihood. In: Proceedings of the Uncertainty in Artificial Intelligence (UAI) (2011)
Google Scholar
Salakhutdinov, R., Murray, I.: On the quantitative analysis of deep belief networks. In: Proceedings of the International Conference of Machine Learning (ICML) (2008)
Google Scholar
Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. In: Parallel Distributed Processing: Volume 1: Foundations, pp. 194–281. MIT Press (1986)
Google Scholar
Sohl-Dickstein, J.: Persistent minimum probability flow. Tech. rep, Redwood Centre for Theoretical Neuroscience (2011)
Google Scholar
Sohl-Dickstein, J., Battaglino, P., DeWeese, M.R.: Minimum probability flow learning. In: Proceedings of the International Conference of Machine Learning (ICML) (2011)
Google Scholar
Sutskever, I., Tieleman, T.: On the convergence properties of contrastive divergence. In: Proceedings of the AI & Statistics (AI STAT) (2009)
Google Scholar
Tieleman, T., Hinton, G.E.: Using fast weights to improve persistent contrastive divergence. In: Proceedings of the International Conference of Machine Learning (ICML) (2009)
Google Scholar
Tierney, L.: Markov chains for exploring posterior distributions. Annals of Statistics 22, 1701–1762 (1994)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Engineering, University of Guelph, Guelph, ON, Canada
Daniel Jiwoong Im, Ethan Buchman & Graham W. Taylor

Authors

Daniel Jiwoong Im
View author publications
You can also search for this author in PubMed Google Scholar
Ethan Buchman
View author publications
You can also search for this author in PubMed Google Scholar
Graham W. Taylor
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Jiwoong Im .

Editor information

Editors and Affiliations

University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
University of Porto, Porto, Portugal
Pedro Pereira Rodrigues
University of Porto - CRACS/INESC TEC, Porto, Portugal
Vítor Santos Costa
University of Porto - INESC TEC, Porto, Portugal
Carlos Soares
University of Porto - INESC TEC, Porto, Portugal
João Gama
University of Porto - INESC TEC, Porto, Portugal
Alípio Jorge

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Im, D.J., Buchman, E., Taylor, G.W. (2015). An Empirical Investigation of Minimum Probability Flow Learning Under Different Connectivity Patterns. In: Appice, A., Rodrigues, P., Santos Costa, V., Soares, C., Gama, J., Jorge, A. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2015. Lecture Notes in Computer Science(), vol 9284. Springer, Cham. https://doi.org/10.1007/978-3-319-23528-8_30

Download citation

DOI: https://doi.org/10.1007/978-3-319-23528-8_30
Published: 29 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23527-1
Online ISBN: 978-3-319-23528-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics