Statistical Mechanics of On-line Learning

Biehl, Michael; Caticha, Nestor; Riegler, Peter

doi:10.1007/978-3-642-01805-3_1

Michael Biehl²³,
Nestor Caticha²⁴ &
Peter Riegler²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5400))

1404 Accesses
1 Citations

Abstract

We introduce and discuss the application of statistical physics concepts in the context of on-line machine learning processes. The consideration of typical properties of very large systems allows to perfom averages over the randomness contained in the sequence of training data. It yields an exact mathematical description of the training dynamics in model scenarios. We present the basic concepts and results of the approach in terms of several examples, including the learning of linear separable rules, the training of multilayer neural networks, and Learning Vector Quantization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equations of State calculations by fast computing machines. J. Chem. Phys. 21, 1087 (1953)
Article CAS Google Scholar
Huang, K.: Statistical Mechanics. Wiley and Sons, New York (1987)
Google Scholar
Jaynes, E.T.: Probability Theory: The Logic of Science. Bretthorst, G.L. (ed.). Cambridge University Press, Cambridge (2003)
Google Scholar
Mace, C.W.H., Coolen, T.: Dynamics of Supervised Learning with Restricted Training Sets. Statistics and Computing 8, 55–88 (1998)
Article Google Scholar
Biehl, M., Schwarze, M.: On-line learning of a time-dependent rule. Europhys. Lett. 20, 733–738 (1992)
Article Google Scholar
Biehl, M., Schwarze, H.: Learning drifting concepts with neural networks. Journal of Physics A: Math. Gen. 26, 2651–2665 (1993)
Article Google Scholar
Kinouchi, O., Caticha, N.: Lower bounds on generalization errors for drifting rules. J. Phys. A: Math. Gen. 26, 6161–6171 (1993)
Article Google Scholar
Vicente, R., Kinouchi, O., Caticha, N.: Statistical Mechanics of Online Learning of Drifting Concepts: A Variational Approach. Machine Learning 32, 179–201 (1998)
Article Google Scholar
Reents, G., Urbanczik, R.: Self-averaging and on-line learning. Phys. Rev. Lett. 80, 5445–5448 (1998)
Article CAS Google Scholar
Kinzel, W., Rujan, P.: Improving a network generalization ability by selecting examples. Europhys. Lett. 13, 2878 (1990)
Article Google Scholar
Kinouchi, O., Caticha, N.: Optimal generalization in perceptrons. J. Phys. A: Math. Gen. 25, 6243–6250 (1992)
Article Google Scholar
Copelli, M., Caticha, N.: On-line learning in the committee machine. J. Phys. A: Math. Gen. 28, 1615–1625 (1995)
Article Google Scholar
Biehl, M., Riegler, P.: On-line Learning with a Perceptron. Europhys. Lett. 78, 525–530 (1994)
Article Google Scholar
Biehl, M., Riegler, P., Stechert, M.: Learning from Noisy Data: An Exactly Solvable Model. Phys. Rev. E 76, R4624–R4627 (1995)
Article Google Scholar
Copelli, M., Eichhorn, R., Kinouchi, O., Biehl, M., Simonetti, R., Riegler, P., Caticha, N.: Noise robustness in multilayer neural networks. Europhys. Lett. 37, 427–432 (1995)
Article Google Scholar
Vicente, R., Caticha, N.: Functional optimization of online algorithms in multilayer neural networks. J. Phys. A: Math. Gen. 30, L599–L605 (1997)
Article Google Scholar
Opper, M.: A Bayesian approach to on-line learning. In: [27], pp. 363–378 (1998)
Google Scholar
Opper, M., Winther, O.: A mean field approach to Bayes learning in feed-forward neural networks. Phys. Rev. Lett. 76, 1964–1967 (1996)
Article CAS PubMed Google Scholar
Solla, S.A., Winther, O.: Optimal perceptron learning: an online Bayesian approach. In: [27], pp. 379–398 (1998)
Google Scholar
Cybenko, G.V.: Approximation by superposition of a sigmoidal function. Math. of Control, Signals and Systems 2, 303–314 (1989)
Article Google Scholar
Endres, D., Riegler, P.: Adaptive systems on different time scales. J. Phys. A: Math. Gen. 32, 8655–8663 (1999)
Article Google Scholar
Biehl, M., Schwarze, H.: Learning by on-line gradient descent. J. Phys A: Math. Gen. 28, 643 (1995)
Article Google Scholar
Saad, D., Solla, S.A.: Exact solution for on-line learning in multilayer neural networks. Phys. Rev. Lett. 74, 4337–4340 (1995)
Article CAS PubMed Google Scholar
Saad, D., Solla, S.A.: Online learning in soft committee machines. Phys. Rev. E 52, 4225–4243 (1995)
Article CAS Google Scholar
Biehl, M., Riegler, P., Wöhler, C.: Transient Dynamics of Online-learning in two-layered neural networks. J. Phys. A: Math. Gen. 29, 4769 (1996)
Article Google Scholar
Saad, D., Rattray, M.: Globally optimal parameters for on-line learning in multilayer neural networks. Phys. Rev. Lett. 79, 2578 (1997)
Article CAS Google Scholar
Saad, D. (ed.): On-line learning in neural networks. Cambridge University Press, Cambridge (1998)
Google Scholar
Engel, A., Van den Broeck, C.: The Statistical Mechanics of Learning. Cambridge University Press, Cambridge (2001)
Book Google Scholar
Schlösser, E., Saad, D., Biehl, M.: Optimisation of on-line Principal Component Analysis. J. Physics A: Math. Gen. 32, 4061 (1999)
Article Google Scholar
Biehl, M., Schlösser, E.: The dynamics of on-line Principal Component Analysis. J. Physics A: Math. Gen. 31, L97 (1998)
Article Google Scholar
Biehl, M., Mietzner, A.: Statistical mechanics of unsupervised learning. Europhys. Lett. 27, 421–426 (1993)
Article Google Scholar
Kohonen, T.: Self-Organizing Maps. Springer, Berlin (1997)
Book Google Scholar
Kohonen, T.: Learning vector quantization. In: Arbib, M.A. (ed.) The Handbook of Brain Theory and Neural Networks, pp. 537–540. MIT Press, Cambridge (1995)
Google Scholar
Van den Broeck, C., Reimann, P.: Unsupervised Learning by Examples: On-line Versus Off-line. Phys. Rev. Lett. 76, 2188–2191 (1996)
Article PubMed Google Scholar
Reimann, P., Van den Broeck, C., Bex, G.J.: A Gaussian Scenario for Unsupervised Learning. J. Phys. A: Math. Gen. 29, 3521–3533 (1996)
Article Google Scholar
Riegler, P., Biehl, M., Solla, S.A., Marangi, C.: On-line learning from clustered input examples. In: Marinaro, M., Tagliaferri, R. (eds.) Neural Nets WIRN Vietri 1995, Proc. of the 7th Italian Workshop on Neural Nets, pp. 87–92. World Scientific, Singapore (1996)
Google Scholar
Marangi, C., Biehl, M., Solla, S.A.: Supervised learning from clustered input examples. Europhys. Lett. 30, 117–122 (1995)
Article CAS Google Scholar
Biehl, M.: An exactly solvable model of unsupervised learning. Europhysics Lett. 25, 391–396 (1994)
Article Google Scholar
Meir, R.: Empirical risk minimization versus maximum-likelihood estimation: a case study. Neural Computation 7, 144–157 (1995)
Article Google Scholar
Barkai, N., Seung, H.S., Sompolinksy, H.: Scaling laws in learning of classification tasks. Phys. Rev. Lett. 70, 3167–3170 (1993)
Article CAS PubMed Google Scholar
Neural Networks Research Centre. Bibliography on the self-organizing maps (SOM) and learning vector quantization (LVQ). Helsinki University of Technology (2002), http://liinwww.ira.uka.de/bibliography/Neural/SOM.LVQ.html
Biehl, M., Ghosh, A., Hammer, B.: Dynamics and generalization ability of LVQ algorithms. J. Machine Learning Research 8, 323–360 (2007)
Google Scholar
Biehl, M., Freking, A., Reents, G.: Dynamics of on-line competitive learning. Europhysics Letters 38, 73–78 (1997)
Article CAS Google Scholar
Biehl, M., Ghosh, A., Hammer, B.: Learning Vector Quantization: The Dynamics of Winner-Takes-All algorithms. Neurocomputing 69, 660–670 (2006)
Article Google Scholar
Witeolar, A., Biehl, M., Ghosh, A., Hammer, B.: Learning Dynamics of Neural Gas and Vector Quantization. Neurocomputing 71, 1210–1219 (2008)
Article Google Scholar
Bojer, T., Hammer, B., Schunk, D., Tluk von Toschanowitz, K.: Relevance determination in learning vector quantization. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks ESANN 2001, pp. 271–276. D-facto publications, Belgium (2001)
Google Scholar
Hammer, B., Villmann, T.: Generalized relevance learning vector quantization. Neural Networks 15, 1059–1068 (2002)
Article PubMed Google Scholar
Schneider, P., Biehl, M., Hammer, B.: Relevance Matrices in Learning Vector Quantization. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks ESANN 2007, pp. 37–43. d-side publishing, Belgium (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Mathematics and Computing Science, University of Groningen, P.O. Box 407, 9700 AK, Groningen, The Netherlands
Michael Biehl
Instituto de Fisica, Universidade de São Paulo, CP66318, CEP 05315-970, São Paulo, SP, Brazil
Nestor Caticha
Fachhochschule Braunschweig/Wolfenbüttel, Fachbereich Informatik, Salzdahlumer Str. 46/48, 38302, Wolfenbüttel, Germany
Peter Riegler

Authors

Michael Biehl
View author publications
You can also search for this author in PubMed Google Scholar
Nestor Caticha
View author publications
You can also search for this author in PubMed Google Scholar
Peter Riegler
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Mathematics and Computing Science, Intelligent Systems Group, University Groningen, P.O. Box 407, 9700 AK, Groningen, Netherlands
Michael Biehl
Department of Computer Science, Clausthal University of Technology, 38679, Clausthal-Zellerfeld, Germany
Barbara Hammer
Machine Learning Group, DICE, Place du Levant, Université catholique de Louvain,, 3-B-1348, Louvain-la-Neuve, Belgium
Michel Verleysen
Dep. of Mathematics/Physics/Computer Sciences, University of Applied Sciences Mittweida, Technikumplatz 17, 09648, Mittweida, Germany
Thomas Villmann

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Biehl, M., Caticha, N., Riegler, P. (2009). Statistical Mechanics of On-line Learning. In: Biehl, M., Hammer, B., Verleysen, M., Villmann, T. (eds) Similarity-Based Clustering. Lecture Notes in Computer Science(), vol 5400. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01805-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-01805-3_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01804-6
Online ISBN: 978-3-642-01805-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics