Abstract
We introduce and discuss the application of statistical physics concepts in the context of on-line machine learning processes. The consideration of typical properties of very large systems allows to perfom averages over the randomness contained in the sequence of training data. It yields an exact mathematical description of the training dynamics in model scenarios. We present the basic concepts and results of the approach in terms of several examples, including the learning of linear separable rules, the training of multilayer neural networks, and Learning Vector Quantization.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equations of State calculations by fast computing machines. J. Chem. Phys. 21, 1087 (1953)
Huang, K.: Statistical Mechanics. Wiley and Sons, New York (1987)
Jaynes, E.T.: Probability Theory: The Logic of Science. Bretthorst, G.L. (ed.). Cambridge University Press, Cambridge (2003)
Mace, C.W.H., Coolen, T.: Dynamics of Supervised Learning with Restricted Training Sets. Statistics and Computing 8, 55–88 (1998)
Biehl, M., Schwarze, M.: On-line learning of a time-dependent rule. Europhys. Lett. 20, 733–738 (1992)
Biehl, M., Schwarze, H.: Learning drifting concepts with neural networks. Journal of Physics A: Math. Gen. 26, 2651–2665 (1993)
Kinouchi, O., Caticha, N.: Lower bounds on generalization errors for drifting rules. J. Phys. A: Math. Gen. 26, 6161–6171 (1993)
Vicente, R., Kinouchi, O., Caticha, N.: Statistical Mechanics of Online Learning of Drifting Concepts: A Variational Approach. Machine Learning 32, 179–201 (1998)
Reents, G., Urbanczik, R.: Self-averaging and on-line learning. Phys. Rev. Lett. 80, 5445–5448 (1998)
Kinzel, W., Rujan, P.: Improving a network generalization ability by selecting examples. Europhys. Lett. 13, 2878 (1990)
Kinouchi, O., Caticha, N.: Optimal generalization in perceptrons. J. Phys. A: Math. Gen. 25, 6243–6250 (1992)
Copelli, M., Caticha, N.: On-line learning in the committee machine. J. Phys. A: Math. Gen. 28, 1615–1625 (1995)
Biehl, M., Riegler, P.: On-line Learning with a Perceptron. Europhys. Lett. 78, 525–530 (1994)
Biehl, M., Riegler, P., Stechert, M.: Learning from Noisy Data: An Exactly Solvable Model. Phys. Rev. E 76, R4624–R4627 (1995)
Copelli, M., Eichhorn, R., Kinouchi, O., Biehl, M., Simonetti, R., Riegler, P., Caticha, N.: Noise robustness in multilayer neural networks. Europhys. Lett. 37, 427–432 (1995)
Vicente, R., Caticha, N.: Functional optimization of online algorithms in multilayer neural networks. J. Phys. A: Math. Gen. 30, L599–L605 (1997)
Opper, M.: A Bayesian approach to on-line learning. In: [27], pp. 363–378 (1998)
Opper, M., Winther, O.: A mean field approach to Bayes learning in feed-forward neural networks. Phys. Rev. Lett. 76, 1964–1967 (1996)
Solla, S.A., Winther, O.: Optimal perceptron learning: an online Bayesian approach. In: [27], pp. 379–398 (1998)
Cybenko, G.V.: Approximation by superposition of a sigmoidal function. Math. of Control, Signals and Systems 2, 303–314 (1989)
Endres, D., Riegler, P.: Adaptive systems on different time scales. J. Phys. A: Math. Gen. 32, 8655–8663 (1999)
Biehl, M., Schwarze, H.: Learning by on-line gradient descent. J. Phys A: Math. Gen. 28, 643 (1995)
Saad, D., Solla, S.A.: Exact solution for on-line learning in multilayer neural networks. Phys. Rev. Lett. 74, 4337–4340 (1995)
Saad, D., Solla, S.A.: Online learning in soft committee machines. Phys. Rev. E 52, 4225–4243 (1995)
Biehl, M., Riegler, P., Wöhler, C.: Transient Dynamics of Online-learning in two-layered neural networks. J. Phys. A: Math. Gen. 29, 4769 (1996)
Saad, D., Rattray, M.: Globally optimal parameters for on-line learning in multilayer neural networks. Phys. Rev. Lett. 79, 2578 (1997)
Saad, D. (ed.): On-line learning in neural networks. Cambridge University Press, Cambridge (1998)
Engel, A., Van den Broeck, C.: The Statistical Mechanics of Learning. Cambridge University Press, Cambridge (2001)
Schlösser, E., Saad, D., Biehl, M.: Optimisation of on-line Principal Component Analysis. J. Physics A: Math. Gen. 32, 4061 (1999)
Biehl, M., Schlösser, E.: The dynamics of on-line Principal Component Analysis. J. Physics A: Math. Gen. 31, L97 (1998)
Biehl, M., Mietzner, A.: Statistical mechanics of unsupervised learning. Europhys. Lett. 27, 421–426 (1993)
Kohonen, T.: Self-Organizing Maps. Springer, Berlin (1997)
Kohonen, T.: Learning vector quantization. In: Arbib, M.A. (ed.) The Handbook of Brain Theory and Neural Networks, pp. 537–540. MIT Press, Cambridge (1995)
Van den Broeck, C., Reimann, P.: Unsupervised Learning by Examples: On-line Versus Off-line. Phys. Rev. Lett. 76, 2188–2191 (1996)
Reimann, P., Van den Broeck, C., Bex, G.J.: A Gaussian Scenario for Unsupervised Learning. J. Phys. A: Math. Gen. 29, 3521–3533 (1996)
Riegler, P., Biehl, M., Solla, S.A., Marangi, C.: On-line learning from clustered input examples. In: Marinaro, M., Tagliaferri, R. (eds.) Neural Nets WIRN Vietri 1995, Proc. of the 7th Italian Workshop on Neural Nets, pp. 87–92. World Scientific, Singapore (1996)
Marangi, C., Biehl, M., Solla, S.A.: Supervised learning from clustered input examples. Europhys. Lett. 30, 117–122 (1995)
Biehl, M.: An exactly solvable model of unsupervised learning. Europhysics Lett. 25, 391–396 (1994)
Meir, R.: Empirical risk minimization versus maximum-likelihood estimation: a case study. Neural Computation 7, 144–157 (1995)
Barkai, N., Seung, H.S., Sompolinksy, H.: Scaling laws in learning of classification tasks. Phys. Rev. Lett. 70, 3167–3170 (1993)
Neural Networks Research Centre. Bibliography on the self-organizing maps (SOM) and learning vector quantization (LVQ). Helsinki University of Technology (2002), http://liinwww.ira.uka.de/bibliography/Neural/SOM.LVQ.html
Biehl, M., Ghosh, A., Hammer, B.: Dynamics and generalization ability of LVQ algorithms. J. Machine Learning Research 8, 323–360 (2007)
Biehl, M., Freking, A., Reents, G.: Dynamics of on-line competitive learning. Europhysics Letters 38, 73–78 (1997)
Biehl, M., Ghosh, A., Hammer, B.: Learning Vector Quantization: The Dynamics of Winner-Takes-All algorithms. Neurocomputing 69, 660–670 (2006)
Witeolar, A., Biehl, M., Ghosh, A., Hammer, B.: Learning Dynamics of Neural Gas and Vector Quantization. Neurocomputing 71, 1210–1219 (2008)
Bojer, T., Hammer, B., Schunk, D., Tluk von Toschanowitz, K.: Relevance determination in learning vector quantization. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks ESANN 2001, pp. 271–276. D-facto publications, Belgium (2001)
Hammer, B., Villmann, T.: Generalized relevance learning vector quantization. Neural Networks 15, 1059–1068 (2002)
Schneider, P., Biehl, M., Hammer, B.: Relevance Matrices in Learning Vector Quantization. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks ESANN 2007, pp. 37–43. d-side publishing, Belgium (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Biehl, M., Caticha, N., Riegler, P. (2009). Statistical Mechanics of On-line Learning. In: Biehl, M., Hammer, B., Verleysen, M., Villmann, T. (eds) Similarity-Based Clustering. Lecture Notes in Computer Science(), vol 5400. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01805-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-01805-3_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01804-6
Online ISBN: 978-3-642-01805-3
eBook Packages: Computer ScienceComputer Science (R0)