Abstract
In this paper we review recent theoretical approaches for analysing the dynamics of on-line learning in multilayer neural networks using methods adopted from statistical physics. The analysis is based on monitoring a set of macroscopic variables from which the generalisation error can be calculated. A closed set of dynamical equations for the macroscopic variables is derived analytically and solved numerically. The theoretical framework is then employed for defining optimal learning parameters and for analysing the incorporation of second order information into the learning process using natural gradient descent and matrix-momentum based methods. We will also briefly explain an extension of the original framework for analysing the case where training examples are sampled with repetition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
AMARI, S. (1998): Natural Gradient Works Efficiently in Learning. Neural Computation, Vol. 10, 251–276.
BIEHL, M. and SCHWARZE, H. (1995): Learning by Online Gradient Descent. Jour. Phys. A, Vol. 28, 643–656.
BISHOP, C. M. (1995): Neural Networks for Pattern Recognition. Oxford University Press, Oxford.
COOLEN, A. C. C. and SAAD, D. (2000): Dynamics of Learning with Restricted Training Sets. Phys. Rev. E., Vol. 62, 5444–5487.
COOLEN, A. C. C., SAAD, D. and XIONG, Y. (2000): On-line Learning from Restricted Training Sets in Multilayer Neural Networks. Europhys Lett., Vol. 51, 691–697.
MACE, C. W. H. and COOLEN, A. C. C. (1998): Statistical Mechanical Analysis of the Dynamics of Learning in Perceptrons. Statistics and Computing, Vol. 855–88.
ORR, G. B. and LEEN, T. K. (1994): Using Curvature Information for Fast Stochastic Search. in Cowan, Tesauro and Alspector (Eds.): Advances in Neural Information Processing Systems, NIPS Vol. 6, Morgan Kaufmann, San Mateo CA, 477–484.
RATTRAY, M. and SAAD, D. (1997): Globally Optimal Rules for Online Learning in Multilayer Networks. Jour. Phys. A, Vol. 30, L771–776.
RATTRAY, M. and SAAD, D. (1998): An analysis of on-line training with optimal learning rates. Phys. Rev. E., Vol. 58, 6379–6391.
RATTRAY, M., SAAD, D. and AMARI, S. (1998): Natural Gradient Descent for On-line Learning. Phys. Rev. Lett., Vol. 81, 5461–5464.
RIEGLER, P. and BIEHL, M. (1995): Online Backpropagation in Two Layered Neural Networks. Jour. Phys. A, Vol. 28, L507–L513.
SAAD, D. (Editor) (1998): On-Line Learning in Neural Networks. Publications of the Newton Institute, Cambridge University Press, Cambridge.
SAAD, D. and RATTRAY, M. (1997): Globally Optimal Parameters for On-line Learning in Multilayer Networks. Phys. Rev. Lett., Vol. 79, 2578–2581.
SAAD, D. and RATTRAY, M. (1998): Learning with Regularizers in Multilayer Neural Networks. Phys. Rev. E., Vol. 57, 2170–2176.
SAAD, D. and SOLLA, S. A. (1995): Exact Solution for On-Line Learning in Multilayer Neural Networks. Phys. Rev. Lett., Vol. 74, 4337–4340.
SAAD, D. and SOLLA, S. A. (1995): On-Line Learning in Soft Committee Machines. Phys. Rev. E, Vol. 52, 4225–4243.
SCARPETTA, S., RATTRAY, M. and SAAD, D. (1999): Matrix Momentum for Practical Natural Gradient Learning. Jour. Phys. A, Vol. 32, 4047–4059.
XIONG, Y. and SAAD, D. (2001): Noise, Regularizers and Unrealizable Scenarios in On-line Learning From Restricted Training Sets. submitted.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Saad, D. (2003). The Theory of On-line Learning — A Statistical Physics Approach. In: Schwaiger, M., Opitz, O. (eds) Exploratory Data Analysis in Empirical Research. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55721-7_31
Download citation
DOI: https://doi.org/10.1007/978-3-642-55721-7_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44183-0
Online ISBN: 978-3-642-55721-7
eBook Packages: Springer Book Archive