The Theory of On-line Learning — A Statistical Physics Approach

Saad, D.

doi:10.1007/978-3-642-55721-7_31

D. Saad⁶

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

1047 Accesses

Abstract

In this paper we review recent theoretical approaches for analysing the dynamics of on-line learning in multilayer neural networks using methods adopted from statistical physics. The analysis is based on monitoring a set of macroscopic variables from which the generalisation error can be calculated. A closed set of dynamical equations for the macroscopic variables is derived analytically and solved numerically. The theoretical framework is then employed for defining optimal learning parameters and for analysing the incorporation of second order information into the learning process using natural gradient descent and matrix-momentum based methods. We will also briefly explain an extension of the original framework for analysing the case where training examples are sampled with repetition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

AMARI, S. (1998): Natural Gradient Works Efficiently in Learning. Neural Computation, Vol. 10, 251–276.
Article Google Scholar
BIEHL, M. and SCHWARZE, H. (1995): Learning by Online Gradient Descent. Jour. Phys. A, Vol. 28, 643–656.
MathSciNet MATH Google Scholar
BISHOP, C. M. (1995): Neural Networks for Pattern Recognition. Oxford University Press, Oxford.
Google Scholar
COOLEN, A. C. C. and SAAD, D. (2000): Dynamics of Learning with Restricted Training Sets. Phys. Rev. E., Vol. 62, 5444–5487.
Article MathSciNet Google Scholar
COOLEN, A. C. C., SAAD, D. and XIONG, Y. (2000): On-line Learning from Restricted Training Sets in Multilayer Neural Networks. Europhys Lett., Vol. 51, 691–697.
Article Google Scholar
MACE, C. W. H. and COOLEN, A. C. C. (1998): Statistical Mechanical Analysis of the Dynamics of Learning in Perceptrons. Statistics and Computing, Vol. 855–88.
Google Scholar
ORR, G. B. and LEEN, T. K. (1994): Using Curvature Information for Fast Stochastic Search. in Cowan, Tesauro and Alspector (Eds.): Advances in Neural Information Processing Systems, NIPS Vol. 6, Morgan Kaufmann, San Mateo CA, 477–484.
Google Scholar
RATTRAY, M. and SAAD, D. (1997): Globally Optimal Rules for Online Learning in Multilayer Networks. Jour. Phys. A, Vol. 30, L771–776.
MathSciNet MATH Google Scholar
RATTRAY, M. and SAAD, D. (1998): An analysis of on-line training with optimal learning rates. Phys. Rev. E., Vol. 58, 6379–6391.
Article Google Scholar
RATTRAY, M., SAAD, D. and AMARI, S. (1998): Natural Gradient Descent for On-line Learning. Phys. Rev. Lett., Vol. 81, 5461–5464.
Article Google Scholar
RIEGLER, P. and BIEHL, M. (1995): Online Backpropagation in Two Layered Neural Networks. Jour. Phys. A, Vol. 28, L507–L513.
Google Scholar
SAAD, D. (Editor) (1998): On-Line Learning in Neural Networks. Publications of the Newton Institute, Cambridge University Press, Cambridge.
MATH Google Scholar
SAAD, D. and RATTRAY, M. (1997): Globally Optimal Parameters for On-line Learning in Multilayer Networks. Phys. Rev. Lett., Vol. 79, 2578–2581.
Article Google Scholar
SAAD, D. and RATTRAY, M. (1998): Learning with Regularizers in Multilayer Neural Networks. Phys. Rev. E., Vol. 57, 2170–2176.
Article Google Scholar
SAAD, D. and SOLLA, S. A. (1995): Exact Solution for On-Line Learning in Multilayer Neural Networks. Phys. Rev. Lett., Vol. 74, 4337–4340.
Article Google Scholar
SAAD, D. and SOLLA, S. A. (1995): On-Line Learning in Soft Committee Machines. Phys. Rev. E, Vol. 52, 4225–4243.
Article Google Scholar
SCARPETTA, S., RATTRAY, M. and SAAD, D. (1999): Matrix Momentum for Practical Natural Gradient Learning. Jour. Phys. A, Vol. 32, 4047–4059.
MATH Google Scholar
XIONG, Y. and SAAD, D. (2001): Noise, Regularizers and Unrealizable Scenarios in On-line Learning From Restricted Training Sets. submitted.
Google Scholar

Download references

Author information

Authors and Affiliations

The Neural Computing Research Group, University of Aston, Birmingham, B4 7ET, UK
D. Saad

Authors

D. Saad
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Munich School of Management Institute of Corporate Development and Organization, University of Munich, Kaulbachstraße 45/1, 80539, Munich, Germany
Manfred Schwaiger
Department of Mathematical Methods in Economics, University of Augsburg, Universitätsstraße 16, 86159, Augsburg, Germany
Otto Opitz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saad, D. (2003). The Theory of On-line Learning — A Statistical Physics Approach. In: Schwaiger, M., Opitz, O. (eds) Exploratory Data Analysis in Empirical Research. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55721-7_31

Download citation

DOI: https://doi.org/10.1007/978-3-642-55721-7_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44183-0
Online ISBN: 978-3-642-55721-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics