Learning by conjugate gradients

Møller, Martin F.

doi:10.1007/3-540-53414-8_41

Learning by conjugate gradients

Martin F. Møller¹

Part III Communications
Conference paper
First Online: 01 January 2005

147 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 464))

Abstract

A learning algorithm (CG) with superlinear convergence rate is introduced. The algorithm is based upon a class of optimization techniques well known in numerical analysis as the Conjugate Gradient Methods. CG uses second order information from the neural network but requires only O(N) memory usage, where N is the number of minimization variables; in our case all the weights in the network. The performance of CG is benchmarked against the performance of the ordinary backpropagation algorithm (BP). We find that CG is considerably faster than BP and that CG is able to perform the learning task with fewer hidden units.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

References

Fletcher,R., Practical Methods of Optimization, Vol.1, Unconstrained Optimization, John Wiley & Sons, 1975.
Google Scholar
Gill, P., Practical Optimization, Academic Press inc., 1980.
Google Scholar
Hestenes, M., Conjugate Direction Methods in Optimization, Springer Verlag, New York, 1980.
Google Scholar
Hinton, G., Connectionist Learning Procedures, Artificial Intelligence (1989), pp. 185–234.
Google Scholar
Madsen, K., Optimering, hæfte 38, Numerisk Institut, DTH, 1984.
Google Scholar
Powell, M., Restart procedures for the Conjugate Gradient Method, Mathematical Programming, Vol. 12, pp. 241–254.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Mathematical Institute, University of Aarhus, Denmark
Martin F. Møller

Authors

Martin F. Møller
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Jürgen Dassow Jozef Kelemen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Møller, M.F. (1990). Learning by conjugate gradients. In: Dassow, J., Kelemen, J. (eds) Aspects and Prospects of Theoretical Computer Science. IMYCS 1990. Lecture Notes in Computer Science, vol 464. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-53414-8_41

Download citation

DOI: https://doi.org/10.1007/3-540-53414-8_41
Published: 08 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-53414-3
Online ISBN: 978-3-540-46869-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics