Abstract
A learning algorithm (CG) with superlinear convergence rate is introduced. The algorithm is based upon a class of optimization techniques well known in numerical analysis as the Conjugate Gradient Methods. CG uses second order information from the neural network but requires only O(N) memory usage, where N is the number of minimization variables; in our case all the weights in the network. The performance of CG is benchmarked against the performance of the ordinary backpropagation algorithm (BP). We find that CG is considerably faster than BP and that CG is able to perform the learning task with fewer hidden units.
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
Fletcher,R., Practical Methods of Optimization, Vol.1, Unconstrained Optimization, John Wiley & Sons, 1975.
Gill, P., Practical Optimization, Academic Press inc., 1980.
Hestenes, M., Conjugate Direction Methods in Optimization, Springer Verlag, New York, 1980.
Hinton, G., Connectionist Learning Procedures, Artificial Intelligence (1989), pp. 185–234.
Madsen, K., Optimering, hæfte 38, Numerisk Institut, DTH, 1984.
Powell, M., Restart procedures for the Conjugate Gradient Method, Mathematical Programming, Vol. 12, pp. 241–254.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1990 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Møller, M.F. (1990). Learning by conjugate gradients. In: Dassow, J., Kelemen, J. (eds) Aspects and Prospects of Theoretical Computer Science. IMYCS 1990. Lecture Notes in Computer Science, vol 464. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-53414-8_41
Download citation
DOI: https://doi.org/10.1007/3-540-53414-8_41
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-53414-3
Online ISBN: 978-3-540-46869-1
eBook Packages: Springer Book Archive