Skip to main content

Learning by conjugate gradients

  • Part III Communications
  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 464))

Abstract

A learning algorithm (CG) with superlinear convergence rate is introduced. The algorithm is based upon a class of optimization techniques well known in numerical analysis as the Conjugate Gradient Methods. CG uses second order information from the neural network but requires only O(N) memory usage, where N is the number of minimization variables; in our case all the weights in the network. The performance of CG is benchmarked against the performance of the ordinary backpropagation algorithm (BP). We find that CG is considerably faster than BP and that CG is able to perform the learning task with fewer hidden units.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fletcher,R., Practical Methods of Optimization, Vol.1, Unconstrained Optimization, John Wiley & Sons, 1975.

    Google Scholar 

  2. Gill, P., Practical Optimization, Academic Press inc., 1980.

    Google Scholar 

  3. Hestenes, M., Conjugate Direction Methods in Optimization, Springer Verlag, New York, 1980.

    Google Scholar 

  4. Hinton, G., Connectionist Learning Procedures, Artificial Intelligence (1989), pp. 185–234.

    Google Scholar 

  5. Madsen, K., Optimering, hæfte 38, Numerisk Institut, DTH, 1984.

    Google Scholar 

  6. Powell, M., Restart procedures for the Conjugate Gradient Method, Mathematical Programming, Vol. 12, pp. 241–254.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Jürgen Dassow Jozef Kelemen

Rights and permissions

Reprints and permissions

Copyright information

© 1990 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Møller, M.F. (1990). Learning by conjugate gradients. In: Dassow, J., Kelemen, J. (eds) Aspects and Prospects of Theoretical Computer Science. IMYCS 1990. Lecture Notes in Computer Science, vol 464. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-53414-8_41

Download citation

  • DOI: https://doi.org/10.1007/3-540-53414-8_41

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-53414-3

  • Online ISBN: 978-3-540-46869-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics