A double gradient algorithm to optimize regularization

  • Thomas Czernichow
Part II: Cortical Maps and Receptive Fields
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1327)


We present in this article a new technique dedicated to optimise the regularization parameter of a cost function. On the one hand the derivatives of the cost function with regards to the weights permits to optimise the network. On the other the derivatives of the cost function with regards to the regularization parameter permits to optimize the smoothness of the function achieved by the network. We show that by oscillating between these two gradient descent optimisations we achieve the task of regulating the smoothness of a neural network. We present the results of this algorithm on a task design to clearly express the network's level of smoothness.


Cost Function Mean Square Error Regularization Parameter Optimal Weight Ridge Regression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    C. M. Bishop, “Chap. 9 Learning and Generalization,” in Neural Networks for Pattern Recognition: Oxford University Press, ISBN 0-19-853864-2, 1995, pp. 332–384.Google Scholar
  2. [2]
    T. Czernichow, “Architecture Selection through Statistical Sensitivity Analysis,” presented at ICANN'96, Bochum, 1996.Google Scholar
  3. [3]
    T. Czernichow, “Apport des réseaux récurrents à la prévision de séries temporelles, application á la prévision de consommation d'électricité,” PhD thesis, Intelligence Artificielle et Reconnaissance de Formes. Université Pierre et Marie Curie (Paris 6), Lab. Laforia/INT-SIM,1996.Google Scholar
  4. [4]
    A. N. Tikhonov, V.Y. Arsenin, Solution of ill posed problems. Washington, D.C.: W.H. Wilson, 1977.Google Scholar
  5. [5]
    A. Weigend, D. Rumelhart, B. Huberman, “Generalization by weight elimination with application to forecasting,” presented at Neural Information Processings 3, pp. 875–882, 1991.Google Scholar
  6. [6]
    Y. Chauvin, “Dynamic behavior of constrained back-propagation networks,” presented at Advances in Neural Information Processings 2, pp. 643–649, 1990.Google Scholar
  7. [7]
    S. Hanson, D. Burr, “Minkowski-r back propagation: Learning connectionist models with non-euclidian error signals,” presented at Neural Information Processing Systems, American Institute of Physics, New York, pp. 348–357, 1988.Google Scholar
  8. [8]
    V. Morosov, Methods for solving incorrectly posed problems. Berlin: Springer Verlag, 1984.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • Thomas Czernichow
    • 1
  1. 1.Instituto de Investigación TecnológicaUniversidad Pontifcia ComillasMadridSpain

Personalised recommendations