A double gradient algorithm to optimize regularization
We present in this article a new technique dedicated to optimise the regularization parameter of a cost function. On the one hand the derivatives of the cost function with regards to the weights permits to optimise the network. On the other the derivatives of the cost function with regards to the regularization parameter permits to optimize the smoothness of the function achieved by the network. We show that by oscillating between these two gradient descent optimisations we achieve the task of regulating the smoothness of a neural network. We present the results of this algorithm on a task design to clearly express the network's level of smoothness.
KeywordsCost Function Mean Square Error Regularization Parameter Optimal Weight Ridge Regression
Unable to display preview. Download preview PDF.
- C. M. Bishop, “Chap. 9 Learning and Generalization,” in Neural Networks for Pattern Recognition: Oxford University Press, ISBN 0-19-853864-2, 1995, pp. 332–384.Google Scholar
- T. Czernichow, “Architecture Selection through Statistical Sensitivity Analysis,” presented at ICANN'96, Bochum, 1996.Google Scholar
- T. Czernichow, “Apport des réseaux récurrents à la prévision de séries temporelles, application á la prévision de consommation d'électricité,” PhD thesis, Intelligence Artificielle et Reconnaissance de Formes. Université Pierre et Marie Curie (Paris 6), Lab. Laforia/INT-SIM,1996.Google Scholar
- A. N. Tikhonov, V.Y. Arsenin, Solution of ill posed problems. Washington, D.C.: W.H. Wilson, 1977.Google Scholar
- A. Weigend, D. Rumelhart, B. Huberman, “Generalization by weight elimination with application to forecasting,” presented at Neural Information Processings 3, pp. 875–882, 1991.Google Scholar
- Y. Chauvin, “Dynamic behavior of constrained back-propagation networks,” presented at Advances in Neural Information Processings 2, pp. 643–649, 1990.Google Scholar
- S. Hanson, D. Burr, “Minkowski-r back propagation: Learning connectionist models with non-euclidian error signals,” presented at Neural Information Processing Systems, American Institute of Physics, New York, pp. 348–357, 1988.Google Scholar
- V. Morosov, Methods for solving incorrectly posed problems. Berlin: Springer Verlag, 1984.Google Scholar