ICANN ’94 pp 693-696 | Cite as

Regularizing Stochastic Pott Neural Networks by Penalizing Mutual Information

  • G. Deco
  • T. Martinetz


In this paper we present a method for eliminating overtraining during learning on small and noisy data sets. The key idea is to reduce the complexity of the neural network by increasing the stochasticity of the information transmission from the input layer to the hidden-layer. The architecture of the network is a stochastic multilayer perceptron the hidden layer of which behaves like a Pott-Spin. The stochasticity is increased by penalizing the mutual information between the input and its internal representation in the hidden layer.Theoretical and empirical studies validate the usefulness of this novel approach to the problem of overtraining.


Hide Layer Mutual Information Input Layer Penalty Term Hide Unit 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Deco G, Finnoff W., and Zimmermann H.G., 1993, “Elimination of Overtraining by a Mutual Information Network”, ICANN’93, Amsterdam, Proc. p. 744–749.Google Scholar
  2. Le Cun Y., Denker J. and Solla S., 1990, “Optimal Brain Damage”, in Proceedings of the Neural Information Processing Systems, Denver, 598–605.Google Scholar
  3. Nowlan, S. and Hinton, G., 1991, “Adaptive Soft Weight Tying using Gaussian Mixtures”, Neural Information Proccesing Systems, VoL 4, 847–854, San Mateo,C.A. Morgan Kaufmann.Google Scholar
  4. Peterson C. and Soederberg B., 1989, “A new method for mapping optimization problems onto neural networks”, Int. J. Neural Syst., 1, 68.CrossRefGoogle Scholar
  5. Weigend A., Rumelhart D. and Hubennan B., 1991, “Generalization by weight elimination with application to forecasting”, in Advances in Neural Information Proccesing, Ш, Ed. R. P. Lippman and J. Moody, Morgan Kaufman, 1991.Google Scholar
  6. Friedman J.H., “Multivariate adaptive regression splines”, 1991, Annals of Statistics, 19, 1–141.CrossRefMATHMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag London Limited 1994

Authors and Affiliations

  • G. Deco
    • 1
  • T. Martinetz
    • 1
  1. 1.Corporate Research and Development, ZFE ST SN 41Siemens AGMunichGermany

Personalised recommendations