Regularizing Stochastic Pott Neural Networks by Penalizing Mutual Information
In this paper we present a method for eliminating overtraining during learning on small and noisy data sets. The key idea is to reduce the complexity of the neural network by increasing the stochasticity of the information transmission from the input layer to the hidden-layer. The architecture of the network is a stochastic multilayer perceptron the hidden layer of which behaves like a Pott-Spin. The stochasticity is increased by penalizing the mutual information between the input and its internal representation in the hidden layer.Theoretical and empirical studies validate the usefulness of this novel approach to the problem of overtraining.
KeywordsHide Layer Mutual Information Input Layer Penalty Term Hide Unit
Unable to display preview. Download preview PDF.
- Deco G, Finnoff W., and Zimmermann H.G., 1993, “Elimination of Overtraining by a Mutual Information Network”, ICANN’93, Amsterdam, Proc. p. 744–749.Google Scholar
- Le Cun Y., Denker J. and Solla S., 1990, “Optimal Brain Damage”, in Proceedings of the Neural Information Processing Systems, Denver, 598–605.Google Scholar
- Nowlan, S. and Hinton, G., 1991, “Adaptive Soft Weight Tying using Gaussian Mixtures”, Neural Information Proccesing Systems, VoL 4, 847–854, San Mateo,C.A. Morgan Kaufmann.Google Scholar
- Weigend A., Rumelhart D. and Hubennan B., 1991, “Generalization by weight elimination with application to forecasting”, in Advances in Neural Information Proccesing, Ш, Ed. R. P. Lippman and J. Moody, Morgan Kaufman, 1991.Google Scholar