Weight discretization due to optical constraints and its influence on the generalization abilities of a simple perceptron
Motivated from the optical implementation of NN which can be realized by storing the weights in holograms with a limited number of gray values, we focus our investigation on the dependence of the generalization and training errors of a simple perceptron with discrete weights, on the number of allowed discrete values (there are 2P allowed values for a bit precision of p) and on the training set size. Our starting point is the teacher pupil paradigm. The teacher is defined by fixing its continuous weights to random values. The pupil network that is only allowed to have discrete values was trained to learn the rule produced by the teacher with simulated annealing. For α < αs, where a encodes the training set size, weight configurations exist so that the training set can be reproduced without error whereas the generalization error is nonzero. For α > αs there is no weight configuration of the pupil which can reproduce the training set without error and for α → ∞ both training and generalization errors asymptotically converge to an εmin. We found that between a precision of 5 bit and 8 bit there was no remarkable improvement in the generalization ablitity of the pupil perceptron. This result is very useful for the optical implementation since optical constraints for storing weights in holograms restrict precision to a maximum value of 6 bit.
KeywordsSimulated Annealing Algorithm Generalization Ability Generalization Error Continuous Weight Optical Implementation
Unable to display preview. Download preview PDF.
- 1.Engel, A.: Uniform convergence bounds for learning from examples. Modern Physics Letters B bf 8 (1994) 1683–1708Google Scholar
- 2.Biehl, M. Watkin, T.L.H. Rau, A.: The statistical mechanics of learning a rule. Review of Modern Physics bf 65 (1993)Google Scholar
- 3.Horner, H.: Dynamics of learning and generalization in a binary perceptron model. Zeitschrift für Physik B — Condensed Matter bf 87 (1992) 371–376Google Scholar
- 4.Horner, H.: Dynamics of learning and generalization in perceptrons with constraints. Physica A bf 200 (1993) 552–562Google Scholar
- 5.Lange, R.: Perfect learning in neural networks. PhD thesis Ruprecht-Karls-Universität Heidelberg (1995)Google Scholar
- 6.Metropolis, M. Rosenbluth,A.W. Rosenbluth, M.N. Teller, A.H. Teller, E. Equation of state calculations by fast computing machines. Journal of Chemical Physics bf 21 (1953) 1087–1092Google Scholar
- 7.Patel, H.-K.: Computational complexity, learning rules and storage capacities: a monte carlo study for the binary perceptron. Zeitschrift für Physik B — Condensed Matter bf 91 (1993) 257–266Google Scholar
- 8.Schwember, St.: Untersuchungen zur Generalisierungsfähigkeit des Simple Perzeptrons mit diskreten Gewichten. Diplomarbeit Ruprecht-Karls-Universität Heidelberg (1997)Google Scholar