Abstract
We first review in pedagogical fashion previous results which gave lower and upper bounds on the number of examples needed for training feedforward neural networks when valid generalization is desired. Experimental tests of generalization versus number of examples are then presented for random target networks and examples drawn from a uniform distribution. The experimental results are roughly consistent with the following heuristic: if a database of M examples is loaded onto a W weight net (for M≫W), one expects to make a fraction ɛ=W/M errors in classifying future examples drawn from the same distribution. This is consistent with our previous bounds, but if reliable strengthens them in that: (1) the bounds had large numerical constants and log factors, all of which are set equal one in the heuristic, (2) previous lower bounds on number of examples needed were valid only in a distribution independent context, whereas the experiments were conducted for a uniform distribution, and (3) the previous lower bound was valid for nets with one hidden layer only. These experiments also seem to indicate that networks with two hidden layers have Vapnik-Chervonenkis dimension roughly equal to their total number of weights.
We then consider the convergence of the k-nearest neighbor algorithm to a classifier making a fraction ɛ of errors when examples are drawn from the uniform distribution on S n, the unit sphere in n dimensions, and classified according to a simple target function. We prove that if the target function is a single half space, then for k appropriately chosen (k ≈ n/ɛ 2 ln(ɛ−1)), k nearest neighbor yields an ɛ accurate classifier using a database of M=O(n/ɛ2 ln(ɛ−1)) classified examples. However, when the target function is a union of two half spaces, k nearest neighbor requires a number of examples exponential in n to achieve high accuracy.
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
ABU-MOSTAFA, Y.S., PSALTIS, D., (1987) Optical neural computers, Scientific American v 256, no. 3, pp88–95.
ANGLUIN, D., VALIANT, L.G., (1979) Fast probabilistic algorithms for Hamiltonian circuits and matchings, Jour. of Computer and Systems Sciences, v18, pp155–193.
BAUM, E. B., (1988), On the capabilities of multilayer perceptrons, Journal of Complexity 4, pp193–215.
BAUM, E. B., (1989a), On learning a union of half spaces, Journal of Complexity v5 no. 4.
BAUM, E. B., (1989b), The perceptron algorithm is fast for non-malicious distributions, submitted for publication.
BAUM, E. B., (1989c), A proposal for more powerful learning algorithms, Neural Computation v1 no 2.
BAUM, E. B., HAUSSLER, D., (1989), What size net gives valid generalization?, Neural Computation 1 pp151–160.
BLUM, A., RIVEST, R. L., (1988), Training a 3-node neural network is NP-complete, pp494–501 in Advances in neural information processing systems 1, ed. D. S. Touretzky, Morgan Kaufmann, San Mateo CA.
BLUMER, A., EHRENFEUCHT, A., HAUSSLER, D., WARMUTH, M., (1987), Learnability and the Vapnik-Chervonenkis dimension, University of California-Santa Cruz Technical Report UCSC-CRL-87-20, and J. ACM, to appear.
COVER, T. M., (1965), Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition, IEEE Trans. Elec. Comput. EC-14, pp326–334.
COVER, T.M., (1968), Rates of convergence of nearest neighbor decision procedures, Proc. First Annal Hawaii Conference on Systems Theory, pp 413–415.
COVER, T. M., HART, P.E., (1967), Nearest neighbor pattern classification, IEEE Trans. Info. Theory, IT-13, pp21–27.
DENKER, J.S., GARDNER, W.R., GRAF, H. P., HENDERSON, D., HOWARD, R. E., HUBBARD, W., JACKEL, L. D., BAIRD, H. S., GUYON, I., (1988), Neural network re4cognizer for hand-written zip code digits, in Neural Information Processing Systems 1, ed. D. Touretzky, Morgan Kaufmann Inc., San Mateo CA., pp323–331.
DUDA, R.O., HART, P.E., (1973) Pattern Classification and Scene Analysis, John Wiley and Sons, NY.
EHRENFEUCHT, A., HAUSSLER, D., KEARNS, M., VALIANT, L., (1988), A general lower bound on the number of examples needed for learning, pp 139 to 154 in Proceedings of the 1988 workshop on computational learning theory, eds. D. Haussler and L. Pitt, Morgan Kauffman, San Mateo CA.
FRIEDMAN, J.H., BENTLEY, J.L., FINKEL, R.A., (1977), An algorithm for finding best matches in logarithmic expected time, ACM Trans. on Mathematical Software, V.3, no. 3, pp200–226.
HAUSSLER, D., (1989), Generalizing the PAC model for neural nets and other learning applications, University of California Santa Cruz Technical Report UCSC-CRL-89-30.
JUDD, S., (1988) On the complexity of loading shallow networks, J. of Complexity v4 pp177–192.
PITT, L., VALIANT, L. G., (1986), Computational limits on learning from examples, Harvard University Tech report TR-05-86.
RIDGEWAY, W. C. III, (1962), An adaptive logic system with generalizing properties, Tech report 1556-1, Solid State Electronics Lab, Stanford University.
RIVEST, R., HAUSSLER, D., WARMUTH, M.K., (1989), Proceedings of the second annual workshop on Computational Learning Theory, Morgan Kauffman, San Mateo CA.
VALIANT, L.G., (1984), A theory of the learnable, Comm. ACM V27, no. 11, pp1134–1142.
WALTZ, D. L. (1988), The prospects for building truly intelligent machines, Daedalus, issued as V117, no. 1 of Proc. National Academy of Arts and Sciences, pp 191–212.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1990 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Baum, E.B. (1990). When are k-nearest neighbor and back propagation accurate for feasible sized sets of examples?. In: Almeida, L.B., Wellekens, C.J. (eds) Neural Networks. EURASIP 1990. Lecture Notes in Computer Science, vol 412. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-52255-7_24
Download citation
DOI: https://doi.org/10.1007/3-540-52255-7_24
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-52255-3
Online ISBN: 978-3-540-46939-1
eBook Packages: Springer Book Archive