Abstract
We consider the problem of efficient approximate learning by multi-layered feedforward circuits subject to two objective functions.
First, we consider the objective to maximize the ratio of correctly classified points compared to the training set size (e.g., see [3],[5]). We show that for single hidden layer threshold circuits with n hidden nodes and varying input dimension, approximation of this ratio within a relative error c/n3, for some positive constant c, is NP-hard even if the number of examples is limited with respect to n. For architectures with two hidden nodes (e.g., as in [6]), approximating the objective within some fixed factor is NP-hard even if any sigmoid-like activation function in the hidden layer and å-separation of the output [19] is considered, or if the semilinear activation function substitutes the threshold function.
Next, we consider the objective to minimize the failure ratio [2]. We show that it is NP-hard to approximate the failure ratio within every constant larger than 1 for a multilayered threshold circuit provided the input biases are zero. Furthermore, even weak approximation of this objective is almost NP-hard.
Research supported by NSF grant CCR-9800086.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
E. Amaldi and V. Kann, The complexity and approximability of finding maximum feasible subsystems of linear relations, Theoretical Computer Science 147 (1-2), pp.181–210, 1995. 265
S. Arora, L. Babai, J. Stern, and Z. Sweedyk, The hardness of approximate optima in lattices, codes and systems of linear equations, Journal of Computer and System Sciences, 54, pp. 317–331, 1997. 264, 265, 266, 273, 275, 277
P. Bartlett and S. Ben-David, Hardness results for neural network approximation problems, to appear in Theoretical Computer Science (conference version in Fischer P. and Simon H. U. (eds.), Computational Learning Theory, Lecture Notes in Artificial Intelligence 1572, Springer, pp. 639–644, 1999). 264, 265, 266, 268, 272, 277
M. Bellare, S. Goldwasser, C. Lund, and A. Russell, Efficient multi-prover interactive proofs with applications to approximation problems, in Proceedings of the 25th ACM Symposium on the Theory of Computing, pp. 113–131, 1993. 273
S. Ben-David, N. Eiron and P. M. Long, On the difficulty of approximately maximizing agreements, 13th Annual ACM Conference on Computational Learning Theory (COLT), 2000. 264, 265, 277
A. Blum and R. L. Rivest, Training a 3-node neural network is NP-complete, Neural Networks 5, pp. 117–127, 1992. 264, 265, 266
J. Brown, M. Garber, and S. Vanable, Artificial neural network on a SIMD architecture, in Proc. 2nd Symposium on the Frontier of Massively Parallel Computation, Fairfax, VA, pp. 43–47, 1988. 272
B. DasGupta, H. T. Siegelmann, and E. D. Sontag, On the Intractability of Loading Neural Networks, in Roychowdhury V. P., Siu K. Y., and Orlitsky A. (eds.), Theoretical Advances in Neural Computation and Learning, Kluwer Academic Publishers, pp. 357–389, 1994. 265, 266, 272
M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-completeness, Freeman, San Francisco, 1979. 273
B. Hammer, Some complexity results for perceptron networks, in Niklasson L., Bodén M., and Ziemke, T. (eds.), ICANN’98, Springer, pp. 639–644, 1998. 265
B. Hammer, Training a sigmoidal network is difficult, in Verleysen M. (ed.), European Symposium on Artificial Neural Networks, D-Facto publications, pp. 255–260, 1998. 265
K.-U. Höffgen, Computational limitations on training sigmoid neural networks, Information Processing Letters 46(6), pp.269–274, 1993. 265
K.-U. Höffgen, H.-U. Simon, and K. S. Van Horn, Robust trainability of single neurons, Journal of Computer and System Sciences 50(1), pp.114–125, 1995. 265
L. K. Jones, The computational intractability of training sigmoidal neural networks, IEEE Transactions on Information Theory 43(1), pp. 167–713, 1997. 265
J. S. Judd, On the complexity of loading shallow networks, Journal on Complexity 4(3), pp.177–192, 1988. 264, 265
J. S. Judd, Neural network design and the complexity of learning, MIT Press, Cambridge, MA, 1990. 264
V. Kann, S. Khanna, J. Lagergren, and A. Panconesi, On the hardness of approximating max-k-cut and its dual, Technical Report CJTCS-1997-2, Chicago Journal of Theoretical Computer Science, 1997. 267
C. Lund and M. Yannakakis, On the hardness of approximate minimization problems, Journal of the ACM, 41(5), pp. 960–981, 1994. 275
W. Maass, G. Schnitger, and E. D. Sontag, A comparison of the computational power of sigmoid versus boolean threshold circuits, in Roychowdhury V. P., Siu K. Y., and Orlitsky A. (eds.), Theoretical Advances in Neural Computation and Learning, Kluwer Academic Publishers, pp. 127–151, 1994. 264, 271
M. Megiddo, On the complexity of polyhedral separability, Discrete Computational Geometry 3, pp. 325–337, 1988. 265, 269
C. H. Papadimtriou and M. Yannakakis. Optimization, Approximation and Complexity Classes, Journal of Computer & System Sciences 43, pp. 425–440, 1991. 267
I. Parberry and G. Schnitger, Parallel computation with threshold functions, Journal of Computer and System Sciences, 36, 3 (1988), pp. 278–302. 268
J. Simá, Back-propagation is not efficient, Neural Networks 9(6), pp. 1017–1023, 1996. 265
K.-Y. Siu, V. Roychowdhury and T. Kailath, Discrete Neural Computation: A Theoretical Foundation, Englewood Cliffs, NJ: Prentice Hall, 1994. 268
E. D. Sontag, Feedforward nets for interpolation and classification, Journal of Computer and System Sciences 45, pp.20–48, 1992. 265, 268
M. Vidyasagar, A theory of learning and generalization, Springer, 1997. 268
V. H. Vu, On the infeasibility of training with small squared errors, in Jordan M. I., Kearns M. J., and Solla S. A. (eds.), Advances in Neural Information Processing Systems 10, MIT Press, pp. 371–377, 1998. 265
B. Widrow, R. G. Winter and R. A. Baxter, Layered neural nets for pattern recognition, IEEE Transactions on Acoustics, Speech and Signal Processing, 36 (1988), pp. 1109–1117. 268
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
DasGupta, B., Hammer, B. (2000). On Approximate Learning by Multi-layered Feedforward Circuits. In: Arimura, H., Jain, S., Sharma, A. (eds) Algorithmic Learning Theory. ALT 2000. Lecture Notes in Computer Science(), vol 1968. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-40992-0_20
Download citation
DOI: https://doi.org/10.1007/3-540-40992-0_20
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41237-3
Online ISBN: 978-3-540-40992-2
eBook Packages: Springer Book Archive