Abstract
When designing neural networks (NNs) one has to consider the ease to determine the best architecture under the selected paradigm. One possible choice is the so-called multi-layer perceptron network (MLP). MLPs have been theoretically proven to be universal approximators. However, a central issue is that the architecture of the MLPs, in general, is not known and has to be determined heuristically. In the past, several such approaches have been taken but none has been shown to be applicable in general, while others depend on complex parameter selection and fine-tuning. In this paper we present a method which allows us to determine the said architecture from basic theoretical considerations: namely, the information content of the sample and the number of variables. From these we derive a closed analytic formulation. We discuss the theory behind our formula and illustrate its application by solving a set of problems (both for classification and regression) from the University of California at Irvine (UCI) data base repository.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Minsky, M.L., Seymour, A.: Papert. Perceptrons - Expanded Edition: An Introduction to Computational Geometry. MIT press, Boston (1987)
Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359–366 (1989)
Hecht-Nielsen, R.: Theory of the backpropagation neural network. In: International Joint Conference on Neural Networks, IJCNN. IEEE (1989)
Cybenko, G.: Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2(4), 303–314 (1989)
Neural Networks, A Comprehensive Foundation, 2nd edn., ch. 4, p. 294, Notes and References 8. Prentice Hall International (1999)
Huang, G.-B.: Learning capability and storage capacity of two-hidden-layer feedforward networks. IEEE Trans. on Neural Networks 14(2), 274–281 (2003)
Shampine, L.F., Allen, R.C.: Numerical computing: an introduction, ch. 1.3, pp. 54–62. Harcourt Brace College Publishers (1973)
Buhmann, M.D.: Radial basis functions. Acta Numerica 2000(9), 1–38 (2000)
Hearst, M.A., Dumais, S.T., Osman, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intelligent Systems and their Applications 13(4), 18–28 (1998)
Haykin, S.S., et al.: Neural networks and learning machines, vol. 3. Pearson Education, Upper Saddle River (2009)
Ash, T.: Dynamic Node Creation In Backpropagation Networks. Connection Science 1(4), 365–375 (1989)
Hirose, Y., Yamashita, I.C., Hijiya, S.: Back-Propagation Algorithm Which Varies the number of hidden units. Neural Networks 1(4) (1991)
Rivals, I., Personnaz, L.: A statistical procedure for determining the optimal number of hidden neurons of a neural model. In: Second International Symposium on Neural Computation (NC 2000), Berlin, May 23-26 (2000)
Yao, X.: Evolving Artificial neural networks. Proceedings of the IEEE 87(9), 1423–1447 (1999)
Xu, L.: Ying-Yang Machine: A Bayesian- Kullback scheme for unified learnings and new results on vector quantization. In: Keynote Talk, Proceedings of International Conference on Neural Information Processing (ICONIP 1995), October 30 - November 3, pp. 977–988 (1995)
Xu, L.: Bayesian Ying-Yang System and Theory as A Unified Statistical Learning Approach (III) Models and Algorithms for Dependence Reduction, Data Dimension Reduction, ICA and Supervised Learning. In: Proc. Of International Workshop on Theoretical Aspects of Neural Computation, Hong Kong, May 26-28. LNCS, pp. 43–60. Springer (1997)
Fletcher, L., Katkovnik, V., Steffens, F.E., Engelbrecht, A.P.: Optimizing The Number Of Hidden Nodes Of A Feedforward Artificial Neural Network. In: Proc. of the IEEE International Joint Conference on Neural Networks, vol. 2, pp. 1608–1612 (1998)
Fahlman, S.E.: An empirical study of learning speed in back propagation networks. In: Proceedings of the 1988 Connectionist Models Summer School. Morgan Kaufman (1988)
Reed, R.: Pruning Algorithms A Survey. IEEE Trans. on Neural Networks 4(5), 707–740 (1993)
Xu, S., Chen, L.: A novel approach for determining the optimal number of hidden layer neurons for FNN’s and its application in data mining. In: International Conference on Information Technology and Applications: iCITA, pp. 683–686 (2008)
Barron, A.R.: Approximation and Estimation Bounds for Artificial Neural Networks. Machine Learning (14), 115–133 (1994)
Saw, J.G., Yang, M.C., Mo, T.C.: Chebyshev inequality with estimated mean and variance. The American Statistician 38(2), 130–132 (1984)
Kuri-Morales, A., Aldana-Bobadilla, E.: The best genetic algorithm I. In: Castro, F., Gelbukh, A., González, M. (eds.) MICAI 2013, Part II. LNCS, vol. 8266, pp. 1–15. Springer, Heidelberg (2013)
Kuri-Morales, A.F., Aldana-Bobadilla, E., López-Peña, I.: The best genetic algorithm II. In: Castro, F., Gelbukh, A., González, M. (eds.) MICAI 2013, Part II. LNCS, vol. 8266, pp. 16–29. Springer, Heidelberg (2013)
Cheney, E.W.: Introduction to approximation theory, ch. 2, pp. 45–51 (1966)
Vapnik, V.: The nature of statistical learning theory. Springer (2000)
Li, M., Vitányi, P.: An introduction to Kolmogorov complexity and its applications, 2nd edn. Springer, New York (1997)
Teahan, W.J.: Probability estimation for PPM. In: Proceedings NZCSRSC 1995 (1995), http://www.cs.waikato.ac.nz/wjt
Ein-Dor, P., Jacob Feldmesser, E.-D.: Computer Hardware Data Set: Faculty of Management, Ramat-Aviv, https://archive.ics.uci.edu/ml/datasets/Computer+Hardware
Forina, M., et al.: Wine Data Set, PARVUS, Via Brigata Salerno, https://archive.ics.uci.edu/ml/datasets/Wine
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kuri-Morales, A.F. (2014). The Best Neural Network Architecture. In: Gelbukh, A., Espinoza, F.C., Galicia-Haro, S.N. (eds) Nature-Inspired Computation and Machine Learning. MICAI 2014. Lecture Notes in Computer Science(), vol 8857. Springer, Cham. https://doi.org/10.1007/978-3-319-13650-9_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-13650-9_7
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13649-3
Online ISBN: 978-3-319-13650-9
eBook Packages: Computer ScienceComputer Science (R0)