Abstract
Consider a multilayer perceptron (MLP) with d inputs, a single hidden sigmoidal layer and a linear output. By adding an additional d inputs to the network with values set to the square of the first d inputs, properties reminiscent of higher-order neural networks and radial basis function networks (RBFN) are added to the architecture with little added expense in terms of weight requirements. Of particular interest, this architecture has the ability to form localized features in a d-dimensional space with a single hidden node but can also span large volumes of the input space; thus, the architecture has the localized properties of an RBFN but does not suffer as badly from the curse of dimensionality. I refer to a network of this type as a SQuare Unit Augmented, Radially Extended, MultiLayer Perceptron (SQUARE-MLP or SMLP).
Previously published in: Orr, G.B. and Müller, K.-R. (Eds.): LNCS 1524, ISBN 978-3-540-65311-0 (1998).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Casdagli, M.: Nonlinear prediction of chaotic time series. Physica D 35, 335–356 (1989)
Deterding, D.H.: Speaker Normalisation for Automatic Speech Recognition. PhD thesis, University of Cambridge (1989)
Fahlman, S.E.: Faster-learning variations on back-propagation: An empirical study. In: Proceedings of the 1988 Connectionist Models Summer School. Morgan Kaufmann (1988)
Fahlman, S.E., Lebiere, C.: The cascade-correlation learning architecture. In: Touretzky, S. (ed.) Advances in Neural Information Processing Systems, vol. 2. Morgan Kaufmann (1990)
Finke, M., Müller, K.-R.: Estimating a-posteriori probabilities using stochastic network models. In: Mozer, M., Smolensky, P., Touretzky, D.S., Elman, J.L., Weigend, A.S. (eds.) Proceedings of the 1993 Connectionist Models Summer School, pp. 324–331. Erlenbaum Associates, Hillsdale (1994)
Hastie, T., Tibshirani, R.: Flexible discriminant analysis by optimal scoring. Technical report, AT&T Bell Labs, Murray Hill, New Jersey (1993)
Hastie, T., Tibshirani, R.: Discriminant adaptive nearest neighbor classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 18(6), 607–616 (1996)
Hochreiter, S., Schmidhuber, J.: Lococode. Technical Report FKI-222-97, Fakultät für Informatik, Technische Universität München (1997)
Lang, K.J., Witbrock, M.J.: Learning to tell two spirals apart. In: Proceedings of the 1988 Connectionist Models Summer School. Morgan Kaufmann, San Francisco (1988)
Lapedes, A., Farber, R.: Nonlinear signal processing using neural networks: Prediction and system modelling. Technical Report LA-UR-87-2662, Los Alamos National Laboratory, Los Alamos, NM (1987)
Lapedes, A., Farber, R.: How neural nets work. In: Anderson, D.Z. (ed.) Neural Information Processing Sysytems, pp. 442–456. American Institute of Physics, New York (1988)
Lawrence, S., Tsoi, A.C., Back, A.D.: Function approximation with neural networks and local methods: Bias, variance and smoothness. In: Bartlett, P., Burkitt, A., Williamson, R. (eds.) Australian Conference on Neural Networks, pp. 16–21. Australian National University (1996)
Lee, S., Kil, R.M.: Multilayer feedforward potential function networks. In: IEEE international Conference on Neural Networks, pp. 1:161–1:171. SOS Printing, San Diego (1988)
Lee, Y.C., Doolen, G., Chen, H.H., Sun, G.Z., Maxwell, T., Lee, H.Y., Giles, C.L.: Machine learning using higher order correlation networks. Physica D 22-D, 276–306 (1986)
Moody, J., Darken, C.: Learning with localized receptive fields. In: Touretsky, D., Hinton, G., Sejnowski, T. (eds.) Proceedings of the 1988 Connectionist Models Summer School, Morgan Kaufmann (1988)
Moody, J., Darken, C.: Fast learning in networks of locally-tuned processing units. Neural Computation 1, 281–294 (1989)
Niranjan, M., Fallside, F.: Neural networks and radial basis functions in classifying static speech patterns. Computer Speech and Language 4, 275–289 (1990)
Pao, Y.H.: Adaptive Pattern Recognition and Neural Networks. Addison-Wesley Publishing Company, Inc., Reading (1989)
Robinson, A.J.: Dynamic Error Propagation Networks. PhD thesis, Cambridge University (1989)
Rumelhart, D.E., McClelland, J.L.: the PDP Research Group. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 2. MIT Press (1986)
Sarle, W.: The comp.ai.neural-nets Frequently Asked Questions List (1997)
Schetzen, M.: The Volterra and Wiener Theories of Nonlinear Systems. John Wiley and Sons, New York (1980)
Schölkopf, B., Smola, A., Müller, K.-R.: Nonlinear component analysis as a kernel eigenvalue problem. Technical report, Max-Planck-Institut für biologische Kybernetik, 1996. Neural Computation 10(5), 1299–1319 (1998)
Volterra, V.: Theory of Functionals and of Integro-differential Equations. Dover (1959)
Werbos, P.: Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD thesis, Harvard University (1974)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Flake, G.W. (2012). Square Unit Augmented, Radially Extended, Multilayer Perceptrons. In: Montavon, G., Orr, G.B., Müller, KR. (eds) Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol 7700. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35289-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-35289-8_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35288-1
Online ISBN: 978-3-642-35289-8
eBook Packages: Computer ScienceComputer Science (R0)