Modularization by Cascading Neural Networks

Littmann, Enno; Ritter, Helge

doi:10.1007/978-94-010-0870-9_39

Enno Littmann⁴ &
Helge Ritter⁵

Part of the book series: Studies in Cognitive Systems ((COGS,volume 26))

590 Accesses

Abstract

The application of artificial neural networks to complex real world problems usually requires a modularization of the network architecture. The single modules deal with subtasks that are defined by a decomposition of the problem. Up to now, the modularization of the network is usually done heuristically. Little is known about sensible methods to adapt the network structure to the problem at hand. Incrementally constructed cascade architectures are a promising approach to grow networks according to the needs of the problem. This paper discusses the properties of the recently proposed direct cascade architecture DCA (Littmann & Ritter 1992). One important virtue of DCA is that it allows the cascading of entire subnetworks, even if these admit no error-backpropagation. Exploiting this flexibility and using LLM networks as cascaded elements, we show that the performance of the resulting network cascades can be greatly enhanced compared to the performance of a single network. Our results for the Mackey-Glass time series prediction task indicate that such deeply cascaded network architectures achieve good generalization even on small data sets, when shallow, broad architectures of comparable size suffer from overfitting. We conclude that the DCA approach offers a powerful and flexible alternative to existing schemes such as, e. g. , the mixtures of experts approach, for the construction of modular systems from a wide range of subnetwork types.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Baffes, P. , & J. Zelle (1992). Growing layers of perceptrons: Introducing the exten-tron algorithm. Proceedings of the International Joint Conference on Neural Networks, volume II (pp. 392–397). Baltimore, MD.
Chapter Google Scholar
Baum, E. , & D. Haussler (1989). What size net gives valid generalization? Neural Computation 1:151–160.
Article Google Scholar
Crowder, R. S. (1990). Predicting the Mackey-Glass time series with cascade-correlation learning. In D. S. Touretzky, J. L. Elman, TJ. Sejnowski, & G. E. Hinton (eds. ), Connectionist Models: Proceedings of the 1990 Summer School (pp. 524–532). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Fahlman, S. E. (1991). The recurrent cascade-correlation architecture. In R. P. Lipp-mann, J. E. Moody, and D. S. Touretzky (eds. ), Advances in neural information processing systems 3 (pp. 190–196). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Fahlman, S. E. (1991). The recurrent cascade-correlation architecture. In R. P. Lipp-mann, J. E. Moody, and D. S. Touretzky (eds. ), Advances in neural information processing systems 3 (pp. 190–196). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Frean, M. (1990). The upstart algorithm: A method for constructing and training feedforward neural networks. Neural Computation 2:198–209.
Article Google Scholar
Hartmann, E. , & J. D. Keeler (1991). Predicting the future: Advantages of semilocal units. Neural Computation 3:566–578.
Article Google Scholar
Jacobs, R., & M. Jordan (1991). A competitive modular connectionist architecture. In R. P. Lippmann, J. E. Moody, & D. S. Touretzky (eds. ), Advances in neural information processing systems 3 (pp. 767–773). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Jacobs, R., M. Jordan, S. Nowlan, & G. Hinton (1991). Adaptive mixtures of local experts. Neural Computation 3:79–87.
Article Google Scholar
Lapedes, A. , & R. Farber (1987). Nonlinear signal processing using neural networks; prediction and system modeling. Technical Report TR LA-UR-87–2662, Los Alamos National Laboratory, Los Alamos, NM.
Google Scholar
LeCun, Y. , J. D. Denker, & S. A. Solla (1990). Optimal brain damage. In D. S. Touretzky (ed. ), Advances in neural information processing systems 2 (pp. 598–605). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Littmann, E. , & H. Ritter (1992). Cascade network architectures. Proceedings of the International Joint Conference on Neural Networks, volume II (pp. 398–404). Baltimore, MD.
Google Scholar
Littmann, E. , & H. Ritter (1993). Generalization abilities of cascade network architectures. In C. L. Giles, S. J. Hanson, & J. D. Cowan (eds. ), Advances in neural information processing systems 5 (pp. 188–195). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Littmann, E. , & H. Ritter (1994a). Analysis and applications of the direct cascade architecture. Technical Report TR 94–2, Department of Computer Science, Bielefeld University, Bielefeld, FR Germany.
Google Scholar
Littmann, E. , & H. Ritter (1996). Learning and generalization in cascade network architectures. Neural Computation 8(7):1521–1540.
Article Google Scholar
Mackey, M. , L. Glass (1977). Oscillations and chaos in physiological control systems. Science 197:287–289.
Article Google Scholar
Meyering, A. , & H. Ritter (1992). Learning 3D shape perception with local linear maps. Proceedings of the International Joint Conference on Neural Networks, volume IV (pp. 432–436). Baltimore, MD.
Google Scholar
Mezard, M. , & J. R Nadal (1989). Learning in feedforward layered networks: The tiling algorithm. Journal of Physics 22:2191–2204.
MathSciNet Google Scholar
Minsky, M. L. , & S. A. Papert (1969). Perceptrons. Cambridge, MA: MIT Press.
MATH Google Scholar
Moody, J. , & C. Darken (1988). Learning with localized receptive fields. Connec-tionist Models: Proceedings of the 1988 Summer School (pp. 133–143). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Mozer, M. (1989). A focused back-propagation algorithm for temporal pattern recognition. Complex Systems 3:349–381.
MathSciNet MATH Google Scholar
Nabhan, T. , & A. Zomaya (1994). Toward generating neural network structures for function approximation. Neural Networks 7:89–99.
Article Google Scholar
Nowlan, S. J. , & G. E. Hinton (1991). Evaluation of adaptive mixtures of competing experts. In D. S. Touretzky (ed. ), Advances in neural information processing systems 5 (pp. 774–780). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Ritter, H. (1991). Learning with the self-organizing map. In T. Kohonen, K. Mäk-isara, O. Simula, & J. Kangas (eds. ), Artificial neural networks 1 (pp. 357–364). Amsterdam: Elsevier.
Google Scholar
Ritter, H. , T. Martinetz, & K. Schulten (1992). Neural computation and self-organizing maps: An introduction (English and German). New York: Addison-Wesley.
MATH Google Scholar
Rumelhart, D. E. , G. E. Hinton, & R. J. Williams (1986). Learning internal representations by back-propagating errors. In D. E. Rumelhart & J. L. McClelland (eds. ), Parallel distributed processing 1. Cambridge, MA: MIT Press.
Google Scholar
Stokbro, K. , D. Umberger, & J. Hertz (1990). Exploiting neurons with localized receptive fields to learn chaos. Complex Systems 4:603–622.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Dornier GmbH, VAFA 1, Friedrichshafen, Germany
Enno Littmann
Universität Bielefeld, Germany
Helge Ritter

Authors

Enno Littmann
View author publications
You can also search for this author in PubMed Google Scholar
Helge Ritter
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Bielefeld, Bielefeld, Germany
Holk Cruse & Helge Ritter &
Cleveland State University, Cleveland, Ohio, USA
Jeffrey Dean

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Littmann, E., Ritter, H. (2000). Modularization by Cascading Neural Networks. In: Cruse, H., Dean, J., Ritter, H. (eds) Prerational Intelligence: Adaptive Behavior and Intelligent Systems Without Symbols and Logic, Volume 1, Volume 2 Prerational Intelligence: Interdisciplinary Perspectives on the Behavior of Natural and Artificial Systems, Volume 3. Studies in Cognitive Systems, vol 26. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-0870-9_39

Download citation

DOI: https://doi.org/10.1007/978-94-010-0870-9_39
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-3792-1
Online ISBN: 978-94-010-0870-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics