Skip to main content

Part of the book series: Studies in Cognitive Systems ((COGS,volume 26))

  • 590 Accesses

Abstract

The application of artificial neural networks to complex real world problems usually requires a modularization of the network architecture. The single modules deal with subtasks that are defined by a decomposition of the problem. Up to now, the modularization of the network is usually done heuristically. Little is known about sensible methods to adapt the network structure to the problem at hand. Incrementally constructed cascade architectures are a promising approach to grow networks according to the needs of the problem. This paper discusses the properties of the recently proposed direct cascade architecture DCA (Littmann & Ritter 1992). One important virtue of DCA is that it allows the cascading of entire subnetworks, even if these admit no error-backpropagation. Exploiting this flexibility and using LLM networks as cascaded elements, we show that the performance of the resulting network cascades can be greatly enhanced compared to the performance of a single network. Our results for the Mackey-Glass time series prediction task indicate that such deeply cascaded network architectures achieve good generalization even on small data sets, when shallow, broad architectures of comparable size suffer from overfitting. We conclude that the DCA approach offers a powerful and flexible alternative to existing schemes such as, e. g. , the mixtures of experts approach, for the construction of modular systems from a wide range of subnetwork types.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Baffes, P. , & J. Zelle (1992). Growing layers of perceptrons: Introducing the exten-tron algorithm. Proceedings of the International Joint Conference on Neural Networks, volume II (pp. 392–397). Baltimore, MD.

    Chapter  Google Scholar 

  • Baum, E. , & D. Haussler (1989). What size net gives valid generalization? Neural Computation 1:151–160.

    Article  Google Scholar 

  • Crowder, R. S. (1990). Predicting the Mackey-Glass time series with cascade-correlation learning. In D. S. Touretzky, J. L. Elman, TJ. Sejnowski, & G. E. Hinton (eds. ), Connectionist Models: Proceedings of the 1990 Summer School (pp. 524–532). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Fahlman, S. E. (1991). The recurrent cascade-correlation architecture. In R. P. Lipp-mann, J. E. Moody, and D. S. Touretzky (eds. ), Advances in neural information processing systems 3 (pp. 190–196). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Fahlman, S. E. (1991). The recurrent cascade-correlation architecture. In R. P. Lipp-mann, J. E. Moody, and D. S. Touretzky (eds. ), Advances in neural information processing systems 3 (pp. 190–196). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Frean, M. (1990). The upstart algorithm: A method for constructing and training feedforward neural networks. Neural Computation 2:198–209.

    Article  Google Scholar 

  • Hartmann, E. , & J. D. Keeler (1991). Predicting the future: Advantages of semilocal units. Neural Computation 3:566–578.

    Article  Google Scholar 

  • Jacobs, R., & M. Jordan (1991). A competitive modular connectionist architecture. In R. P. Lippmann, J. E. Moody, & D. S. Touretzky (eds. ), Advances in neural information processing systems 3 (pp. 767–773). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Jacobs, R., M. Jordan, S. Nowlan, & G. Hinton (1991). Adaptive mixtures of local experts. Neural Computation 3:79–87.

    Article  Google Scholar 

  • Lapedes, A. , & R. Farber (1987). Nonlinear signal processing using neural networks; prediction and system modeling. Technical Report TR LA-UR-87–2662, Los Alamos National Laboratory, Los Alamos, NM.

    Google Scholar 

  • LeCun, Y. , J. D. Denker, & S. A. Solla (1990). Optimal brain damage. In D. S. Touretzky (ed. ), Advances in neural information processing systems 2 (pp. 598–605). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Littmann, E. , & H. Ritter (1992). Cascade network architectures. Proceedings of the International Joint Conference on Neural Networks, volume II (pp. 398–404). Baltimore, MD.

    Google Scholar 

  • Littmann, E. , & H. Ritter (1993). Generalization abilities of cascade network architectures. In C. L. Giles, S. J. Hanson, & J. D. Cowan (eds. ), Advances in neural information processing systems 5 (pp. 188–195). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Littmann, E. , & H. Ritter (1994a). Analysis and applications of the direct cascade architecture. Technical Report TR 94–2, Department of Computer Science, Bielefeld University, Bielefeld, FR Germany.

    Google Scholar 

  • Littmann, E. , & H. Ritter (1996). Learning and generalization in cascade network architectures. Neural Computation 8(7):1521–1540.

    Article  Google Scholar 

  • Mackey, M. , L. Glass (1977). Oscillations and chaos in physiological control systems. Science 197:287–289.

    Article  Google Scholar 

  • Meyering, A. , & H. Ritter (1992). Learning 3D shape perception with local linear maps. Proceedings of the International Joint Conference on Neural Networks, volume IV (pp. 432–436). Baltimore, MD.

    Google Scholar 

  • Mezard, M. , & J. R Nadal (1989). Learning in feedforward layered networks: The tiling algorithm. Journal of Physics 22:2191–2204.

    MathSciNet  Google Scholar 

  • Minsky, M. L. , & S. A. Papert (1969). Perceptrons. Cambridge, MA: MIT Press.

    MATH  Google Scholar 

  • Moody, J. , & C. Darken (1988). Learning with localized receptive fields. Connec-tionist Models: Proceedings of the 1988 Summer School (pp. 133–143). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Mozer, M. (1989). A focused back-propagation algorithm for temporal pattern recognition. Complex Systems 3:349–381.

    MathSciNet  MATH  Google Scholar 

  • Nabhan, T. , & A. Zomaya (1994). Toward generating neural network structures for function approximation. Neural Networks 7:89–99.

    Article  Google Scholar 

  • Nowlan, S. J. , & G. E. Hinton (1991). Evaluation of adaptive mixtures of competing experts. In D. S. Touretzky (ed. ), Advances in neural information processing systems 5 (pp. 774–780). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Ritter, H. (1991). Learning with the self-organizing map. In T. Kohonen, K. Mäk-isara, O. Simula, & J. Kangas (eds. ), Artificial neural networks 1 (pp. 357–364). Amsterdam: Elsevier.

    Google Scholar 

  • Ritter, H. , T. Martinetz, & K. Schulten (1992). Neural computation and self-organizing maps: An introduction (English and German). New York: Addison-Wesley.

    MATH  Google Scholar 

  • Rumelhart, D. E. , G. E. Hinton, & R. J. Williams (1986). Learning internal representations by back-propagating errors. In D. E. Rumelhart & J. L. McClelland (eds. ), Parallel distributed processing 1. Cambridge, MA: MIT Press.

    Google Scholar 

  • Stokbro, K. , D. Umberger, & J. Hertz (1990). Exploiting neurons with localized receptive fields to learn chaos. Complex Systems 4:603–622.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Littmann, E., Ritter, H. (2000). Modularization by Cascading Neural Networks. In: Cruse, H., Dean, J., Ritter, H. (eds) Prerational Intelligence: Adaptive Behavior and Intelligent Systems Without Symbols and Logic, Volume 1, Volume 2 Prerational Intelligence: Interdisciplinary Perspectives on the Behavior of Natural and Artificial Systems, Volume 3. Studies in Cognitive Systems, vol 26. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-0870-9_39

Download citation

  • DOI: https://doi.org/10.1007/978-94-010-0870-9_39

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-010-3792-1

  • Online ISBN: 978-94-010-0870-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics