Recurrent neural network architectures: An overview

  • Ah Chung Tsoi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1387)


In this paper, we have first considered a number of popular recurrent neural network architectures. Then, two subclasses of general recurrent neural network architectures are introduced. It is shown that all these popular recurrent neural network architectures can be grouped under either of these two subclasses of general recurrent neural network architectures. It is also inferred that these two subclasses of recurrent neural network architectures are distinct, in that it is not possible to transform from one form to the other. Two recently introduced recurrent neural network architectures specifically designed for special purposes, viz., for overcoming long term temporal dependency, and for data structure classifications are also considered.

Once the architectural aspects of the class of networks are settled, then one could consider the training aspects. This will be considered in a companion paper [31].


Finite Impulse Response Recurrent Neural Network Infinite Impulse Response Finite Impulse Response Filter Hide Layer Neuron 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Albertini, F., Sontag, E. “For neural networks, function determines form”. Neural Networks. Vol 6, pp 975–990, 1993.CrossRefGoogle Scholar
  2. 2.
    Back, A.D., Tsoi, A.C. “FIR and IIR synapses, a new neural network architecture for time series modelling”. Neural Computation. Vol. 3, No. 3, pp 375–385, 1991.Google Scholar
  3. 3.
    Baldi, P., Chauvin, Y. “Hybrid modelling, HMM/NN architectures, and protein modelling”. Neural Computation. Vol 8, No. 7, pp 1541–1565, 1996.Google Scholar
  4. 4.
    Bengio, Y., Simard, P., Frasconi, P. “Learning Long term dependencies with gradient descent is difficult”. IEEE Trans Neural Networks. Vol. 5, pp 157–166, 1994.CrossRefGoogle Scholar
  5. 5.
    Box, G. E. P., Jenkins, G. Time Series Analysis. Holden Day, 1967.Google Scholar
  6. 6.
    Calder, B., Grunwald, D., Jones, M., Lindsay, D., Martin, J., Mozer, M., Zorn, B. “Evidence-based static branch prediction using machine learning”. ACM Transaction on on Programming Languages and Systems, Vol. 19, pp 188–222, 1997.CrossRefGoogle Scholar
  7. 7.
    Chen, S., Billings, S., Grant, P. “Nonlinear system identification using neural networks”. International Journal of Control. Vol. 51, No. 6, pp. 1191–1214, 1990.MathSciNetzbMATHGoogle Scholar
  8. 8.
    Elman, J. “Finding structure in time”. Cognitive Science. Vol. 14, pp 179–211, 1990.CrossRefGoogle Scholar
  9. 9.
    Frasconi, P., Gori, M., Soda, G. “Local feedback multilayered networks”. Neural Computation. Vol. 4, pp 120–130, 1992.Google Scholar
  10. 10.
    Haykin, S. Neural Networks, A comprehensive foundation. MacMillan College Pub Co. 1994.Google Scholar
  11. 11.
    Hornik, K. “Approximation capabilities of multilayer feedforward neural networks”. Neural Networks. Vol. 4, pp 251–257, 1990.CrossRefGoogle Scholar
  12. 12.
    Hochreiter, S., Schmidhuber, J. “Long short-term memory”. Neural Computation. Vol 9, pp 1735–1780, 1997.CrossRefGoogle Scholar
  13. 13.
    Jordan, M. “Supervised learning and systems with excess degree of freedom”. Massachusetts Institute of Technology, COINS Technical Report 88-27, May, 1988.Google Scholar
  14. 14.
    Kailath, T. Linear Systems. Prentice Hall, Englewood Cliffs, N.J., 1980.zbMATHGoogle Scholar
  15. 15.
    Lawrence, S., Giles, L., Back, A., Tsoi, A. C. “The gamma MLP — multiple temporal resolutions, the curse of dimensionality, and gradient descent learning”. Neural Computation To appear.Google Scholar
  16. 16.
    Lapedes, A., Farber, R. “Nonlinear signal processing using neural networks prediction and system modelling”. Los Alamos Natioanl Laboratory, Los Alamos, LA-UR-262, 1987.Google Scholar
  17. 17.
    Lin, T., Horne, B.G., Giles, L. “How embedding memory in recurrent neural network architecture helps learning long term temporal dependencies”. Technical Report, University of Maryland. Report Number UMIACS-TR-96-76, and CS-TR-3706, Institute for Advanced Computer Studies, University of Maryland, College Park, Maryland, 1996.Google Scholar
  18. 18.
    Marple, S.L. Digital spectral analysis and applications. Englewood, N.J.: Prentice Hall, 1987.Google Scholar
  19. 19.
    Narendra, K.P., Parthasarathy, K. “Identification and Control of Dynamical Systems using Neural Networks”. IEEE Trans Neural Networks. Vol 1, pp 4–27, 1990.CrossRefGoogle Scholar
  20. 20.
    Nerrand, O., Roussel-Ragot, P., Personnaz, L., Dreyfus, G., Marcos, S. “Neural Networks and nonlinear adaptive filtering: Unifying concepts and new algorithms”. Neural Computation. Vol 5, pp 165–197, 1993.Google Scholar
  21. 21.
    Pindea, F. “Dynamics and architecture for neural computation in recurrent neural networks”. Journal of Complexity. Vol 4., pp 216–245, 1988.MathSciNetCrossRefGoogle Scholar
  22. 22.
    Principe, J., de Vries, B., Oliveira, P. “The gamma filter — a new class of adaptive IIR filters with restricted feedback”. IEEE Trans Signal Processing. Vol. 41, pp 649–656, 1993.CrossRefzbMATHGoogle Scholar
  23. 23.
    Robinson, A., J. Dynamic error propagation networks. PhD thesis, University of Cambridge, Cambridge, U.K., 1989.Google Scholar
  24. 24.
    Scarselli, F. Tsoi, A.C. “Universal approximation using feedforward neural networks: A survey of some existing methods, and some results”. Neural Networks. To appear.Google Scholar
  25. 25.
    Siegelmann, H., Home, B., Giles, L. “Computational capabilites of recurrent NARX neural networks”. IEEE Trans System, Man and Cybernetics. Part B, Vol 27, pp 208–218, 1997.CrossRefGoogle Scholar
  26. 26.
    Sontag, E. “Neural networks for control”. In Essay on Control: Perspectives in the Theory and its applications. H. L. Trentelman, J. C. Willems, Ed. Boston: Birkhauser, pp. 339–380, 1993.Google Scholar
  27. 27.
    Sperduti, A. “Labelling RAAM”. Connection Science. Vol. 6, No. 4, pp 429–459, 1994.Google Scholar
  28. 28.
    Sperduti, A., Starita, A. “Supervised neural networks for the classification of structures”. IEEE Trans Neural Networks. Vol 8, pp 714–735, 1997.CrossRefGoogle Scholar
  29. 29.
    Tsoi, A.C., Back, A.D. “Locally recurrent globally feedforward networks: a critical review of architectures”. IEEE Trans on Neural Networks. Vol. 5, No. 2, pp 229–239, 1994.CrossRefGoogle Scholar
  30. 30.
    A C Tsoi, “Application of neural network methodology to the modelling of the yield strength in a steel rolling plate mill”, Advances in Neural Information Processing Systems, Vol 4. Ed. Moody, J, Hansen, S, Lippmann, R, Morgan Kaufmann Publishers, 1992.Google Scholar
  31. 31.
    Tsoi, A.C. “Gradient based learning methods”. This volume.Google Scholar
  32. 32.
    Tsoi, A.D., Back, A.D. “Discrete time recurrent neural network architectures: a unifying review”. Neurocomputing. Vol. 15, pp 183–224, 1997.CrossRefzbMATHGoogle Scholar
  33. 33.
    Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., Lang, L. “Phonemic recognition using time delay neural networks” IEEE Trans Acoustic Speech and Signal Processing. Vol. 37, No. 3, pp 328–339, 1989.CrossRefGoogle Scholar
  34. 34.
    Wan, E. “Temporal backpropagation for FIR neural networks”. Proc Int Joint Conf Neural Networks. San Diego, June, 1990, pp 575–580, 1990.Google Scholar
  35. 35.
    Williams, R., Zipser, D. “A learning algorithm for continually running fully recurrent neural networks”. Neural Computation. Vol. 1, pp 270–280, 1989.Google Scholar
  36. 36.
    Zomaya, A., Mills, P.M., Tade, M.O. Neuron-adaptive process control, a practical approach. Wiley, 1996.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Ah Chung Tsoi
    • 1
  1. 1.Faculty of InformaticsUniversity of WollongongWollongongAustralia

Personalised recommendations