Advertisement

Recurrent Neural Networks

  • Ke-Lin DuEmail author
  • M. N. S. Swamy
Chapter

Abstract

Recurrent networks are neural networks with backward connections. They are dynamical systems with temporal state representations. They are used in many temporal processing models and applications. This chapter deals with recurrent networks and their learning.

References

  1. 1.
    Almeida, L. B. (1987). A learning rule for asynchronous perceptrons with feedback in combinatorial environment. In Proceedings of the IEEE 1st International Conference on Neural Networks (pp. 609–618). San Diego, CA.Google Scholar
  2. 2.
    Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2), 157–166.CrossRefGoogle Scholar
  3. 3.
    Berthold, M. R. (1994). A time delay radial basis function network for phoneme recognition. In Proceedings of IEEE International Conference on Neural Networks (Vol. 7, pp. 4470–4473). Orlando, FL.Google Scholar
  4. 4.
    Bertschinger, N., & Natschlager, T. (2004). Real-time computation at the edge of chaos in recurrent neural networks. Neural Computation, 16, 1413–1436.zbMATHCrossRefGoogle Scholar
  5. 5.
    Billings, S. A., & Hong, X. (1998). Dual-orthogonal radial basis function networks for nonlinear time series prediction. Neural Networks, 11, 479–493.CrossRefGoogle Scholar
  6. 6.
    Buonomano, D. V., & Maass, W. (2009). State-dependent computations: spatiotemporal processing in cortical networks. Nature Reviews Neuroscience, 10, 113–125.CrossRefGoogle Scholar
  7. 7.
    Busing, L., Schrauwen, B., & Legenstein, R. (2010). Connectivity, dynamics, and memory in reservoir computing with binary and analog neurons. Neural Computation, 22, 1272–1311.MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    Cai, X., Prokhorov, D. V., & Wunsch, D. C, I. I. (2007). Training winner-take-all simultaneous recurrent neural networks. IEEE Transactions on Neural Networks, 18(3), 674–684.CrossRefGoogle Scholar
  9. 9.
    Coelho, P. H. G. (2001). A complex EKF-RTRL neural network. In Proceedings of the International Joint Conference on Neural Networks (IJCNN) (Vol. 1, pp. 120–125).Google Scholar
  10. 10.
    Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14, 179–211.CrossRefGoogle Scholar
  11. 11.
    Farkas, I., Bosak, R., & Gergel, P. (2016). Computational analysis of memory capacity in echo state networks. Neural Networks, 83, 109–120.CrossRefGoogle Scholar
  12. 12.
    Frascon, P., Cori, M., Maggini, M., & Soda, G. (1996). Representation of finite state automata in recurrent radial basis function networks. Machine Learning, 23, 5–32.Google Scholar
  13. 13.
    Funahashi, K. I., & Nakamura, Y. (1993). Approximation of dynamical systems by continuous time recurrent neural networks. Neural Networks, 6(6), 801–806.CrossRefGoogle Scholar
  14. 14.
    Goh, S. L., & Mandic, D. P. (2004). A complex-valued RTRL algorithm for recurrent neural networks. Neural Computation, 16, 2699–2713.zbMATHCrossRefGoogle Scholar
  15. 15.
    Goh, S. L., & Mandic, D. P. (2007). An augmented CRTRL for complex-valued recurrent neural networks. Neural Networks, 20(10), 1061–1066.zbMATHCrossRefGoogle Scholar
  16. 16.
    Goh, S. L., & Mandic, D. P. (2007). An augmented extended Kalman filter algorithm for complex-valued recurrent neural networks. Neural Computation, 19, 1039–1055.zbMATHCrossRefGoogle Scholar
  17. 17.
    Gonon, L., & Ortega, J.-P. (2019). Reservoir computing universality with stochastic inputs. IEEE Transactions on Neural Networks and Learning Systems.  https://doi.org/10.1109/TNNLS.2019.2899649.
  18. 18.
    Grigoryeva, L., & Ortega, J.-P. (2018). Echo state networks are universal. Neural Networks, 108, 495–508.CrossRefGoogle Scholar
  19. 19.
    Gruning, A. (2007). Elman backpropagation as reinforcement for simple recurrent networks. Neural Computation, 19, 3108–3131.zbMATHCrossRefGoogle Scholar
  20. 20.
    Hassoun, M. H. (1995). Fundamentals of artificial neural networks. Cambridge: MIT Press.Google Scholar
  21. 21.
    Ilin, R., Kozma, R., & Werbos, P. J. (2008). Beyond feedforward models trained by backpropagation: A practical training tool for a more efficient universal approximator. IEEE Transactions on Neural Networks, 19(6), 929–937.CrossRefGoogle Scholar
  22. 22.
    Jaeger, H. (2001). The “echo state” approach to analyzing and training recurrent neural networks. GMD Technical Report 148. Sankt Augustin, Germany: German National Research Center for Information Technology.Google Scholar
  23. 23.
    Jaeger, H. (2001). Short term memory in echo state networks. GMD Technical Report 152. Sankt Augustin, Germany: German National Research Center for Information Technology.Google Scholar
  24. 24.
    Kawai, Y., Park, J., & Asada, M. (2019). A small-world topology enhances the echo state property and signal propagation in reservoir computing. Neural Networks, 112, 15–23.CrossRefGoogle Scholar
  25. 25.
    Kechriotis, G., & Manolakos, E. (1994). Training fully recurrent neural networks with complex weights. IEEE Transactions on Circuits and Systems II, 41(3), 235–238.CrossRefGoogle Scholar
  26. 26.
    Kinouchi, M., & Hagiwara, M. (1996). Memorization of melodies by complex-valued recurrent neural network. In Proceedings of International Conference on Neural Networks (Vol. 3, pp. 1324–1328). Washington, DC.Google Scholar
  27. 27.
    Li, L. K. (1992). Approximation theory and recurrent networks. In Proceedings of the International Joint Conference on Neural Networks (IJCNN) (pp. 266–271). Baltimore, MD.Google Scholar
  28. 28.
    Li, X., & Yu, W. (2002). Dynamic system identification via recurrent multilayer perceptrons. Information Sciences, 147, 45–63.MathSciNetzbMATHCrossRefGoogle Scholar
  29. 29.
    Li, X. D., Ho, J. K. L., & Chow, T. W. S. (2005). Approximation of dynamical time-variant systems by continuous-time recurrent neural networks. IEEE Transactions on Circuits and Systems, 52(10), 656–660.CrossRefGoogle Scholar
  30. 30.
    Lian, J., Lee, Y., Sudhoff, S. D., & Zak, S. H. (2008). Self-organizing radial basis function network for real-time approximation of continuous-time dynamical systems. IEEE Transactions on Neural Networks, 19(3), 460–474.CrossRefGoogle Scholar
  31. 31.
    Liang, J., & Gupta, M. M. (1999). Stable dynamic backpropagation learning in recurrent neural networks. IEEE Transactions on Neural Networks, 10(6), 1321–1334.CrossRefGoogle Scholar
  32. 32.
    Liu, Q., & Wang, J. (2008). A one-layer recurrent neural network with a discontinuous activation function for linear programming. Neural Computation, 20, 1366–1383.MathSciNetzbMATHCrossRefGoogle Scholar
  33. 33.
    Maass, W., & Markram, H. (2004). On the computational power of circuits of spiking neurons. Journal of Computer and System Sciences, 69(4), 593–616.MathSciNetzbMATHCrossRefGoogle Scholar
  34. 34.
    Maass, W., Natschlager, T., & Markram, H. (2002). Real-time computing without stable states: A new framework for neural computation based on perturbations. Neural Computation, 14(11), 2531–2560.zbMATHCrossRefGoogle Scholar
  35. 35.
    Manjunath, G., & Jaeger, H. (2013). Echo state property linked to an input: Exploring a fundamental characteristic of recurrent neural networks. Neural Computation, 25, 671–696.MathSciNetzbMATHCrossRefGoogle Scholar
  36. 36.
    Mirikitani, D. T., & Nikolaev, N. (2010). Recursive Bayesian recurrent neural networks for time-series modeling. IEEE Transactions on Neural Networks, 21(2), 262–274.CrossRefGoogle Scholar
  37. 37.
    Mandic, D. P., & Chambers, J. A. (2000). A normalised real time recurrent learning algorithm. Signal Processing, 80, 1909–1916.zbMATHCrossRefGoogle Scholar
  38. 38.
    Menguc, E. C., & Acir, N. (2018). Kurtosis-based CRTRL algorithms for fully connected recurrent neural networks. IEEE Transactions on Neural Networks and Learning Systems, 29(12), 6123–6131.CrossRefGoogle Scholar
  39. 39.
    Patan, K. (2008). Approximation of state-space trajectories by locally recurrent globally feed-forward neural networks. Neural Networks, 21, 59–64.zbMATHCrossRefGoogle Scholar
  40. 40.
    Pearlmutter, B. A. (1989). Learning state space trajectories in recurrent neural networks. In Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN) (pp. 365–372). Washington, DC.Google Scholar
  41. 41.
    Pearlmutter, B. A. (1995). Gradient calculations for dynamic recurrent neural networks: A survey. IEEE Transactions on Neural Networks, 6(5), 1212–1228.CrossRefGoogle Scholar
  42. 42.
    Peng, H., Ozaki, T., Haggan-Ozaki, V., & Toyoda, Y. (2003). A parameter optimization method for radial basis function type models. IEEE Transactions on Neural Networks, 14(2), 432–438.zbMATHCrossRefGoogle Scholar
  43. 43.
    Pineda, F. J. (1987). Generalization of back-propagation to recurrent neural networks. Physical Review Letters, 59, 2229–2232.MathSciNetCrossRefGoogle Scholar
  44. 44.
    Rodan, A., & Tino, P. (2011). Minimum complexity echo state network. IEEE Transactions on Neural Networks, 22(1), 131–144.CrossRefGoogle Scholar
  45. 45.
    Roelfsema, P. R., & van Ooyen, A. (2005). Attention-gated reinforcement learning of internal representations for classification. Neural Computation, 17, 1–39.zbMATHCrossRefGoogle Scholar
  46. 46.
    Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning internal representations by error propagation. In D. E. Rumelhart & J. L. McClelland (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition, 1: Foundation (pp. 318–362). Cambridge: MIT Press.Google Scholar
  47. 47.
    Schafer, A. M., & Zimmermann, H. G. (2006). Recurrent neural networks are universal approximators. In Proceedings of the 16th International Conference on Artificial Neural Networks (ICANN), LNCS (Vol. 4131, pp. 632–640). Berlin: Springer.Google Scholar
  48. 48.
    Sejnowski, T., & Rosenberg, C. (1986). NETtalk: A parallel network that learns to read alound. Technical Report JHU/EECS-86/01, Johns Hopkins University.Google Scholar
  49. 49.
    Shamma, S. A. (1989). Spatial and temporal processing in cellular auditory network. In C. Koch & I. Segev (Eds.), Methods in neural modeling (pp. 247–289). Cambridge: MIT Press.Google Scholar
  50. 50.
    Sheta, A. F., & De Jong, K. (2001). Time series forecasting using GA-tuned radial basis functions. Information Sciences, 133, 221–228.zbMATHCrossRefGoogle Scholar
  51. 51.
    Siegelmann, H. T., & Sontag, E. D. (1995). On the computational power of neural nets. Journal of Computer and System Sciences, 50(1), 132–150.MathSciNetzbMATHCrossRefGoogle Scholar
  52. 52.
    Siegelmann, H. T., Horne, B. G., & Giles, C. L. (1997). Computational capabilities of recurrent NARX neural networks. IEEE Transactions on Systems, Man, and Cybernetics Part B, 27(2), 208–215.CrossRefGoogle Scholar
  53. 53.
    Soh, H., & Demiris, Y. (2015). Spatio-temporal learning with the online finite and infinite echo-state Gaussian processes. IEEE Transactions on Neural Networks and Learning Systems, 26(3), 522–536.MathSciNetCrossRefGoogle Scholar
  54. 54.
    Song, Q., Wu, Y., & Soh, Y. C. (2008). Robust adaptive gradient-descent training algorithm for recurrent neural networks in discrete time domain. IEEE Transactions on Neural Networks, 19(11), 1841–1853.CrossRefGoogle Scholar
  55. 55.
    Steil, J. J. (2004). Backpropagation-decorrelation: Recurrent learning with \(O(N)\) complexity. In Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN) (Vol. 1, pp. 843–848).Google Scholar
  56. 56.
    Steil, J. J. (2006). Online stability of backpropagation-decorrelation recurrent learning. Neurocomputing, 69, 642–650.CrossRefGoogle Scholar
  57. 57.
    Strauss, T., Wustlich, W., & Labahn, R. (2012). Design strategies for weight matrices of echo state networks. Neural Computation, 24, 3246–3276.MathSciNetzbMATHCrossRefGoogle Scholar
  58. 58.
    Tokuda, I., Tokunaga, R., & Aihara, K. (2003). Back-propagation learning of infinite-dimensional dynamical systems. Neural Networks, 16, 1179–1193.CrossRefGoogle Scholar
  59. 59.
    Tsoi, A. C., & Back, A. D. (1994). Locally recurrent globally feedforward networks: A critical review of architectures. IEEE Transactions on Neural Networks, 5(2), 229–239.CrossRefGoogle Scholar
  60. 60.
    Vesin, J. (1993). An amplitude-dependent autoregressive signal model based on a radial basis function expansion. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (Vol. 3, pp. 129–132). Minneapolis, MN.Google Scholar
  61. 61.
    Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., & Lang, K. J. (1989). Phoneme recognition using time-delay neural networks. IEEE Transactions on Acoustics, Speech and Signal Processing, 37(3), 328–339.CrossRefGoogle Scholar
  62. 62.
    Wallace, E., Hamid, R., & Latham, P. (2013). Randomly connected networks have short temporal memory. Neural Computation, 25, 1408–1439.MathSciNetzbMATHCrossRefGoogle Scholar
  63. 63.
    Wan, E. A. (1990). Temporal backpropagation for FIR neural networks. In Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN) (pp. 575–580). San Diego, CA.Google Scholar
  64. 64.
    Wan, E. A. (1994). Time series prediction by using a connectionist network with internal delay lines. In A. S. Weigend & N. A. Gershenfeld (Eds.), Time series prediction: Forcasting the future and understanding the past (pp. 195–217). Reading: Addison-Wesley.Google Scholar
  65. 65.
    Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of small-world networks. Nature, 393, 440–442.zbMATHCrossRefGoogle Scholar
  66. 66.
    Werbos, P. J. (1990). Backpropagation through time: What it does and how to do it. Proceedings of the IEEE, 78(10), 1550–1560.CrossRefGoogle Scholar
  67. 67.
    Werbos, P. J., & Pang, X. (1996). Generalized maze navigation: SRN critics solve what feedforward or Hebbian cannot. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (Vol. 3, pp. 1764–1769).Google Scholar
  68. 68.
    Williams, R. J., & Zipser, D. (1989). A learning algorithm for continually running fully recurrent neural networks. Neural Computation, 1(2), 270–280.CrossRefGoogle Scholar
  69. 69.
    Williams, R. J., & Zipser, D. (1995). Gradient-based learning algorithms for recurrent networks and their computational complexity. In Y. Chauvin & D. E. Rumelhart (Eds.), Backpropagation: Theory, architecture, and applications (pp. 433–486). Hillsdale: Lawrence Erlbaum.Google Scholar
  70. 70.
    Wu, F.-X. (2011). Delay-independent stability of genetic regulatory networks. IEEE Transactions on Neural Networks, 22(11), 1685–1692.CrossRefGoogle Scholar
  71. 71.
    Yu, D. L. (2004). A localized forgetting method for Gaussian RBFN model adaptation. Neural Processing Letters, 20, 125–135.CrossRefGoogle Scholar
  72. 72.
    Zhang, Y., Jiang, D., & Wang, J. (2002). A recurrent neural network for solving Sylvester equation with time-varying coefficients. IEEE Transactions on Neural Networks, 13(5), 1053–1063.CrossRefGoogle Scholar
  73. 73.
    Zhang, Y., Ma, W., & Cai, B. (2009). From Zhang neural network to Newton iteration for matrix inversion. IEEE Transactions on Circuits and Systems Part I, 56(7), 1405–1415.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Electrical and Computer EngineeringConcordia UniversityMontrealCanada
  2. 2.Xonlink Inc.HangzhouChina

Personalised recommendations