Neural Networks with Online Sequential Learning Ability for a Reinforcement Learning Algorithm

  • Hitesh ShahEmail author
  • Madan Gopal
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 27)


Reinforcement learning (RL) algorithms that employ neural networks as function approximators have proven to be powerful tools for solving optimal control problems. However, neural network function approximators suffer from a number of problems like learning becomes difficult when the training data are given sequentially, difficult to determine structural parameters, and usually result in local minima or overfitting. In this paper, a novel on-line sequential learning evolving neural network model design for RL is proposed. We explore the use of minimal resource allocation neural network (mRAN), and develop a mRAN function approximation approach to RL systems. Potential of this approach is demonstrated through a case study. The mean square error accuracy, computational cost, and robustness properties of this scheme are compared with static structure neural networks.


Reinforcement learning online sequential learning resource allocation neural network 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Sutton, R.S., Barto, A.G., Williams, R.J.: Reinforcement learning is direct adaptive optimal control. IEEE Control Syst. Mag. 12(2), 19–22Google Scholar
  2. 2.
    Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, CambridgeGoogle Scholar
  3. 3.
    Watkins CJCHLearning with delayed rewards. Ph. D. Thesis, University of Cambridge (1989)Google Scholar
  4. 4.
    Singh, S., Jaakkola, T., Littman, M., Szpesvari, C.: Convergence results for single step on-policy reinforcement learning algorithms. Machine Learning 38, 287–308 (2000)CrossRefzbMATHGoogle Scholar
  5. 5.
    Hagen, S.T., Kröse, B.: Neural Q-learning. Neural Comput. & Applic. 12, 81–88 (2003)CrossRefGoogle Scholar
  6. 6.
    Platt, J.: A resource-allocating network for function interpolation. Neural Computation 3, 213–225 (1991)CrossRefMathSciNetGoogle Scholar
  7. 7.
    Kadirkamanathan, V., Niranjan, M.: A function estimation approach to sequential learning with neural networks. Neural Computation 5, 954–975 (1993)CrossRefGoogle Scholar
  8. 8.
    Yingwei, L., Sundararajan, N., Saratchandran, P.: A sequential learning scheme for function approximation using minimal radial basis function (RBF) neural networks. Neural Computation 9, 461–478 (1997)CrossRefzbMATHGoogle Scholar
  9. 9.
    Yingwei, L., Sundararajan, N., Saratchandran, P.: Performance evaluation of a sequential minimal radial basis function (RBF) neural network learning algorithm. IEEE Trans. on Neural Network 9, 308–318 (1998)CrossRefGoogle Scholar
  10. 10.
    Rojas, I., Pomares, H., Bernier, J.L., Ortega, J., Pino, B., Pelayo, F.J., Prieto, A.: Time series analysis using normalized PG-RBF network with regression weights. Neurocomputing 42, 267–285 (2002)CrossRefzbMATHGoogle Scholar
  11. 11.
    Salmeron, M., Ortega, J., Puntonet, C.G., Prieto, A., Improved, R.A.N.: sequential prediction using orthogonal techniques. Neurocomputing 41, 153–172 (2001)CrossRefzbMATHGoogle Scholar
  12. 12.
    Huang, G.B., Saratchandran, P., Sundararajan, N.: An efficient sequential learning algorithm for growing and pruning RBF (GAPRBF) networks. IEEE Transcript on System Man and Cybern. B 34, 2284–2292 (2004)CrossRefGoogle Scholar
  13. 13.
    Huang, G.B., Saratchandran, P., Sundararajan, N.: A generalized growing and pruning RBF (GGAP-RBF) neural network for function approximation. IEEE Transcript on Neural Network 16, 57–67 (2005)CrossRefGoogle Scholar
  14. 14.
    Liang, N.Y., Huang, G.B., Saratchandran, P., Sundararajan, N.: A fast and accurate online sequential learning algorithm for feed forward networks. IEEE Trans. on Neural Network 17, 1411–1423 (2006)CrossRefGoogle Scholar
  15. 15.
    Vamplew, P., Ollington, R.: Global versus local constructive function approximation for on-line reinforcement learning. In: Zhang, S., Jarvis, R.A. (eds.) AI 2005. LNCS (LNAI), vol. 3809, pp. 113–122. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  16. 16.
    Shiraga, N., Ozawa, S., Abe, S.: A reinforcement learning algorithm for neural networks with incremental learning ability. In: Proceeding of the 9th International Conference on Neural Information Processing, vol. 5, pp. 2566–2570 (2002)Google Scholar
  17. 17.
    Kobayashi, M., Zamani, A., Ozawa, S., Abe, S.: Reducing computations in incremental learning for feed-forward neural network with long-term memory. In: Proc. International Joint Conference on Neural Networks, pp. 1989–1994 (2001)Google Scholar
  18. 18.
    Shah, H., Gopal, M.: A fuzzy decision tree based robust Markov game controller for robot manipulators. International Journal of Automatic and Control 4(4), 417–439 (2010)CrossRefGoogle Scholar
  19. 19.
    Green, S.J.Z.: Dynamics and trajectory tracking control of a two-link robot manipulator. J. Vibration Control 10(10), 1415–1440 (2004)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of Electronicsand Communication EngineeringG H Patel College of Engineering and TechnologyGujaratIndia
  2. 2.School of EngineeringShiv Nadar UniversityGreater NoidaIndia

Personalised recommendations