Dynamic Programming with NAR Model versus Q-learning — Case Study

  • Jarosław Chrobak
  • Andrzej Pacut
Conference paper
Part of the Advances in Soft Computing book series (AINSC, volume 19)


Two approaches to control policy synthesis for unknown systems are investigated. An indirect approach is based on adaptive identification of a neural network model in the NAR form (nonlinear autoregresion model) followed by application of the dynamic programming to this model. A direct approach consists of Q-learning with the use of a lookup table. Both methods were applied to optimization of a stock portfolio problem and tested on Warsaw Stock Exchange data.


Lookup Table Investment Policy Warsaw Stock Exchange Optimal Control Synthesis Cessive Approximation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    D.P. Bertsekas and J. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, Belmont, Mass., 1996.MATHGoogle Scholar
  2. 2.
    D.P. Bertsekas, Dynamic Programming and Optimal Control, Athena Scientific, Belmont, Mass., 1995.MATHGoogle Scholar
  3. 3.
    S. Haykin, Neural Networks - A Comprehensive Foundation Macmillan College Publishing Company, 1994.Google Scholar
  4. 4.
    K.S. Narendra, Neural Networks for Control: Theory and Practice, Proceedings of the IEEE, Vol. 84, No. 10, pp. 1385–1407, 1996.CrossRefGoogle Scholar
  5. 5.
    C.J.C.H. Watkins and P. Dayan, Technical Note: Q Learning, Machine Learning, Vol. 8, pp. 279–492, 1992.MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Jarosław Chrobak
    • 1
  • Andrzej Pacut
    • 1
  1. 1.Warsaw University of TechnologyWarsawPoland

Personalised recommendations