The Effect of Recurrent Networks on Policy Improvement in Polling Systems

  • H. Sato
  • Y. Matsumoto
  • N. Okino
Conference paper


This paper considers polling policies represented by recurrent neural networks and investigates the effect of feedback weights on the optimality. The polling system consists of a single server and multiple stations and whenever the server finishes serving one of the stations, it determines the next station to visit according to the output of the neural network for the current system state. By using the simulated annealing method, we improve the polling policy in such a way that the mean delay of customers is to be minimized in the steady state. The benefit of applying recurrent networks is in that they can represent a broader class of policies than feedforward networks. Numerical results show that recurrent networks can substantially reduce the (sub-)optimal mean delay in comparison with feedforward networks.


Hide Layer Queue Length Polling System Feedforward Network Recurrent Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    O.J. Boxma, H. Levy, and J.A. Weststrate. Optimization of polling systems. In Proc. of PERFORMANCE’90, pages 349–361. Elsevier Science Publishers B.V., 1990.Google Scholar
  2. [2]
    S. Browne and U. Yechiali. Dynamic priority rules for cyclic-type queues. Adv. Appli. Prob., 21:432–450, 1989.MathSciNetMATHCrossRefGoogle Scholar
  3. [3]
    O. Fabian and H. Levy. Polling system optimization through dynamic routing policies. In Proc. of IEEE INFOCOM’93, pages 2b-3-1–2b-3-7, 1993.Google Scholar
  4. [4]
    M. Hofri and K.W. Ross. On the optimal control of two queues with server set-up times and its analysis. SIAM J. on Computing, 16:399–419, 1987.MathSciNetMATHCrossRefGoogle Scholar
  5. [5]
    T. Kohonen. The self-organizing map. Proc. of the IEEE, 78(9):1464–1480, 1990.CrossRefGoogle Scholar
  6. [6]
    M. Markon, H. Kita, and Y. Nishikawa. Reinforcement learning for stochastic system control by using a feature extraction with bp neural networks. Tech. Rep. of IEICE, NC91-126:209–214, 1991.Google Scholar
  7. [7]
    Y. Matsumoto. On optimization of polling policy represented by neural network. In Proc. of ACM SIGCOMM’94, pages 181–190, 1994.Google Scholar
  8. [8]
    H. Sato, Y. Matsumoto, and N. Okino. Policy optimization by neural network and its application to queuing allocation problem. In D.W. Pearson, N. C. Steele, R.F. Albrecht (editors), Artificial Neural Networks and Genetic Algorithms, pages 344–347, Wien New York, 1995. Springer-Verlag.Google Scholar
  9. [9]
    H. Takagi. Analysis of polling systems. MIT Press, 1986.Google Scholar
  10. [10]
    J. Walrand. An introduction to queueing networks. Prentice Hall, NJ, 1988.Google Scholar

Copyright information

© Springer-Verlag Wien 1998

Authors and Affiliations

  • H. Sato
    • 1
  • Y. Matsumoto
    • 2
  • N. Okino
    • 1
  1. 1.Division of Applied Systems Science, Faculty of EngineeringKyoto UniversityJapan
  2. 2.I.T.S., Inc.Japan

Personalised recommendations