Bayesian Method-Based Learning Automata for Two-Player Stochastic Games with Incomplete Information

  • Hua DingEmail author
  • Chong Di
  • Li Shenghong
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 517)


In the field of artificial intelligence, learning automaton (LA) is a self-adaptive decision-maker which plays an important role in reinforcement learning (RL). Games of learning automata are stochastic games with incomplete information that have received frequent usage. Traditional learning automata schemes using in games are parameter-based schemes which exist a tunable parameter (stepsize) changing with different environments. In this paper, we proposed Bayesian method-based parameter-free learning automata (BPFLA) for two-player stochastic games with incomplete information. The parameter-free property indicates that a set of parameters in the scheme can be universally applicable for all configurations of games. Besides, simulation results demonstrate that BPFLA has much faster convergence rate than traditional schemes using games of learning automata with equal or higher accuracy.


Games of learning automata Learning automata Reinforcement learning Bayesian inference 


  1. 1.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An introduction, vol. 1, No. 1. Cambridge, MIT press (1998)Google Scholar
  2. 2.
    Narendra, K.S., Thathachar, M.A.: Learning Automata: An Introduction. Courier Corporation (2012)Google Scholar
  3. 3.
    Thomas, L.C.: Games, Theory And Applications. Courier Corporation (2012)Google Scholar
  4. 4.
    Wang, H., et al.: Reinforcement learning for constrained energy trading games with incomplete information. IEEE Trans. Cybern. 47(10), 3404–3416 (2017)CrossRefGoogle Scholar
  5. 5.
    Fu, K.S., Li, T.J.: Formulation of learning automata and automata games. Inf. Sci. 1(3), 237–256 (1969)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Lakshmivarahan, S., Narendra, K.S.: Learning algorithms for two-person zero-sum stochastic games with incomplete information: a unified approach. SIAM J. Control. Optim. 20(4), 541–552 (1982)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Tilak, O., Ryan, M., Mukhopadhyay, S.: Decentralized indirect methods for learning automata games. IEEE Trans. Syst. Man Cybern. Part B (Cybernetics) 41(5), 1213–1223 (2011)CrossRefGoogle Scholar
  8. 8.
    Ge, H., et al.: A parameter-free gradient Bayesian two-action learning automaton scheme. In: Proceedings of the 2015 International Conference on Communications, Signal Processing, and Systems. Springer, Berlin (2016)CrossRefGoogle Scholar
  9. 9.
    Ge, H.: A parameter-free learning automaton scheme (2017). arXiv:1711.10111
  10. 10.
    Gupta, A.K., Nadarajah, S. (eds.): Handbook of Beta Distribution and its Applications. CRC Press, Boca Raton (2004)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.School of Biomedical EngineeringDalian University of TechnologyDalianChina
  2. 2.School of Cyber SecurityShanghai Jiao Tong UniversityShanghaiChina

Personalised recommendations