On Using the Theory of Regular Functions to Prove the ε-Optimality of the Continuous Pursuit Learning Automaton

Zhang, Xuan; Granmo, Ole-Christoffer; Oommen, B. John; Jiao, Lei

doi:10.1007/978-3-642-38577-3_27

Xuan Zhang²⁴,
Ole-Christoffer Granmo²⁴,
B. John Oommen^25,24 &
…
Lei Jiao²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7906))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

4005 Accesses
6 Citations

Abstract

There are various families of Learning Automata (LA) such as Fixed Structure, Variable Structure, Discretized etc. Informally, if the environment is stationary, their ε-optimality is defined as their ability to converge to the optimal action with an arbitrarily large probability, if the learning parameter is sufficiently small/large. Of these LA families, Estimator Algorithms (EAs) are certainly the fastest, and within this family, the set of Pursuit algorithms have been considered to be the pioneering schemes. The existing proofs of the ε-optimality of all the reported EAs follow the same fundamental principles. Recently, it has been reported that the previous proofs for the ε-optimality of all the reported EAs have a common flaw. In other words, people have worked with this flawed reasoning for almost three decades. The flaw lies in the condition which apparently supports the so-called “monotonicity” property of the probability of selecting the optimal action, explained in the paper. In this paper, we provide a new method to prove the ε-optimality of the Continuous Pursuit Algorithm (CPA), which was the pioneering EA. The new proof follows the same outline of the previous proofs, but instead of examining the monotonicity property of the action probabilities, it rather examines their submartingale property, and then, unlike the traditional approach, invokes the theory of Regular functions to prove the ε-optimality. We believe that the proof is both unique and pioneering, and that it can form the basis for formally demonstrating the ε-optimality of other EAs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Oommen, B.J., Granmo, O.C., Pedersen, A.: Using stochastic AI techniques to achieve unbounded resolution in finite player goore games and its applications. In: IEEE Symposium on Computational Intelligence and Games, Honolulu, HI (2007)
Google Scholar
Beigy, H., Meybodi, M.R.: Adaptation of parameters of bp algorithm using learning automata. In: Sixth Brazilian Symposium on Neural Networks, JR, Brazil (2000)
Google Scholar
Granmo, O.C., Oommen, B.J., Myrer, S.A., Olsen, M.G.: Learning automata-based solutions to the nonlinear fractional knapsack problem with applications to optimal resource allocation. IEEE Transactions on Systems, Man, and Cybernetics, Part B 37(1), 166–175 (2007)
Article Google Scholar
Unsal, C., Kachroo, P., Bay, J.S.: Multiple stochastic learning automata for vehicle path control in an automated highway system. IEEE Transactions on Systems, Man, and Cybernetics, Part A 29, 120–128 (1999)
Article Google Scholar
Oommen, B.J., Roberts, T.D.: Continuous learning automata solutions to the capacity assignment problem. IEEE Transactions on Computers 49, 608–620 (2000)
Article Google Scholar
Granmo, O.C.: Solving stochastic nonlinear resource allocation problems using a hierarchy of twofold resource allocation automata. IEEE Transactions Computers 59(4), 545–560 (2010)
Article MathSciNet Google Scholar
Oommen, B.J., Croix, T.D.S.: String taxonomy using learning automata. IEEE Transactions on Systems, Man, and Cybernetics 27, 354–365 (1997)
Article Google Scholar
Oommen, B.J., Croix, T.D.S.: Graph partitioning using learning automata. IEEE Transactions on Computers 45, 195–208 (1996)
Article MATH Google Scholar
Dean, T., Angluin, D., Basye, K., Engelson, S., Aelbling, L., Maron, O.: Inferring finite automata with stochastic output functions and an application to map learning. Maching Learning 18, 81–108 (1995)
Google Scholar
Thathachar, M.A.L., Sastry, P.S.: Estimator algorithms for learning automata. In: The Platinum Jubilee Conference on Systems and Signal Processing, Bangalore, India, pp. 29–32 (1986)
Google Scholar
Oommen, B.J., Lanctot, J.K.: Discretized pursuit learning automata. IEEE Transactions on Systems, Man, and Cybernetics 20, 931–938 (1990)
Article MathSciNet MATH Google Scholar
Lanctot, J.K., Oommen, B.J.: On discretizing estimator-based learning algorithms. IEEE Trans. on Systems, Man, and Cybernetics, Part B: Cybernetics 2, 1417–1422 (1991)
Google Scholar
Lanctot, J.K., Oommen, B.J.: Discretized estimator learning automata. IEEE Trans. on Systems, Man, and Cybernetics, Part B: Cybernetics 22(6), 1473–1483 (1992)
Article MathSciNet Google Scholar
Rajaraman, K., Sastry, P.S.: Finite time analysis of the pursuit algorithm for learning automata. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 26, 590–598 (1996)
Article Google Scholar
Oommen, B.J., Agache, M.: Continuous and discretized pursuit learning schemes: various algorithms and their comparison. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 31(3), 277–287 (2001)
Article Google Scholar
Ryan, M., Omkar, T.: On ε-optimality of the pursuit learning algorithm. Journal of Applied Probability 49(3), 795–805 (2012)
Article MathSciNet MATH Google Scholar
Narendra, K.S., Thathachar, M.A.L.: Learning Automat: An Introduction. Prentice Hall (1989)
Google Scholar
Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58, 13–30 (1963)
Article MathSciNet MATH Google Scholar
Zhang, X., Granmo, O.C., Oommen, B.J., Jiao, L.: A Formal Proof of the ε-Optimality of Continuous Pursuit Algorithms Using the Theory of Regular Functions. The Unabridged Version of this Paper (Submitted for Publication. It can be made available to the Referees if needed)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of ICT, University of Agder, Grimstad, Norway
Xuan Zhang, Ole-Christoffer Granmo, B. John Oommen & Lei Jiao
School of Computer Science, Carleton University, Ottawa, Canada
B. John Oommen

Authors

Xuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ole-Christoffer Granmo
View author publications
You can also search for this author in PubMed Google Scholar
B. John Oommen
View author publications
You can also search for this author in PubMed Google Scholar
Lei Jiao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Texas State University, 78666, San Marcos, TX, USA
Moonis Ali
Agent Systems Research Group, Department of Computer Science, Faculty of Sciences, VU University Amsterdam, De Boelelaan 1081, 1081, Amsterdam, HV, The Netherlands
Tibor Bosse
Interactive Intelligence Group, Department of Intelligent Systems, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Mekelweg 4, 2628 CD, Delft, The Netherlands
Koen V. Hindriks & Catholijn M. Jonker &
Computational Intelligence Group, Department of Computer Science, Faculty of Sciences, VU University Amsterdam, De Boelelaan 1081, 1081 HV, Amsterdam, The Netherlands
Mark Hoogendoorn
Agent Systems Research Group, Department of Computer Science, Faculty of Sciences, VU University Amsterdam, De Boelelaan 1081, 1081 HV, Amsterdam, The Netherlands
Jan Treur

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, X., Granmo, OC., Oommen, B.J., Jiao, L. (2013). On Using the Theory of Regular Functions to Prove the ε-Optimality of the Continuous Pursuit Learning Automaton. In: Ali, M., Bosse, T., Hindriks, K.V., Hoogendoorn, M., Jonker, C.M., Treur, J. (eds) Recent Trends in Applied Artificial Intelligence. IEA/AIE 2013. Lecture Notes in Computer Science(), vol 7906. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38577-3_27

Download citation

DOI: https://doi.org/10.1007/978-3-642-38577-3_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38576-6
Online ISBN: 978-3-642-38577-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics