Two-time scale learning automata: an efficient decision making mechanism for stochastic nonlinear resource allocation
The Stochastic Non-linear Fractional Equality Knapsack (NFEK) problem is a substantial resource allocation problem which admits a large set of applications such as web polling under polling constraints, and constrained estimation. The NFEK problem is usually solved by trial and error based on noisy feedback information from the environment. The available solutions to NFEK are based on the traditional family of Reward-Inaction Learning Automata (LA) scheme where the action probabilities are updated based on only the last feedback. Such an update form seems counterproductive for two reasons: 1) it only uses the last feedback and does not consider the whole history of the feedback and 2) it ignores updates whenever the last feedback does not correspond to a reward. In this paper, we rather suggest instead a learning solution that resorts to the whole history of feedback using the theory of two time-scale separation. Through comprehensive experimental results we show that the proposed solution is not only superior to the state-of-the-art in terms of peak performance but is also robust to the choice of the tuning parameters.
KeywordsDecision making under uncertainty Continuous learning automata Two-time scale Stochastic non-linear fractional equality knapsack Resource allocation
A very preliminary conference version of this work appeared in IEA/AIE 2017, the 30th International Conference on Industrial, Engineering, Other Applications of Applied Intelligent Systems, held in Paris, June 2017. Prof. Tore Jonassen passed away on February 04, 2018 and the authors dedicate this manuscript to his memory.
- 1.Al Islam AA, Alam SI, Raghunathan V, Bagchi S (2012) Multi-armed bandit congestion control in multi-hop infrastructure wireless mesh networks. In: IEEE 20th international symposium on modeling, analysis & simulation of computer and telecommunication systems (MASCOTS), IEEE, pp 31–40Google Scholar
- 3.Black PE (2004) Fractional knapsack problem Dictionary of algorithms and data structuresGoogle Scholar
- 4.Ghavipour M, Meybodi MR (2017) Trust propagation algorithm based on learning automata for inferring local trust in online social networks. Knowl-Based Syst 33(1):3–20Google Scholar
- 8.Liu K, Zhao Q, Swami A (2013) Dynamic probing for intrusion detection under resource constraints. In: Proceedings of IEEE international conference on communications, ICC. Budapest, Hungary, June 9-13, 2013, pp 1980–1984Google Scholar
- 9.Ma Z, Wang H, Shi K, Wang X (2018) Learning automata based caching for efficient data access in delay tolerant networks. Wirel Commun Mob Comput 2018(2018):1–19Google Scholar
- 10.Malboubi M, Wang L, Chuah C-N, Sharma P (2014) Intelligent sdn based traffic (de) aggregation and measurement paradigm (istamp). In: Proceedings IEEE INFOCOM. IEEE, 2014, pp 934–942Google Scholar
- 11.Narendra KS, Thathachar MAL (2012) Learning automata: an introduction. Courier CorporationGoogle Scholar