Abstract
We study the problem of allocating multiple users to a set of wireless channels in a decentralized manner when the channel qualities are time-varying and unknown to the users, and accessing the same channel by multiple users leads to reduced quality due to interference. In such a setting the users not only need to learn the inherent channel quality and at the same time the best allocations of users to channels so as to maximize the social welfare. Assuming that the users adopt a certain online learning algorithm, we investigate under what conditions the socially optimal allocation is achievable. In particular we examine the effect of different levels of knowledge the users may have and the amount of communications and cooperation. The general conclusion is that when the cooperation of users decreases and the uncertainty about channel payoffs increases it becomes harder to achieve the socially optimal allocation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R.: Sample Mean Based Index Policies with O(log(n)) Regret for the Multi-armed Bandit Problem. Advances in Applied Probability 27(4), 1054–1078 (1995)
Ahmad, S., Tekin, C., Liu, M., Southwell, R., Huang, J.: Spectrum Sharing as Spatial Congestion Games (2010), http://arxiv.org/abs/1011.5384
Anandkumar, A., Michael, N., Tang, A.: Opportunistic Spectrum Access with Multiple Players: Learning under Competition. In: Proc. of IEEE INFOCOM (March 2010)
Anantharam, V., Varaiya, P., Walrand, J.: Asymptotically Efficient Allocation Rules for the Multiarmed Bandit Problem with Multiple Plays-Part I: IID Rewards. IEEE Trans. Automat. Contr., 968–975 (November 1987)
Anantharam, V., Varaiya, P., Walrand, J.: Asymptotically Efficient Allocation Rules for the Multiarmed Bandit Problem with Multiple Plays-Part II: Markovian Rewards. IEEE Trans. Automat. Contr., 977–982 (November 1987)
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time Analysis of the Multiarmed Bandit Problem. Machine Learning 47, 235–256 (2002)
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.: The Nonstochastic Multiarmed Bandit Problem. SIAM Journal on Computing 32, 48–77 (2002)
Chlebus, E.: An Approximate Formula for a Partial Sum of the Divergent p-series. Applied Mathematics Letters 22, 732–737 (2009)
Turner, D.W., Young, D.M., Seaman, J.: A Kolmogorov Inequality for the Sum of Independent Bernoulli Random Variables with Unequal Means. Statistics and Probability Letters 23, 243–245 (1995)
Freund, Y., Schapire, R.: Adaptive Game Playing Using Multiplicative Weights. Games and Economic Behaviour 29, 79–103 (1999)
Gai, Y., Krishnamachari, B., Jain, R.: Learning Multiuser Channel Allocations in Cognitive Radio Networks: a Combinatorial Multi-armed Bandit Formulation. In: IEEE Symp. on Dynamic Spectrum Access Networks (DySPAN) (April 2010)
Kakhbod, A., Teneketzis, D.: Power Allocation and Spectrum Sharing in Cognitive Radio Networks With Strategic Users. In: 49th IEEE Conference on Decision and Control (CDC) (December 2010)
Kasbekar, G., Proutiere, A.: Opportunustic Medium Access in Multi-channel Wireless Systems: A Learning Approach. In: Proceedings of the 48th Annual Allerton Conference on Communication, Control, and Computation (September 2010)
Kleinberg, R., Piliouras, G., Tardos, E.: Multiplicative Updates Outperform Generic No-Regret Learning in Congestion Games. In: Annual ACM Symposium on Theory of Computing, STOC (2009)
Lai, T., Robbins, H.: Asymptotically Efficient Adaptive Allocation Rules. Advances in Applied Mathematics 6, 4–22 (1985)
Liu, K., Zhao, Q.: Distributed Learning in Multi-Armed Bandit with Multiple Players. IEEE Transactions on Signal Processing 58(11), 5667–5681 (2010)
Monderer, D., Shapley, L.S.: Potential Games. Games and Economic Behavior 14(1), 124–143 (1996)
Rosenthal, R.: A Class of Games Possessing Pure-strategy Nash Equilibria. International Journal of Game Theory 2, 65–67 (1973)
Sandholm, W.H.: Population Games and Evolutionary Dynamics (2008) (manuscript)
Smith, J.M.: Evolution and the Theory of Games. Cambridge University Press (1982)
Tekin, C., Liu, M.: Online Algorithms for the Multi-armed Bandit Problem with Markovian Rewards. In: Proceedings of the 48th Annual Allerton Conference on Communication, Control, and Computation (September 2010)
Tekin, C., Liu, M.: Online Learning in Opportunistic Spectrum Access: A Restless Bandit Approach. In: 30th IEEE International Conference on Computer Communications (INFOCOM) (April 2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Tekin, C., Liu, M. (2012). Performance and Convergence of Multi-user Online Learning. In: Jain, R., Kannan, R. (eds) Game Theory for Networks. GameNets 2011. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 75. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30373-9_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-30373-9_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30372-2
Online ISBN: 978-3-642-30373-9
eBook Packages: Computer ScienceComputer Science (R0)