Skip to main content

Reinforcement Learning-Based Spectrum Management for Cognitive Radio Networks: A Literature Review and Case Study

  • Living reference work entry
  • First Online:
Book cover Handbook of Cognitive Radio

Abstract

In cognitive radio (CR) networks, the cognition cycle, i.e., the ability of wireless transceivers to learn the optimal configuration meeting environmental and application requirements, is considered as important as the hardware components which enable the dynamic spectrum access (DSA) capabilities. To this purpose, several machine learning (ML) techniques have been applied on CR spectrum and network management issues, including spectrum sensing, spectrum selection, and routing. In this paper, we focus on reinforcement learning (RL), an online ML paradigm where an agent discovers the optimal sequence of actions required to perform a task via trial-end-error interactions with the environment. Our study provides both a survey and a proof of concept of RL applications in CR networking. As a survey, we discuss pros and cons of the RL framework compared to other ML techniques, and we provide an exhaustive review of the RL-CR literature, by considering a twofold perspective, i.e., an application-driven taxonomy and a learning methodology-driven taxonomy. As a proof of concept, we investigate the application of RL techniques on joint spectrum sensing and decision problems, by comparing different algorithms and learning strategies and by further analyzing the impact of information sharing techniques in purely cooperative or mixed cooperative/competitive tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    The classification rule is based on the occurrence of specific keywords in the paper title.

  2. 2.

    In MAB theory [55], the regret is defined as the expected difference between the reward sum associated with an optimal strategy and the sum of the collected rewards of the actual strategy.

References

  1. Akyildiz IF, Lee WY, Vuran MC, Mohanty S (2006) NeXt generation/dynamic spectrum access/cognitive radio wireless networks: a survey. Comput Netw J 50(1):2127–2159

    Google Scholar 

  2. Mitola J (2000) Cognitive radio an integrated agent architecture for software defined radio. PhD Dissertation, KTH Stockholm

    Google Scholar 

  3. Yucek T, Arslan H (2009) A survey of spectrum sensing algorithms for cognitive radio applications. J IEEE Commun Surv Tutor 11(1):116–130

    Google Scholar 

  4. Lee WY, Akyildiz I (2008) Optimal spectrum sensing framework for cognitive radio networks. IEEE Trans Wirel Commun 7(10):3845–3857

    Google Scholar 

  5. Sherman M, Mody AN, Martinez R, Rodriguez C, Reddy R (2008) IEEE standards supporting cognitive radio and networks, dynamic spectrum access, and coexistence. IEEE Commun Mag 46(7):72–79

    Google Scholar 

  6. Flores AB, Guerra RE, Knightly EW (2013) IEEE 802.11af: a standard for TV white space spectrum sharing. IEEE Commun Mag 51(10):92–100

    Google Scholar 

  7. Clancy C, Hecker J, Stuntbeck E, OShea T (2007) Applications of machine learning to cognitive radio networks. IEEE Wirel Commun 14(4):47–52

    Google Scholar 

  8. Mitchell T (1997) Machine learning. McGraw Hill, New York

    Google Scholar 

  9. Gavrilovska L, Atanasovksi V, Macaluso I, DaSilva L (2013) Learning and reasoning in cognitive radio networks. IEEE Commun Surv Tutor 15(4):1761–1777

    Google Scholar 

  10. Bkassiny M, Li Y, Jayaweera SK (2013) A survey on machine-learning techniques in cognitive radios. IEEE Commun Surv Tutor 15(3):1136–1159

    Google Scholar 

  11. Wang W, Kwasinksi A, Niyato D, Han Z (2016) A survey on applications of model-free strategy learning in cognitive wireless networks. IEEE Commun Surv Tutor 18(3):1717–1757

    Google Scholar 

  12. Barto AG, Sutton R (1998) Reinforcement learning: an introduction. MIT Press, Cambridge

    Google Scholar 

  13. Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4(1):237–285

    Google Scholar 

  14. Busoniu L, Babuska R, De Schutter B (2008) A comprehensive survey of multiagent reinforcement learning. IEEE Trans Syst Man Cybern 38(2):156–171

    Google Scholar 

  15. Busoniu L, Babuska R, De Schutter B (2006) Multi-agent reinforcement learning: a survey. In: Proceedings of IEEE ICARCV, Singapore

    Google Scholar 

  16. Watkins CJ, Dayan P (1992) Technical note: Q-learning. Mach Learn 8(1):279–292

    Google Scholar 

  17. Rummery GA, Niranjan M (1994) Online Q-learning using connectionist systems. Technical Report

    Google Scholar 

  18. Di Felice MK, Wu C, Bononi L, Meleis W (2010) Learning-based spectrum selection in cognitive radio ad hoc networks. In: Proceedings of IEEE/IFIP WWIC, Lulea

    Google Scholar 

  19. Yau KLA, Komisarczuk P, Teal PD (2012) Reinforcement learning for context awareness and intelligence in wireless networks: review, new features and open issues. J Netw Comput Appl 35(1):235–267

    Google Scholar 

  20. Yau KLA, Komisarczuk P, Teal PD (2010) Applications of reinforcement learning to cognitive radio networks. In: Proceedings of IEEE ICC, Capetown

    Google Scholar 

  21. Raza Syed A, Alvin Yau KL, Qadir J, Mohamad H, Ramli N, Loong Keoh S (2016) Route selection for multi-hop cognitive radio networks using reinforcement learning: an experimental study. In: Proceedings of IEEE access 4(1):6304–6324

    Google Scholar 

  22. Vucevic N, Akyildiz IF, Romero JP (2010) Cooperation reliability based on reinforcement learning for cognitive radio networks. In: Proceedings of IEEE SDR, Boston

    Google Scholar 

  23. Jiang T, Grace D, Mitchell PD (2011) Efficient exploration in reinforcement learning-based cognitive radio spectrum sharing. IET Commun 5(10):1309–1317

    Google Scholar 

  24. Ozekin E, Demirci FC, Alagoz F (2013) Self-evaluating reinforcement learning based spectrum management for cognitive ad hoc networks. In: Proceedings of IEEE ICOIN, Bangkok

    Google Scholar 

  25. Macaluso I, DaSilva L, Doyle L (2012) Learning Nash equilibria in distributed channel selection for frequency-agile radios. In: Proceedings of IEEE ECAI, Montpellier

    Google Scholar 

  26. Lall S, Sadhu AK, Konar A, Mallik KK, Ghosh S (2016) Multi-agent reinforcement learning for stochastic power management in cognitive radio network. In: Proceedings of IEEE Microcom, Durgapur

    Google Scholar 

  27. Kapetanakis S, Kudenko D (2002) Reinforcement learning of coordination to cooperative multi-agent systems. In: Proceedings of AAAI, Menlo Park

    Google Scholar 

  28. Wahab B, Yang Y, Fan Z, Sooriyabandara M (2009) Reinforcement learning based spectrum-aware routing in multi-hop cognitive radio networks. In: Proceedings of IEEE CROWNCOM, Hannover

    Google Scholar 

  29. Chowdhury K, Wu C, Di Felice M, Meleis W (2010) Spectrum management of cognitive radio using multi-agent reinforcement learning. In: Proceedings of IEEE AAMAS, Toronto

    Google Scholar 

  30. Faganello LR, Kunst R, Both CB (2013) Improving reinforcement learning algorithms for dynamic spectrum allocation in cognitive sensor networks. In: Proceedings of IEEE WCNC, Shanghai

    Google Scholar 

  31. Wu Y, Hu F, Kumar S, Zhu Y, Talari A, Rahnavard N, Matyjas JD (2014) A learning-based QoE-driven spectrum handoff scheme for multimedia transmissions over cognitive radio networks. IEEE J Sel Areas Commun 32(11):2134–2148

    Google Scholar 

  32. Chen X, Zhao Z, Zhang H (2013) Stochastic power adaptation with multiagent reinforcement learning for cognitive wireless mesh networks. IEEE Trans Mob Comput 12(11):2155–2166

    Google Scholar 

  33. Zhou P, Chang Y, Copeland JA (2010) Learning through reinforcement for repeated power control game in cognitive radio networks. In: Proceedings of IEEE Globecom, Miami

    Google Scholar 

  34. Di Felice M, Chowdhury K, Kim W, Kassler A, Bononi L (2011) End-to-end protocols for cognitive radio ad hoc networks: an evaluation study. Perform Eval (Elsevier) 68(9):859–875

    Google Scholar 

  35. Reddy YB (2008) Detecting primary signals for efficient utilization of spectrum using Q-learning. In: Proceedings of IEEE ITNG, Las Vegas

    Google Scholar 

  36. Berhold U, Fu F, Van Der Schaar M, Jondral FK (2008) Detection of spectral resources in cognitive radios using reinforcement learning. In: Proceedings of IEEE Dyspan, pp 1–5

    Google Scholar 

  37. Di Felice M, Chowdhury KR, Kassler A, Bononi L (2011) Adaptive sensing scheduling and spectrum selection in cognitive wireless mesh networks. In: Proceedings of IEEE Flex-BWAN, Maui

    Google Scholar 

  38. Arunthavanathan S, Kandeepan S, Evans RJ (2013) Reinforcement learning based secondary user transmissions in cognitive radio networks. In: Proceedings of IEEE Globecom, Atlanta

    Google Scholar 

  39. Mendes AC, Augusto CHP, da Silva MWR, Guedes RM, de Rezende JF (2011) Channel sensing order for cognitive radio networks using reinforcement learning. In: Proceedings of IEEE LCN, Bonn

    Google Scholar 

  40. Lo BF, Akyldiz IF (2010) Reinforcement learning-based cooperative sensing in cognitive radio ad hoc networks. In: Proceedings of IEEE PIMRC, Istanbul

    Google Scholar 

  41. Lunden J, Kulkarni SR, Koivunen V, Poor HV (2011) Exploiting spatial diversity in multiagent reinforcement learning based spectrum sensing. In: Proceedings of IEEE CAMSAP, San Juan

    Google Scholar 

  42. Lunden J, Kulkarni SR, Koivunen V, Poor HV (2013) Multiagent reinforcement learning based spectrum sensing policies for cognitive radio networks. IEEE J Sel Top Signal Process 7(5):858–868

    Google Scholar 

  43. Jao Y, Feng Z (2010) Centralized channel and power allocation for cognitive radio network: a Q-learning solution. In: Proceedings of IEEE FNMS, Florence

    Google Scholar 

  44. Galindo-Serrano A, Giupponi L, Blasco P, Dohler M (2010) Learning from experts in cognitive radio networks: the docitive paradigm. In: Proceedings of IEEE CROWNCOM, Cannes

    Google Scholar 

  45. Galindo-Serrano A, Giupponi L (2010) Distributed Q-learning for aggregated interference control in cognitive radio networks. IEEE Trans Veh Tech 59(4):1823–1834

    Google Scholar 

  46. Chowdhury KR, Di Felice M, Doost-Mohammady R, Meleis W, Bononi L (2011) Cooperation and communication in cognitive radio networks based on TV spectrum experiments. In: Proceedings of IEEE WoWMoM, Lucca

    Google Scholar 

  47. Emre M, Gur G, Bayhan S, Alagoz F (2015) CooperativeQ: energy-efficient channel access based on cooperative reinforcement learning. In: Proceedings of IEEE ICCW, London

    Google Scholar 

  48. Saad H, Mohamed A, ElBatt T (2012) Distributed cooperative Q-learning for power allocation in cognitive femtocell networks. In: Proceedings of IEEE VTC-Fall, Quebec City

    Google Scholar 

  49. Venkatraman P, Hamdaoui B, Guizani M (2010) Opportunistic bandwidth sharing thorough reinforcement learning. IEEE Trans Veh Tech 59(6):3148–3153

    Google Scholar 

  50. Bernardo F, Augusti R, Perez-Romero J, Sallent O (2010) Distributed spectrum management based on reinforcement learning. In: Proceeding of IEEE CROWNCOM, Hannover

    Google Scholar 

  51. Yau KLA, Komisarczuk P, Teal PD (2010) Context-awareness and intelligence in distributed cognitive radio networks: a reinforcement learning approach. In: Proceedings of IEEE AusCTW, Canberra

    Google Scholar 

  52. Yau KLA, Komisarczuk P, Teal PD (2010) Enhancing network performance in distributed cognitive radio networks using single-agent and multi-agent reinforcement learning. In: Proceedings of IEEE LCN, Denver

    Google Scholar 

  53. Yau KLA, Komisarczuk P, Teal PD (2010) Achieving context awareness and intelligence in distributed cognitive radio networks: a payoff propagation approach. In: Proceedings of IEEE WAINA, Singapore

    Google Scholar 

  54. Kakalou I, Papadimitriou GI, Nicopoliditis P, Sarigiannidis PG, Obaidat MS (2015) A reinforcement learning-based cognitive MAC protocol. In: Proceedings of IEEE ICC, London

    Google Scholar 

  55. Agrawal R (1995) Sample mean based index policies with o(log(n)) regret for the multi-armed bandit problem. Adv Appl Prob 27(1):1054–1078

    Google Scholar 

  56. Robert C, Moy C, Wang CX (2014) Reinforcement learning approaches and evaluation criteria for opportunistic spectrum access. In: Proceeding of IEEE ICC, Sydney

    Google Scholar 

  57. Jouini W, Di Felice M, Bononi L, Moy C (2012) Coordination and collaboration in secondary networks: a multi-armed bandit based framework. In: Technical Report. Available at: https://arxiv.org/abs/1204.3005

  58. Li H (2010) Multi-agent Q-learning for competitive spectrum access in cognitive radio systems. In: Proceedings of IEEE SDR, Boston

    Google Scholar 

  59. Alsarhan A, Agarwal A (2010) Resource adaptations for revenue optimization in cognitive mesh network using reinforcement learning. In: Proceedings of IEEE GLOBECOM, Miami

    Google Scholar 

  60. Teng Y, Zhang Y, Niu F, Dai C, Song M (2010) Reinforcement learning based auction algorithm for dynamic spectrum access in cognitive radio networks. In: Proceedings of IEEE VTC Fall, Ottawa

    Google Scholar 

  61. Cesana M, Cuomo F, Ekici E (2011) Routing in cognitive radio networks: challenges and solutions. Ad Hoc Netw (Elsevier) 9(3):228–248

    Google Scholar 

  62. Chowdhury KM, Di Felice (2009) SEARCH: a routing protocol for mobile cognitive radio ad-hoc networks. Comput Commun (Elsevier) 32(18):1983–1997

    Google Scholar 

  63. Litman M, Boyan J (1994) Packet routing in dynamically changing networks: a reinforcement learning approach. Adv Neural Inform Process Syst 7(1):671–678

    Google Scholar 

  64. Chetret D, Tham C, Wong L (2004) Reinforcement learning and CMAC-based adaptive routing for MANETs. In: Proceedings of IEEE ICON, Singapore

    Google Scholar 

  65. Al-Rawi AHA, Alvin Yau KL, Mohamad H, Ramli N, Hashim W (2014) A reinforcement learning-based routing scheme for cognitive radio ad hoc networks. In: Proceedings of IEEE WMNC, Vilamoura

    Google Scholar 

  66. Zheng K, Li H, Qiu RC, Gong S (2012) Multi-objective reinforcement learning based routing in cognitive radio networks: walking in a random maze. In: Proceedings of IEEE ICNC, Maui

    Google Scholar 

  67. Safdar T, Hasbulah HB, Rehan M (2015) Effect of reinforcement learning on routing of cognitive radio ad hoc networks. In: Proceedings of IEEE ISMSC, Ipon

    Google Scholar 

  68. Pourpeighhambar B, Dehghan M, Sabaei M (2017) Non-cooperative reinforcement learning based routing in cognitive radio networks. Comput Commun (Elsevier) 106(1):11–23

    Google Scholar 

  69. Dowling J, Curran E, Cunningham R, Cahill V (2005) Using feedback in collaborative reinforcement learning to adaptively optimize MANET routing. IEEE Trans Syst Man Cybern 35(3):360–372

    Google Scholar 

  70. Macaluso I, Finn D, Ozgul BAL, DaSilva (2013) Complexity of spectrum activity and benefits of reinforcement learning for dynamic channel selection. IEEE J Sel Areas Commun 31(11):2237–2246

    Google Scholar 

  71. Ren Y, Dmochowski P, Komisarczuk P (2010) Analysis and implementation of reinforcement learning on a GNU radio cognitive radio platform. In: Proceedings of IEEE CROWNCOM, Cannes

    Google Scholar 

  72. Moy C, Nafkha A, Naoues M (2015) Reiforcement learning demonstrator for opportunistic spectrum access on real radio signals. In: Proceedings of IEEE DySPAN, Stockholm

    Google Scholar 

  73. Dayan P, Niv Y (2008) Reinforcement learning: the good, the bad and the ugly. Curr Opin Neurobiol 18(1):1–12

    Google Scholar 

  74. Naparstek O, Cohen K (2017) Deep multi-user reinforcement learning for distributed dynamic spectrum access. In: CoRR abs/1704.02613

    Google Scholar 

  75. Ferreira VP, Paffenroth R, Wyglinski RMA, Hackett MT, Bilen GS, Reinhart CR, Mortense JD (2017) Multi-objective reinforcement learning-based deep neural networks for cognitive space communications. In: Proceedings of IEEE CCAA, Cleveland

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Di Felice .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Di Felice, M., Bedogni, L., Bononi, L. (2018). Reinforcement Learning-Based Spectrum Management for Cognitive Radio Networks: A Literature Review and Case Study. In: Zhang, W. (eds) Handbook of Cognitive Radio . Springer, Singapore. https://doi.org/10.1007/978-981-10-1389-8_58-1

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-1389-8_58-1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-1389-8

  • Online ISBN: 978-981-10-1389-8

  • eBook Packages: Springer Reference EngineeringReference Module Computer Science and Engineering

Publish with us

Policies and ethics