Skip to main content

\(\mathsf {QFlip}\): An Adaptive Reinforcement Learning Strategy for the \(\mathsf {FlipIt}\) Security Game

  • Conference paper
  • First Online:
Decision and Game Theory for Security (GameSec 2019)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11836))

Included in the following conference series:

Abstract

A rise in Advanced Persistent Threats (APTs) has introduced a need for robustness against long-running, stealthy attacks which circumvent existing cryptographic security guarantees. \(\mathsf {FlipIt}\) is a security game that models attacker-defender interactions in advanced scenarios such as APTs. Previous work analyzed extensively non-adaptive strategies in \(\mathsf {FlipIt}\), but adaptive strategies rise naturally in practical interactions as players receive feedback during the game. We model the \(\mathsf {FlipIt}\) game as a Markov Decision Process and introduce \(\mathsf {QFlip}\), an adaptive strategy for \(\mathsf {FlipIt}\) based on temporal difference reinforcement learning. We prove theoretical results on the convergence of our new strategy against an opponent playing with a Periodic strategy. We confirm our analysis experimentally by extensive evaluation of \(\mathsf {QFlip}\) against specific opponents. \(\mathsf {QFlip}\) converges to the optimal adaptive strategy for Periodic and Exponential opponents using associated state spaces. Finally, we introduce a generalized \(\mathsf {QFlip}\) strategy with composite state space that outperforms a Greedy strategy for several distributions including Periodic and Uniform, without prior knowledge of the opponent’s strategy. We also release an OpenAI Gym environment for \(\mathsf {FlipIt}\) to facilitate future research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bowers, K.D., et al.: Defending against the unknown enemy: Applying FLIPIT to system security. In: Proceedings of the Conference on Decision and Game Theory for Security. GameSec (2012)

    Google Scholar 

  2. Chung, K., Kamhoua, C.A., Kwiat, K.A., Kalbarczyk, Z.T., Iyer, R.K.: Game theory with learning for cyber security monitoring. In: 2016 IEEE 17th International Symposium on High Assurance Systems Engineering (HASE), pp. 1–8, January 2016. https://doi.org/10.1109/HASE.2016.48

  3. van Dijk, M., Juels, A., Oprea, A., Rivest, R.L.: FlipIt: The game of stealthy takeover. J. Cryptol. 26, 655–713 (2013)

    Article  MathSciNet  Google Scholar 

  4. Elderman, R., Pater, L.J.J., Thie, A.S., Drugan, M.M., Wiering, M.: Adversarial reinforcement learning in a cyber security simulation. In: ICAART (2017)

    Google Scholar 

  5. Farhang, S., Grossklags, J.: FlipLeakage: a game-theoretic approach to protect against stealthy attackers in the presence of information leakage. In: Zhu, Q., Alpcan, T., Panaousis, E., Tambe, M., Casey, W. (eds.) GameSec 2016. LNCS, vol. 9996, pp. 195–214. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47413-7_12

    Chapter  MATH  Google Scholar 

  6. Feng, X., Zheng, Z., Hu, P., Cansever, D., Mohapatra, P.: Stealthy attacks meets insider threats: a three-player game model. In: IEEE Military Communications Conference on MILCOM 2015–2015, pp. 25–30, October 2015. https://doi.org/10.1109/MILCOM.2015.7357413

  7. Feng, X., Zheng, Z., Mohapatra, P., Cansever, D.: A stackelberg game and Markov modeling of moving target defense. In: Rass, S., An, B., Kiekintveld, C., Fang, F., Schauer, S. (eds.) Decision and Game Theory for Security, pp. 315–335. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68711-7_17

    Chapter  MATH  Google Scholar 

  8. Grossklags, J., Reitter, D.: How task familiarity and cognitive predispositions impact behavior in a security game of timing. In: 2014 IEEE 27th Computer Security Foundations Symposium, pp. 111–122, July 2014. https://doi.org/10.1109/CSF.2014.16

  9. Han, Y.: Reinforcement learning for autonomous defence in software-defined networking. In: Bushnell, L., Poovendran, R., Başar, T. (eds.) GameSec 2018. LNCS, vol. 11199, pp. 145–165. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01554-1_9

    Chapter  Google Scholar 

  10. Hu, P., Li, H., Fu, H., Cansever, D., Mohapatra, P.: Dynamic defense strategy against advanced persistent threat with insiders. In: 2015 IEEE Conference on Computer Communications (INFOCOM), pp. 747–755, April 2015. https://doi.org/10.1109/INFOCOM.2015.7218444

  11. Hu, Q., Lv, S., Shi, Z., Sun, L., Xiao, L.: Defense against advanced persistent threats with expert system for internet of things. In: Ma, L., Khreishah, A., Zhang, Y., Yan, M. (eds.) WASA 2017. LNCS, vol. 10251, pp. 326–337. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60033-8_29

    Chapter  Google Scholar 

  12. Klíma, R., Tuyls, K., Oliehoek, F.A.: Markov security games: learning in spatial security problems (2016)

    Google Scholar 

  13. Laszka, A., Horvath, G., Felegyhazi, M., Buttyán, L.: FlipThem: modeling targeted attacks with flipit for multiple resources. In: Poovendran, R., Saad, W. (eds.) GameSec 2014. LNCS, vol. 8840, pp. 175–194. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-12601-2_10

    Chapter  MATH  Google Scholar 

  14. Laszka, A., Johnson, B., Grossklags, J.: Mitigating covert compromises: a game-theoretic model of targeted and non-targeted covert attacks. In: 9th International Conference on Web and Internet Economics (WINE) (2013)

    Chapter  Google Scholar 

  15. Laszka, A., Johnson, B., Grossklags, J.: Mitigation of targeted and non-targeted covert attacks as a timing game. In: Das, S.K., Nita-Rotaru, C., Kantarcioglu, M. (eds.) GameSec 2013. LNCS, vol. 8252, pp. 175–191. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-02786-9_11

    Chapter  MATH  Google Scholar 

  16. Maleki, H., Valizadeh, S., Koch, W., Bestavros, A., van Dijk, M.: Markov modeling of moving target defense games. In: Proceedings of the 2016 ACM Workshop on Moving Target Defense, MTD 2016, pp. 81–92. ACM, New York (2016). https://doi.org/10.1145/2995272.2995273

  17. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.A.: Playing atari with deep reinforcement learning. CoRR abs/1312.5602 (2013)

    Google Scholar 

  18. Nochenson, A., Grossklags, J.: A behavioral investigation of the FlipIt game. In: 12th Workshop on the Economics of Information Security (WEIS) (2013)

    Google Scholar 

  19. Pham, V., Cid, C.: Are we compromised? modelling security assessment games. In: Grossklags, J., Walrand, J. (eds.) GameSec 2012. LNCS, vol. 7638, pp. 234–247. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34266-0_14

    Chapter  MATH  Google Scholar 

  20. Reitter, D., Grossklags, J., Nochenson, A.: Risk-seeking in a continuous game of timing. In: 13th International Conference on Cognitive Modeling (ICMM) (2013)

    Google Scholar 

  21. Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529, 484–503 (2016)

    Article  Google Scholar 

  22. Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning, 1st edn. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  23. Tesauro, G.: Temporal difference learning and TD-Gammon. Commun. ACM 38(3), 58–68 (1995). https://doi.org/10.1145/203330.203343

    Article  Google Scholar 

  24. Xiao, L., Li, Y., Han, G., Dai, H., Poor, H.V.: A secure mobile crowdsensing game with deep reinforcement learning. IEEE Trans. Inf. Forensics Secur. 13(1), 35–47 (2018). https://doi.org/10.1109/TIFS.2017.2737968

    Article  Google Scholar 

  25. Zhang, M., Zheng, Z., Shroff, N.B.: Stealthy attacks and observable defenses: a game theoretic model under strict resource constraints. In: 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 813–817, December 2014. https://doi.org/10.1109/GlobalSIP.2014.7032232

  26. Zhu, M., Hu, Z., Liu, P.: Reinforcement learning algorithms for adaptive cyber defense against Heartbleed. In: Proceedings of the First ACM Workshop on Moving Target Defense, MTD 2014, pp. 51–58. ACM, New York (2014). https://doi.org/10.1145/2663474.2663481

Download references

Acknowledgements

We would like to thank Ronald Rivest, Marten van Dijk, Ari Juels, and Sang Chin for discussions about reinforcement learning in \(\mathsf {FlipIt}\). We thank Matthew Jagielski, Tina Eliassi-Rad, and Lucianna Kiffer for discussing the theoretical analysis. This project was funded by NSF under grant CNS-1717634. This research was also sponsored by the U.S. Army Combat Capabilities Development Command Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-13-2-0045 (ARL Cyber Security CRA). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Combat Capabilities Development Command Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lisa Oakley .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Oakley, L., Oprea, A. (2019). \(\mathsf {QFlip}\): An Adaptive Reinforcement Learning Strategy for the \(\mathsf {FlipIt}\) Security Game. In: Alpcan, T., Vorobeychik, Y., Baras, J., Dán, G. (eds) Decision and Game Theory for Security. GameSec 2019. Lecture Notes in Computer Science(), vol 11836. Springer, Cham. https://doi.org/10.1007/978-3-030-32430-8_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32430-8_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32429-2

  • Online ISBN: 978-3-030-32430-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics