Stochastic Dynamic Information Flow Tracking Game with Reinforcement Learning

  • Dinuka SahabanduEmail author
  • Shana Moothedath
  • Joey Allen
  • Linda Bushnell
  • Wenke Lee
  • Radha Poovendran
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11836)


Advanced Persistent Threats (APTs) are stealthy, sophisticated, and long-term attacks that impose significant economic costs and violate the security of sensitive information. Data and control flow commands arising from APTs introduce new information flows into the targeted computer system. Dynamic Information Flow Tracking (DIFT) is a promising detection mechanism against APTs that taints suspicious input sources in the system and authenticates the tainted flows at certain processes according to a well defined security policy. Employing DIFT to defend against APTs in large scale cyber systems is restricted due to the heavy resource and performance overhead introduced on the system. The objective of this paper is to model resource efficient DIFT that successfully detect APTs. We develop a game-theoretic framework and provide an analytical model of DIFT that enables the study of trade-off between resource efficiency and the quality of detection in DIFT. Our proposed infinite-horizon, nonzero-sum, stochastic game captures the performance parameters of DIFT such as false alarms and false-negatives and considers an attacker model where the APT can relaunch the attack if it fails in a previous attempt and thereby continuously engage in threatening the system. We assume some of the performance parameters of DIFT are unknown. We propose a model-free reinforcement learning algorithm that converges to a Nash equilibrium of the discounted stochastic game between APT and DIFT. We execute and evaluate the proposed algorithm on a real-world nation state attack dataset.


Security of computer systems Advance persistent threats Dynamic Information Flow Tracking Stochastic games Reinforcement learning 


  1. 1.
    Alpcan, T., Başar, T.: An intrusion detection game with limited observations. In: International Symposium on Dynamic Games and Applications (2006)Google Scholar
  2. 2.
    Amir, R.: Stochastic games in economics and related fields: an overview. In: Neyman, A., Sorin, S. (eds.) Stochastic Games and Applications. ASIC, vol. 570, pp. 455–470. Springer, Dordrecht (2003). Scholar
  3. 3.
    Bencsáth, B., Pék, G., Buttyán, L., Felegyhazi, M.: The cousins of Stuxnet: Duqu, Flame, and Gauss. Future Internet 4(4), 971–1003 (2012)CrossRefGoogle Scholar
  4. 4.
    Borkar, V.S.: Stochastic approximation with two time scales. Syst. Control Lett. 29(5), 291–294 (1997)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Bowling, M., Veloso, M.: Rational and convergent learning in stochastic games. In: International Joint Conference on Artificial Intelligence, vol. 17, no. 1, pp. 1021–1026 (2001)Google Scholar
  6. 6.
    Brogi, G., Tong, V.V.T.: TerminAPTor: highlighting advanced persistent threats through information flow tracking. In: IFIP International Conference on New Technologies, Mobility and Security, pp. 1–5 (2016)Google Scholar
  7. 7.
    Clause, J., Li, W., Orso, A.: Dytan: a generic dynamic taint analysis framework. In: International Symposium on Software Testing and Analysis, pp. 196–206 (2007)Google Scholar
  8. 8.
    Enck, W., et al.: TaintDroid: an information-flow tracking system for realtime privacy monitoring on smartphones. ACM Trans. Comput. Syst. 32(2), 1–5 (2014)CrossRefGoogle Scholar
  9. 9.
    Falliere, N., Murchu, L.O., Chien, E.: W32.Stuxnet Dossier. White paper, Symantec Corp., Security Response, vol. 5, no. 6, p. 29 (2011)Google Scholar
  10. 10.
    Filar, J., Vrieze, K.: Competitive Markov Decision Processes. Springer, New York (2012)zbMATHGoogle Scholar
  11. 11.
    Greenwald, A., Hall, K., Serrano, R.: Correlated Q-learning. In: International Conference on Machine Learning (ICML), vol. 3, pp. 242–249 (2003)Google Scholar
  12. 12.
    Hu, J., Wellman, M.P.: Nash Q-learning for general-sum stochastic games. J. Mach. Learn. Res. 4, 1039–1069 (2003)MathSciNetzbMATHGoogle Scholar
  13. 13.
    Huang, L., Zhu, Q.: Adaptive strategic cyber defense for advanced persistent threats in critical infrastructure networks. ACM SIGMETRICS Perform. Eval. Rev. 46(2), 52–56 (2019)CrossRefGoogle Scholar
  14. 14.
    Jang-Jaccard, J., Nepal, S.: A survey of emerging threats in cybersecurity. J. Comput. Syst. Sci. 80(5), 973–993 (2014)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Jee, K., Kemerlis, V.P., Keromytis, A.D., Portokalidis, G.: ShadowReplica: efficient parallelization of dynamic data flow tracking. In: ACM SIGSAC Conference on Computer & Communications Security, pp. 235–246 (2013)Google Scholar
  16. 16.
    Ji, Y., et al.: RAIN: refinable attack investigation with on-demand inter-process information flow tracking. In: ACM SIGSAC Conference on Computer and Communications Security, pp. 377–390 (2017)Google Scholar
  17. 17.
    Lye, K.W., Wing, J.M.: Game strategies in network security. Int. J. Inf. Secur. 4(1–2), 71–86 (2005)CrossRefGoogle Scholar
  18. 18.
    Moothedath, S., et al.: A game theoretic approach for dynamic information flow tracking to detect multi-stage advanced persistent threats. ArXiv e-prints arXiv:1811.05622, November 2018
  19. 19.
    Moothedath, S., Sahabandu, D., Clark, A., Lee, S., Lee, W., Poovendran, R.: Multi-stage dynamic information flow tracking game. In: Bushnell, L., Poovendran, R., Başar, T. (eds.) GameSec 2018. LNCS, vol. 11199, pp. 80–101. Springer, Cham (2018). Scholar
  20. 20.
    Newsome, J., Song, D.: Dynamic taint analysis: automatic detection, analysis, and signature generation of exploit attacks on commodity software. In: Network and Distributed Systems Security Symposium (2005)Google Scholar
  21. 21.
    Nguyen, K.C., Alpcan, T., Başar, T.: Stochastic games for security in networks with interdependent nodes. In: International Conference on Game Theory for Networks, pp. 697–703 (2009)Google Scholar
  22. 22.
    Nightingale, E.B., Peek, D., Chen, P.M., Flinn, J.: Parallelizing security checks on commodity hardware. ACM SIGPLAN Not. 43(3), 308–318 (2008)CrossRefGoogle Scholar
  23. 23.
    Prasad, H., Prashanth, L.A., Bhatnagar, S.: Two-timescale algorithms for learning Nash equilibria in general-sum stochastic games. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 1371–1379 (2015)Google Scholar
  24. 24.
    Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 400–407 (1951)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Sahabandu, D., et al.: A game theoretic approach for dynamic information flow tracking with conditional branching. In: American Control Conference (ACC) (2019, to appear)Google Scholar
  26. 26.
    Sahabandu, D., Xiao, B., Clark, A., Lee, S., Lee, W., Poovendran, R.: DIFT games: dynamic information flow tracking games for advanced persistent threats. In: IEEE Conference on Decision and Control (CDC), pp. 1136–1143 (2018)Google Scholar
  27. 27.
    Sayin, M.O., Hosseini, H., Poovendran, R., Başar, T.: A game theoretical framework for inter-process adversarial intervention detection. In: Bushnell, L., Poovendran, R., Başar, T. (eds.) GameSec 2018. LNCS, vol. 11199, pp. 486–507. Springer, Cham (2018). Scholar
  28. 28.
    Suh, G.E., Lee, J.W., Zhang, D., Devadas, S.: Secure program execution via dynamic information flow tracking. ACM SIGPLAN Not. 39(11), 85–96 (2004)CrossRefGoogle Scholar
  29. 29.
    Vieille, N.: Two-player stochastic games II: the case of recursive games. Israel J. Math. 119(1), 93–126 (2000)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Vogt, P., Nentwich, F., Jovanovic, N., Kirda, E., Kruegel, C., Vigna, G.: Cross site scripting prevention with dynamic data tainting and static analysis. In: Network & Distributed System Security Symposium, pp. 1–12 (2007)Google Scholar
  31. 31.
    Watkins, B.: The impact of cyber attacks on the private sector, pp. 1–11 (2014)Google Scholar
  32. 32.
    Zhu, Q., Başar, T.: Robust and resilient control design for cyber-physical systems with an application to power systems. In: IEEE Decision and Control and European Control Conference (CDC-ECC), pp. 4066–4071 (2011)Google Scholar
  33. 33.
    Zhu, Q., Tembine, H., Başar, T.: Network security configurations: a nonzero-sum stochastic game approach. In: American Control Conference (ACC), pp. 1059–1064 (2010)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Dinuka Sahabandu
    • 1
    Email author
  • Shana Moothedath
    • 1
  • Joey Allen
    • 2
  • Linda Bushnell
    • 1
  • Wenke Lee
    • 2
  • Radha Poovendran
    • 1
  1. 1.Department of Electrical and Computer EngineeringUniversity of WashingtonSeattleUSA
  2. 2.College of ComputingGeorgia Institute of TechnologyAtlantaUSA

Personalised recommendations