Abstract
Organizations that collect and use large volumes of personal information often use security audits to protect data subjects from inappropriate uses of this information by authorized insiders. In face of unknown incentives of employees, a reasonable audit strategy for the organization is one that minimizes its regret. While regret minimization has been extensively studied in repeated games, the standard notion of regret for repeated games cannot capture the complexity of the interaction between the organization (defender) and an adversary, which arises from dependence of rewards and actions on history. To account for this generality, we introduce a richer class of games called bounded-memory games, which can provide a more accurate model of the audit process. We introduce the notion of k-adaptive regret, which compares the reward obtained by playing actions prescribed by the algorithm against a hypothetical k-adaptive adversary with the reward obtained by the best expert in hindsight against the same adversary. Roughly, a hypothetical k-adaptive adversary adapts her strategy to the defender’s actions exactly as the real adversary would within each window of k rounds. A k-adaptive adversary is a natural model for temporary adversaries (e.g., company employees) who stay for a certain number of audit cycles and are then replaced by a different person. Our definition is parameterized by a set of experts, which can include both fixed and adaptive defender strategies. We investigate the inherent complexity of and design algorithms for adaptive regret minimization in bounded-memory games of perfect and imperfect information. We prove a hardness result showing that, with imperfect information, any k-adaptive regret minimizing algorithm (with fixed strategies as experts) must be inefficient unless NP = RP even when playing against an oblivious adversary. In contrast, for bounded-memory games of perfect and imperfect information we present approximate 0-adaptive regret minimization algorithms against an oblivious adversary running in time \(n^{O\left(1\right)}\).
This work was partially supported by the U.S. Army Research Office contract “Perpetually Available and Secure Information Systems” (DAAD19-02-1-0389) to Carnegie Mellon CyLab, the NSF Science and Technology Center TRUST, the NSF CyberTrust grant “Privacy, Compliance and Information Risk in Complex Organizational Processes,” the AFOSR MURI “Collaborative Policies and Assured Information Sharing,” and HHS Grant no. HHS 90TR0003/01. Jeremiah Blocki was also partially supported by a NSF Graduate Fellowship. Arunesh Sinha was also partially supported by the CMU CIT Bertucci Fellowship. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Blum, A., Mansour, Y.: Learning, regret minimization, and equilibria. Algorithmic Game Theory, 79–102 (2007)
Shapley, L.: Stochastic games. Proceedings of the National Academy of Sciences of the United States of America 39(10), 1095 (1953)
Blocki, J., Christin, N., Datta, A., Sinha, A.: Adaptive regret minimization in bounded-memory games. CoRR abs/1111.2888 (2011)
Blocki, J., Christin, N., Datta, A., Sinha, A.: Regret minimizing audits: A learning-theoretic basis for privacy protection. In: 24th IEEE Computer Security Foundations Symposium, CSF 2011, pp. 312–327. IEEE (2011)
Blocki, J., Christin, N., Datta, A., Sinha, A.: Audit mechanisms for provable risk management and accountable data governance. In: Grossklags, J., Walrand, J. (eds.) GameSec 2012. LNCS, vol. 7638, pp. 38–59. Springer, Heidelberg (2012)
Von Stackelberg, H.: Market structure and equilibrium. Springer (2011)
Blocki, J., Christin, N., Datta, A., Procaccia, A.D., Sinha, A.: Audit games. In: IJCAI (2013)
Tambe, M.: Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned. Cambridge University Press (2011)
Mertens, J., Neyman, A.: Stochastic games. International Journal of Game Theory 10(2), 53–66 (1981)
Papadimitriou, C., Tsitsiklis, J.: The complexity of optimal queueing network control (1999)
Golovin, D., Krause, A.: Adaptive submodularity: A new approach to active learning and stochastic optimization. CoRR abs/1003.3967 (2010)
Even-Dar, E., Kakade, S., Mansour, Y.: Experts in a Markov decision process. In: Advances in Neural Information Processing Systems 17: Proceedings of the 2004 Conference, p. 401. The MIT Press (2005)
Mannor, S., Shimkin, N.: The empirical bayes envelope and regret minimization in competitive markov decision processes. Mathematics of Operations Research, 327–345 (2003)
Even-Dar, E., Mannor, S., Mansour, Y.: Learning with global cost in stochastic environments. In: COLT: Proceedings of the Workshop on Computational Learning Theory (2010)
Takimoto, E., Warmuth, M.: Path kernels and multiplicative updates. The Journal of Machine Learning Research 4, 773–818 (2003)
Awerbuch, B., Kleinberg, R.: Online linear optimization and adaptive routing. Journal of Computer and System Sciences 74(1), 97–114 (2008)
Farias, D.P.D., Megiddo, N.: Combining expert advice in reactive environments. J. ACM 53, 762–799 (2006)
Fudenberg, D., Tirole, J.: Game theory. MIT Press (1991)
Blum, A., Mansour, Y.: From external to internal regret. Learning Theory, 621–636 (2005)
Celentani, M., Fudenberg, D., Levine, D., Pesendorfer, W.: Maintaining a reputation against a patient opponent. Econometrica 64, 691–704 (1996)
Impagliazzo, R., Paturi, R.: On the complexity of k-sat. Journal of Computer and System Sciences 62(2), 367–375 (2001)
Hastad, J.: Some optimal inapproximability results. Journal of the ACM (JACM) 48(4), 798–859 (2001)
Good, I.J.: Normal recurring decimals. Journal of the London Mathematical Society 1(3), 167 (1946)
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.: Gambling in a rigged casino: The adversarial multi-armed bandit problem. In: FOCS, p. 322. IEEE Computer Society (1995)
Littlestone, N., Warmuth, M.: The weighted majority algorithm. In: Proceedings of FOCS, pp. 256–261 (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Blocki, J., Christin, N., Datta, A., Sinha, A. (2013). Adaptive Regret Minimization in Bounded-Memory Games. In: Das, S.K., Nita-Rotaru, C., Kantarcioglu, M. (eds) Decision and Game Theory for Security. GameSec 2013. Lecture Notes in Computer Science, vol 8252. Springer, Cham. https://doi.org/10.1007/978-3-319-02786-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-02786-9_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02785-2
Online ISBN: 978-3-319-02786-9
eBook Packages: Computer ScienceComputer Science (R0)