Coordinating Randomized Policies for Increasing Security in Multiagent Systems

  • Praveen Paruchuri
  • Milind Tambe
  • Fernando Ordóñez
  • Sarit Kraus
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4324)


Despite significant recent advances in decision theoretic frameworks for reasoning about multiagent teams, little attention has been paid to applying such frameworks in adversarial domains, where the agent team may face security threats from other agents. This paper focuses on domains where such threats are caused by unseen adversaries whose actions or payoffs are unknown. In such domains, action randomization is recognized as a key technique to deteriorate an adversary’s capability to predict and exploit an agent/agent team’s actions. Unfortunately, there are two key challenges in such randomization. First, randomization can reduce the expected reward (quality) of the agent team’s plans, and thus we must provide some guarantees on such rewards. Second, randomization results in miscoordination in teams. While communication within an agent team can help in alleviating the miscoordination problem, communication is unavailable in many real domains or sometimes scarcely available. To address these challenges, this paper provides the following contributions. First, we recall the Multiagent Constrained MDP (MCMDP) framework that enables policy generation for a team of agents where each agent may have a limited or no(communication) resource. Second, since randomized policies generated directly for MCMDPs lead to miscoordination, we introduce a transformation algorithm that converts the MCMDP into a transformed MCMDP incorporating explicit communication and no communication actions. Third, we show that incorporating randomization results in a non-linear program and the unavailability/limited availability of communication results in addition of non-convex constraints to the non-linear program. Finally, we experimentally illustrate the benefits of our work.


Multiagent Systems Decision Theory Security Randomized Policies 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Burstein, M.H., Mulvehill, A.M., Deutsch, S.: An approach to mixed-initiative management of heterogeneous software agent teams. In: HICSS, p. 8055. IEEE Computer Society, Los Alamitos (1999)Google Scholar
  2. 2.
    Boutilier, C.: Sequential Optimality and Coordination in Multiagent Systems. In: IJCAI (1999)Google Scholar
  3. 3.
    Becker, R., Zilberstein, S., Lesser, V., Goldman, C.V.: Transition-Independent Decentralized Markov Decision Processes. In: AAMAS (2003)Google Scholar
  4. 4.
    Nair, R., Pynadath, D., Yokoo, M., Tambe, M., Marsella, S.: Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings. In: IJCAI (2003)Google Scholar
  5. 5.
    Paruchuri, P., Tambe, M., Ordonez, F., Kraus, S.: Security in Multiagent Systems by Policy Randomization. In: AAMAS (2006)Google Scholar
  6. 6.
    Carroll, D., Mikell, K., Denewiler, T.: Unmanned Ground Vehicles for Integrated Force Protection. In: SPIE Proc., vol. 5422 (2004)Google Scholar
  7. 7.
    Lewis, P.J., Torrie, M.R., Omilon, P.M.: Applications suitable for unmanned and autonomous missions utilizing the Tactical Amphibious Ground Support (TAGS) platform (2005),
  8. 8.
    Call for Papers: Safety and Security in Multiagent Systems,
  9. 9.
    Beard, R., McLain, T.: Multiple UAV Cooperative Search under Collision Avoidance and Limited Range Communication Constraints. In: IEEE CDC (2003)Google Scholar
  10. 10.
    Serjantov, A.: On the Anonymity of Anonymity Systems. PhD Dissertation, University of Cambridge (2004)Google Scholar
  11. 11.
    Paruchuri, P., Tambe, M., Ordonez, F., Kraus, S.: Towards a Formalization of Teamwork With Resource Constraints. In: AAMAS (2004)Google Scholar
  12. 12.
    Rahimi, M.H., Shah, H., Sukhatme, G.S., Heidemann, J., Estrin, D.: Studying the Feasibility of Energy Harvesting in a Mobile Sensor Network. In: ICRA (2003)Google Scholar
  13. 13.
    Dolgov, D., Durfee, E.: Approximating Optimal Policies for Agents with Limited Execution Resources. In: IJCAI (2003)Google Scholar
  14. 14.
    Altman, E.: Constrained Markov Decision Process. Chapman and Hall, Boca Raton (1999)zbMATHGoogle Scholar
  15. 15.
    Littman, M.: Markov Games as a Framework for Multi-Agent Reinforcement Learning (1994),
  16. 16.
    Dolgov, D., Durfee, E.: Resource Allocation and Policy Formulation for Multiple Resource-Limited Agents Under Uncertainty. In: ICAPS (2004)Google Scholar
  17. 17.
    Shannon, C.: A Mathematical Theory of Communication. The Bell Labs Technical Journal (1948)Google Scholar
  18. 18.
    Pynadath, D., Tambe, M.: The communicative multiagent team decision problem: analyzing teamwork theories and models. In: JAIR (2002)Google Scholar
  19. 19.
    Goldman, C.V., Zilberstein, S.: Optimizing Information Exchange in Cooperative Multi-agent Systems. In: AAMAS (2003)Google Scholar
  20. 20.
    Jaakkola, T., Singh, S., Jordan, M.: Reinforcement learning algorithm for partially observable markov decision problems. In: Advances in NIPS (1994)Google Scholar
  21. 21.
    Parr, R., Russel, S.: Approximating Optimal Policies for partially observable stochastic domains. In: IJCAI (1995)Google Scholar
  22. 22.
    Kaelbling, L., Littman, M., Cassandra, A.: Planning and Acting in Partially Observable Stochastic Domains. In: Technical Report, Brown University (1995)Google Scholar
  23. 23.
    Poupart, P., Boutilier, C.: Bounded finite state controllers. In: NIPS (2003)Google Scholar
  24. 24.
    Bernstein, D.S., Hansen, E.A., Zilberstein, S.: Bounded Policy Iteration for Decentralized POMDPs. In: IJCAI (2005)Google Scholar
  25. 25.
    Xuan, P., Lesser, V.: Multi-Agent Policies: From Centralized Ones to Decentralized Ones. In: AAMAS (2002)Google Scholar
  26. 26.
    Becker, R., Lesser, V., Zilberstein, S.: Analyzing Myopic Approaches for Multi-Agent Communication. In: Proceedings of IAT (2005)Google Scholar
  27. 27.
    Ghavamzadeh, M., Mahadevan, S.: Learning to Communicate and Act in Cooperative Multiagent Systems using Hierarchical Reinforcement Learning. In: AAMAS (2004)Google Scholar
  28. 28.
    Nair, R., Roth, M., Yokoo, M., Tambe, M.: Communication for Improving Policy Computation in Distributed POMDPs. In: AAMAS (2004)Google Scholar
  29. 29.
    Hu, J., Wellman, P.: Multiagent reinforcement learning: theoretical framework and an algorithm. In: ICML (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Praveen Paruchuri
    • 1
  • Milind Tambe
    • 1
  • Fernando Ordóñez
    • 1
  • Sarit Kraus
    • 2
  1. 1.University of Southern CaliforniaLos Angeles
  2. 2.Bar-Ilan UniversityRamat-GanIsrael

Personalised recommendations