Abstract
Chapter 5 introduced the algorithms used in this book for solving ad hoc teamwork problems. Before moving on to the empirical analysis of these algorithms in Chap. 7, it is useful to first investigate the theoretical attributes of PLASTIC. Our analysis focuses on whether the multi-armed bandit domain described in Sect. 3.2.1 is tractable for PLASTIC–Model. We chose to analyze the bandit domain because of its simplicity, which lends itself to more complete theoretical analysis. In addition, the bandit domain is interesting due to its use of communication, which is an important aspect of ad hoc teamwork that is not explored in the other domains. Note that we do not investigate the model learning aspect of PLASTIC–Model. Instead, we analyze whether the PLASTIC–Model can select from a set of known models (from \(\text {HandCodedKnowledge}\)) and plan its response to these models in polynomial time.
This chapter contains material from the publication: [1]. Note that all work presented in this chapter is joint work with Noa Agmon, Noam Hazon, and Sarit Kraus in addition to my advisor Peter Stone.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barrett, Samuel, Noa Agmon, Noam Hazon, Sarit Kraus, and Peter Stone. 2014. Communicating with unknown teammates. In Proceedings of the twenty-first european conference on artificial intelligence, Aug 2014.
Sutton, Richard S., and Andrew G. Barto. 1998. Reinforcement learning: An introduction. Cambridge: MIT Press.
Mayo-Wilson, Conor, Kevin Zollman, and David Danks. 2012. Wisdom of crowds versus groupthink: learning in groups and in isolation. International Journal of Game Theory 42: 695–723
Barrett, Samuel, Peter Stone, Sarit Kraus, and Avi Rosenfeld. 2013. Teamwork with limited knowledge of teammates. In Proceedings of the twenty-seventh conference on artificial intelligence (AAAI), July 2013.
Rosenfeld, Avi, Inon Zuckerman, Amos Azaria, and Sarit Kraus. 2012. Combining psychological models with machine learning to better predict people’s decisions. Synthese 189: 81–93.
Auer, Peter, Nicolò Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine Learning (MLJ) 47: 235–256.
Hsu, David, Wee Sun Lee, and Nan Rong. 2007. What makes some POMDP problems easy to approximate? In Advances in Neural Information Processing Systems 20 (NIPS).
Kurniawati, Hanna, David Hsu, and Wee Sun Lee. 2008. SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In Proceedings of robotics: Science and systems.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Barrett, S. (2015). Theoretical Analysis of PLASTIC. In: Making Friends on the Fly: Advances in Ad Hoc Teamwork. Studies in Computational Intelligence, vol 603. Springer, Cham. https://doi.org/10.1007/978-3-319-18069-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-18069-4_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18068-7
Online ISBN: 978-3-319-18069-4
eBook Packages: EngineeringEngineering (R0)