Abstract
Agents or agent teams deployed to assist humans often face the challenges of monitoring the state of key processes in their environment (including the state of their human users themselves) and making periodic decisions based on such monitoring. POMDPs appear well suited to enable agents to address these challenges, given the uncertain environment and cost of actions, but optimal policy generation for POMDPs is computationally expensive. This paper introduces two key implementation techniques (one exact and one approximate) to speedup POMDP policy generation that exploit the notion of progress or dynamics in personal assistant domains and the density of policy vectors. Policy computation is restricted to the belief space polytope that remains reachable given the progress structure of a domain. One is based on applying Lagrangian methods to compute a bounded belief space support in polynomial time and other based on approximating policy vectors in the bounded belief polytope. We illustrate this by enhancing two of the fastest existing algorithms for exact POMDP policy generation. The order of magnitude speedups demonstrate the utility of our implementation techniques in facilitating the deployment of POMDPs within agents assisting human users.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Littman, M.L., Cassandra, A.R., Zhang, N.L.: Incremental pruning: A simple, fast, exact method for partially observable markov decision processes. In: UAI (1997)
Feng, Z., Zilberstein, S.: Region based incremental pruning for POMDPs. In: UAI (2004)
Hauskrecht, M.: Value-function approximations for POMDPs. JAIR 13, 33–94 (2000)
CALO: Cognitive Agent that Learns and Organizes (2003), http://www.ai.sri.com/project/CALO http://calo.sri.com
Gordon, G., Pineau, J., Thrun, S.: PBVI: An anytime algorithm for POMDPs. In: IJCAI (2003)
Leong, T.Y., Cao, C.: Modeling medical decisions in DynaMoL: A new general framework of dynamic decision analysis. In: World Congress on Medical Informatics (MEDINFO), pp. 483–487 (1998)
Fraser, H., Hauskrecht, M.: Planning treatment of ischemic heart disease with partially observable markov decision processes. AI in Medicine 18, 221–244 (2000)
Locatelli, F., Magni, P., Bellazzi, R.: Using uncertainty management techniques in medical therapy planning: A decision-theoretic approach. In: Hunter, A., Parsons, S. (eds.) Applications of Uncertainty Formalisms. LNCS (LNAI), vol. 1455, pp. 38–57. Springer, Heidelberg (1998)
Pollack, M.E., Brown, L., Colbry, D., McCarthy, C.E., Orosz, C., Peintner, B., Ramakrishnan, S., Tsamardinos, I.: Autominder: An intelligent cognitive orthotic system for people with memory impairment. Robotics and Autonomous Systems 44, 273–282 (2003)
Poulpart, P., Boutilier, C.: Bounded finite state controllers. In: NIPS (2003)
Roy, N., Gordon, G.: Exponential family PCA for belief compression in POMDPs. In: NIPS (2002)
Scerri, P., Pynadath, D., Tambe, M.: Towards adjustable autonomy for the real-world. JAIR 17, 171–228 (2002)
Schreckenghost, D., Martin, C., Bonasso, P., Kortenkamp, D., Milam, T., Thronesbery, C.: Supporting group interaction among humans and autonomous agents. In: AAAI (2002)
Zhang, N.L., Zhang, W.: Speeding up convergence of value iteration in partially observable markov decision processes. JAIR 14, 29–51 (2001)
Zhou, R., Hansen, E.: An improved grid-based approximation algorithm for POMDPs. In: IJCAI (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Varakantham, P., Maheswaran, R., Tambe, M. (2006). Implementation Techniques for Solving POMDPs in Personal Assistant Agents. In: Bordini, R.H., Dastani, M.M., Dix, J., El Fallah Seghrouchni, A. (eds) Programming Multi-Agent Systems. ProMAS 2005. Lecture Notes in Computer Science(), vol 3862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11678823_5
Download citation
DOI: https://doi.org/10.1007/11678823_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32616-8
Online ISBN: 978-3-540-32617-5
eBook Packages: Computer ScienceComputer Science (R0)