Abstract
Motion planning under uncertainty is important for reliable robot operations in uncertain and dynamic environments. Partially Observable Markov Decision Process (POMDP) is a general and systematic framework for motion planning under uncertainty. To cope with dynamic environment well, we often need to modify the POMDP model during runtime. However, despite recent tremendous advances in POMDP planning, most solvers are not fast enough to generate a good solution when the POMDP model changes during runtime. Recent progress in online POMDP solvers have shown promising results. However, most online solvers are based on replanning, which recompute a solution from scratch at each step, discarding any solution that has been computed so far, and hence wasting valuable computational resources. In this paper, we propose a new online POMDP solver, called Adaptive Belief Tree (ABT), that can reuse and improve existing solution, and update the solution as needed whenever the POMDP model changes. Given enough time, ABT converges to the optimal solution of the current POMDP model in probability. Preliminary results on three distinct robotics tasks in dynamic environments are promising. In all test scenarios, ABT generates similar or better solutions faster than the fastest online POMDP solver today; using an average of less than 50 ms of computation time per step.
Vinay Yadav—All work were done while the author was an internship student at University of Queensland.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2–3), 235–256 (2002)
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The non-stochastic multi-armed bandit problem. SIAM J. Comput. 32(1), 48–77 (2003)
Bai, H., Hsu, D., Lee, W.S., Ngo, A.V.: Monte Carlo value iteration for continuous-state POMDPs. In: Proceedings of the WAFR (2010)
de Berg, M., Cheong, O., Kreveld, M.V., Overmars, M.: Computational Geometry: Algorithms and Applications. Springer, Berlin (2000)
Hauser, K.: Randomized belief-space replanning in partially-observable continuous spaces. In: Proceedings of the WAFR (2010)
He, R., Brunskill, E., Roy, N.: PUMA: planning under uncertainty with macro-actions. In: Proceedings of the AAAI (2010)
Horowitz, M., Burdick, J.: Interactive non-prehensile manipulation for grasping via POMDPs. In: Proceedings of the ICRA (2013)
Hsiao, K., Kaelbling, L.P., Lozano-Perez, T.: Grasping POMDPs. In: Proceedings of the ICRA, pp. 4685–4692 (2007)
Kocsis, L., Szepesvri, C.: Bandit based monte-carlo planning. In: ECML-06. LNCS, vol. 4212, pp. 282–293. Springer, Berlin (2006)
Kurniawati, H., Patrikalakis, N.M.: Point-based policy transformation: adapting policy to changing POMDP models. In: Proceedings of the WAFR (2012)
Kurniawati, H., Hsu, D., Lee, W.S.: SARSOP: efficient point-based POMDP planning by approximating optimally reachable belief spaces. In: Proceedings of the RSS (2008)
Kurniawati, H., Du, Y., Hsu, D., Lee, W.S.: Motion planning under uncertainty for robotic tasks with long time horizons. IJRR 30(3), 308–323 (2011)
Ong, S.C.W., Png, S.W., Hsu, D., Lee, W.S.: Planning under uncertainty for robotic tasks with mixed observability. IJRR 29(8), 1053–1068 (2010)
Papadimitriou, C.H., Tsitsiklis, J.N.: The complexity of Markov decision processes. Math. Oper. Res. 12(3), 441–450 (1987)
Pineau, J., Gordon, G., Thrun, S.: Point-based value iteration: an anytime algorithm for POMDPs. In: Proceedings of the IJCAI, pp. 1025–1032 (2003)
Platt, R., Tedrake, R., Lozano-Perez, T., Kaelbling, L.P.: Belief space planning assuming maximum likelihood observations. In: Proceedings of the RSS (2010)
Prentice, S., Roy, N.: The belief roadmap: efficient planning in linear POMDPs by factoring the covariance. In: Proceedings of the ISRR (2007)
Ross, S., Pineau, J., Paquet, S., Chaib-draa, B.: Online planning algorithms for POMDPs. JAIR 32, 663–704 (2008)
Silver, D., Veness, J.: Monte-Carlo planning in large POMDPs. In: Proceedings of the NIPS (2010)
Smith, T., Simmons, R.: Heuristic search value iteration for POMDPs. In: Proceedings of the UAI (2004)
Smith, T., Simmons, R.: Point-based POMDP algorithms: improved analysis and implementation. In: Proceedings of the UAI, July 2005
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2012)
Thrun, S.: Monte carlo POMDPs. In: Proceedings of the NIPS, pp. 1064–1070 (2000)
van den Berg, J., Abbeel, P., Goldberg, K.: LQG-MP: optimized path planning for robots with motion uncertainty and imperfect state information. In: Proceedings of the RSS (2010)
van den Berg, J., Wilkie, D., Guy, S.J., Niethammer, M., Manocha, D.: LQG-Obstacles: feedback control with collision avoidance for mobile robots with motion and sensing uncertainty. In: Proceedings of the ICRA (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Kurniawati, H., Yadav, V. (2016). An Online POMDP Solver for Uncertainty Planning in Dynamic Environment. In: Inaba, M., Corke, P. (eds) Robotics Research. Springer Tracts in Advanced Robotics, vol 114. Springer, Cham. https://doi.org/10.1007/978-3-319-28872-7_35
Download citation
DOI: https://doi.org/10.1007/978-3-319-28872-7_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28870-3
Online ISBN: 978-3-319-28872-7
eBook Packages: EngineeringEngineering (R0)