An Online POMDP Solver for Uncertainty Planning in Dynamic Environment

Kurniawati, Hanna; Yadav, Vinay

doi:10.1007/978-3-319-28872-7_35

Hanna Kurniawati⁵ &
Vinay Yadav⁶

Part of the book series: Springer Tracts in Advanced Robotics ((STAR,volume 114))

6148 Accesses
51 Citations

Abstract

Motion planning under uncertainty is important for reliable robot operations in uncertain and dynamic environments. Partially Observable Markov Decision Process (POMDP) is a general and systematic framework for motion planning under uncertainty. To cope with dynamic environment well, we often need to modify the POMDP model during runtime. However, despite recent tremendous advances in POMDP planning, most solvers are not fast enough to generate a good solution when the POMDP model changes during runtime. Recent progress in online POMDP solvers have shown promising results. However, most online solvers are based on replanning, which recompute a solution from scratch at each step, discarding any solution that has been computed so far, and hence wasting valuable computational resources. In this paper, we propose a new online POMDP solver, called Adaptive Belief Tree (ABT), that can reuse and improve existing solution, and update the solution as needed whenever the POMDP model changes. Given enough time, ABT converges to the optimal solution of the current POMDP model in probability. Preliminary results on three distinct robotics tasks in dynamic environments are promising. In all test scenarios, ABT generates similar or better solutions faster than the fastest online POMDP solver today; using an average of less than 50 ms of computation time per step.

Vinay Yadav—All work were done while the author was an internship student at University of Queensland.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2–3), 235–256 (2002)
Article MATH Google Scholar
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The non-stochastic multi-armed bandit problem. SIAM J. Comput. 32(1), 48–77 (2003)
Article MathSciNet MATH Google Scholar
Bai, H., Hsu, D., Lee, W.S., Ngo, A.V.: Monte Carlo value iteration for continuous-state POMDPs. In: Proceedings of the WAFR (2010)
Google Scholar
de Berg, M., Cheong, O., Kreveld, M.V., Overmars, M.: Computational Geometry: Algorithms and Applications. Springer, Berlin (2000)
Google Scholar
Hauser, K.: Randomized belief-space replanning in partially-observable continuous spaces. In: Proceedings of the WAFR (2010)
Google Scholar
He, R., Brunskill, E., Roy, N.: PUMA: planning under uncertainty with macro-actions. In: Proceedings of the AAAI (2010)
Google Scholar
Horowitz, M., Burdick, J.: Interactive non-prehensile manipulation for grasping via POMDPs. In: Proceedings of the ICRA (2013)
Google Scholar
Hsiao, K., Kaelbling, L.P., Lozano-Perez, T.: Grasping POMDPs. In: Proceedings of the ICRA, pp. 4685–4692 (2007)
Google Scholar
Kocsis, L., Szepesvri, C.: Bandit based monte-carlo planning. In: ECML-06. LNCS, vol. 4212, pp. 282–293. Springer, Berlin (2006)
Google Scholar
Kurniawati, H., Patrikalakis, N.M.: Point-based policy transformation: adapting policy to changing POMDP models. In: Proceedings of the WAFR (2012)
Google Scholar
Kurniawati, H., Hsu, D., Lee, W.S.: SARSOP: efficient point-based POMDP planning by approximating optimally reachable belief spaces. In: Proceedings of the RSS (2008)
Google Scholar
Kurniawati, H., Du, Y., Hsu, D., Lee, W.S.: Motion planning under uncertainty for robotic tasks with long time horizons. IJRR 30(3), 308–323 (2011)
MATH Google Scholar
Ong, S.C.W., Png, S.W., Hsu, D., Lee, W.S.: Planning under uncertainty for robotic tasks with mixed observability. IJRR 29(8), 1053–1068 (2010)
Google Scholar
Papadimitriou, C.H., Tsitsiklis, J.N.: The complexity of Markov decision processes. Math. Oper. Res. 12(3), 441–450 (1987)
Article MathSciNet MATH Google Scholar
Pineau, J., Gordon, G., Thrun, S.: Point-based value iteration: an anytime algorithm for POMDPs. In: Proceedings of the IJCAI, pp. 1025–1032 (2003)
Google Scholar
Platt, R., Tedrake, R., Lozano-Perez, T., Kaelbling, L.P.: Belief space planning assuming maximum likelihood observations. In: Proceedings of the RSS (2010)
Google Scholar
Prentice, S., Roy, N.: The belief roadmap: efficient planning in linear POMDPs by factoring the covariance. In: Proceedings of the ISRR (2007)
Google Scholar
Ross, S., Pineau, J., Paquet, S., Chaib-draa, B.: Online planning algorithms for POMDPs. JAIR 32, 663–704 (2008)
MathSciNet MATH Google Scholar
Silver, D., Veness, J.: Monte-Carlo planning in large POMDPs. In: Proceedings of the NIPS (2010)
Google Scholar
Smith, T., Simmons, R.: Heuristic search value iteration for POMDPs. In: Proceedings of the UAI (2004)
Google Scholar
Smith, T., Simmons, R.: Point-based POMDP algorithms: improved analysis and implementation. In: Proceedings of the UAI, July 2005
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2012)
Google Scholar
Thrun, S.: Monte carlo POMDPs. In: Proceedings of the NIPS, pp. 1064–1070 (2000)
Google Scholar
van den Berg, J., Abbeel, P., Goldberg, K.: LQG-MP: optimized path planning for robots with motion uncertainty and imperfect state information. In: Proceedings of the RSS (2010)
Google Scholar
van den Berg, J., Wilkie, D., Guy, S.J., Niethammer, M., Manocha, D.: LQG-Obstacles: feedback control with collision avoidance for mobile robots with motion and sensing uncertainty. In: Proceedings of the ICRA (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technology and Electrical Engineering, University of Queensland, Brisbane, Australia
Hanna Kurniawati
Department of Electrical Engineering, Indian Institute of Technology, Kharagpur, India
Vinay Yadav

Authors

Hanna Kurniawati
View author publications
You can also search for this author in PubMed Google Scholar
Vinay Yadav
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hanna Kurniawati .

Editor information

Editors and Affiliations

Creative Informatics, The University of Tokyo, Tokyo, Japan
Masayuki Inaba
School of Electrical Engineering and Com, Queensland Univ of Technology, Brisbane, Queensland, Australia
Peter Corke

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kurniawati, H., Yadav, V. (2016). An Online POMDP Solver for Uncertainty Planning in Dynamic Environment. In: Inaba, M., Corke, P. (eds) Robotics Research. Springer Tracts in Advanced Robotics, vol 114. Springer, Cham. https://doi.org/10.1007/978-3-319-28872-7_35

Download citation

DOI: https://doi.org/10.1007/978-3-319-28872-7_35
Published: 23 April 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28870-3
Online ISBN: 978-3-319-28872-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics