Generalizing Over Uncertain Dynamics for Online Trajectory Generation

Kim, Beomjoon; Kim, Albert; Dai, Hongkai; Kaelbling, Leslie; Lozano-Perez, Tomas

doi:10.1007/978-3-319-60916-4_3

Beomjoon Kim⁵,
Albert Kim⁵,
Hongkai Dai⁵,
Leslie Kaelbling⁵ &
…
Tomas Lozano-Perez⁵

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 3))

3647 Accesses
2 Citations

Abstract

We present an algorithm which learns an online trajectory generator that can generalize over varying and uncertain dynamics. When the dynamics is certain, the algorithm generalizes across model parameters. When the dynamics is partially observable, the algorithm generalizes across different observations. To do this, we employ recent advances in supervised imitation learning to learn a trajectory generator from a set of example trajectories computed by a trajectory optimizer. In experiments in two simulated domains, it finds solutions that are nearly as good as, and sometimes better than, those obtained by calling the trajectory optimizer on line. The online execution time is dramatically decreased, and the off-line training time is reasonable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The video of this can be found at: https://www.youtube.com/watch?v=r9o0pUIXV6w.
2.
https://www.youtube.com/watch?v=r9o0pUIXV6w.

References

Betts, J.T.: Survey of numerical methods for trajectory optimization. In: Journal of Guidance, Control, and Dynamics (1998)
Google Scholar
Ross, S., Gordon, G.J., Bagnell, J.A.: A reduction of imitation learning and structured prediction to no-regret online learning. In: International Conference on Artificial Intelligence and Statistics (2011)
Google Scholar
Kim, B., Pineau, J.: Maximum mean discrepancy imitation learning. In: Robotics: Science and Systems (2013)
Google Scholar
Atkeson, C.: Using local trajectory optimizers to speed up global optimization in dynamic programming. In: Neural Information Processing Systems (1994)
Google Scholar
Tedrake, R.: LQR-trees: feedback motion planning via sums of squares verification. In: International Journal of Robotics Research (2010)
Google Scholar
Atkeson, C., Liu, C.: Trajectory-based dynamic programming, In: Modeling, Simulation, and Optimization of Bipedal Walking (2013)
Google Scholar
Levine, S., Koltun, V.: Guided policy search. In: International Conference on Machine Learning (2013)
Google Scholar
Levine, S., Koltun, V.: Learning complex neural network policies with trajectory optimization. In: International Conference on Machine Learning (2014)
Google Scholar
Mordatch, I., Todorov, E.: Combining the benefits of function approximation and trajectory optimization. In: Robotics: Science and Systems (2014)
Google Scholar
Argall, B., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. In: Robotics and Autonomous Systems (2009)
Google Scholar
Bagnell, J.A.: An invitation to imitation. In Tech Report CMU-RI-TR-15-08, Robotics Institute, Carnegie Mellon University (2015)
Google Scholar
Abbeel, P., Coates, A., Ng. A.Y.: Autonomous helicopter aerobatics through apprenticeship learning. In: International Journal of Robotics Research (2010)
Google Scholar
Ross, S., Melik-Barkhudarov, N., Shankar, K S., Wendel, A., Dey, D., Bagnell, J.A., Hebert, M.: Learning monocular reactive UAV control in cluttered natural environments. In International Conference on Robotics and Automation (2013)
Google Scholar
Berg, J., Miller, S., Duckworth, D., Hu, H., Wan, A., Fu, X., Goldberg, K., Abbeel, P.: Superhuman performance of surgical tasks by robots using iterative Learning from human-guided demonstrations. In: International Conference on Robotics and Automation (2010)
Google Scholar
Gretton, A., Borgwardt, K., Rasch, M., Schlkopf, B., Smola, A.: A kernel method for the two sample problem. In: Neural Information Processing Systems (2007)
Google Scholar
Cristianini, N., Shawe-Taylor, J.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
MATH Google Scholar
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. In: ACM Computing Surveys (2009)
Google Scholar
Betts, J.: SIAM Advances in Design and Control. Practical methods for optimal control using nonlinear programming. Society for Industrial and Applied Mathematics, Philadelphia (2001)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article MATH Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. In: Journal of Machine Learning Research (2011)
Google Scholar
Gill, P.E., Murray, W., Saunders, M.A.: Snopt: an sqp algorithm for large-scale constrained optimization. In: SIAM Journal on Optimization (2002)
Google Scholar
Levine, S., Wagener, N., Abbeel, P.: Learning contact-rich manipulator skills with guided policy search. In: International Conference on Automation and Control (2015)
Google Scholar
Marchese, A.D., Tedrake, R., Rus, D.: Dynamics and trajectory optimization for a soft spatial fluidic elastomer manipulator. In: International Conference on Automation and Control (2015)
Google Scholar
Dai, H., Valenzuela, A., Tedrake, R.: Whole-body motion planning with centroidal dynamics and full kinematics. In: International Conference on Humanoid Robots (2014)
Google Scholar
Posa, M., Cantu, C., Tedrake, R.: A direct method for trajectory optimization of rigid bodies through contact. In: International Journal of Robotics Research (2014)
Google Scholar
Stryk, O.V., Bulirsch, R.: Direct and indirect methods for trajectory optimization. Ann. Op. Res. 37, 357–373 (1992)
Google Scholar
Boggs, P.T., Tolle, J.W.: Sequential quadratic programming. Acta Numerica 4, 1–51 (1995)
Article MathSciNet MATH Google Scholar
Tedrake, R.: Drake: a planning, control, and analysis toolbox for nonlinear dynamical systems (2014). http://drake.mit.edu
Daume, H., Langford, J., Marcu, D.: Search-based structured prediction. In: Machine Learning Journal (2009)
Google Scholar
Perkins, T.J., Barto, A.G.: Lyapunov design for safe reinforcement learning. J. Mach. Learn. Res. 3, 803–832 (2002)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work was supported in part by the NSF (grant 1420927). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. We also gratefully acknowledge support from the ONR (grant N00014-14-1-0486), from the AFOSR (grant FA23861014135), and from the ARO (grant W911NF1410433).

Author information

Authors and Affiliations

MIT, Cambridge, MA, USA
Beomjoon Kim, Albert Kim, Hongkai Dai, Leslie Kaelbling & Tomas Lozano-Perez

Authors

Beomjoon Kim
View author publications
You can also search for this author in PubMed Google Scholar
Albert Kim
View author publications
You can also search for this author in PubMed Google Scholar
Hongkai Dai
View author publications
You can also search for this author in PubMed Google Scholar
Leslie Kaelbling
View author publications
You can also search for this author in PubMed Google Scholar
Tomas Lozano-Perez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Beomjoon Kim .

Editor information

Editors and Affiliations

Istituto Italiano di Tecnologia, Genova, Italy, University of Pisa, Pisa, Italy , Pisa, Italy
Antonio Bicchi
Inst. für Informatik, Albert-Ludwigs-Universität Freiburg Inst. für Informatik, Freiburg, Germany
Wolfram Burgard

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kim, B., Kim, A., Dai, H., Kaelbling, L., Lozano-Perez, T. (2018). Generalizing Over Uncertain Dynamics for Online Trajectory Generation. In: Bicchi, A., Burgard, W. (eds) Robotics Research. Springer Proceedings in Advanced Robotics, vol 3. Springer, Cham. https://doi.org/10.1007/978-3-319-60916-4_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-60916-4_3
Published: 25 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60915-7
Online ISBN: 978-3-319-60916-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics