Skip to main content

Generalizing Over Uncertain Dynamics for Online Trajectory Generation

  • Chapter
  • First Online:
Robotics Research

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 3))

Abstract

We present an algorithm which learns an online trajectory generator that can generalize over varying and uncertain dynamics. When the dynamics is certain, the algorithm generalizes across model parameters. When the dynamics is partially observable, the algorithm generalizes across different observations. To do this, we employ recent advances in supervised imitation learning to learn a trajectory generator from a set of example trajectories computed by a trajectory optimizer. In experiments in two simulated domains, it finds solutions that are nearly as good as, and sometimes better than, those obtained by calling the trajectory optimizer on line. The online execution time is dramatically decreased, and the off-line training time is reasonable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The video of this can be found at: https://www.youtube.com/watch?v=r9o0pUIXV6w.

  2. 2.

    https://www.youtube.com/watch?v=r9o0pUIXV6w.

References

  1. Betts, J.T.: Survey of numerical methods for trajectory optimization. In: Journal of Guidance, Control, and Dynamics (1998)

    Google Scholar 

  2. Ross, S., Gordon, G.J., Bagnell, J.A.: A reduction of imitation learning and structured prediction to no-regret online learning. In: International Conference on Artificial Intelligence and Statistics (2011)

    Google Scholar 

  3. Kim, B., Pineau, J.: Maximum mean discrepancy imitation learning. In: Robotics: Science and Systems (2013)

    Google Scholar 

  4. Atkeson, C.: Using local trajectory optimizers to speed up global optimization in dynamic programming. In: Neural Information Processing Systems (1994)

    Google Scholar 

  5. Tedrake, R.: LQR-trees: feedback motion planning via sums of squares verification. In: International Journal of Robotics Research (2010)

    Google Scholar 

  6. Atkeson, C., Liu, C.: Trajectory-based dynamic programming, In: Modeling, Simulation, and Optimization of Bipedal Walking (2013)

    Google Scholar 

  7. Levine, S., Koltun, V.: Guided policy search. In: International Conference on Machine Learning (2013)

    Google Scholar 

  8. Levine, S., Koltun, V.: Learning complex neural network policies with trajectory optimization. In: International Conference on Machine Learning (2014)

    Google Scholar 

  9. Mordatch, I., Todorov, E.: Combining the benefits of function approximation and trajectory optimization. In: Robotics: Science and Systems (2014)

    Google Scholar 

  10. Argall, B., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. In: Robotics and Autonomous Systems (2009)

    Google Scholar 

  11. Bagnell, J.A.: An invitation to imitation. In Tech Report CMU-RI-TR-15-08, Robotics Institute, Carnegie Mellon University (2015)

    Google Scholar 

  12. Abbeel, P., Coates, A., Ng. A.Y.: Autonomous helicopter aerobatics through apprenticeship learning. In: International Journal of Robotics Research (2010)

    Google Scholar 

  13. Ross, S., Melik-Barkhudarov, N., Shankar, K S., Wendel, A., Dey, D., Bagnell, J.A., Hebert, M.: Learning monocular reactive UAV control in cluttered natural environments. In International Conference on Robotics and Automation (2013)

    Google Scholar 

  14. Berg, J., Miller, S., Duckworth, D., Hu, H., Wan, A., Fu, X., Goldberg, K., Abbeel, P.: Superhuman performance of surgical tasks by robots using iterative Learning from human-guided demonstrations. In: International Conference on Robotics and Automation (2010)

    Google Scholar 

  15. Gretton, A., Borgwardt, K., Rasch, M., Schlkopf, B., Smola, A.: A kernel method for the two sample problem. In: Neural Information Processing Systems (2007)

    Google Scholar 

  16. Cristianini, N., Shawe-Taylor, J.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)

    MATH  Google Scholar 

  17. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. In: ACM Computing Surveys (2009)

    Google Scholar 

  18. Betts, J.: SIAM Advances in Design and Control. Practical methods for optimal control using nonlinear programming. Society for Industrial and Applied Mathematics, Philadelphia (2001)

    Google Scholar 

  19. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  20. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. In: Journal of Machine Learning Research (2011)

    Google Scholar 

  21. Gill, P.E., Murray, W., Saunders, M.A.: Snopt: an sqp algorithm for large-scale constrained optimization. In: SIAM Journal on Optimization (2002)

    Google Scholar 

  22. Levine, S., Wagener, N., Abbeel, P.: Learning contact-rich manipulator skills with guided policy search. In: International Conference on Automation and Control (2015)

    Google Scholar 

  23. Marchese, A.D., Tedrake, R., Rus, D.: Dynamics and trajectory optimization for a soft spatial fluidic elastomer manipulator. In: International Conference on Automation and Control (2015)

    Google Scholar 

  24. Dai, H., Valenzuela, A., Tedrake, R.: Whole-body motion planning with centroidal dynamics and full kinematics. In: International Conference on Humanoid Robots (2014)

    Google Scholar 

  25. Posa, M., Cantu, C., Tedrake, R.: A direct method for trajectory optimization of rigid bodies through contact. In: International Journal of Robotics Research (2014)

    Google Scholar 

  26. Stryk, O.V., Bulirsch, R.: Direct and indirect methods for trajectory optimization. Ann. Op. Res. 37, 357–373 (1992)

    Google Scholar 

  27. Boggs, P.T., Tolle, J.W.: Sequential quadratic programming. Acta Numerica 4, 1–51 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  28. Tedrake, R.: Drake: a planning, control, and analysis toolbox for nonlinear dynamical systems (2014). http://drake.mit.edu

  29. Daume, H., Langford, J., Marcu, D.: Search-based structured prediction. In: Machine Learning Journal (2009)

    Google Scholar 

  30. Perkins, T.J., Barto, A.G.: Lyapunov design for safe reinforcement learning. J. Mach. Learn. Res. 3, 803–832 (2002)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the NSF (grant 1420927). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. We also gratefully acknowledge support from the ONR (grant N00014-14-1-0486), from the AFOSR (grant FA23861014135), and from the ARO (grant W911NF1410433).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Beomjoon Kim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Kim, B., Kim, A., Dai, H., Kaelbling, L., Lozano-Perez, T. (2018). Generalizing Over Uncertain Dynamics for Online Trajectory Generation. In: Bicchi, A., Burgard, W. (eds) Robotics Research. Springer Proceedings in Advanced Robotics, vol 3. Springer, Cham. https://doi.org/10.1007/978-3-319-60916-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-60916-4_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-60915-7

  • Online ISBN: 978-3-319-60916-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics