Autonomous Helicopter Flight Using Reinforcement Learning

Coates, Adam; Abbeel, Pieter; Ng, Andrew Y.

doi:10.1007/978-1-4899-7502-7_16-1

Adam Coates³,
Pieter Abbeel⁴ &
Andrew Y. Ng³

922 Accesses

Definition

Helicopter flight is a highly challenging control problem. While it is possible to obtain controllers for simple maneuvers (like hovering) by traditional manual design procedures, this approach is tedious and typically requires many hours of adjustments and flight testing, even for an experienced control engineer. For complex maneuvers, such as aerobatic routines, this approach is likely infeasible. In contrast, reinforcement learning (RL) algorithms enable faster and more automated design of controllers. Model-based RL algorithms have been used successfully for autonomous helicopter flight for hovering, forward flight, and using apprenticeship learning methods for expert-level aerobatics. In model-based RL, the first one builds a model of the helicopter dynamics and specifies the task using a reward function. Then, given the model and the reward function, the RL algorithm finds a controller that maximizes the expected sum of rewards accumulated over time.

Motivation and...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Recommended Reading

Abbeel P, Coates A, Hunter T, Ng AY (2008) Autonomous autorotation of an rc helicopter. In: ISER 11, Athens
Google Scholar
Abbeel P, Coates A, Quigley M, Ng AY (2007) An application of reinforcement learning to aerobatic helicopter flight. In: NIPS 19, Vancouver, pp 1–8
Google Scholar
Abbeel P, Ganapathi V, Ng AY (2006) Learning vehicular dynamics with application to modeling helicopters. In: NIPS 18, Vancouver
Google Scholar
Abbeel P, Ng AY (2004) Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the international conference on machine learning, Banff. ACM, New York
Book Google Scholar
Abbeel P, Ng AY (2005a) Exploration and apprenticeship learning in reinforcement learning. In: Proceedings of the international conference on machine learning, Bonn. ACM, New York
Book Google Scholar
Abbeel P, Ng AY (2005b) Learning first order Markov models for control. In: NIPS 18, Vancouver
Google Scholar
Abbeel P, Quigley M, Ng AY (2006) Using inaccurate models in reinforcement learning. In: ICML ’06: proceedings of the 23rd international conference on machine learning, Pittsburgh. ACM, New York, pp 1–8
Google Scholar
Anderson B, Moore J (1989) Optimal control: linear quadratic methods. Prentice-Hall, Princeton
Google Scholar
Bagnell J, Schneider J (2001) Autonomous helicopter control using reinforcement learning policy search methods. In: International conference on robotics and automation, Seoul. IEEE, Canada
Book Google Scholar
Brafman RI, Tennenholtz M (2002) R-max, a general polynomial time algorithm for near-optimal reinforcement learning. J Mach Learn Res 3:213–231
MathSciNet Google Scholar
Coates A, Abbeel P, Ng AY (2008) Learning for control from multiple demonstrations. In: Proceedings of the 25th international conference on machine learning (ICML ’08), Helsinki
Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc 39(1):1–38
MathSciNet MATH Google Scholar
Dunbabin M, Brosnan S, Roberts J, Corke P (2004) Vibration isolation for autonomous helicopter flight. In: Proceedings of the IEEE international conference on robotics and automation, New Orleans, vol 4, pp 3609–3615
Google Scholar
Gavrilets V, Martinos I, Mettler B, Feron E (2002a) Control logic for automated aerobatic flight of miniature helicopter. In: AIAA guidance, navigation and control conference, Monterey. Massachusetts Institute of Technology, Cambridge
Book Google Scholar
Gavrilets V, Martinos I, Mettler B, Feron E (2002b) Flight test and simulation results for an autonomous aerobatic helicopter. In: AIAA/IEEE digital avionics systems conference, Irvine
Book Google Scholar
Gavrilets V, Mettler B, Feron E (2001) Nonlinear model for a small-size acrobatic helicopter. In: AIAA guidance, navigation and control conference, Montreal, pp 1593–1600
Google Scholar
Jacobson DH, Mayne DQ (1970) Differential dynamic programming. Elsevier, New York
MATH Google Scholar
Kakade S, Kearns M, Langford J (2003) Exploration in metric state spaces. In: Proceedings of the international conference on machine learning, Washington, DC
Google Scholar
Kearns M, Koller D (1999) Efficient reinforcement learning in factored MDPs. In: Proceedings of the 16th international joint conference on artificial intelligence, Stockholm. Morgan Kaufmann, San Francisco
Google Scholar
Kearns M, Singh S (2002) Near-optimal reinforcement learning in polynomial time. Mach Learn J 49(2–3):209–232
Article MATH Google Scholar
La Civita M (2003) Integrated modeling and robust control for full-envelope flight of robotic helicopters. PhD thesis, Carnegie Mellon University, Pittsburgh
Google Scholar
La Civita M, Papageorgiou G, Messner WC, Kanade T (2006) Design and flight testing of a high-bandwidth \(\mathcal{H}_{\infty }\) loop shaping controller for a robotic helicopter. J Guid Control Dyn 29(2):485–494
Article Google Scholar
Leishman J (2000) Principles of helicopter aerodynamics. Cambridge University Press, Cambridge
Google Scholar
Nelder JA, Mead R (1964) A simplex method for function minimization. Comput J 7:308–313
Article Google Scholar
Ng AY, Coates A, Diel M, Ganapathi V, Schulte J, Tse B et al (2004) Autonomous inverted helicopter flight via reinforcement learning. In: International symposium on experimental robotics, Singapore. Springer, Berlin
Google Scholar
Ng AY, Jordan M (2000) Pegasus: a policy search method for large MDPs and POMDPs. In: Proceedings of the uncertainty in artificial intelligence 16th conference, Stanford. Morgan Kaufmann, San Francisco
Google Scholar
Ng AY, Kim HJ, Jordan M, Sastry S (2004) Autonomous helicopter flight via reinforcement learning. In: NIPS 16, Vancouver
Google Scholar
Ng AY, Russell S (2000) Algorithms for inverse reinforcement learning. In: Proceedings of the 17th international conference on machine learning, San Francisco. Morgan Kaufmann, San Francisco, pp 663–670
Google Scholar
Saripalli S, Montgomery JF, Sukhatme GS (2003) Visually-guided landing of an unmanned aerial vehicle. IEEE Trans Robot Auton Syst 19(3):371–380
Article Google Scholar
Seddon J (1990) Basic helicopter aerodynamics. AIAA education series. America Institute of Aeronautics and Astronautics, El Segundo
Google Scholar
Tischler MB, Cauffman MG (1992) Frequency response method for rotorcraft system identification: flight application to BO-105 couple rotor/fuselage dynamics. J Am Helicopter Soc 37:3–17
Article Google Scholar

Download references

Author information

Authors and Affiliations

Stanford University, Stanford, CA, USA
Adam Coates & Andrew Y. Ng
University of California, Berkeley, CA, USA
Pieter Abbeel

Authors

Adam Coates
View author publications
You can also search for this author in PubMed Google Scholar
Pieter Abbeel
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Y. Ng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adam Coates .

Editor information

Editors and Affiliations

Engineering (CSE), University of New South Wales School of Computer Science &, Sydney, New South Wales, Australia
Claude Sammut
Software Engineering, Monash University School of Computer Science &, Melbourne, Victoria, Australia
Geoffrey I. Webb

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Coates, A., Abbeel, P., Ng, A.Y. (2014). Autonomous Helicopter Flight Using Reinforcement Learning. In: Sammut, C., Webb, G. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7502-7_16-1

Download citation

DOI: https://doi.org/10.1007/978-1-4899-7502-7_16-1
Received: 30 October 2014
Accepted: 30 October 2014
Published: 12 February 2015
Publisher Name: Springer, Boston, MA
Online ISBN: 978-1-4899-7502-7
eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics