Skip to main content

Autonomous Helicopter Flight Using Reinforcement Learning

  • Living reference work entry
  • First Online:
Encyclopedia of Machine Learning and Data Mining

Definition

Helicopter flight is a highly challenging control problem. While it is possible to obtain controllers for simple maneuvers (like hovering) by traditional manual design procedures, this approach is tedious and typically requires many hours of adjustments and flight testing, even for an experienced control engineer. For complex maneuvers, such as aerobatic routines, this approach is likely infeasible. In contrast, reinforcement learning (RL) algorithms enable faster and more automated design of controllers. Model-based RL algorithms have been used successfully for autonomous helicopter flight for hovering, forward flight, and using apprenticeship learning methods for expert-level aerobatics. In model-based RL, the first one builds a model of the helicopter dynamics and specifies the task using a reward function. Then, given the model and the reward function, the RL algorithm finds a controller that maximizes the expected sum of rewards accumulated over time.

Motivation and...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

  • Abbeel P, Coates A, Hunter T, Ng AY (2008) Autonomous autorotation of an rc helicopter. In: ISER 11, Athens

    Google Scholar 

  • Abbeel P, Coates A, Quigley M, Ng AY (2007) An application of reinforcement learning to aerobatic helicopter flight. In: NIPS 19, Vancouver, pp 1–8

    Google Scholar 

  • Abbeel P, Ganapathi V, Ng AY (2006) Learning vehicular dynamics with application to modeling helicopters. In: NIPS 18, Vancouver

    Google Scholar 

  • Abbeel P, Ng AY (2004) Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the international conference on machine learning, Banff. ACM, New York

    Book  Google Scholar 

  • Abbeel P, Ng AY (2005a) Exploration and apprenticeship learning in reinforcement learning. In: Proceedings of the international conference on machine learning, Bonn. ACM, New York

    Book  Google Scholar 

  • Abbeel P, Ng AY (2005b) Learning first order Markov models for control. In: NIPS 18, Vancouver

    Google Scholar 

  • Abbeel P, Quigley M, Ng AY (2006) Using inaccurate models in reinforcement learning. In: ICML ’06: proceedings of the 23rd international conference on machine learning, Pittsburgh. ACM, New York, pp 1–8

    Google Scholar 

  • Anderson B, Moore J (1989) Optimal control: linear quadratic methods. Prentice-Hall, Princeton

    Google Scholar 

  • Bagnell J, Schneider J (2001) Autonomous helicopter control using reinforcement learning policy search methods. In: International conference on robotics and automation, Seoul. IEEE, Canada

    Book  Google Scholar 

  • Brafman RI, Tennenholtz M (2002) R-max, a general polynomial time algorithm for near-optimal reinforcement learning. J Mach Learn Res 3:213–231

    MathSciNet  Google Scholar 

  • Coates A, Abbeel P, Ng AY (2008) Learning for control from multiple demonstrations. In: Proceedings of the 25th international conference on machine learning (ICML ’08), Helsinki

    Google Scholar 

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc 39(1):1–38

    MathSciNet  MATH  Google Scholar 

  • Dunbabin M, Brosnan S, Roberts J, Corke P (2004) Vibration isolation for autonomous helicopter flight. In: Proceedings of the IEEE international conference on robotics and automation, New Orleans, vol 4, pp 3609–3615

    Google Scholar 

  • Gavrilets V, Martinos I, Mettler B, Feron E (2002a) Control logic for automated aerobatic flight of miniature helicopter. In: AIAA guidance, navigation and control conference, Monterey. Massachusetts Institute of Technology, Cambridge

    Book  Google Scholar 

  • Gavrilets V, Martinos I, Mettler B, Feron E (2002b) Flight test and simulation results for an autonomous aerobatic helicopter. In: AIAA/IEEE digital avionics systems conference, Irvine

    Book  Google Scholar 

  • Gavrilets V, Mettler B, Feron E (2001) Nonlinear model for a small-size acrobatic helicopter. In: AIAA guidance, navigation and control conference, Montreal, pp 1593–1600

    Google Scholar 

  • Jacobson DH, Mayne DQ (1970) Differential dynamic programming. Elsevier, New York

    MATH  Google Scholar 

  • Kakade S, Kearns M, Langford J (2003) Exploration in metric state spaces. In: Proceedings of the international conference on machine learning, Washington, DC

    Google Scholar 

  • Kearns M, Koller D (1999) Efficient reinforcement learning in factored MDPs. In: Proceedings of the 16th international joint conference on artificial intelligence, Stockholm. Morgan Kaufmann, San Francisco

    Google Scholar 

  • Kearns M, Singh S (2002) Near-optimal reinforcement learning in polynomial time. Mach Learn J 49(2–3):209–232

    Article  MATH  Google Scholar 

  • La Civita M (2003) Integrated modeling and robust control for full-envelope flight of robotic helicopters. PhD thesis, Carnegie Mellon University, Pittsburgh

    Google Scholar 

  • La Civita M, Papageorgiou G, Messner WC, Kanade T (2006) Design and flight testing of a high-bandwidth \(\mathcal{H}_{\infty }\) loop shaping controller for a robotic helicopter. J Guid Control Dyn 29(2):485–494

    Article  Google Scholar 

  • Leishman J (2000) Principles of helicopter aerodynamics. Cambridge University Press, Cambridge

    Google Scholar 

  • Nelder JA, Mead R (1964) A simplex method for function minimization. Comput J 7:308–313

    Article  Google Scholar 

  • Ng AY, Coates A, Diel M, Ganapathi V, Schulte J, Tse B et al (2004) Autonomous inverted helicopter flight via reinforcement learning. In: International symposium on experimental robotics, Singapore. Springer, Berlin

    Google Scholar 

  • Ng AY, Jordan M (2000) Pegasus: a policy search method for large MDPs and POMDPs. In: Proceedings of the uncertainty in artificial intelligence 16th conference, Stanford. Morgan Kaufmann, San Francisco

    Google Scholar 

  • Ng AY, Kim HJ, Jordan M, Sastry S (2004) Autonomous helicopter flight via reinforcement learning. In: NIPS 16, Vancouver

    Google Scholar 

  • Ng AY, Russell S (2000) Algorithms for inverse reinforcement learning. In: Proceedings of the 17th international conference on machine learning, San Francisco. Morgan Kaufmann, San Francisco, pp 663–670

    Google Scholar 

  • Saripalli S, Montgomery JF, Sukhatme GS (2003) Visually-guided landing of an unmanned aerial vehicle. IEEE Trans Robot Auton Syst 19(3):371–380

    Article  Google Scholar 

  • Seddon J (1990) Basic helicopter aerodynamics. AIAA education series. America Institute of Aeronautics and Astronautics, El Segundo

    Google Scholar 

  • Tischler MB, Cauffman MG (1992) Frequency response method for rotorcraft system identification: flight application to BO-105 couple rotor/fuselage dynamics. J Am Helicopter Soc 37:3–17

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adam Coates .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this entry

Cite this entry

Coates, A., Abbeel, P., Ng, A.Y. (2014). Autonomous Helicopter Flight Using Reinforcement Learning. In: Sammut, C., Webb, G. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7502-7_16-1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4899-7502-7_16-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Online ISBN: 978-1-4899-7502-7

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics