Skip to main content

Dyna-Q Algorithm for Path Planning of Quadrotor UAVs

  • Conference paper
  • First Online:
Methods and Applications for Modeling and Simulation of Complex Systems (AsiaSim 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 946))

Included in the following conference series:

Abstract

In this paper, the problem of path planning of quadrotor unmanned aerial vehicles (UAVs) is investigated in the framework of reinforcement learning methodology. With the abstraction of the environment in the form of grid world in 2D, the design procedure is presented by utilizing the Dyna-Q algorithm, which is one of the reinforcement method combining both model-based and non-model framework. In this process, an optimal or suboptimal safe flight trajectory will be obtained by learning constantly and planning by simulated experience, thus calculative reward can be maximized efficiently. Matlab software is used for maze establishing and computation, and the effectiveness of the proposed method is illustrated by two typical examples.

This work was supported by the National Natural Science Foundation of China under Grants 61773138 and 61203191.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Qu, Y.-H., Pan, Q., Yan, J.-G.: Flight path planning of UAV based on heuristically search and genetic algorithms. In: Proceedings of IECON 2005 (2005)

    Google Scholar 

  2. Zhang, T., Huo, X.: Path planning and control of a quadrotor UAV: a symbolic approach. In: 11th Asian Control Conference, pp. 2750–2755 (2017)

    Google Scholar 

  3. Goerzen, C., Kong, Z., Mettler, B.: A survey of motion planning algorithms from the perspective of autonomous UAV guidance. J. Intell. Robot. Syst. 57, 65–100 (2010)

    Article  Google Scholar 

  4. Zheng, C., Li, L., Xu, F., Sun, F., Ding, M.: Evolutionary route planner for unmanned air vehicles. IEEE Trans. Robot. 21, 609–620 (2005)

    Article  Google Scholar 

  5. Lange, S., Riedmiller, M., Voigtl\(\ddot{a}\)nder, A.: Autonomous reinforcement learning on raw visual input data in a real world application. In: 2012 IEEE World Congress on Computational Intelligence, pp. 10–15 (2012)

    Google Scholar 

  6. Tai, L., Liu, M.: Towards cognitive exploration through deep reinforcement learning for mobile robots, arXiv:1610.01733v1 [cs.RO] 6 Oct 2016

  7. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)

    Article  Google Scholar 

  8. Zhang, T., Huo, X., Chen, S., Yang, B., Zhang, G.: Hybrid path planning of a quadrotor UAV based on Q-Learning algorithm. In: 37th Chinese Control Conference (2018, accepted)

    Google Scholar 

  9. Pfeiffer, M., Schaeuble, M., Nieto, J., Siegwart, R., Cadena, C.: From perception to decision: a data-driven approach to end-to-end motion planning for autonomous ground robots, pp. 1527–1533 (2016)

    Google Scholar 

  10. Reißig, G., Weber, A., Rungger, M.: Feedback refinement relations for the synthesis of symbolic controllers. IEEE Trans. Autom. Control 62, 1781–1796 (2015)

    Article  MathSciNet  Google Scholar 

  11. Ng, A.Y., et al.: Autonomous inverted helicopter flight via reinforcement learning. In: Ang, M.H., Khatib, O. (eds.) Experimental Robotics IX. STAR, vol. 21, pp. 363–372. Springer, Heidelberg (2006). https://doi.org/10.1007/11552246_35

    Chapter  Google Scholar 

  12. Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike Adaptive Elements that Can Solve Difficult Learning Control Problems, pp. 834–846. MIT Press, Cambridge (1988)

    Google Scholar 

  13. Bellman, R.: Dynamic programming. Science 153(3731), 34–37 (1966)

    Article  Google Scholar 

  14. Howard, R.A.: Dynamic programming and Markov processes. Math. Gazette 3(358), 120 (1960)

    Google Scholar 

  15. Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn. Athena Scientific, Belmont (2007)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Huo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huo, X., Zhang, T., Wang, Y., Liu, W. (2018). Dyna-Q Algorithm for Path Planning of Quadrotor UAVs. In: Li, L., Hasegawa, K., Tanaka, S. (eds) Methods and Applications for Modeling and Simulation of Complex Systems. AsiaSim 2018. Communications in Computer and Information Science, vol 946. Springer, Singapore. https://doi.org/10.1007/978-981-13-2853-4_27

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-2853-4_27

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-2852-7

  • Online ISBN: 978-981-13-2853-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics