Dyna-Q Algorithm for Path Planning of Quadrotor UAVs

Huo, Xin; Zhang, Tianze; Wang, Yuzhu; Liu, Weizhen

doi:10.1007/978-981-13-2853-4_27

Xin Huo¹²,
Tianze Zhang^12,13,
Yuzhu Wang¹⁴ &
…
Weizhen Liu¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 946))

Included in the following conference series:

Asian Simulation Conference

1583 Accesses
1 Citations

Abstract

In this paper, the problem of path planning of quadrotor unmanned aerial vehicles (UAVs) is investigated in the framework of reinforcement learning methodology. With the abstraction of the environment in the form of grid world in 2D, the design procedure is presented by utilizing the Dyna-Q algorithm, which is one of the reinforcement method combining both model-based and non-model framework. In this process, an optimal or suboptimal safe flight trajectory will be obtained by learning constantly and planning by simulated experience, thus calculative reward can be maximized efficiently. Matlab software is used for maze establishing and computation, and the effectiveness of the proposed method is illustrated by two typical examples.

This work was supported by the National Natural Science Foundation of China under Grants 61773138 and 61203191.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Qu, Y.-H., Pan, Q., Yan, J.-G.: Flight path planning of UAV based on heuristically search and genetic algorithms. In: Proceedings of IECON 2005 (2005)
Google Scholar
Zhang, T., Huo, X.: Path planning and control of a quadrotor UAV: a symbolic approach. In: 11th Asian Control Conference, pp. 2750–2755 (2017)
Google Scholar
Goerzen, C., Kong, Z., Mettler, B.: A survey of motion planning algorithms from the perspective of autonomous UAV guidance. J. Intell. Robot. Syst. 57, 65–100 (2010)
Article Google Scholar
Zheng, C., Li, L., Xu, F., Sun, F., Ding, M.: Evolutionary route planner for unmanned air vehicles. IEEE Trans. Robot. 21, 609–620 (2005)
Article Google Scholar
Lange, S., Riedmiller, M., Voigtl\(\ddot{a}\)nder, A.: Autonomous reinforcement learning on raw visual input data in a real world application. In: 2012 IEEE World Congress on Computational Intelligence, pp. 10–15 (2012)
Google Scholar
Tai, L., Liu, M.: Towards cognitive exploration through deep reinforcement learning for mobile robots, arXiv:1610.01733v1 [cs.RO] 6 Oct 2016
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)
Article Google Scholar
Zhang, T., Huo, X., Chen, S., Yang, B., Zhang, G.: Hybrid path planning of a quadrotor UAV based on Q-Learning algorithm. In: 37th Chinese Control Conference (2018, accepted)
Google Scholar
Pfeiffer, M., Schaeuble, M., Nieto, J., Siegwart, R., Cadena, C.: From perception to decision: a data-driven approach to end-to-end motion planning for autonomous ground robots, pp. 1527–1533 (2016)
Google Scholar
Reißig, G., Weber, A., Rungger, M.: Feedback refinement relations for the synthesis of symbolic controllers. IEEE Trans. Autom. Control 62, 1781–1796 (2015)
Article MathSciNet Google Scholar
Ng, A.Y., et al.: Autonomous inverted helicopter flight via reinforcement learning. In: Ang, M.H., Khatib, O. (eds.) Experimental Robotics IX. STAR, vol. 21, pp. 363–372. Springer, Heidelberg (2006). https://doi.org/10.1007/11552246_35
Chapter Google Scholar
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike Adaptive Elements that Can Solve Difficult Learning Control Problems, pp. 834–846. MIT Press, Cambridge (1988)
Google Scholar
Bellman, R.: Dynamic programming. Science 153(3731), 34–37 (1966)
Article Google Scholar
Howard, R.A.: Dynamic programming and Markov processes. Math. Gazette 3(358), 120 (1960)
Google Scholar
Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn. Athena Scientific, Belmont (2007)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Harbin Institute of Technology, Harbin, 150080, China
Xin Huo, Tianze Zhang & Weizhen Liu
National Instruments China, Shanghai, 201203, China
Tianze Zhang
Helong Senior High School of Nongan, Changchun, 130216, China
Yuzhu Wang

Authors

Xin Huo
View author publications
You can also search for this author in PubMed Google Scholar
Tianze Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yuzhu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Weizhen Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Huo .

Editor information

Editors and Affiliations

Ritsumeikan University, Kusatsu, Shiga, Japan
Liang Li
Ritsumeikan University, Kusatsu, Shiga, Japan
Kyoko Hasegawa
Ritsumeikan University, Kusatsu, Shiga, Japan
Satoshi Tanaka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huo, X., Zhang, T., Wang, Y., Liu, W. (2018). Dyna-Q Algorithm for Path Planning of Quadrotor UAVs. In: Li, L., Hasegawa, K., Tanaka, S. (eds) Methods and Applications for Modeling and Simulation of Complex Systems. AsiaSim 2018. Communications in Computer and Information Science, vol 946. Springer, Singapore. https://doi.org/10.1007/978-981-13-2853-4_27

Download citation

DOI: https://doi.org/10.1007/978-981-13-2853-4_27
Published: 18 October 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2852-7
Online ISBN: 978-981-13-2853-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics