Abstract
In this paper, the problem of path planning of quadrotor unmanned aerial vehicles (UAVs) is investigated in the framework of reinforcement learning methodology. With the abstraction of the environment in the form of grid world in 2D, the design procedure is presented by utilizing the Dyna-Q algorithm, which is one of the reinforcement method combining both model-based and non-model framework. In this process, an optimal or suboptimal safe flight trajectory will be obtained by learning constantly and planning by simulated experience, thus calculative reward can be maximized efficiently. Matlab software is used for maze establishing and computation, and the effectiveness of the proposed method is illustrated by two typical examples.
This work was supported by the National Natural Science Foundation of China under Grants 61773138 and 61203191.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Qu, Y.-H., Pan, Q., Yan, J.-G.: Flight path planning of UAV based on heuristically search and genetic algorithms. In: Proceedings of IECON 2005 (2005)
Zhang, T., Huo, X.: Path planning and control of a quadrotor UAV: a symbolic approach. In: 11th Asian Control Conference, pp. 2750–2755 (2017)
Goerzen, C., Kong, Z., Mettler, B.: A survey of motion planning algorithms from the perspective of autonomous UAV guidance. J. Intell. Robot. Syst. 57, 65–100 (2010)
Zheng, C., Li, L., Xu, F., Sun, F., Ding, M.: Evolutionary route planner for unmanned air vehicles. IEEE Trans. Robot. 21, 609–620 (2005)
Lange, S., Riedmiller, M., Voigtl\(\ddot{a}\)nder, A.: Autonomous reinforcement learning on raw visual input data in a real world application. In: 2012 IEEE World Congress on Computational Intelligence, pp. 10–15 (2012)
Tai, L., Liu, M.: Towards cognitive exploration through deep reinforcement learning for mobile robots, arXiv:1610.01733v1 [cs.RO] 6 Oct 2016
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)
Zhang, T., Huo, X., Chen, S., Yang, B., Zhang, G.: Hybrid path planning of a quadrotor UAV based on Q-Learning algorithm. In: 37th Chinese Control Conference (2018, accepted)
Pfeiffer, M., Schaeuble, M., Nieto, J., Siegwart, R., Cadena, C.: From perception to decision: a data-driven approach to end-to-end motion planning for autonomous ground robots, pp. 1527–1533 (2016)
Reißig, G., Weber, A., Rungger, M.: Feedback refinement relations for the synthesis of symbolic controllers. IEEE Trans. Autom. Control 62, 1781–1796 (2015)
Ng, A.Y., et al.: Autonomous inverted helicopter flight via reinforcement learning. In: Ang, M.H., Khatib, O. (eds.) Experimental Robotics IX. STAR, vol. 21, pp. 363–372. Springer, Heidelberg (2006). https://doi.org/10.1007/11552246_35
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike Adaptive Elements that Can Solve Difficult Learning Control Problems, pp. 834–846. MIT Press, Cambridge (1988)
Bellman, R.: Dynamic programming. Science 153(3731), 34–37 (1966)
Howard, R.A.: Dynamic programming and Markov processes. Math. Gazette 3(358), 120 (1960)
Bertsekas, D.P.: Dynamic Programming and Optimal Control, 3rd edn. Athena Scientific, Belmont (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Huo, X., Zhang, T., Wang, Y., Liu, W. (2018). Dyna-Q Algorithm for Path Planning of Quadrotor UAVs. In: Li, L., Hasegawa, K., Tanaka, S. (eds) Methods and Applications for Modeling and Simulation of Complex Systems. AsiaSim 2018. Communications in Computer and Information Science, vol 946. Springer, Singapore. https://doi.org/10.1007/978-981-13-2853-4_27
Download citation
DOI: https://doi.org/10.1007/978-981-13-2853-4_27
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2852-7
Online ISBN: 978-981-13-2853-4
eBook Packages: Computer ScienceComputer Science (R0)