Attitude control of underwater glider combined reinforcement learning with active disturbance rejection control

  • Zhi-qiang SuEmail author
  • Meng Zhou
  • Fang-fang Han
  • Yi-wu Zhu
  • Da-lei Song
  • Ting-ting Guo
Original article


Buoyancy-driven underwater gliders are highly efficient winged underwater vehicles driven by modifying the net buoyancy and internal shape. Many advantages, such as wide cruise range, less power consumption, low noise, and no pollution, make the underwater glider an important platform for marine environment observation and ocean resource exploration. For the wide cruise range, attitude control of underwater glider becomes the core technology. In this paper, the underwater glider named OUC-III has been developed for marine observation. To control the attitude of glider, the kinematic and dynamic models of it have been calculated by mathematical analysis. Furthermore, a novel control algorithm is proposed to control the attitude of glider. The algorithm is combined reinforcement learning with Active Disturbance Rejection Control (ADRC) and compared with classical ADRC by simulation based on the dynamic model of OUC-III. The simulation experimental results indicate that the proposed algorithm compensates well for the ocean current disturbances on OUC-III attitude control mission and it obtains high-precision and high-adaptive control ability.


Underwater glider Attitude control Reinforcement learning NAC-ADRC 



This work has been supported by the Underwater Glider Research Center of Ocean University of China, 863 Plan Acoustic Glider System Development Team (Grant Number: 2012AA091004). We will also thank for the support from Shanghai Jiao Tong University and Ocean University of China. At last, we would like to thank the three anonymous reviewers and editors highly for the huge contributions to help us improve the quality of our paper.


  1. 1.
    Mahmoudian N, Woolsey C (2008) Underwater glider motion control. In: Proceedings of the 47th IEEE Conference on decision and control. IEEE, Cancun, Mexico, pp. 552–557Google Scholar
  2. 2.
    Webb DC, Simonetti PJ, Jones CP (2001) SLOCUM: an underwater glider propelled by environment energy. IEEE J Ocean Eng 26(4):447–452CrossRefGoogle Scholar
  3. 3.
    Sherman J, Davis RE, Owens WB, Valdes J (2001) The autonomous underwater glider “Spray”. IEEE J Ocean Eng 26(4):437–446CrossRefGoogle Scholar
  4. 4.
    Jenkins SA, Humphreys DE, Sherman J, Osse J, Jones C, Leonard N, Graver J, Bachmayer R, Clem T, Carroll P, Davis P, Berry J, Worley P, Wasyl J (2003) Underwater glider system study (Technical Report 53). Scripps Institution of OceanographyGoogle Scholar
  5. 5.
    Eriksen CC, Osse TJ, Light RD, Wen T, Lehman TW, Sabin PL, Ballard JW, Chiodi AM (2001) Seaglider: a long range autonomous underwater vehicle for oceanographic research. IEEE J Ocean Eng 26(4):424–436CrossRefGoogle Scholar
  6. 6.
    Osse TJ, Eriksen CC (2007) The deepglider: a full ocean depth glider for oceanographic research. Proc Oceans 7:1–12Google Scholar
  7. 7.
    Zhang S, Yu, Jiancheng, Zhang, Aiqun, Zhang, Fumin (2013) Spiraling motion of underwater gliders: modeling, analysis, and experimental results. Ocean Eng 60:1–3CrossRefGoogle Scholar
  8. 8.
    Wang SX, Sun XJ, Wu JG et al (2010) Motion characteristic analysis of a hybrid-driven underwater glider. Oceans IEEE, 1–9Google Scholar
  9. 9.
    Stommel H (1989) The Slocum mission. Oceanography 2(1):22–25CrossRefGoogle Scholar
  10. 10.
    Ali Hussain NA, Arshad MR, Mohd-Mokhtar R (2011) Underwater glider modeling and analysis for net buoyancy, depth and pitch angle control. ELSEVIER Ocean Eng 38(16):1782–1791CrossRefGoogle Scholar
  11. 11.
    Arima M, Ichihashi N, Miwa Y (2009) Modeling and motion simulation of an underwater glider with independently controllable main wings. In: Proceedings of IEEE Oceans 2009-Europe, pp. 1–6Google Scholar
  12. 12.
    Graver JG (2005) Underwater Glider: Dynamic, Control and Design (Ph.D. thesis). Department of Mechanical and Aerospace Engineering, Princeton University, USAGoogle Scholar
  13. 13.
    Tomoda Y, Kawaguchi K, Ura T, Kobayashi H (1993) Development and sea trials of a shuttle type AUV ALBAC. In: Proceedings of the Eighth International Symposium on Unmanned Untethered Submersible Technology. Durham, New Hampshire, pp. 7–13Google Scholar
  14. 14.
    Antonelli G, Fossen T, Yoerger D (2008) Underwater robotics. In: Siciliano B, Khatib O (eds) Springer Handbook of Robotics SE-44. Springer, Berlin HeidelbergGoogle Scholar
  15. 15.
    Fossen TI (2002) Marine control systems: guidance, navigation and control of ships, rigs and underwater vehicles. Marine Cybernetics AS, TrondheimGoogle Scholar
  16. 16.
    Sørensen AJ (2005) Marine cybernetics: modelling and control, lecture notes, fifth edition, UK-05-76. Department of Marine Technology, The Norwegian University of Secience and Technology, TrondheimGoogle Scholar
  17. 17.
    Bhatta P, Leonard NE (2002) Stabilization and coordination of underwater gliders. In: Proceedings of the 41st IEEE Conference on Decision and Control, pp. 2081–2086Google Scholar
  18. 18.
    Caiti A, Calabro V (2010) Control-oriented modeling of a hybrid AUV. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 5275–5280Google Scholar
  19. 19.
    Mahmoudian N (2009) Efficient motion planning and control for underwater gliders efficient motion planning and control for underwater gliders (Ph.D thesis). Virginia Polytechnic Institute and State UniversityGoogle Scholar
  20. 20.
    Kan L, Zhang Y, Fan H, Yang W, Chen Z (2008) MATLAB-based simulation of buoyancy-driven underwater glider motion. J Ocean Univ China 7(1):113–118CrossRefGoogle Scholar
  21. 21.
    Wang YH, Wang SX (2009) Dynamic modeling and three-dimensional motion analysis of underwater gliders. China Ocean Eng 23(3):489–504Google Scholar
  22. 22.
    Wang SX, Sun XJ, Wang YH, Wu JG, Wang XM (2011) Dynamic modeling and simulation for a winged hybrid-driven underwater glider. China Ocean Eng 25(1):97–112CrossRefGoogle Scholar
  23. 23.
    Arima M, Ichihashi N, Ikebuchi T (2008) Motion characteristics of an underwater glider with independently controllable main wings. In: Proceedings of the OCEANS’08 – MTS/IEEE Kobe Techno-Ocean, pp. 951–957Google Scholar
  24. 24.
    Liu Y-H, Su Z-Q, Luan X, Song D-L, Han L (2017) Motion analysis and fuzzy-PID control algorithm designing for the pitch angle of an underwater glider. J Math Comput Sci, pp 133–147Google Scholar
  25. 25.
    Leonard NE, Graver JG (2001) Model-based feedback control of autonomous underwater gliders. IEEE J Ocean Eng 26(4):633–644CrossRefGoogle Scholar
  26. 26.
    Wang W, Clark CM (2006) Modeling and simulation of the VideoRay Pro III underwater vehicle. In: Proceedings of the OCEANS 2006—Asia Pacific, pp 1–7Google Scholar
  27. 27.
    Jagadeesh P, Murali K, Idichandy VG (2009) Experimental investigation of hydrodynamic force coefficients over AUV hull form. Ocean Eng 36(4):113–118CrossRefGoogle Scholar
  28. 28.
    Budiyono A (2009) Advances in unmanned underwater vehicles technologies: modeling, control and guidance perspectives. Indian J Geo-Mar Sci 38(3):282–295Google Scholar
  29. 29.
    Wang Y, Zhang H, Wang S (2009) Trajectory control strategies for the underwater glider. In: Proceedings of the International Conference on Measuring Technology and Mechatronics Automation, pp 918–921Google Scholar
  30. 30.
    Jun B-H, Park J-Y, Lee F-Y, Lee P-M, Lee C-M, Kim K, Lim Y-K, Oh J-H (2009) Development of the AUV ‘ISiMI’ and free running test in an ocean engineering basin. Ocean Eng 36(1):2–14CrossRefGoogle Scholar
  31. 31.
    Isa K, Arshad MR, Ishak S (2014) A hybrid-driven underwater glider model, hydrodynamics estimation, and an analysis of the motion control. Ocean Eng 81(2):111–129CrossRefGoogle Scholar
  32. 32.
    Chin CS, Lau MWS, Low E (2011) Supervisory cascaded controllers design: experiment test on a remotely-operated vehicle. Proc Inst Mech Eng Part C J Mech Eng Sci 225(3):584–603CrossRefGoogle Scholar
  33. 33.
    Chin CS, Lau MWS, Low E, Seet GGL (2008) Robust and decoupled cascaded control system of underwater robotic vehicle for stabilization and pipeline tracking. Proc Inst Mech Eng Part I J Syst Control Eng 222(4):261–278CrossRefGoogle Scholar
  34. 34.
    Chin CS, Lin WP, Lin JY (2017) Experimental validation of open-frame ROV model for virtual reality simulation and control. J Marine Sci Technol 2017(2):1–21Google Scholar
  35. 35.
    Lin W, Chin CS (2017) Block diagonal dominant remotely operated vehicle model simulation using decentralized model predictive control. Adv Mech Eng 9(4):1–24CrossRefGoogle Scholar
  36. 36.
    Carreras M, Yuh J, Batlle J et al (2005) A behavior-based scheme using reinforcement learning for autonomous underwater vehicles. IEEE J Oceanic Eng 30(2):416–427CrossRefGoogle Scholar
  37. 37.
    Cui R, Yang C, Li Y et al (2017) Adaptive neural network control of AUVs with control input nonlinearities using reinforcement learning. IEEE Trans Syst Man Cyber Syst 47(6):1019–1029CrossRefGoogle Scholar
  38. 38.
    El-Fakdi A, Carreras M (2013) Two-step gradient-based reinforcement learning for underwater robotics behavior learning. Robot Auton Syst 61(3):271–282CrossRefGoogle Scholar
  39. 39.
    Shen Y, Shao K, Ren W et al (2016) Diving control of autonomous underwater vehicle based on improved active disturbance rejection control approach. Neurocomputing 173(P3):1377–1385CrossRefGoogle Scholar
  40. 40.
    Sun Y, Zhang Y, Zhang G et al. Path Tracking Control of Underactuated AUVs Based on ADRC. Proceedings of 2013 Chinese Intelligent Automation Conference. Springer Berlin Heidelberg, 2013:609–615Google Scholar
  41. 41.
    Matsubara T, Morimoto J, Nakanishi J, Sato M, Doya K, Learning sensory feedback to CPG with policy gradient for biped locomotion, in: IEEE International Conference on Robotics and Automation, ICRA, Barcelona, Spain,2005Google Scholar
  42. 42.
    Yan Z, Liu Y, Zhou J et al (2014) Path following control of an AUV under the current using the SVR-ADRC. J Appl Math 2014 (2014-3-13) 2014(3):1–12Google Scholar
  43. 43.
    Hwangbo J, Sa I, Siegwart R et al (2017) Control of a quadrotor with reinforcement learning. PP(99):1–1Google Scholar
  44. 44.
    Yang C, Peng S, Fan S et al (2016) Study on docking guidance algorithm for hybrid underwater glider in currents. Ocean Eng 125:170–181CrossRefGoogle Scholar
  45. 45.
    Han JQ, Active Disturbance Rejection Control Technique (2009) The technique for estimation and compensating the uncertainties. National Defense Industrial Press, BeijingGoogle Scholar
  46. 46.
    Song D, Guo T, Wang H et al (2017) Pitch angle active disturbance rejection control with model compensation for underwater gliders. International Conference on intelligent robotics and applications. Springer, Cham, pp 745–756Google Scholar
  47. 47.
    Han JQ (1995) Nonlinear state error feedback control law. Control Decision 10(3):221–225Google Scholar
  48. 48.
    Han JQ (1998) Auto-disturbance rejection controller and its applications. Control Decis 13:19–23Google Scholar
  49. 49.
    Han JQ (2007) Auto disturbances rejection control technique. Front Sci 1:24–31Google Scholar
  50. 50.
    Sutton R, Barto A (1998) Reinforcement learning, an introduction. MIT Press, CambridgeGoogle Scholar
  51. 51.
    Smart W, Kaelbling L (2000) Practical reinforcement learning in continuous spaces, In: International Conference on Machine Learning, ICMLGoogle Scholar
  52. 52.
    Casas N (2017) Deep deterministic policy gradient for urban traffic light controlGoogle Scholar
  53. 53.
    Hernandez N, Mahadevan S (2000) Hierarchical memory-based reinforcement learning. In: Advances in neural information processing systems. NIPS, Denver, USAGoogle Scholar
  54. 54.
    Carreras M, Ridao P, El-Fakdi A (2003) Semi-online neural-Q-learning for real-time robot learning. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS, Las Vegas, USAGoogle Scholar
  55. 55.
    Konda V, Tsitsiklis J (2003) On actor–critic algorithms. SIAM J Control Opt 42(4):1143–1166MathSciNetCrossRefzbMATHGoogle Scholar
  56. 56.
    Tedrake R, Zhang T, Seung H (2004) Stochastic policy gradient reinforcement learning on a simple 3D biped. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS, Sendai, JapanGoogle Scholar
  57. 57.
    Smart W (2002) Making reinforcement learning work on real robots, Thesis PhD, Department of Computer Science at Brown University, Rhode IslandGoogle Scholar
  58. 58.
    Lin L (1992) Self-improving reactive agents based on reinforcement learning, planning and teaching. J Mach Learn 8(3–4):293–321Google Scholar
  59. 59.
    Peters J, Vijayakumar S, Schaal S (2005) Natural actor–critic. In: ECML, 280–291Google Scholar
  60. 60.
    Richter S, Aberdeen D, Yu J (2006) Natural actor–critic for road traffic optimisation. In: Neural Information Processing Systems, NIPS, pp. 1169–1176Google Scholar
  61. 61.
    Boyan JA (1999) Least-squares temporal difference learning. In: 16th International Conference on Machine Learning, ICML, pp. 49–56Google Scholar
  62. 62.
    Gautam U, Ramanathan M (2015) Simulation for path planning of SLOCUM glider in near-bottom ocean currents using heuristic algorithms and q-learning. 65(3):220–225Google Scholar

Copyright information

© JASNAOE 2018

Authors and Affiliations

  • Zhi-qiang Su
    • 1
    • 2
    Email author
  • Meng Zhou
    • 2
  • Fang-fang Han
    • 3
  • Yi-wu Zhu
    • 2
  • Da-lei Song
    • 4
  • Ting-ting Guo
    • 4
  1. 1.School of Naval Architecture, Ocean and Civil EngineeringShanghai Jiao Tong UniversityShanghaiChina
  2. 2.Institute of OceanographyShanghai Jiao Tong UniversityShanghaiChina
  3. 3.School of EconomicsOcean University of ChinaQingdao CityChina
  4. 4.College of EngineeringOcean University of ChinaQingdao CityChina

Personalised recommendations