Advertisement

Model-free Adaptive Dynamic Programming Based Near-optimal Decentralized Tracking Control of Reconfigurable Manipulators

  • Bo Zhao
  • Yuanchun Li
Regular Paper Control Theory and Applications

Abstract

In this paper, a model-free near-optimal decentralized tracking control (DTC) scheme is developed for reconfigurable manipulators via adaptive dynamic programming algorithm. The proposed controller can be divided into two parts, namely local desired controller and local tracking error controller. In order to remove the normboundedness assumption of interconnections, desired states of coupled subsystems are employed to substitute their actual states. Using the local input/output data, the unknown subsystem dynamics of reconfigurable manipulators can be identified by constructing local neural network (NN) identifiers. With the help of the identified dynamics, the local desired control can be derived directly with corresponding desired states. Then, for tracking error subsystems, the local tracking error control is investigated by the approximate improved local cost function via local critic NN and the identified input gain matrix. To overcome the overall error caused by the substitution, identification and critic NN approximation, a robust compensation is added to construct the improved local cost function that reflects the overall error, regulation and control simultaneously. Therefore, the closed-loop tracking system can be guaranteed to be asymptotically stable via Lyapunov stability theorem. Two 2-degree of freedom reconfigurable manipulators with different configurations are employed to demonstrate the effectiveness of the proposed modelfree near-optimal DTC scheme.

Keywords

Adaptive dynamic programming decentralized tracking control model-free near-optimal neural networks reconfigurable manipulators 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    C. J. J. Paredis, H. B. Brown, and P. K. Khosla, “A rapidly deployable manipulator system,” Robotics and Autonomous Systems, vol. 21, no. 3, pp. 289–304, 1997. [click]CrossRefGoogle Scholar
  2. [2]
    R. Naldi, F. Forte, A. Serrani, and L. Marconi, “Modeling and control of a class of modular aerial robots combining under actuated and fully actuated behavior,” IEEE Transactions on Control Systems Technology, vol. 23, no. 5, pp. 1869–1885, 2015. [click]CrossRefGoogle Scholar
  3. [3]
    E. Meister, A. Gutenkunst, and P. Levi, “Dynamics and control of modular and self-reconfigurable robotic systems,” International Journal on Advances in Intelligent Systems, vol. 6, no. 1 & 2, Number 1, 2013.Google Scholar
  4. [4]
    Y. Liu, and Y. Li, “Dynamics and model-based control for mobile modular manipulators,” Robotica, vol. 23, no. 6, pp. 795–797, 2005. [click]CrossRefGoogle Scholar
  5. [5]
    Y. Liu, and Y. Li, “Sliding mode adaptive neural-network control for nonholonomic mobile modular manipulators,” Journal of Intelligent and Robotic Systems, vol. 44, no. 3, pp. 203–224, 2005.CrossRefGoogle Scholar
  6. [6]
    S. Kirchoff, and W. W. Melek, “A saturation-type robust controller for modular manipulators arms,” Mechatronics, vol. 17, no. 4, pp. 175–190, 2007. [click]CrossRefGoogle Scholar
  7. [7]
    D. J. Christensen, U. P. Schultz, and K. Stoy, “A distributed and morphology-independent strategy for adaptive locomotion in self-reconfigurable modular robots,” Robotics and Autonomous Systems, vol. 61, no. 9, pp. 1021–1035, 2013. [click]CrossRefGoogle Scholar
  8. [8]
    G. Liu, Y. Liu, and A. A. Goldenberg, “Design, analysis, and control of a spring-assisted modular and reconfigurable robot,” IEEE/ASME Transactions on Mechatronics, vol. 16, no. 4, pp. 695–706, 2011. [click]CrossRefGoogle Scholar
  9. [9]
    S. Ahmad, H. Zhang, and G. Liu, “Distributed fault detection for modular and reconfigurable robots with joint torque sensing: A prediction error based approach,” Mechatronics, vol. 23, no. 6, pp. 607–616, 2013. [click]CrossRefGoogle Scholar
  10. [10]
    G. Liu, S. Abdul, and A. A. Goldenberg, “Distributed control of modular and reconfigurable robot with torque sensing,” Robotica, vol. 26, no. 1, pp. 75–84, 2008. [click]CrossRefGoogle Scholar
  11. [11]
    B. Zhao, Y. Li, and D. Liu, “Self-tuned local feedback gain based decentralized fault tolerant control for a class of large-scale nonlinear systems,” Neurocomputing, vol. 235, pp. 147–156, 2017. [click]CrossRefGoogle Scholar
  12. [12]
    B. Zhao, C. Li, T. Ma, and Y. Li, “Multiple faults detection and isolation via decentralized sliding mode observer for reconfigurable manipulator,” Journal of Electrical Engineering & Technology, vol. 10, no. 6, pp. 2393–2405, 2015. [click]CrossRefGoogle Scholar
  13. [13]
    T. Ababsa, N. Djedi, Y. Duthen, and S. C. Blanc, “Decentralized approach to evolve the structure of metamorphic robots,” 2013 IEEE Symposium on Artificial Life (ALife), pp. 74–81, 2013. [click]CrossRefGoogle Scholar
  14. [14]
    Z. Butler, K. Kotay, D. Rus, and K. Tomita, “Generic decentralized control for lattice-based self-reconfigurable robots,” The International Journal of Robotics Research, vol. 23, no. 9, pp. 919–937, 2004.CrossRefGoogle Scholar
  15. [15]
    J. Yuan, G. Liu, and B. Wu, “Power efficiency estimationbased health monitoring and fault detection of modular and reconfigurable robot,” IEEE Transactions on Industrial Electronics, vol. 58, no. 10, pp. 4880–4887, 2011. [click]CrossRefGoogle Scholar
  16. [16]
    W. H. Zhu, T. Lamarche, E. Dupuis, D. Jameux, P. Barnard, and G. Liu, “Precision control of modular robot manipulators: the VDC approach with embedded FPGA,” IEEE Transactions on Robotics, vol. 29, no. 5, pp. 1162–1179, 2013. [click]CrossRefGoogle Scholar
  17. [17]
    Z. Li, W. W. Melek, and C. Clark, “Decentralized robust control of robot manipulators with harmonic drive transmission and application to modular and reconfigurable serial arms,” Robotica, vol. 27, no. 2, pp. 291–302, 2009. [click]CrossRefGoogle Scholar
  18. [18]
    M. Zhu, and Y. Li, “Decentralized adaptive fuzzy sliding mode control for reconfigurable modular manipulators,” International Journal of Robust and Nonlinear Control, vol. 20, no. 4, pp. 472–488, 2010. [click]MathSciNetzbMATHGoogle Scholar
  19. [19]
    Y. Li, X. Liu, Z. Peng, and Y. Liu, “The identification of joint parameters for modular robots using fuzzy theory and a genetic algorithm,” Robotica, vol. 20, no. 5, pp. 509–517, 2002.CrossRefGoogle Scholar
  20. [20]
    B. Zhao, and Y. Li, “Local joint information based active fault tolerant control for reconfigurable manipulator,” Nonlinear dynamics, vol. 77, no. 3, pp. 859–876, 2014.MathSciNetCrossRefzbMATHGoogle Scholar
  21. [21]
    X. Yang, D. Liu, Q. Wei, and D. Wang, “Guaranteed cost neural tracking control for a class of uncertain nonlinear systems using adaptive dynamic programming,” Neurocomputing, vol. 198, pp. 80–90, 2016. [click]CrossRefGoogle Scholar
  22. [22]
    P. J. Werbos, “Approximate dynamic programming for real-time control and neural modeling,” Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, vol. 15, pp. 493–525, 1992.Google Scholar
  23. [23]
    Y. Pan, and H. Yu, “Biomimetic hybrid feedback feedforward neural-network learning control,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 6, pp. 1481–1487, 2017. [click]CrossRefGoogle Scholar
  24. [24]
    D. V. Prokhorov, and D. C. Wunsch, “Adaptive critic designs,” IEEE Transactions on Neural Networks, vol. 8, no. 5, pp. 997–1007, 1997. [click]CrossRefGoogle Scholar
  25. [25]
    D. P. Bertsekas, and J. N. Tsitsiklis, “Neuro-dynamic programming (optimization and neural computation series 3),” Athena Scientific, vol. 7, pp. 15–23, 1996.Google Scholar
  26. [26]
    F. L. Lewis, and K. G. Vamvoudakis, “Reinforcement learning for partially observable dynamic processes: Adaptive dynamic programming using measured output data,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 41, no. 1, pp. 14–25, 2011.CrossRefGoogle Scholar
  27. [27]
    D. Liu, D. Wang, and H. Li, “Decentralized stabilization for a class of continuous-time nonlinear interconnected systems using online learning optimal control approach,” IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 2, pp. 418–428, 2014. [click]CrossRefGoogle Scholar
  28. [28]
    B. Zhao, D. Wang, G. Shi, D. Liu and Y. Li, “Decentralized control for large-scale nonlinear systems with unknown mismatched interconnections via policy iteration,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, DOI: 10.1109/TSMC.2017.2690665, 2017.Google Scholar
  29. [29]
    H. Lin, Q. Wei, and D. Liu, “Online identifier-actor-critic algorithm for optimal control of nonlinear systems,” Optimal Control Applications and Methods, vol. 38, no. 3, pp. 317–335, 2017.MathSciNetCrossRefzbMATHGoogle Scholar
  30. [30]
    Y. Jiang, and Z. P. Jiang, “Robust adaptive dynamic programming for large-scale systems with an application to multimachine power systems,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 59, no. 10, pp. 693–697, 2012. [click]CrossRefGoogle Scholar
  31. [31]
    W. C. Wong, and J. H. Lee, “A reinforcement learningbased scheme for direct adaptive optimal control of linear stochastic systems,” Optimal Control Applications and Methods, vol. 31, no. 4, pp. 365–374, 2010.MathSciNetCrossRefzbMATHGoogle Scholar
  32. [32]
    M. Sharma, and A. Verma, “Wavelet reduced order observer based adaptive tracking control for a class of uncertain nonlinear systems using reinforcement learning,” International Journal of Control, Automation and Systems, vol. 11, no. 3, pp. 496–502, 2013. [click]CrossRefGoogle Scholar
  33. [33]
    B. Zhao, D. Liu, X. Yang, and D. Liu, “Observer-critic structure based adaptive dynamic programming for decentralized tracking control of unknown large-scale nonlinear systems,” International Journal of Systems Science, 2017, vol. 48, no. 9, pp. 1978–1989, 2017. [click]CrossRefzbMATHGoogle Scholar
  34. [34]
    X. Yang, D. Liu, and D. Wang, “Reinforcement learning for adaptive optimal control of unknown continuoustime nonlinear systems with input constraints,” International Journal of Control, vol. 87, no. 3, pp. 553–566, 2014. [click]MathSciNetCrossRefzbMATHGoogle Scholar
  35. [35]
    B. Zhao, D. Liu, and Y. Li, “Online fault compensation control based on policy iteration algorithm for a class of affine non-linear systems with actuator failures,” IET Control Theory & Applications, vol. 10, no. 16, pp. 1816–1823, 2016. [click]MathSciNetCrossRefGoogle Scholar
  36. [36]
    B. Zhao, D. Liu, and Y. Li, “Observer based adaptive dynamic programming for fault tolerant control of a class of nonlinear systems,” Information Sciences, vol. 384, pp. 21–33, 2017. [click]CrossRefGoogle Scholar
  37. [37]
    S. J. Chang, J. Y. Lee, J. B. Park, and Y. H. Choi, “An online fault tolerant actor-critic neuro-control for a class of nonlinear systems using neural network HJB approach,” International Journal of Control, Automation and Systems, vol. 13, no. 2, pp. 311–318, 2015. [click]CrossRefGoogle Scholar
  38. [38]
    C. L. Chen, D. Y. Sun, and C. Y. Chang, “Numerical solution of time-delayed optimal control problems by iterative dynamic programming,” Optimal Control Applications and Methods, vol. 21, no. 3, pp. 91–105, 2000. [click]MathSciNetCrossRefzbMATHGoogle Scholar
  39. [39]
    H. Li, D. Liu, and D. Wang, “Integral reinforcement learning for linear continuous-time zero-sum games with completely unknown dynamics,” IEEE Transactions on Automation Science and Engineering, vol. 11, no. 3, pp. 706–714, 2014. [click]CrossRefGoogle Scholar
  40. [40]
    S. Yasini, M. B. N. Sistani, and A. Karimpour, “Approximate dynamic programming for two-player zero-sum game related to H control of unknown nonlinear continuoustime systems,” International Journal of Control, Automation and Systems, vol. 13, no. 1, pp. 99–109, 2015. [click]CrossRefGoogle Scholar
  41. [41]
    T. Bian, Y. Jiang, and Z. P. Jiang, “Decentralized adaptive optimal control of large-scale systems with application to power systems,” IEEE Transactions on Industrial Electronics, vol. 62, no. 4, pp. 2439–2447, 2015. [click]CrossRefGoogle Scholar
  42. [42]
    W. Gao, Y. Jiang, Z. P. Jiang, and T. Chai, “Outputfeedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming,” Automatica, vol. 72, pp. 37–45, 2016. [click]MathSciNetCrossRefzbMATHGoogle Scholar
  43. [43]
    D. Hioe, N. Hudon, and J. Bao, “Decentralized nonlinear control of process networks based on dissipativity-A Hamilton-Jacobi equation approach,” Journal of Process Control, vol. 24, no. 3, pp. 172–187, 2014. [click]CrossRefGoogle Scholar
  44. [44]
    D. Wang, D. Liu, C. Mu, and H. Ma, “Decentralized guaranteed cost control of interconnected systems with uncertainties: a learning-based optimal control strategy,” Neurocomputing, vol. 214, pp. 297–306, 2016. [click]CrossRefGoogle Scholar
  45. [45]
    D. Liu, C. Li, H. Li, D. Wang, and H. Ma, “Neuralnetwork- based decentralized control of continuous-time nonlinear interconnected systems with unknown dynamics,” Neurocomputing, vol. 165, pp. 90–98, 2015. [click]CrossRefGoogle Scholar
  46. [46]
    S. Mehraeen, and S. Jagannathan, “Decentralized optimal control of a class of interconnected nonlinear discrete-time systems by using online Hamilton-Jacobi-Bellman formulation,” IEEE Transactions on Neural Networks, vol. 22, no. 11, pp. 1757–1769, 2011. [click]CrossRefGoogle Scholar
  47. [47]
    D. Wang, D. Liu, and Q. Wei, “Finite-horizon neurooptimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach,” Neurocomputing, vol. 78, no. 1, pp. 14–22, 2012.CrossRefGoogle Scholar
  48. [48]
    Y. Huang, and D. Liu, “Neural-network-based optimal tracking control scheme for a class of unknown discretetime nonlinear systems using iterative ADP algorithm,” Neurocomputing, vol. 125, pp. 46–56, 2014. [click]CrossRefGoogle Scholar
  49. [49]
    Q. Wei, D. Liu, and Y. Xu, “Neuro-optimal tracking control for a class of discrete-time nonlinear systems via generalized value iteration adaptive dynamic programming approach,” Soft Computing, vol. 20, no. 2, pp. 697–706, 2016. [click]CrossRefzbMATHGoogle Scholar
  50. [50]
    Y. M. Park, M. S. Choi, and K. Y. Lee, “An optimal tracking neuro-controller for nonlinear dynamic systems,” IEEE Transactions on Neural Networks, vol. 7, no. 5, pp. 1099–1110, 1996. [click]CrossRefGoogle Scholar
  51. [51]
    T. Cheng, F. L. Lewis, and M. Abu-Khalaf, “Fixed-finaltime- constrained optimal control of nonlinear systems using neural network HJB approach,” IEEE Transactions on Neural Networks, vol. 18, no. 6, pp. 1725–1737, 2007. [click]CrossRefGoogle Scholar
  52. [52]
    H. Wu, M. Li, and L. Guo, “Finite-Horizon Approximate Optimal Guaranteed Cost Control of Uncertain Nonlinear Systems With Application to Mars Entry Guidance,” IEEE Transactions on Neural Networks and Learning Systems, vol. 26. no. 7, pp. 1456–1467, 2015. [click]MathSciNetCrossRefGoogle Scholar
  53. [53]
    B. Kiumarsi, F. L. Lewis, H. Modares, A. Karimpour, and M. B. Naghibi-Sistani, “Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics,” Automatica, vol. 50, no. 4, pp. 1167–1175, 2014. [click]MathSciNetCrossRefzbMATHGoogle Scholar
  54. [54]
    H. Modares, and F. L. Lewis, “Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning,” IEEE Transactions on Automatic control, vol. 59, no. 11, pp. 3051–3056, 2014. [click]MathSciNetCrossRefzbMATHGoogle Scholar
  55. [55]
    H. Zhang, L. Cui, X. Zhang, and Y. Luo, “Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method,” IEEE Transactions on Neural Networks, vol. 22, no. 12, pp. 2226–2236, 2011. [click]CrossRefGoogle Scholar
  56. [56]
    Y. Zhu, D. Zhao, and X. Li, “Using reinforcement learning techniques to solve continuous-time non-linear optimal tracking problem without system dynamics,” IET Control Theory & Applications, vol. 10, no. 12, pp. 1339–1347, 2016. [click]MathSciNetCrossRefGoogle Scholar
  57. [57]
    R. Kamalapurkar, L. Andrews, P. Walters, and W. E. Dixon, “Model-based reinforcement learning for infinite-horizon approximate optimal tracking,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 3, pp. 753–758, 2017. [click]CrossRefGoogle Scholar
  58. [58]
    A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, “Discretetime nonlinear HJB solution using approximate dynamic programming: convergence proof,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 38, no. 4, pp. 943–949, 2008. [click]CrossRefGoogle Scholar
  59. [59]
    Y. Pan, and H. Yu, “Composite learning from adaptive dynamic surface control,” IEEE Transactions on Automatic Control, vol. 61, no. 9, pp. 2603–2609, 2016. [click]MathSciNetCrossRefzbMATHGoogle Scholar
  60. [60]
    Y. Pan, J. Zhang, and H. Yu, “Model reference composite learning control without persistency of excitation,” IET Control Theory & Applications, vol. 10, no. 16, pp. 1963–1971, 2016. [click]MathSciNetCrossRefGoogle Scholar

Copyright information

© Institute of Control, Robotics and Systems and The Korean Institute of Electrical Engineers and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.State Key Laboratory of Management and Control for Complex Systems, Institute of AutomationChinese Academy of SciencesBeijingChina
  2. 2.Department of Control Science and EngineeringChangchun University of TechnologyChangchunChina

Personalised recommendations