Robot Motor Skill Acquisition with Learning in Two Spaces
Motor skill acquisition and refinement is critical for the robot to step in human daily lives, which can endow it with the ability of autonomously performing unfamiliar tasks. However, how does the robot autonomously fulfill the new motion task with preassigned performance based on the demonstration task is still a challenge. We in this paper proposed a novel motor skill acquisition policy to conquer above problem, which is based on improved local weighted regression (iLWR), policy improvement with path integral (PI\(^2\)). Besides, the mixture Gaussian regression (GMR) guided self-reconstruction of basis function and the search of weight coefficient in the policy expression are performed alternately in basis function space and weight space to seek the optimal/suboptimal solution. In this way, robot can achieve the gradual acquisition of movement skills from similar tasks which is related to the demonstration to unsimilar task with different criterion. At last, the classical via-points trajectory planning experiment are performed with SCARA manipulator, NAO humanoid robot to verify that the proposed method is effective and feasible.
KeywordsAlternate study in two spaces GMR-PI\(^2\) Motor skill acquisition
- Deisenroth, M., Neumann, G., Peters, J.: A survey on policy search for robotics. J. Intell. Rob. Syst. 15(1), 1–2 (2013)Google Scholar
- Khansari-Zadeh, S.M., Billard, A.: BM: an iterative algorithm to learn stable non-linear dynamical systems with Gaussian mixture models. In: 2010 IEEE International Conference on Robotics and Automation, Anchorage, USA, pp. 2381–2388 (2010)Google Scholar
- Parisi, S., Abdulsamad, H., Paraschos, A., Daniel, C., Peters, J.: Reinforcement learning vs human programming in tetherball robot games. In: 2015 IEEE International conference on Intelligent Robots and Systems, Hamburg, Germany, pp. 6428–6434 (2015)Google Scholar
- Peters, J., Mülling, K., Altun, Y.: Relative entropy policy search. In: 24th AAAI, Atlanta, Westin, USA, pp. 1607–1612 (2010)Google Scholar