Reinforcement Learning of Bimanual Robot Skills pp 109-146 | Cite as
Dimensionality Reduction with Movement Primitives
- 469 Downloads
Abstract
As mentioned in Chap. 5, Movement Primitives are nowadays widely used as movement parametrization for learning robot trajectories, because of their linearity in the parameters, rescaling robustness and continuity. However, when learning a movement with MPs, a very large number of Gaussian approximations needs to be performed. Adding them up for all joints yields too many parameters to be explored when using Reinforcement Learning (RL), thus requiring a prohibitive number of experiments/simulations to converge to a solution with a (locally or globally) optimal reward. In this chapter, we address the process of simultaneously learning a MP-characterized robot motion and its underlying joint couplings through linear Dimensionality Reduction (DR), which will provide valuable qualitative information leading to a reduced and intuitive algebraic description of such motion.
References
- 1.Abdolmaleki, A., Lau, N., Reis, L.P., Neumann, G.: Regularized covariance estimation for weighted maximum likelihood policy search methods. In: IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), pp. 154–159 (2015)Google Scholar
- 2.Ben Amor, H., Kroemer, O., Hillenbrand, U., Neumann, G., Peters, J.: Generalization of human grasping for multi-fingered robot hands. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2043–2050 (2012)Google Scholar
- 3.Bosman, P.A.N., Grahl, J., Thierens, D.: Enhancing the performance of maximum–likelihood gaussian edas using anticipated mean shift. In: International Conference Parallel Problem Solving from Nature (PPSN), pp. 133–143 (2008)CrossRefGoogle Scholar
- 4.Colomé, A., Neumann, G., Peters, J., Torras, C.: Dimensionality reduction for probabilistic movement primitives. In: IEEE-RAS 14th International Conference on Humanoid Robots (Humanoids), pp. 794–800 (2014)Google Scholar
- 5.Colomé, A., Pardo, D., Alenyà, G., Torras, C.: External force estimation during compliant robot manipulation. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3535–3540 (2013)Google Scholar
- 6.Deisenroth, M.P., Neumann, G., Peters, J.: A survey on policy search for robotics. Found. Trends Robot. 2(1–2), 1–142 (2013)Google Scholar
- 7.Duchi, J.: Properties of the Trace and Matrix Derivatives. Available at https://web.stanford.edu/~jduchi/projects/matrix_prop.pdf
- 8.Hansen, N.: The CMA Evolution Strategy: A Comparing Review, pp. 75–102. Springer, Berlin, Heidelberg (2006)Google Scholar
- 9.Kormushev, P., Calinon, S., Caldwell, D.G.: Robot motor skill coordination with EM-based Reinforcement Learning. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3232–3237 (2010)Google Scholar
- 10.Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math, Stat (1951)CrossRefGoogle Scholar
- 11.Luck, K.S., Neumann, G., Berger, E., Peters, J., Amor, H.B.: Latent space policy search for robotics. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1434–1440 (2014)Google Scholar
- 12.Paraschos, A., Daniel, C., Peters, J., Neumann, G.: Probabilistic movement primitives. In: Advances in Neural Information Processing Systems (NIPS), pp. 2616–2624 (2013)Google Scholar
- 13.Pardo, D.: Learning rest-to-rest Motor Coordination in Articulated Mobile Robots. Ph.D. thesis, Universitat Politicnica de Catalunya (2009)Google Scholar
- 14.Peters, J., Mülling, K., Altn, Y.: Relative entropy policy search. In: AAAI Conference on Artificial Intelligence, pp. 1607–1612 (2010)Google Scholar
- 15.Shafii, N., Abdolmaleki, A., Ferreira, R., Lau, N., Reis, L.P.: Omnidirectional walking and active balance for soccer humanoid robot. In: 16th Portuguese Conference on Artificial Intelligence (EPIA), pp. 283–294 (2013)CrossRefGoogle Scholar
- 16.Téllez, R.A., Angulo, C., Pardo, D.E.: Evolving the walking behaviour of a 12 dof quadruped using a distributed neural architecture. In: International Workshop on Biologically Inspired Approaches to Advanced Information Technology (BioADIT), pp. 5–19 (2006)CrossRefGoogle Scholar
- 17.Theodorou, E., Buchli, J., Schaal, S.: A generalized path integral control approach to reinforcement learning. J. Mach. Learn. Res. 11, 3137–3181 (2010)MathSciNetzbMATHGoogle Scholar
- 18.Toussaint, M.: Lecture Notes: Gaussian identities. Available at http://ipvs.informatik.uni-stuttgart.de/mlr/marc/notes/gaussians.pdf
- 19.Vinjamuri, R., Sun, M., Chang, C.C., Lee, H.N., Sclabassi, R.J., Mao, Z.H.: Dimensionality reduction in control and coordination of the human hand. IEEE Trans. Biomed. Eng. 57(2), 284–295 (2010)CrossRefGoogle Scholar
- 20.Zhifei, S., Meng Joo, E.: A survey of inverse reinforcement learning techniques. Int. J. Intell. Comput. Cybern. 5(3), 293–311 (2012)MathSciNetCrossRefGoogle Scholar