Abstract
Generalizing manipulation skills to new situations requires extracting invariant patterns from demonstrations. For example, the robot needs to understand the demonstrations at a higher level while being invariant to the appearance of the objects, geometric aspects of objects such as its position, size, orientation and viewpoint of the observer in the demonstrations. In this paper, we propose an algorithm that learns a joint probability density function of the demonstrations with invariant formulations of hidden semi-Markov models to extract invariant segments (also called sub-goals or options), and smoothly follow the generated sequence of states with a linear quadratic tracking controller. The algorithm takes as input the demonstrations observed with respect to different coordinate systems describing virtual landmarks or objects of interest, and adapts the segments according to the environmental changes in a systematic manner. We present variants of this algorithm in latent space with low-rank covariance decompositions, semi-tied covariances, and non-parametric online estimation of model parameters under small variance asymptotics; yielding considerably low sample and model complexity for acquiring new manipulation skills. The algorithm allows a Baxter robot to learn a pick-and-place task while avoiding a movable obstacle based on only 4 kinesthetic demonstrations.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Setting \(d_i = 0\) by choosing \(\lambda _1 \gg 0\) gives the loss function formulation with isotropic Gaussian under small variance asymptotics [22].
References
Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Robot. Auton. Syst. 57(5), 469–483 (2009)
Borrelli, F., Bemporad, A., Morari, M.: Predictive Control for Linear and Hybrid Systems. Cambridge University Press, Cambridge (2011)
Broderick, T., Kulis, B., Jordan, M.I.: MAD-Bayes: map-based asymptotic derivations from Bayes. In: International Conference on Machine Learning, ICML, pp. 226–234 (2013)
Calinon, S.: A tutorial on task-parameterized movement learning and retrieval. Intell. Serv. Robot. 9(1), 1–29 (2016)
Duan, Y., Andrychowicz, M., Stadie, B.C., Ho, J., Schneider, J., Sutskever, I., Abbeel, P., Zaremba, W.: One-shot imitation learning. CoRR, abs/1703.07326 (2017)
Figueroa, N., Billard, A.: Transform-invariant non-parametric clustering of covariance matrices and its application to unsupervised joint segmentation and action discovery. CoRR, abs/1710.10060 (2017)
Fox, R., Shin, R., Krishnan, S., Goldberg, K., Song, D., Stoica, I.: Parametrized hierarchical procedures for neural programming. In: The International Conference on Learning Representations, ICLR 2018 (2018)
Gales, M.J.F.: Semi-tied covariance matrices for hidden Markov models. IEEE Trans. Speech Audio Process. 7(3), 272–281 (1999)
Ho, J., Ermon, S.: Generative adversarial imitation learning. CoRR, abs/1606.03476 (2016)
Ijspeert, A., Nakanishi, J., Pastor, P., Hoffmann, H., Schaal, S.: Dynamical movement primitives: learning attractor models for motor behaviors. Neural Comput. 25, 328–373 (2013)
Krishnan, S., Fox, R., Stoica, I., Goldberg, K.: DDCO: discovery of deep continuous options for robot learning from demonstrations. CoRR (2017)
Kulic, D., Takano, W., Nakamura, Y.: Incremental learning, clustering and hierarchy formation of whole body motion patterns using adaptive hidden Markov chains. Int. J. Robot. Res. 27(7), 761–784 (2008)
Kulis, B., Jordan, M.I.: Revisiting k-means: new algorithms via Bayesian nonparametrics. In: International Conference on Machine Learning ICML, pp. 513–520 (2012)
Lee, D., Ott, C.: Incremental motion primitive learning by physical coaching using impedance control. In: Proceedings of the IEEE/RSJ Intl Conference on Intelligent Robots and Systems (IROS), Taipei, Taiwan, pp. 4133–4140, October 2010
McLachlan, G.J., Peel, D., Bean, R.W.: Modelling high-dimensional data by mixtures of factor analyzers. Comput. Stat. Data Anal. 41(3–4), 379–388 (2003)
Jose Medina, R., Billard, A.: Learning stable task sequences from demonstration with linear parameter varying systems and hidden Markov models. In: Conference on Robot Learning (CoRL) (2017)
Nehaniv, C.L., Dautenhahn, K. (eds.): Imitation and Social Learning in Robots, Humans, and Animals: Behavioural, Social and Communicative Dimensions. Cambridge University Press, Cambridge (2004)
Niekum, S., Osentoski, S., Konidaris, G., Barto, A.G.: Learning and generalization of complex tasks from unstructured demonstrations. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5239–5246 (2012)
Osa, T., Pajarinen, J., Neumann, G., Bagnell, A., Abbeel, P., Peters, J.: An Algorithmic Perspective on Imitation Learning. Now Publishers Inc. (2018)
Paraschos, A., Daniel, C., Peters, J.R., Neumann, G.: Probabilistic movement primitives. In: Advances in Neural Information Processing Systems 26, pp. 2616–2624 (2013)
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–285 (1989)
Roychowdhury, A., Jiang, K., Kulis, B.: Small-variance asymptotics for hidden Markov models. In: Advances in Neural Information Processing Systems 26, pp. 2103–2111. Curran Associates, Inc. (2013)
Shiarlis, K., Wulfmeier, M., Salter, S., Whiteson, S., Posner, I.: Taco: learning task decomposition via temporal alignment for control. In: International Conference on Machine Learning, ICML 2018 (2018)
Tanwani, A.K.: Generative models for learning robot manipulation skills from humans. Ph.D. thesis, Ecole Polytechnique Federale de Lausanne, Switzerland (2018)
Tanwani, A.K., Calinon, S.: Learning robot manipulation tasks with task-parameterized semitied hidden semi-Markov model. IEEE Robot. Autom. Lett. 1(1), 235–242 (2016)
Tanwani, A.K., Calinon, S.: Small-variance asymptotics for non-parametric online robot learning. Int. J. Robot. Res. 38(1), 3–22 (2019)
Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)
Tipping, M.E., Bishop, C.M.: Mixtures of probabilistic principal component analyzers. Neural Comput. 11(2), 443–482 (1999)
Wilson, A.D., Bobick, A.F.: Parametric hidden Markov models for gesture recognition. IEEE Trans. Pattern Anal. Mach. Intell. 21(9), 884–900 (1999)
Wolpert, D.M., Diedrichsen, J., Flanagan, J.R.: Principles of sensorimotor learning. Nat. Rev. 12, 739–751 (2011)
Xu, D., Nair, S., Zhu, Y., Gao, J., Garg, A., Fei-Fei, L., Savarese, S.: Neural task programming: learning to generalize across hierarchical tasks. CoRR, abs/1710.01813 (2017)
Yu, S.-Z.: Hidden semi-Markov models. Artif. Intell. 174, 215–243 (2010)
Acknowledgements
This work was, in large part, carried out at Idiap Research Institute and Ecole Polytechnique Federale de Lausanne (EPFL) Switzerland. This work was in part supported by the DexROV project through the EC Horizon 2020 program (Grant 635491), and the NSF National Robotics Initiative Award 1734633 on Scalable Collaborative Human-Robot Learning (SCHooL). The information, data, comments, and views detailed herein may not necessarily reflect the endorsements of the sponsors.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Tanwani, A.K. et al. (2020). Generalizing Robot Imitation Learning with Invariant Hidden Semi-Markov Models. In: Morales, M., Tapia, L., Sánchez-Ante, G., Hutchinson, S. (eds) Algorithmic Foundations of Robotics XIII. WAFR 2018. Springer Proceedings in Advanced Robotics, vol 14. Springer, Cham. https://doi.org/10.1007/978-3-030-44051-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-44051-0_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44050-3
Online ISBN: 978-3-030-44051-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)