Abstract
We present an active learning architecture that allows a robot to actively learn which data collection strategy is most efficient for acquiring motor skills to achieve multiple outcomes, and generalise over its experience to achieve new outcomes. The robot explores its environment both via interactive learning and goal-babbling. It learns at the same time when, who and what to actively imitate from several available teachers, and learns when not to use social guidance but use active goal-oriented self-exploration. This is formalised in the framework of life-long strategic learning.
The proposed architecture, called Socially Guided Intrinsic Motivation with Active Choice of Teacher and Strategy (SGIM-ACTS), relies on hierarchical active decisions of what and how to learn driven by empirical evaluation of learning progress for each learning strategy. We illustrate with an experiment where a simulated robot learns to control its arm for realising two kinds of different outcomes. It has to choose actively and hierarchically at each learning episode: 1) what to learn: which outcome is most interesting to select as a goal to focus on for goal-directed exploration; 2) how to learn: which data collection strategy to use among self-exploration, mimicry and emulation; 3) once he has decided when and what to imitate by choosing mimicry or emulation, then he has to choose who to imitate, from a set of different teachers. We show that SGIM-ACTS learns significantly more effciently than using single learning strategies, and coherently selects the best strategy with respect to the chosen outcome, taking advantage of the available teachers (with different levels of skills).
Similar content being viewed by others
References
Brenna D. Argall, B. Browning, and Manuela Veloso. Learning robot motion control with demonstration and advice-operators. In In Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 399–404. IEEE, September 2008.
Brenna D. Argall, B. Browning, and Manuela Veloso. Teacher feedback to scaffold and refine demonstrated motion primitives on a mobile robot. Robotics and Autonomous Systems, 59(3–4):243–255, 2011.
Brenna D. Argall, Sonia Chernova, Manuela Veloso, and Brett Browning. A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5):469–483, 2009.
C.G. Atkeson, Moore Andrew, and Schaal Stefan. Locally weighted learning. AI Review, 11:11–73, April 1997.
Y. Baram, R. El-Yaniv, and K. Luz. Online choice of active learning algorithms. The Journal of Machine Learning Research,, 5:255–291, 2004.
Adrien Baranes and Pierre-Yves Oudeyer. Active learning of inverse models with intrinsically motivated goal exploration in robots. Robotics and Autonomous Systems, 61(1):49–73, 2013.
Andrew G. Barto, S. Singh, and N Chentanez. Intrinsically motivated learning of hierarchical collections of skills. In ICDL International Conference on Developmental Learning, pages 112–119, 2004.
Aude Billard, Sylvain Calinon, Ruediger Dillmann, and Stefan Schaal. Handbook of Robotics, chapter Robot Programming by Demonstration. Number 59. MIT Press, 2007.
Cynthia Breazeal and B. Scassellati. Robots that imitate humans. Trends in Cognitive Sciences, 6(11):481–487, 2002.
Maya Cakmak, C. Chao, and Andrea L. Thomaz. Designing interactions for robot active learners. Autonomous Mental Development, IEEE Transactions on, 2(2):108–118, 2010.
Maya Cakmak, Nick DePalma, Andrea L. Thomaz, and Rosa Arriaga. Effects of social exploration mechanisms on robot learning. In The 18th IEEE International Symposium on Robot and Human Interactive Communication, 2009. RO-MAN 2009., pages 128–134. IEEE, 2009.
J. Call and M. Carpenter. Imitation in animals and artifacts, chapter Three sources of information in social learning, pages 211–228. Cambridge, MA: MIT Press., 2002.
Sonia Chernova and Manuela Veloso. Interactive policy learning through confidence-based autonomy. Journal of Artificial Intelligence Research, 34, 2009.
Gergely Csibra. Teleological and referential understanding of action in infancy. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 358(1431):447, 2003.
B.C. da Silva, G. Konidaris, and Andrew G. Barto. Learning parameterized skills. In 29th International Conference on Machine Learning (ICML 2012), 2012.
Kerstin Dautenhahn and Chrystopher L. Nehaniv. Imitation in Animals and Artifacts. MIT Press, 2002.
Daniel H Grollman and Odest Chadwicke” Jenkins. Incremental learning of subtasks from unsegmented demonstration. In Intelligent Robots and Systems IROS 2010 IEEERSJ International Conference on, pages 261–266, 2010.
Jens Kober, Andreas Wilhelm, Erhan Oztop, and Jan Peters. Reinforcement learning to adjust parametrized motor primitives to new situations. Autonomous Robots, pages 1–19, 2012. 10.1007/s10514-012-9290-3.
N Koenig, L Takayama, and M Matari¢. Communication and knowledge sharing in human-robot interaction and learning from demonstration. Neural Netw, 23(8–9):1104–1112, Oct–Nov 2010.
Petar Kormushev, Sylvain Calinon, and Darwin G. Caldwell. Robot motor skill coordination with EM-based reinforcement learning. In Proc. IEEE/RSJ Intl Conf. on Intelligent Robots and Systems (IROS), pages 3232–3237, Taipei, Taiwan, October 2010.
Petar Kormushev, Sylvain Calinon, and Darwin G. Caldwell. Imitation learning of positional and force skills demonstrated via kinesthetic teaching and haptic input. Advanced Robotics, 25(5):581–603, 2011.
J. C. Lagarias, J. A. Reeds, M. H. Wright, and P. E. Wright. Convergence properties of the nelder-mead simplex method in low dimensions. SIAM Journal of Optimization, 9(1):112–147, 1998.
Manuel Lopes, Thomas Cederborg, and Pierre-Yves Oudeyer. Simultaneous acquisition of task and feedback models. Development and Learning (ICDL), 2011 IEEE International Conference on, pages 1–7, 2011.
Manuel Lopes, Tobias Lang, Marc Toussaint, Pierre-Yves Oudeyer, et al. Exploration in model-based reinforcement learning by empirically estimating learning progress. In Neural Information Processing Systems (NIPS), 2012.
Manuel Lopes, Francisco Melo, and Luis Montesano. Active learning for reward estimation in inverse reinforcement learning. Machine Learning and Knowledge Discovery in Databases, pages 31–46, 2009.
Manuel Lopes, Francisco Melo, Luis Montesano, and Jose Santos-Victor. From Motor to Interaction Learning in Robots, chapter Abstraction Levels for Robotic Imitation: Overview and Computational Approaches. Springer, 2009.
Manuel Lopes and Pierre-Yves Oudeyer. The Strategic Student Approach for Life-Long Exploration and Learning. In IEEE Conference on Development and Learning / EpiRob, San Diego, États-Unis, November 2012.
Chrystopher L Nehaniv and Kerstin Dautenhahn. Imitation and Social Learning in Robots, Humans and Animals: Behavioural, Social and Communicative Dimensions. Cambridge Univ. Press, Cambridge, March 2007.
Sao Mai Nguyen, Adrien Baranes, and Pierre-Yves Oudeyer. Bootstrapping intrinsically motivated learning with human demonstrations. In IEEE International Conference on Development and Learning, Frankfurt, Germany, 2011.
Sao Mai Nguyen and Pierre-Yves Oudeyer. Properties for efficient demonstrations to a socially guided intrinsically motivated learner. In 21st IEEE International Symposium on Robot and Human Interactive Communication, 2012.
M.N. Nicolescu and M.J. Mataric. Natural methods for robot task learning: Instructive demonstrations, generalization and practice. In Proceedings of the second international joint conference on Autonomous agents and multiagent systems, pages 241–248. ACM, 2003.
Pierre-Yves Oudeyer and Frederic Kaplan. What is intrinsic motivation? a typology of computational approaches. Frontiers in Neurorobotics, 2007.
Pierre-Yves Oudeyer, Frederic Kaplan, and Verena Hafner. Intrinsic motivation systems for autonomous mental development. IEEE Transactions on Evolutionary Computation, 11(2):265–286, 2007.
Jan Peters and Stefan Schaal. Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4):682–697, 2008.
G. Qi, X. Hua, Y. Rui, J. Tang, and H. Zhang. Two-dimensional active learning for image classification. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1–8. IEEE, 2008.
Roi Reichart, Katrin Tomanek, Udo Hahn, and Ari Rappoport. Multi-task active learning for linguistic annotations. In Annual Meeting of the Association for Computational Linguistics (ACL). Citeseer, 2008.
Matthias Rolf and Jochen J Steil. Goal babbling: a new concept for early sensorimotor exploration. pages 40–43, Osaka, 11/2012 2012. IEEE.
Richard M. Ryan and Edward L. Deci. Intrinsic and extrinsic motivations: Classic definitions and new directions. Contemporary Educational Psychology, 25(1):54–67, 2000.
Stefan Schaal, A Ijspeert, and Aude Billard. Computational approaches to motor learning by imitation. Philosophical transactions of the Royal Society of London. Series B, Biological sciences, 358(1431), 03 2003.
Andrea L. Thomaz. Socially Guided Machine Learning. PhD thesis, MIT, 5 2006.
Andrea L. Thomaz and Cynthia Breazeal. Experiments in socially guided exploration: Lessons learned in building robots that learn with and without human teachers. Connection Science, 20Special Issue on Social Learning in Embodied Agents (2–3):91–110, 2008.
M. Tomasello and M. Carpenter. Shared intentionality. Developmental Science, 10(1):121–125, 2007.
Andrew Whiten. Primate culture and social learning. Cognitive Science, 24(3):477–508, 2000.
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Nguyen, S.M., Oudeyer, PY. Active choice of teachers, learning strategies and goals for a socially guided intrinsic motivation learner. Paladyn 3, 136–146 (2012). https://doi.org/10.2478/s13230-013-0110-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.2478/s13230-013-0110-z