Abstract
Learning by observation agents learn to perform a behaviour by watching an expert perform that behaviour. The ability of the agents to learn correctly is therefore related to the quality and coverage of the observations. This article presents two novel approaches for observation acquisition, mixed-initiative observation acquisition and delayed observation acquisition, that allow learning agents to identify problems they are having difficulty solving and ask the expert for assistance solving them. The observation approaches are presented in the context of a case-based learning by observation agent and empirically compared to traditional passive observation in the domain of Tetris. Our results show that not only do the mixed-initiative and delayed observation acquisition approaches result in observations that cannot be obtained in a passive manner, but they also improve the learning performance of an agent.
Similar content being viewed by others
Notes
Although the work could easily be extended to include multiple experts.
References
Aamodt A, Plaza E (1994) Case-based reasoning: foundational issues, methodological variations, and system approaches. AI Commun 7(1):39–59
Ang MH, Lin W, Lim SY (1999) A walk-through programmed robot for welding in shipyards. Industrial Robot: An International Journal 26(5):377–388
Argall BD, Chernova S, Veloso M, Browning B (2009) A survey of robot learning from demonstration. Robot Auton Syst 57(5):469–483
Asiimwe S, Craw S, Taylor B, Wiratunga N (2007) Case authoring: from textual reports to knowledge-rich cases. In: 7th international conference on case-based reasoning, pp 179–193
Chernova S, Veloso M (2008) Multi-thresholded approach to demonstration selection for interactive robot learning. In: 3rd ACM/IEEE international conference on human-robot interaction, pp 225–232
Coates A, Abbeel P, Ng AY (2008) Learning for control from multiple demonstrations. In: 25th international conference on machine learning, pp 144–151
Deisenroth MP, Krishnan KK (1999) On-line programming. In: Nof SY (ed) Handbook of industrial robotics. New York, Wiley, pp 337–352
Dinerstein J, Egbert PK, Ventura D, Goodrich M (2008) Demonstration-based behavior programming for embodied virtual agents. Comput Intell 24(4):235–256
Flinter S, Keane MT (1995) On the automatic generation of case libraries by chunking chess games. In: 1st international conference on case-based reasoning, pp 421–430
Floyd MW, Bicakci MV, Esfandiari B (2012) Case-based learning by observation in robotics using a dynamic case representation. In: 25th international Florida artificial intelligence research society conference, pp 323–328
Floyd MW, Davoust A, Esfandiari B (2008) Considerations for real-time spatially-aware case-based reasoning: a case study in robotic soccer imitation. In: 9th European conference on case-based reasoning. Springer, Berlin, pp 195–209
Floyd MW, Esfandiari B (2009) An active approach to automatic case generation. In: 8th international conference on case-based reasoning, pp 150–164
Floyd MW, Esfandiari B (2011) A case-based reasoning framework for developing agents using learning by observation. In: 23rd IEEE international conference on tools with artificial intelligence. IEEE Computer Society Press, pp 531–538
Floyd MW, Esfandiari B (2011) Learning state-based behaviour using temporally related cases. In: 16th UK workshop on case-based reasoning, pp 34–45
Floyd MW, Esfandiari B (2011) Supplemental case acquisition using mixed-initiative control. In: 24th international Florida artificial intelligence research society conference, pp 395–400
Floyd MW, Esfandiari B, Lam K (2008) A case-based reasoning approach to imitating RoboCup players. In: 21st international Florida artificial intelligence research society conference, pp 251–256
Floyd MW, Turner J, Aha DW (2017) Using deep learning to automate feature modeling in learning by observation. In: 30th international Florida artificial intelligence research society conference. AAAI Press, pp 50–55
Grollman DH, Jenkins OC (2007) Dogged learning for robots. In: 24th IEEE international conference on robotics and automation, pp 2483–2488. https://doi.org/10.1109/robot.2007.363692
Grollman DH, Jenkins OC (2007) Learning robot soccer skills from demonstration. In: 6th IEEE international conference on development and learning
Hearst MA (1999) Trends & controversies: mixed-initiative interaction. IEEE Intell Syst 14(5):14–23
Hu R, Delany SJ, Namee BM (2010) EGAL: exploration guided active learning for TCBR. In: 18th international conference on case-based reasoning, pp 156–170
Massie S, Craw S, Wiratunga N (2005) Complexity-guided case discovery for case based reasoning. In: 20th national conference on artificial intelligence, pp 216–221
McSherry D (2000) Automating case selection in the construction of a case library. Knowl-Based Syst 13 (2-3):133–140
Meriçli C, Veloso M, Akin H (2012) Improving biped walk stability with complementary corrective demonstration. Auton Robot 32(4):419–432
Meriçli C, Veloso M, Akin HL (2012) Multi-resolution corrective demonstration for efficient task execution and refinement. Int J Soc Robot 4(4):423–435
Michalski RS, Carbonell JG, Mitchell TM (1983) Machine learning: an artificial intelligence approach. Springer, Berlin
Ontañón S (2012) Case acquisition strategies for case-based reasoning in real-time strategy games. In: 25th international Florida artificial intelligence research society conference, pp 335–340
Ontañón S, Mishra K, Sugandh N, Ram A (2007) Case-based planning and execution for real-time strategy games. In: 7th international conference on case-based reasoning, pp 164–178
Packard B, Ontañón S (2017) Policies for active learning from demonstration. In: AAAI spring symposium on learning from observation of humans. AAAI Press, pp 513–519
Powell J, Molineaux M, Aha DW (2011) Active and interactive discovery of goal selection knowledge. In: 24th international Florida artificial intelligence research society conference, pp 413–418
Powell JH, Hastings JD (2006) An empirical evaluation of automated knowledge discovery in a complex domain. In: Workshop on heuristic search, memory based heuristics and their applications: 21st national conference on artificial intelligence
Powell JH, Hauff BM, Hastings JD (2005) Evaluating the effectiveness of exploration and accumulated experience in automatic case elicitation. In: 6th international conference on case-based reasoning, pp 397–407
Romdhane H, Lamontagne L (2008) Forgetting reinforced cases. In: 9th European conference on case-based reasoning, pp 474– 486
Romdhane H, Lamontagne L (2008) Reinforcement of local pattern cases for playing Tetris. In: 21st international Florida artificial intelligence research society conference, pp 263– 268
Ross S, Bagnell D (2010) Efficient reductions for imitation learning. In: 13th international conference on artificial intelligence and statistics, pp 661–668
Ross S, Gordon GJ, Bagnell D (2011) A reduction of imitation learning and structured prediction to no-regret online learning. In: 14th international conference on artificial intelligence and statistics, pp 627–635
Rubin J, Watson I (2010) Similarity-based retrieval and solution re-use policies in the game of Texas Hold’em. In: 18th international conference on case-based reasoning, pp 465– 479
Rubin J, Watson I (2011) Implicit opponent modelling via dynamic case-base selection. In: Workshop on case-based reasoning for computer games at the 19th international conference on case-based reasoning, pp 63–71
Rubin J, Watson I (2011) On combining decisions from multiple expert imitators for performance. In: 22nd international joint conference on artificial intelligence, pp 344–349
Rubin J, Watson ID (2011) Successful performance via decision generalisation in No Limit Texas Hold’em. In: 19th international conference on case-based reasoning. Springer, pp 467–481
Thurau C, Bauckhage C (2003) Combining self organizing maps and multilayer perceptrons to learn bot-behavior for a commercial game. In: 4th international conference on intelligent games and simulation, pp 119–126
Yang C, Farley B, Orchard R (2008) Automated case creation and management for diagnostic CBR systems. Appl Intell 28(1):17– 28
Zhang J, Cho K (2017) Query-efficient imitation learning for end-to-end simulated driving. In: 31st AAAI conference on artificial intelligence. AAAI Press, pp 2891–2897
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Floyd, M.W., Esfandiari, B. Supplemental observation acquisition for learning by observation agents. Appl Intell 48, 4338–4354 (2018). https://doi.org/10.1007/s10489-018-1191-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-018-1191-5