Skip to main content
Log in

Supplemental observation acquisition for learning by observation agents

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Learning by observation agents learn to perform a behaviour by watching an expert perform that behaviour. The ability of the agents to learn correctly is therefore related to the quality and coverage of the observations. This article presents two novel approaches for observation acquisition, mixed-initiative observation acquisition and delayed observation acquisition, that allow learning agents to identify problems they are having difficulty solving and ask the expert for assistance solving them. The observation approaches are presented in the context of a case-based learning by observation agent and empirically compared to traditional passive observation in the domain of Tetris. Our results show that not only do the mixed-initiative and delayed observation acquisition approaches result in observations that cannot be obtained in a passive manner, but they also improve the learning performance of an agent.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Although the work could easily be extended to include multiple experts.

References

  1. Aamodt A, Plaza E (1994) Case-based reasoning: foundational issues, methodological variations, and system approaches. AI Commun 7(1):39–59

    Google Scholar 

  2. Ang MH, Lin W, Lim SY (1999) A walk-through programmed robot for welding in shipyards. Industrial Robot: An International Journal 26(5):377–388

    Article  Google Scholar 

  3. Argall BD, Chernova S, Veloso M, Browning B (2009) A survey of robot learning from demonstration. Robot Auton Syst 57(5):469–483

    Article  Google Scholar 

  4. Asiimwe S, Craw S, Taylor B, Wiratunga N (2007) Case authoring: from textual reports to knowledge-rich cases. In: 7th international conference on case-based reasoning, pp 179–193

  5. Chernova S, Veloso M (2008) Multi-thresholded approach to demonstration selection for interactive robot learning. In: 3rd ACM/IEEE international conference on human-robot interaction, pp 225–232

  6. Coates A, Abbeel P, Ng AY (2008) Learning for control from multiple demonstrations. In: 25th international conference on machine learning, pp 144–151

  7. Deisenroth MP, Krishnan KK (1999) On-line programming. In: Nof SY (ed) Handbook of industrial robotics. New York, Wiley, pp 337–352

  8. Dinerstein J, Egbert PK, Ventura D, Goodrich M (2008) Demonstration-based behavior programming for embodied virtual agents. Comput Intell 24(4):235–256

    Article  MathSciNet  Google Scholar 

  9. Flinter S, Keane MT (1995) On the automatic generation of case libraries by chunking chess games. In: 1st international conference on case-based reasoning, pp 421–430

  10. Floyd MW, Bicakci MV, Esfandiari B (2012) Case-based learning by observation in robotics using a dynamic case representation. In: 25th international Florida artificial intelligence research society conference, pp 323–328

  11. Floyd MW, Davoust A, Esfandiari B (2008) Considerations for real-time spatially-aware case-based reasoning: a case study in robotic soccer imitation. In: 9th European conference on case-based reasoning. Springer, Berlin, pp 195–209

  12. Floyd MW, Esfandiari B (2009) An active approach to automatic case generation. In: 8th international conference on case-based reasoning, pp 150–164

  13. Floyd MW, Esfandiari B (2011) A case-based reasoning framework for developing agents using learning by observation. In: 23rd IEEE international conference on tools with artificial intelligence. IEEE Computer Society Press, pp 531–538

  14. Floyd MW, Esfandiari B (2011) Learning state-based behaviour using temporally related cases. In: 16th UK workshop on case-based reasoning, pp 34–45

  15. Floyd MW, Esfandiari B (2011) Supplemental case acquisition using mixed-initiative control. In: 24th international Florida artificial intelligence research society conference, pp 395–400

  16. Floyd MW, Esfandiari B, Lam K (2008) A case-based reasoning approach to imitating RoboCup players. In: 21st international Florida artificial intelligence research society conference, pp 251–256

  17. Floyd MW, Turner J, Aha DW (2017) Using deep learning to automate feature modeling in learning by observation. In: 30th international Florida artificial intelligence research society conference. AAAI Press, pp 50–55

  18. Grollman DH, Jenkins OC (2007) Dogged learning for robots. In: 24th IEEE international conference on robotics and automation, pp 2483–2488. https://doi.org/10.1109/robot.2007.363692

  19. Grollman DH, Jenkins OC (2007) Learning robot soccer skills from demonstration. In: 6th IEEE international conference on development and learning

  20. Hearst MA (1999) Trends & controversies: mixed-initiative interaction. IEEE Intell Syst 14(5):14–23

    Article  Google Scholar 

  21. Hu R, Delany SJ, Namee BM (2010) EGAL: exploration guided active learning for TCBR. In: 18th international conference on case-based reasoning, pp 156–170

  22. Massie S, Craw S, Wiratunga N (2005) Complexity-guided case discovery for case based reasoning. In: 20th national conference on artificial intelligence, pp 216–221

  23. McSherry D (2000) Automating case selection in the construction of a case library. Knowl-Based Syst 13 (2-3):133–140

    Article  Google Scholar 

  24. Meriçli C, Veloso M, Akin H (2012) Improving biped walk stability with complementary corrective demonstration. Auton Robot 32(4):419–432

    Article  Google Scholar 

  25. Meriçli C, Veloso M, Akin HL (2012) Multi-resolution corrective demonstration for efficient task execution and refinement. Int J Soc Robot 4(4):423–435

    Article  Google Scholar 

  26. Michalski RS, Carbonell JG, Mitchell TM (1983) Machine learning: an artificial intelligence approach. Springer, Berlin

    Book  Google Scholar 

  27. Ontañón S (2012) Case acquisition strategies for case-based reasoning in real-time strategy games. In: 25th international Florida artificial intelligence research society conference, pp 335–340

  28. Ontañón S, Mishra K, Sugandh N, Ram A (2007) Case-based planning and execution for real-time strategy games. In: 7th international conference on case-based reasoning, pp 164–178

  29. Packard B, Ontañón S (2017) Policies for active learning from demonstration. In: AAAI spring symposium on learning from observation of humans. AAAI Press, pp 513–519

  30. Powell J, Molineaux M, Aha DW (2011) Active and interactive discovery of goal selection knowledge. In: 24th international Florida artificial intelligence research society conference, pp 413–418

  31. Powell JH, Hastings JD (2006) An empirical evaluation of automated knowledge discovery in a complex domain. In: Workshop on heuristic search, memory based heuristics and their applications: 21st national conference on artificial intelligence

  32. Powell JH, Hauff BM, Hastings JD (2005) Evaluating the effectiveness of exploration and accumulated experience in automatic case elicitation. In: 6th international conference on case-based reasoning, pp 397–407

  33. Romdhane H, Lamontagne L (2008) Forgetting reinforced cases. In: 9th European conference on case-based reasoning, pp 474– 486

  34. Romdhane H, Lamontagne L (2008) Reinforcement of local pattern cases for playing Tetris. In: 21st international Florida artificial intelligence research society conference, pp 263– 268

  35. Ross S, Bagnell D (2010) Efficient reductions for imitation learning. In: 13th international conference on artificial intelligence and statistics, pp 661–668

  36. Ross S, Gordon GJ, Bagnell D (2011) A reduction of imitation learning and structured prediction to no-regret online learning. In: 14th international conference on artificial intelligence and statistics, pp 627–635

  37. Rubin J, Watson I (2010) Similarity-based retrieval and solution re-use policies in the game of Texas Hold’em. In: 18th international conference on case-based reasoning, pp 465– 479

  38. Rubin J, Watson I (2011) Implicit opponent modelling via dynamic case-base selection. In: Workshop on case-based reasoning for computer games at the 19th international conference on case-based reasoning, pp 63–71

  39. Rubin J, Watson I (2011) On combining decisions from multiple expert imitators for performance. In: 22nd international joint conference on artificial intelligence, pp 344–349

  40. Rubin J, Watson ID (2011) Successful performance via decision generalisation in No Limit Texas Hold’em. In: 19th international conference on case-based reasoning. Springer, pp 467–481

  41. Thurau C, Bauckhage C (2003) Combining self organizing maps and multilayer perceptrons to learn bot-behavior for a commercial game. In: 4th international conference on intelligent games and simulation, pp 119–126

  42. Yang C, Farley B, Orchard R (2008) Automated case creation and management for diagnostic CBR systems. Appl Intell 28(1):17– 28

    Article  Google Scholar 

  43. Zhang J, Cho K (2017) Query-efficient imitation learning for end-to-end simulated driving. In: 31st AAAI conference on artificial intelligence. AAAI Press, pp 2891–2897

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael W. Floyd.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Floyd, M.W., Esfandiari, B. Supplemental observation acquisition for learning by observation agents. Appl Intell 48, 4338–4354 (2018). https://doi.org/10.1007/s10489-018-1191-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-018-1191-5

Keywords

Navigation