Advertisement

Applied Intelligence

, Volume 48, Issue 11, pp 4338–4354 | Cite as

Supplemental observation acquisition for learning by observation agents

  • Michael W. Floyd
  • Babak Esfandiari
Article
  • 68 Downloads

Abstract

Learning by observation agents learn to perform a behaviour by watching an expert perform that behaviour. The ability of the agents to learn correctly is therefore related to the quality and coverage of the observations. This article presents two novel approaches for observation acquisition, mixed-initiative observation acquisition and delayed observation acquisition, that allow learning agents to identify problems they are having difficulty solving and ask the expert for assistance solving them. The observation approaches are presented in the context of a case-based learning by observation agent and empirically compared to traditional passive observation in the domain of Tetris. Our results show that not only do the mixed-initiative and delayed observation acquisition approaches result in observations that cannot be obtained in a passive manner, but they also improve the learning performance of an agent.

Keywords

Learning by observation Observation acquisition Case-based reasoning Tetris 

References

  1. 1.
    Aamodt A, Plaza E (1994) Case-based reasoning: foundational issues, methodological variations, and system approaches. AI Commun 7(1):39–59Google Scholar
  2. 2.
    Ang MH, Lin W, Lim SY (1999) A walk-through programmed robot for welding in shipyards. Industrial Robot: An International Journal 26(5):377–388CrossRefGoogle Scholar
  3. 3.
    Argall BD, Chernova S, Veloso M, Browning B (2009) A survey of robot learning from demonstration. Robot Auton Syst 57(5):469–483CrossRefGoogle Scholar
  4. 4.
    Asiimwe S, Craw S, Taylor B, Wiratunga N (2007) Case authoring: from textual reports to knowledge-rich cases. In: 7th international conference on case-based reasoning, pp 179–193Google Scholar
  5. 5.
    Chernova S, Veloso M (2008) Multi-thresholded approach to demonstration selection for interactive robot learning. In: 3rd ACM/IEEE international conference on human-robot interaction, pp 225–232Google Scholar
  6. 6.
    Coates A, Abbeel P, Ng AY (2008) Learning for control from multiple demonstrations. In: 25th international conference on machine learning, pp 144–151Google Scholar
  7. 7.
    Deisenroth MP, Krishnan KK (1999) On-line programming. In: Nof SY (ed) Handbook of industrial robotics. New York, Wiley, pp 337–352Google Scholar
  8. 8.
    Dinerstein J, Egbert PK, Ventura D, Goodrich M (2008) Demonstration-based behavior programming for embodied virtual agents. Comput Intell 24(4):235–256MathSciNetCrossRefGoogle Scholar
  9. 9.
    Flinter S, Keane MT (1995) On the automatic generation of case libraries by chunking chess games. In: 1st international conference on case-based reasoning, pp 421–430Google Scholar
  10. 10.
    Floyd MW, Bicakci MV, Esfandiari B (2012) Case-based learning by observation in robotics using a dynamic case representation. In: 25th international Florida artificial intelligence research society conference, pp 323–328Google Scholar
  11. 11.
    Floyd MW, Davoust A, Esfandiari B (2008) Considerations for real-time spatially-aware case-based reasoning: a case study in robotic soccer imitation. In: 9th European conference on case-based reasoning. Springer, Berlin, pp 195–209Google Scholar
  12. 12.
    Floyd MW, Esfandiari B (2009) An active approach to automatic case generation. In: 8th international conference on case-based reasoning, pp 150–164Google Scholar
  13. 13.
    Floyd MW, Esfandiari B (2011) A case-based reasoning framework for developing agents using learning by observation. In: 23rd IEEE international conference on tools with artificial intelligence. IEEE Computer Society Press, pp 531–538Google Scholar
  14. 14.
    Floyd MW, Esfandiari B (2011) Learning state-based behaviour using temporally related cases. In: 16th UK workshop on case-based reasoning, pp 34–45Google Scholar
  15. 15.
    Floyd MW, Esfandiari B (2011) Supplemental case acquisition using mixed-initiative control. In: 24th international Florida artificial intelligence research society conference, pp 395–400Google Scholar
  16. 16.
    Floyd MW, Esfandiari B, Lam K (2008) A case-based reasoning approach to imitating RoboCup players. In: 21st international Florida artificial intelligence research society conference, pp 251–256Google Scholar
  17. 17.
    Floyd MW, Turner J, Aha DW (2017) Using deep learning to automate feature modeling in learning by observation. In: 30th international Florida artificial intelligence research society conference. AAAI Press, pp 50–55Google Scholar
  18. 18.
    Grollman DH, Jenkins OC (2007) Dogged learning for robots. In: 24th IEEE international conference on robotics and automation, pp 2483–2488.  https://doi.org/10.1109/robot.2007.363692
  19. 19.
    Grollman DH, Jenkins OC (2007) Learning robot soccer skills from demonstration. In: 6th IEEE international conference on development and learningGoogle Scholar
  20. 20.
    Hearst MA (1999) Trends & controversies: mixed-initiative interaction. IEEE Intell Syst 14(5):14–23CrossRefGoogle Scholar
  21. 21.
    Hu R, Delany SJ, Namee BM (2010) EGAL: exploration guided active learning for TCBR. In: 18th international conference on case-based reasoning, pp 156–170Google Scholar
  22. 22.
    Massie S, Craw S, Wiratunga N (2005) Complexity-guided case discovery for case based reasoning. In: 20th national conference on artificial intelligence, pp 216–221Google Scholar
  23. 23.
    McSherry D (2000) Automating case selection in the construction of a case library. Knowl-Based Syst 13 (2-3):133–140CrossRefGoogle Scholar
  24. 24.
    Meriçli C, Veloso M, Akin H (2012) Improving biped walk stability with complementary corrective demonstration. Auton Robot 32(4):419–432CrossRefGoogle Scholar
  25. 25.
    Meriçli C, Veloso M, Akin HL (2012) Multi-resolution corrective demonstration for efficient task execution and refinement. Int J Soc Robot 4(4):423–435CrossRefGoogle Scholar
  26. 26.
    Michalski RS, Carbonell JG, Mitchell TM (1983) Machine learning: an artificial intelligence approach. Springer, BerlinCrossRefGoogle Scholar
  27. 27.
    Ontañón S (2012) Case acquisition strategies for case-based reasoning in real-time strategy games. In: 25th international Florida artificial intelligence research society conference, pp 335–340Google Scholar
  28. 28.
    Ontañón S, Mishra K, Sugandh N, Ram A (2007) Case-based planning and execution for real-time strategy games. In: 7th international conference on case-based reasoning, pp 164–178Google Scholar
  29. 29.
    Packard B, Ontañón S (2017) Policies for active learning from demonstration. In: AAAI spring symposium on learning from observation of humans. AAAI Press, pp 513–519Google Scholar
  30. 30.
    Powell J, Molineaux M, Aha DW (2011) Active and interactive discovery of goal selection knowledge. In: 24th international Florida artificial intelligence research society conference, pp 413–418Google Scholar
  31. 31.
    Powell JH, Hastings JD (2006) An empirical evaluation of automated knowledge discovery in a complex domain. In: Workshop on heuristic search, memory based heuristics and their applications: 21st national conference on artificial intelligenceGoogle Scholar
  32. 32.
    Powell JH, Hauff BM, Hastings JD (2005) Evaluating the effectiveness of exploration and accumulated experience in automatic case elicitation. In: 6th international conference on case-based reasoning, pp 397–407Google Scholar
  33. 33.
    Romdhane H, Lamontagne L (2008) Forgetting reinforced cases. In: 9th European conference on case-based reasoning, pp 474– 486Google Scholar
  34. 34.
    Romdhane H, Lamontagne L (2008) Reinforcement of local pattern cases for playing Tetris. In: 21st international Florida artificial intelligence research society conference, pp 263– 268Google Scholar
  35. 35.
    Ross S, Bagnell D (2010) Efficient reductions for imitation learning. In: 13th international conference on artificial intelligence and statistics, pp 661–668Google Scholar
  36. 36.
    Ross S, Gordon GJ, Bagnell D (2011) A reduction of imitation learning and structured prediction to no-regret online learning. In: 14th international conference on artificial intelligence and statistics, pp 627–635Google Scholar
  37. 37.
    Rubin J, Watson I (2010) Similarity-based retrieval and solution re-use policies in the game of Texas Hold’em. In: 18th international conference on case-based reasoning, pp 465– 479Google Scholar
  38. 38.
    Rubin J, Watson I (2011) Implicit opponent modelling via dynamic case-base selection. In: Workshop on case-based reasoning for computer games at the 19th international conference on case-based reasoning, pp 63–71Google Scholar
  39. 39.
    Rubin J, Watson I (2011) On combining decisions from multiple expert imitators for performance. In: 22nd international joint conference on artificial intelligence, pp 344–349Google Scholar
  40. 40.
    Rubin J, Watson ID (2011) Successful performance via decision generalisation in No Limit Texas Hold’em. In: 19th international conference on case-based reasoning. Springer, pp 467–481Google Scholar
  41. 41.
    Thurau C, Bauckhage C (2003) Combining self organizing maps and multilayer perceptrons to learn bot-behavior for a commercial game. In: 4th international conference on intelligent games and simulation, pp 119–126Google Scholar
  42. 42.
    Yang C, Farley B, Orchard R (2008) Automated case creation and management for diagnostic CBR systems. Appl Intell 28(1):17– 28CrossRefGoogle Scholar
  43. 43.
    Zhang J, Cho K (2017) Query-efficient imitation learning for end-to-end simulated driving. In: 31st AAAI conference on artificial intelligence. AAAI Press, pp 2891–2897Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Systems and Computer EngineeringCarleton UniversityOttawaCanada

Personalised recommendations