Advertisement

On Diversity, Teaming, and Hierarchical Policies: Observations from the Keepaway Soccer Task

  • Stephen Kelly
  • Malcolm I. Heywood
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8599)

Abstract

The 3-versus-2 Keepaway soccer task represents a widely used benchmark appropriate for evaluating approaches to reinforcement learning, multi-agent systems, and evolutionary robotics. To date most research on this task has been described in terms of developments to reinforcement learning with function approximation or frameworks for neuro-evolution. This work performs an initial study using a recently proposed algorithm for evolving teams of programs hierarchically using two phases of evolution: one to build a library of candidate meta policies and a second to learn how to deploy the library consistently. Particular attention is paid to diversity maintenance, where this has been demonstrated as a critical component in neuro-evolutionary approaches. A new formulation is proposed for fitness sharing appropriate to the Keepaway task. The resulting policies are observed to benefit from the use of diversity and perform significantly better than previously reported. Moreover, champion individuals evolved and selected under one field size generalize to multiple field sizes without any additional training.

Keywords

Policy search Keepaway soccer Symbiosis Fitness sharing Diversity maintenance 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Stone, P., Sutton, R.S.: Scaling reinforcement learning toward robocup soccer. In: The Eighteenth International Conference on Machine Learning, pp. 537–544 (2001)Google Scholar
  2. 2.
    Stone, P., Sutton, R.S., Kuhlmann, G.: Reinforcement learning for RoboCup soccer keepaway. Adaptive Behavior 13(3), 165–188 (2005)CrossRefGoogle Scholar
  3. 3.
    Metzen, J.H., Edgington, M., Kassahun, Y., Kirchner, F.: Analysis of an evolutionary reinforcement learning method in a multiagent domain. In: Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 291–298 (2008)Google Scholar
  4. 4.
    Whiteson, S., Taylor, M.E., Stone, P.: Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning. Autonomous Agents and Multi-Agent Systems 21(1), 1–35 (2009)CrossRefGoogle Scholar
  5. 5.
    Burke, E.K., Gustafson, S., Kendall, G.: Diversity in genetic programming: An analysis of measures and correlation with fitness. IEEE Transactions on Evolutionary Computation 8(1), 47–62 (2004)CrossRefGoogle Scholar
  6. 6.
    Lichodzijewski, P., Heywood, M.I.: The Rubik cube and GP temporal sequence learning: an initial study. In: Genetic Programming Theory and Practice VIII, pp. 35–54. Springer (2011)Google Scholar
  7. 7.
    Kelly, S., Lichodzijewski, P., Heywood, M.I.: On run time libraries and hierarchical symbiosis. In: IEEE Congress on Evolutionary Computation, pp. 3245–3252 (2012)Google Scholar
  8. 8.
    Doucette, J.A., Lichodzijewski, P., Heywood, M.I.: Hierarchical task decomposition through symbiosis in reinforcement learning. In: Proceedings of the ACM Genetic and Evolutionary Computation Conference, pp. 97–104 (2012)Google Scholar
  9. 9.
    Lichodzijewski, P., Heywood, M.I.: Symbiosis, complexification and simplicity under GP. In: Proceedings of the ACM Genetic and Evolutionary Computation Conference, pp. 853–860 (2010)Google Scholar
  10. 10.
    Calabretta, R., Nolfi, S., Parisi, D., Wagner, G.P.: Duplication of modules facilitates the evolution of functional specialization. Artificial Life 6(1), 69–84 (2000)CrossRefGoogle Scholar
  11. 11.
    Watson, R.A., Pollack, J.B.: Modular interdependency in complex dynamical systems. Artificial Life 11(4), 445–458 (2005)CrossRefGoogle Scholar
  12. 12.
    Dempsey, I., O’Neill, M., Brabazon, A.: Survey of EC in dynamic environments. In: Foundations in Grammatical Evolution for Dynamic Environments. SCI, vol. 194, pp. 25–54. Springer, Heidelberg (2009)Google Scholar
  13. 13.
    Minku, L.L., White, A.P., Yao, X.: The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Transactions on Knowledge and Data Engineering 22(5), 730–742 (2010)CrossRefGoogle Scholar
  14. 14.
    Chong, S.Y., Tino, P., Yao, X.: Relationship between generalization and diversity in coevolutionary learning. IEEE Transactions on Computational Intelligence and AI in Games 1(3), 214–232 (2009)CrossRefGoogle Scholar
  15. 15.
    Cuccu, G., Gomez, F.: When novelty is not enough. In: Di Chio, C., et al. (eds.) EvoApplications 2011, Part I. LNCS, vol. 6624, pp. 234–243. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  16. 16.
    Mouret, J.B., Doncieux, S.: Encouraging behavioral diversity in evolutionary robotics: an empirical study. Evolutionary Computation 20(1), 91–133 (2012)CrossRefGoogle Scholar
  17. 17.
    Waibel, M., Keller, L., Floreano, D.: Genetic team composition and level of selection in the evolution of cooperation. IEEE Transactions on Evolutionary Computation 13(3), 648–660 (2009)CrossRefGoogle Scholar
  18. 18.
    Jung, T., Polani, D.: Learning robocup-keepaway with kernels. In: JMLR: Workshop and Conference Proceedings – Gaussian Processes in Practice, pp. 33–57 (2007)Google Scholar
  19. 19.
    Taylor, M.E., Whiteson, S., Stone, P.: Comparing evolutionary and temporal difference methods in a reinforcement learning domain. In: Proceedings of the ACM Genetic and Evolutionary Computation Conference, pp. 1321–1328 (2006)Google Scholar
  20. 20.
    Verbancsics, P., Stanley, K.O.: Evolving static representations for task transfer. The Journal of Machine Learning Research 99, 1737–1769 (2010)MathSciNetGoogle Scholar
  21. 21.
    Gustafson, S.M., Hsu, W.H.: Layered learning in genetic programming for a cooperative robot soccer problem. In: Miller, J., Tomassini, M., Lanzi, P.L., Ryan, C., Tetamanzi, A.G.B., Langdon, W.B. (eds.) EuroGP 2001. LNCS, vol. 2038, pp. 291–301. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  22. 22.
    Hsu, W.H., Harmon, S.J., Rodriguez, E., Zhong, C.: Empirical comparison of incremental reuse strategies in genetic programming for keep-away soccer. In: Late Breaking Papers at the Genetic and Evolutionary Computation Conference (2004)Google Scholar
  23. 23.
    Brameier, M., Banzhaf, W.: Evolving teams of predictors with linear genetic programming. Genetic Programming and Evolvable Machines 2(4), 381–407 (2001)CrossRefzbMATHGoogle Scholar
  24. 24.
    Thomason, R., Soule, T.: Novel ways of improving cooperation and performance in ensemble classifiers. In: Proceedings of the ACM Genetic and Evolutionary Computation Conference, pp. 1708–1715 (2007)Google Scholar
  25. 25.
    Lichodzijewski, P., Heywood, M.I.: Pareto-coevolutionary Genetic Programming for problem decomposition in multi-class classification. In: Proceedings of the ACM Genetic and Evolutionary Computation Conference, pp. 464–471 (2007)Google Scholar
  26. 26.
    Brameier, M., Banzhaf, W.: Linear Genetic Programming. Springer (2007)Google Scholar
  27. 27.
    Okasha, S.: Multilevel selection and the major transitions in evolution. Philosophy of Science 72, 1013–1025 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Stephen Kelly
    • 1
  • Malcolm I. Heywood
    • 1
  1. 1.Dalhousie UniversityHalifaxCanada

Personalised recommendations