Advertisement

Reducing the Learning Time of Tetris in Evolution Strategies

  • Amine Boumaza
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7401)

Abstract

Designing artificial players for the game of Tetris is a challenging problem that many authors addressed using different methods. Very performing implementations using evolution strategies have also been proposed. However one drawback of using evolution strategies for this problem can be the cost of evaluations due to the stochastic nature of the fitness function. This paper describes the use of racing algorithms to reduce the amount of evaluations of the fitness function in order to reduce the learning time. Different experiments illustrate the benefits and the limitation of racing in evolution strategies for this problem. Among the benefits is designing artificial players at the level of the top ranked players at a third of the cost.

Keywords

Evolution Strategy Search Point Learn Time Game State Game Board 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Audibert, J.-Y., Munos, R., Szepesvári, C.: Tuning Bandit Algorithms in Stochastic Environments. In: Hutter, M., Servedio, R.A., Takimoto, E. (eds.) ALT 2007. LNCS (LNAI), vol. 4754, pp. 150–165. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  2. 2.
    Bertsekas, D., Tsitsiklis, J.: Neuro-Dynamic Programming. Athena Scientific (1996)Google Scholar
  3. 3.
    de Boer, P., Kroese, D., Mannor, S., Rubinstein, R.: A tutorial on the cross-entropy method. Annals of Operations Research 1(134), 19–67 (2004)Google Scholar
  4. 4.
    Böhm, N., Kókai, G., Mandl, S.: An Evolutionary Approach to Tetris. In: University of Vienna Faculty of Business; Economics, Statistics (eds.) Proc. of the 6th Metaheuristics International Conference, CDROM (2005)Google Scholar
  5. 5.
    Boumaza, A.: On the evolution of artificial tetris players. In: Proc. of the IEEE Symp. on Comp. Intel. and Games, CIG 2009, pp. 387–393. IEEE (June 2009)Google Scholar
  6. 6.
    Burgiel, H.: How to lose at Tetris. Mathematical Gazette 81, 194–200 (1997)CrossRefGoogle Scholar
  7. 7.
    Demaine, E.D., Hohenberger, S., Liben-Nowell, D.: Tetris is Hard, Even to Approximate. In: Warnow, T.J., Zhu, B. (eds.) COCOON 2003. LNCS, vol. 2697, pp. 351–363. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  8. 8.
    Fahey, C.P.: Tetris AI, Computer plays Tetris (2003), on the web http://colinfahey.com/tetris/tetris_en.html
  9. 9.
    Farias, V., van Roy, B.: Tetris: A study of randomized constraint sampling. Springer (2006)Google Scholar
  10. 10.
    Hansen, N., Müller, S., Koumoutsakos, P.: Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evolutionary Computation 11(1), 1–18 (2003)CrossRefGoogle Scholar
  11. 11.
    Hansen, N., Niederberger, S., Guzzella, L., Koumoutsakos, P.: A method for handling uncertainty in evolutionary optimization with an application to feedback control of combustion. IEEE Trans. Evol. Comp. 13(1), 180–197 (2009)CrossRefGoogle Scholar
  12. 12.
    Heidrich-Meisner, V., Igel, C.: Hoeffding and bernstein races for selecting policies in evolutionary direct policy search. In: Proc. of the 26th ICML, pp. 401–408. ACM, New York (2009)Google Scholar
  13. 13.
    Maron, O., Moore, A.W.: Hoeffding races: Accelerating model selection search for classification and function approximation. In: Proc. Advances in Neural Information Processing Systems, pp. 59–66. Morgan Kaufmann (1994)Google Scholar
  14. 14.
    Ostermeier, A., Gawelczyk, A., Hansen, N.: A derandomized approach to self-adaptation of evolution strategies. Evolutionary Computation 2(4), 369–380 (1994)CrossRefGoogle Scholar
  15. 15.
    Schmidt, C., Branke, J., Chick, S.E.: Integrating Techniques from Statistical Ranking into Evolutionary Algorithms. In: Rothlauf, F., Branke, J., Cagnoni, S., Costa, E., Cotta, C., Drechsler, R., Lutton, E., Machado, P., Moore, J.H., Romero, J., Smith, G.D., Squillero, G., Takagi, H. (eds.) EvoWorkshops 2006. LNCS, vol. 3907, pp. 752–763. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  16. 16.
    Siegel, E.V., Chaffee, A.D.: Genetically optimizing the speed of programs evolved to play tetris. In: Angeline, P.J., Kinnear Jr., K.E. (eds.) Advances in Genetic Programming 2, pp. 279–298. MIT Press, Cambridge (1996)Google Scholar
  17. 17.
    Stagge, P.: Averaging Efficiently in the Presence of Noise. In: Eiben, A.E., Bäck, T., Schoenauer, M., Schwefel, H.-P. (eds.) PPSN 1998. LNCS, vol. 1498, pp. 188–197. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  18. 18.
    Szita, I., Lörincz, A.: Learning tetris using the noisy cross-entropy method. Neural Comput. 18(12), 2936–2941 (2006)zbMATHCrossRefGoogle Scholar
  19. 19.
    Thiery, C., Scherrer, B.: Building Controllers for Tetris. International Computer Games Association Journal 32, 3–11 (2009)Google Scholar
  20. 20.
    Thiery, C., Scherrer, B.: Least-Squares λ Policy Iteration: Bias-Variance Trade-off in Control Problems. In: Proc. ICML, Haifa (2010)Google Scholar
  21. 21.
    Tsitsiklis, J.N., van Roy, B.: Feature-based methods for large scale dynamic programming. Machine Learning 22, 59–94 (1996)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Amine Boumaza
    • 1
    • 2
  1. 1.Univ. Lille Nord de FranceLilleFrance
  2. 2.ULCO, LISICCalaisFrance

Personalised recommendations