Abstract
In this paper we demonstrate how genetic algorithms can be used to reverse engineer an evaluation function’s parameters for computer chess. Our results show that using an appropriate expert (or mentor), we can evolve a program that is on par with top tournament-playing chess programs, outperforming a two-time World Computer Chess Champion. This performance gain is achieved by evolving a program that mimics the behavior of a superior expert. The resulting evaluation function of the evolved program consists of a much smaller number of parameters than the expert’s. The extended experimental results provided in this paper include a report on our successful participation in the 2008 World Computer Chess Championship. In principle, our expert-driven approach could be used in a wide range of problems for which appropriate experts are available.
Similar content being viewed by others
Notes
An evaluation unit in chess programs is commonly called a centipawn, i.e., 1/100th of the value of a pawn. Traditionally, a pawn is assigned a value of 100, and all other parameters are assigned relative values. However, the value of a pawn itself need not be exactly 100, so a unit of evaluation may no longer be exactly 1/100th of a pawn. Despite this inconsistency, the term centipawn is still used to denote the smallest evaluation unit.
Note that Evol* and RandOrg (including the sets of parameters of their evaluation function) are essentially the same, except for the actual values assigned to these parameters.
Our genetically evolved program participated under the name Falcon, which is the original name we had used in previous championships. Even though a name reflecting evolution (such as FalconGA) might have been more appropriate, it is customary that the participants use the same program name every year, even when using a substantially different version.
References
S.G. Akl, M.M. Newborn, The principal continuation and the killer heuristic. in Proceedings of the 5th Annual ACM Computer Science Conference (ACM Press, Seattle, WA, 1977), pp. 466–473
P. Aksenov, Genetic Algorithms for Optimising Chess Position Scoring. Master’s Thesis, University of Joensuu, Finland (2004)
T.S. Anantharaman, Extension heuristics. ICCA J. 14(2), 47–65 (1991)
J. Baxter, A. Tridgell, L. Weaver, Learning to play chess using temporal-differences. Mach. Learn. 40(3), 243–263 (2000)
D.F. Beal, Experiments with the null move. Advances in Computer Chess 5, in ed. by D.F. Beal (Elsevier Science, Amsterdam, 1989), pp. 65–79
D.F. Beal, M.C. Smith, Quantification of search extension benefits. ICCA J. 18(4), 205–218 (1995)
Y. Björnsson, T.A. Marsland, Multi-cut pruning in alpha-beta search. in Proceedings of the First International Conference on Computers and Games, Tsukuba, Japan (1998), pp. 15–24
Y. Björnsson, T.A. Marsland, Multi-cut alpha-beta-pruning in game-tree search. Theor. Comput. Sci. 252(1–2), 177–196 (2001)
M. Block, M. Bader, E. Tapia, M. Ramirez, K. Gunnarsson, E. Cuevas, D. Zaldivar, R. Rojas, Using reinforcement learning in chess engines, Res. Comput. Sci. 35, 31–40 (2008)
M.S. Campbell, T.A. Marsland, A comparison of minimax tree search algorithms. Artif. Intell. 20(4), 347–367 (1983)
S. Chinchalkar, An upper bound for the number of reachable positions. ICCA J. 19(3), 181–183 (1996)
O. David-Tabibi, A. Felner, N.S. Netanyahu, Blockage detection in pawn endings. in Proceedings of the 2004 International Conference on Computers and Games, eds. by H.J. van den Herik, Y. Björnsson, N.S. Netanyahu (Springer (LNCS 3846), Ramat-Gan, Israel, 2006), pp. 187–201
O. David-Tabibi, M. Koppel, N.S. Netanyahu, Genetic algorithms for mentor-assisted evaluation function optimization. in Proceedings of the Genetic and Evolutionary Computation Conference (Atlanta, GA, 2008), pp. 1469–1476
O. David-Tabibi, N.S. Netanyahu, Extended null-move reductions. in Proceedings of the 2008 International Conference on Computers and Games, eds. by H.J. van den Herik, X. Xu, Z. Ma, M.H.M. Winands (Springer (LNCS 5131), Beijing, China, 2008), pp. 205–216
C. Donninger, Null move and deep search: Selective search heuristics for obtuse chess programs. ICCA J. 16(3), 137–143 (1993)
J.J. Gillogly, The technology chess program. Artif. Intell. 3(1–3), 145–163 (1972)
R. Gross, K. Albrecht, W. Kantschik, W. Banzhaf, Evolving chess playing programs. in Proceedings of the Genetic and Evolutionary Computation Conference (New York, NY, 2002), pp. 740–747
A. Hauptman, M. Sipper, Using genetic programming to evolve chess endgame players. in Proceedings of the 2005 European Conference on Genetic Programming (Springer, Lausanne, Switzerland, 2005), pp. 120–131
A. Hauptman, M. Sipper, Evolution of an efficient search algorithm for the Mate-in-N problem in chess. in Proceedings of the 2007 European Conference on Genetic Programming (Springer, Valencia, Spain, 2007), pp. 78–89
E.A. Heinz, Extended futility pruning. ICCA J. 21(2), 75–83 (1998)
R.M. Hyatt, A.E. Gower, H.L. Nelson. Cray Blitz. Computers, chess, and cognition, in eds. T.A. Marsland, J. Schaeffer (Springer, New York, 1990), pp. 227–237
G. Kendall, G. Whitwell, An evolutionary approach for the tuning of a chess evaluation function using population dynamics. in Proceedings of the 2001 Congress on Evolutionary Computation. (IEEE Press, World Trade Center, Seoul, Korea, 2001), pp. 995–1002
J. McCarthy, Chess as the Drosophila of AI. Computers, chess, and cognition, eds. T.A. Marsland, J. Schaeffer (Springer, New York, 1990), pp. 227–237
H.L. Nelson. Hash tables in Cray Blitz. ICCA J. 8(1), 3–13 (1985)
A. Reinfeld, An improvement to the Scout tree-search algorithm. ICCA J. 6(4), 4–14 (1983)
J. Schaeffer, The history heuristic. ICCA J. 6(3), 16–19 (1983)
J. Schaeffer, The history heuristic and alpha-beta search enhancements in practice. IEEE Trans. Pattern. Anal. Mach. Intell. 11(11), 1203–1212 (1989)
J. Schaeffer, M. Hlynka, V. Jussila, Temporal difference learning applied to a high-performance game-playing program. in Proceedings of the 2001 International Joint Conference on Artificial Intelligence (Seattle, WA, 2001), pp. 529–534
J.J. Scott. A chess-playing program, in machine intelligence 4, eds. B. Meltzer, D. Michie (Edinburgh University Press, Edinburgh, 1969), pp. 255–265
D.J. Slate, L.R. Atkin, Chess 4.5—The Northwestern University chess program. Chess skill in man and machine, ed. by P.W. Frey (Springer, New York, 2nd ed, 1983), pp. 82–118
R.S. Sutton, A.G. Barto. Reinforcement learning: an introduction (MIT Press, Cambridge, MA, 1998)
G. Tesauro, Practical issues in temporal difference learning. Mach. Learn. 8(3–4), 257–277 (1992)
W. Tunstall-Pedoe (1991) Genetic algorithms optimising evaluation functions. ICCA J. 14(3), 119–128 (1991)
M.A. Wiering, TD Learning of Game Evaluation Functions with Hierarchical Neural Architectures. Master’s Thesis, University of Amsterdam (1995)
Author information
Authors and Affiliations
Corresponding author
Additional information
A preliminary version of this paper appeared in Proceedings of the 2008 Genetic and Evolutionary Computation Conference [13] and received the Best Paper Award in the conference’s Real-World Applications track.
Appendix
Appendix
1.1 A. Experimental setup
Our experimental setup consisted of the following resources:
-
Falcon chess engine running under UCI protocol, and Crafty 19, Junior 9, Fritz 8, and Hiarcs 8 running as a native ChessBase engines.
-
Encyclopedia of Chess Middlegames (ECM) test suite, consisting of 879 positions.
-
Fritz 8 interface for automatic running of matches. Fritz opening book was used for all games.
-
AMD Athlon 64 3200+ with 1 GB RAM and Windows XP operating system.
1.2 B. Elo rating system
The Elo rating system, developed by Arpad Elo, is the official system for calculating the relative skill levels of players in chess. The following statistics from the January 2009 FIDE rating list provide a general impression of the meaning of the Elo rating system:
-
21079 players have a rating above 2200 Elo.
-
2886 players have a rating between 2400 and 2499, most of whom have either the title of International Master (IM) or Grandmaster (GM).
-
876 players have a rating between 2500 and 2599, most of whom have the title of GM.
-
188 players have a rating between 2600 and 2699, all of whom have the title of GM.
-
32 players have a rating above 2700.
Only four players have ever had a rating of 2800 or above. A novice player is generally associated with rating values below 1400 Elo. Given the rating difference (RD) between player A and player B, the expected winning rate w (0 ≤ w ≤ 1) of player A is given by
Given the winning rate of player A against player B (as is the case in our experiments), the expected rating difference between the two players can be derived from the above formula, i.e.,
In addition, given the results of a series of N matches between two players, we can derive confidence intervals for their rating difference. Without loss of generality, let W, D, and L denote, respectively, the number of wins, draws, and losses of the first player. The mean score and standard deviation are given, respectively, by
and
Note that \(\overline{x}\) is essentially an estimate of the expected winning rate. Now, suppose that we are interested in computing, for example, the 95% confidence interval (which corresponds to ± two standard deviations) of the rating difference. For this we compute the lower and upper ends of the winning rate, i.e., \(w_{lo} = \overline{x} - 2s\) and \(w_{hi} = \overline{x} + 2s\). Substituting w lo and w hi in Eq. 2 we obtain the corresponding lower and upper ends of the 95% confidence interval of the rating difference. Given any confidence level, one can compute the corresponding RD confidence interval similarly to the above described steps.
Rights and permissions
About this article
Cite this article
David-Tabibi, O., Koppel, M. & Netanyahu, N.S. Expert-driven genetic algorithms for simulating evaluation functions. Genet Program Evolvable Mach 12, 5–22 (2011). https://doi.org/10.1007/s10710-010-9103-4
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10710-010-9103-4