The Global Landscape of Objective Functions for the Optimization of Shogi Piece Values with a Game-Tree Search

Hoki, Kunihito; Kaneko, Tomoyuki

doi:10.1007/978-3-642-31866-5_16

Kunihito Hoki¹⁷ &
Tomoyuki Kaneko¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7168))

Included in the following conference series:

Advances in Computer Games

1834 Accesses
7 Citations

Abstract

The landscape of an objective function for supervised learning of evaluation functions is numerically investigated for a limited number of feature variables. Despite the importance of such learning methods, the properties of the objective function are still not well known because of its complicated dependence on millions of tree-search values. This paper shows that the objective function has multiple local minima and the global minimum point indicates reasonable feature values. Moreover, the function is continuous with a practically computable numerical accuracy. However, the function has non-partially differentiable points on the critical boundaries. It is shown that an existing iterative method is able to minimize the functions from random initial values with great stability, but it has the possibility to end up with a non-reasonable local minimum point if the initial random values are far from the desired values. Furthermore, the obtained minimum points are shown to form a funnel structure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Anantharaman, T.: Evaluation tuning for computer chess: Linear discriminant methods. ICCA Journal 20, 224–242 (1997)
Google Scholar
Baxter, J., Tridgell, A., Weaver, L.: TDLeaf(λ) Combining temporal difference learning with game-tree search. In: Proceedings of the 9th Australian Conference on Neural Networks (ACNN 1998), Brisbane, Australia, pp. 168–172 (1999)
Google Scholar
Baxter, J., Tridgell, A., Weaver, L.: Learning to play chess using temporal-differences. Machine Learning 40, 242–263 (2000)
Article Google Scholar
Beal, D.F., Smith, M.C.: Temporal difference learning applied to game playing and the results of application to shogi. Theoretical Computer Science 252, 105–119 (2001)
Article MathSciNet MATH Google Scholar
Campbell, M., Joseph Hoane, J.A., Hsu, F.: Deep Blue. Artificial Intelligence 134, 57–83 (2002)
Article MATH Google Scholar
Conn, A.R., Scheinberg, K., Vicente, L.N.: Introduction to Derivative-Free Optimization. MPS-SIAM Series on Optimization. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2009)
Book Google Scholar
Fürnkranz, J.: Machine Learning in Games: A Survey. In: Fürnkranz, J., Kubat, M. (eds.) Machines that Learn to Play Games, pp. 11–59. Nova Science Publishers (2001)
Google Scholar
Hoki, K., Kaneko, T.: Large-Scale Optimization of Evaluation Functions with Minimax Search (in preparation)
Google Scholar
Hoki, K.: Bonanza – The Computer Shogi Program (2011) (in Japanese), http://www.geocities.jp/bonanzashogi/ (last access: 2011)
Hoki, K.: Optimal control of minimax search results to learn positional evaluation. In: Proceedings of the 11th Game Programming Workshop (GPW 2006), Hakone, Japan, pp. 78–83 (2006) (in Japanese)
Google Scholar
Hyatt, R.: Crafty 23.4 (2010), ftp://ftp.cis.uab.edu/pub/hyatt
Kaneko, T.: Learning evaluation functions by comparison of sibling nodes. In: Proceedings of the 12th Game Programming Workshop (GPW 2007), Hakone, Japan, pp. 9–16 (2007) (in Japanese)
Google Scholar
Knuth, D.E., Moor, R.W.: An Analysis of Alpha-Beta Pruning. Artificial Intelligence 13, 293–326 (1991)
Google Scholar
Letouzey, F.: Fruit 2.1 (2005), http://arctrix.com/nas/chess/fruit
Marsland, T., Campbell, M.: Parallel Search of Strongly Ordered Game Trees. ACM Computing Survey 14, 533–551 (1982)
Article Google Scholar
Marsland, T.A.: Evaluation-Function Factors. ICCA Journal 8, 47–57 (1985)
Google Scholar
Marsland, T.A., Member, S., Popowich, F.: Parallel game-tree search. IEEE Transactions on Pattern Analysis and Machine Intelligence 7, 442–452 (1985)
Article Google Scholar
Nocedal, J., Wright, S.: Numerical Optimization. Springer (2006)
Google Scholar
Nowatzyk, A.: (2000), http://tim-mann.org/DTevaltune.txt (last access: 2010)
Romstad, T.: Stockfish 1.9.1 (2010), http://www.stockfishchess.com
Schaeffer, J., Hlynka, M., Jussila, V.: Temporal difference learning applied to a high-performance game-playing program. In: Proceedings of the 17th International Joint Conference on Artificial Intelligence (IJCAI 2001), pp. 529–534. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Google Scholar
Shannon, C.E.: Programming a Computer for Playing Chess. Philosophical Magazine, Ser. 7 41(314) (1950)
Google Scholar
Sun, W., Yuan, Y.-X.: Optimization Theory and Methods. Nonlinear Programming. Springer Science+Business Media, LLC (2006)
Google Scholar
Tesauro, G.: Comparison training of chess evaluation functions. In: Furnkranz, J., Kumbat, M. (eds.) Machines that Learn to Play Games, pp. 117–130. Nova Science Publishers (2001)
Google Scholar
Tesauro, G.: Programming backgammon using self-teaching neural nets. Artificial Intelligence 134, 181–199 (2002)
Article MATH Google Scholar
Veness, J., Silver, D., Uther, W., Blair, A.: Bootstrapping from game tree search. In: Bengio, Y., Schuurmans, D., Laerty, J., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems, Vancouver, BC, Canada, pp. 1937–1945 (2009)
Google Scholar
Yamashita, H.: YSS 7.0 – data structures and algorithms (in Japanese), http://www32.ocn.ne.jp/~yss/book.html (last access: 2010)

Download references

Author information

Authors and Affiliations

Department of Communication Engineering and Informatics, The University of Electro-Communications, Tokyo, 182-8585, Japan
Kunihito Hoki
Department of Graphics and Computer Sciences, The University of Tokyo, Tokyo, Japan
Tomoyuki Kaneko

Authors

Kunihito Hoki
View author publications
You can also search for this author in PubMed Google Scholar
Tomoyuki Kaneko
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Tilburg Institute of Cognition and Communication, Tilburg University, Warandelaan 2, 5037 AB, Tilburg, The Netherlands
H. Jaap van den Herik & Aske Plaat &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hoki, K., Kaneko, T. (2012). The Global Landscape of Objective Functions for the Optimization of Shogi Piece Values with a Game-Tree Search. In: van den Herik, H.J., Plaat, A. (eds) Advances in Computer Games. ACG 2011. Lecture Notes in Computer Science, vol 7168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31866-5_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-31866-5_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31865-8
Online ISBN: 978-3-642-31866-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics