MOUSE(μ): A Self-teaching Algorithm that Achieved Master-Strength at Othello

Tournavitis, Konstantinos

doi:10.1007/978-3-540-40031-8_2

MOUSE(μ): A Self-teaching Algorithm that Achieved Master-Strength at Othello

Konstantinos Tournavitis⁷

Conference paper

713 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2883))

Abstract

This paper discusses an experimental comparison of supervised and reinforcement learning algorithms for the game of Othello. Motivated from the results, a new learning algorithm Mouse(μ) (MO nte-Carlo learning U sing heuri S tic E rror reduction) has been developed. Mouse uses a heuristic model of past experience to improve generalization and reduce noisy estimations. The algorithm was able to tune the parameter vector of a huge linear system consisting of about 1.5 million parameters and to end up at the fourth place in a recent GGS Othello tournament, a significant result for a self-teaching algorithm. Besides the theoretical aspects of the used learning methods, experimental results and comparisons are presented and discussed. These results demonstrate the advantages and drawbacks of existing learning approaches in strategy games and the potential of the new algorithm.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Samuel, A.: Some studies in machine learning using the game of checkers. IBM Journal of Research and Development 3, 210–229 (1959)
Article Google Scholar
Buro, M.: Experiments with multi-probcut and a new high-quality evaluation function for Othello. In: van den Herik, J., Iida, H. (eds.) Games in AI Research, Universiteit Maastricht (2000)
Google Scholar
Tesauro, G.: TD-Gammon, A self-teaching backgammon program, achieves master-level play. Neural Computation 6, 215–219 (1994)
Article Google Scholar
Fürnkranz, J.: Machine learning in games: A survey. In: Fürnkranz, J., Kubat, M. (eds.) Machines that Learn to Play Games, pp. 11–59. Nova Science Publishers (2001)
Google Scholar
Schaeffer, J.: The games computers (and people) play. In: Zelkowitz, M. (ed.) Advances in Computers 50, pp. 189–266. Academic Press, London (2000)
Google Scholar
Buro, M.: From simple features to sophisticated evaluation functions. In: van den Herik, H.J., Iida, H. (eds.) CG 1998. LNCS, vol. 1558, pp. 126–145. Springer, Heidelberg (1999)
Chapter Google Scholar
Buro, M.: Techniken fuer die Bewertung von Spielsituationen anhand von Beispielen. PhD thesis, University of Paderborn (1994) (in German)
Google Scholar
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1988)
Google Scholar
Watkins, C.: Models of Delayed Reinforcement Learning. PhD thesis, Psychology Department, Cambridge University (1989)
Google Scholar
Kearns, M., Singh, S.: Bias-variance error bounds for temporal difference updates. In: 13th Annual Conference on Computational Learning Theory, pp. 142–147. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Tsitsiklis, J., van Roy, B.: An analysis of temporal-difference learning with function approximation. IEEE Transactions on Automatic Control 42, 674–690 (1997)
Article MATH Google Scholar
Buro, M.: Personal communication (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Knowledge Based Systems (WBS), Technical University Berlin,
Konstantinos Tournavitis

Authors

Konstantinos Tournavitis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computing Science Department, University of Alberta, T6G 2E8, Edmonton, Alberta, Canada
Jonathan Schaeffer
Department of Computing Science, University of Alberta, T6G 2E8, Edmonton, AB, Canada
Martin Müller
School of Computer Science, Reykjavík University, Reykjavík, Iceland
Yngvi Björnsson

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tournavitis, K. (2003). MOUSE(μ): A Self-teaching Algorithm that Achieved Master-Strength at Othello. In: Schaeffer, J., Müller, M., Björnsson, Y. (eds) Computers and Games. CG 2002. Lecture Notes in Computer Science, vol 2883. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-40031-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-540-40031-8_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20545-6
Online ISBN: 978-3-540-40031-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics