Skip to main content

Reinforcement Learning with N-tuples on the Game Connect-4

  • Conference paper
Parallel Problem Solving from Nature - PPSN XII (PPSN 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7491))

Included in the following conference series:

Abstract

Learning complex game functions is still a difficult task. We apply temporal difference learning (TDL), a well-known variant of the reinforcement learning approach, in combination with n-tuple networks to the game Connect-4. Our agent is trained just by self-play. It is able, for the first time, to consistently beat the optimal-playing Minimax agent (in game situations where a win is possible). The n-tuple network induces a mighty feature space: It is not necessary to design certain features, but the agent learns to select the right ones. We believe that the n-tuple network is an important ingredient for the overall success and identify several aspects that are relevant for achieving high-quality results. The architecture is sufficiently general to be applied to similar reinforcement learning tasks as well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allis, V.: A knowledge-based approach of Connect-4. The game is solved: White wins. Master’s thesis, Department of Mathematics and Computer Science, Vrije Universiteit, Amsterdam, The Netherlands (1988)

    Google Scholar 

  2. Bledsoe, W.W., Browning, I.: Pattern recognition and reading by machine. In: Proc. Eastern Joint Computer Conference, New York, pp. 225–232 (1959)

    Google Scholar 

  3. Curran, D., O’Riordan, C.: Evolving Connect-4 playing neural networks using cultural learning. NUIG-IT-081204, National University of Ireland, Galway (2004)

    Google Scholar 

  4. Edelkamp, S., Kissmann, P.: Symbolic classication of general two-player games. Technical report, Technische Universität Dortmund (2008)

    Google Scholar 

  5. Konen, W., Bartz–Beielstein, T.: Reinforcement Learning: Insights from Interesting Failures in Parameter Selection. In: Rudolph, G., Jansen, T., Lucas, S., Poloni, C., Beume, N. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 478–487. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  6. Konen, W., Bartz-Beielstein, T.: Reinforcement learning for games: failures and successes – CMA-ES and TDL in comparison. In: Proc. GECCO 2009, Montreal, pp. 2641–2648. ACM, New York (2009)

    Chapter  Google Scholar 

  7. Krawiec, K., Szubert, M.G.: Learning n-tuple networks for Othello by coevolutionary gradient search. In: Proc. GECCO 2011, Dublin, pp. 355–362. ACM, New York (2011)

    Google Scholar 

  8. Lucas, S.M.: Learning to play Othello with n-tuple systems. Australian Journal of Intelligent Information Processing 4, 1–20 (2008)

    Google Scholar 

  9. Samuel, A.L.: Some studies in machine learning using the game of checkers. IBM Journal of Research and Development 3(3), 210–229 (1959)

    Article  Google Scholar 

  10. Schneider, M., Garcia Rosa, J.: Neural Connect-4 - a connectionist approach. In: Proc. VII. Brazilian Symposium on Neural Networks, pp. 236–241 (2002)

    Google Scholar 

  11. Sommerlund, P.: Artificial neural nets applied to strategic games (1996) (unpublished), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.56.4690 (last access: June 05, 2012)

  12. Stenmark, M.: Synthesizing board evaluation functions for Connect-4 using machine learning techniques. Master’s thesis, Østfold University College, Norway (2005)

    Google Scholar 

  13. Sutton, R.S.: Temporal Credit Assignment in Reinforcement Learning. PhD thesis, University of Massachusetts, Amherst, MA (1984)

    Google Scholar 

  14. Sutton, R.S.: Learning to predict by the method of temporal differences. Machine Learning 3, 9–44 (1988)

    Google Scholar 

  15. Tesauro, G.: TD-gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation 6, 215–219 (1994)

    Article  Google Scholar 

  16. Thill, M.: Using n-tuple systems with TD learning for strategic board games. CIOP Report 01/12, Cologne University of Applied Science (2012) (in German)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Thill, M., Koch, P., Konen, W. (2012). Reinforcement Learning with N-tuples on the Game Connect-4. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds) Parallel Problem Solving from Nature - PPSN XII. PPSN 2012. Lecture Notes in Computer Science, vol 7491. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32937-1_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32937-1_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32936-4

  • Online ISBN: 978-3-642-32937-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics