ACE-RL-Checkers: decision-making adaptability through integration of automatic case elicitation, reinforcement learning, and sequential pattern mining

Neto, Henrique Castro; Julia, Rita Maria Silva

doi:10.1007/s10115-018-1175-0

ACE-RL-Checkers: decision-making adaptability through integration of automatic case elicitation, reinforcement learning, and sequential pattern mining

Regular Paper
Published: 26 February 2018

Volume 57, pages 603–634, (2018)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Henrique Castro Neto¹ &
Rita Maria Silva Julia¹

442 Accesses
5 Citations
Explore all metrics

Abstract

In agents that operate in environments where decision-making needs to take into account, not only the environment, but also the minimizing actions of an opponent (as in games), it is fundamental that the agent is endowed with the ability of progressively tracing the profile of its adversaries, in such a manner that this profile aids in the process of selecting appropriate actions. However, it would be unsuitable to construct an agent with a decision-making system based only on the elaboration of such a profile, as this would prevent the agent from having its “own identity,” which would leave the agent at the mercy of its opponent. Following this direction, this study proposes an automatic Checkers player, called ACE-RL-Checkers, equipped with a dynamic decision-making module, which adapts to the profile of the opponent over the course of the game. In such a system, the action selection process is conducted through a composition of multilayer perceptron neural network and case library. In this case, the neural network represents the “identity” of the agent, i.e., it is an already trained static decision-making module. On the other hand, the case library represents the dynamic decision-making module of the agent, which is generated by the Automatic Case Elicitation technique. This technique has a pseudo-random exploratory behavior, which allows the dynamic decision-making of the agent to be directed either by the opponent’s game profile or randomly. In order to avoid a high occurrence of pseudo-random decision-making in the game initial phases—in which the agent counts on very little information about its opponent—this work proposes a new module based on sequential pattern mining for generating a base of experience rules extracted from human expert’s game records. This module will improve the agent’s move selection in the game initial phases. Experiments carried out in tournaments involving ACE-RL-Checkers and other agents correlated to this work, confirm the superiority of the dynamic architecture proposed herein.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

Quo vadis artificial intelligence?

Article Open access 07 March 2022

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

References

American Checkers Federation (ACF) (2014) http://www.usacheckers.com/
World Checkers and Draughts Federation (WCDF) (2014) http://www.wcdf.net/
Aamodt A, Plaza E (1994) Case-based reasoning; foundational issues, methodological variations, and system approaches. AI Commun 7(1):39–59
Google Scholar
Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749
Article Google Scholar
Al-Khateeb B, Kendall G (2012) Effect of look-ahead depth in evolutionary checkers. J Comput Sci Technol 27(5):996–1006
Article MathSciNet MATH Google Scholar
Al-Khateeb B, Kendall G (2012) Introducing individual and social learning into evolutionary checkers. IEEE Trans Comput Intell AI Games 4:258–269
Article Google Scholar
Banks S, Rafter R, Smyth B (2015) The recommendation game: using a game-with-a-purpose to generate recommendation data. In: Proceedings of the 9th ACM conference on recommender systems. ACM, New York, pp 305–308
Campos P, Langlois T (2003) Abalearn: efficient self-play learning of the game abalone. In: INESC-ID, Neural Networks and Signal Processing Group
Cheheltani SH, Ebadzadeh MM (2012) Immune based fuzzy agent plays checkers game. Appl Soft Comput 12(8):2227–2236
Article Google Scholar
Derrac J, Garcia S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol Comput 1(1):3–18
Article Google Scholar
Duarte VAR, Julia RMS (2012) Mp-draughts: ordering the search tree and refining the game board representation to improve a multi-agent system for draughts. In: 2012 IEEE 24th international conference on tools with artificial intelligence (ICTAI), vol 1, pp 1120–1125
Duarte VAR, Julia RMS, Albertini MK, Neto HC (2015) Mp-draughts: unsupervised learning multi-agent system based on mlp and adaptive neural networks. In: 2015 IEEE 27th international conference on tools with artificial intelligence (ICTAI), pp 920–927
Fierz MC (2008) Cake informations. Technical report. http://www.fierz.ch/cake.php
Fierz MC (2012) Checkerboard program—version 1.72. Technical report. http://www.fierz.ch/checkerboard.php
Fogel DB, Chellapilla K (2001) Verifying Anaconda’s expert rating by competing against chinook: experiments in co-evolving a neural checkers player. Neurocomputing 42(1–4):69–86
MATH Google Scholar
Herik HJV, Uiterwijk JW, Rijswijck JV (2002) Games solved: now and in the future. Artif Intell 134(1–2):277–311
Article MATH Google Scholar
Jong KAD, Schultz AC (1988) Using experience-based learning in game playing. In: Fifth international machine learning conference, pp 284–290
Lin MY, Lee SY (2002) Fast discovery of sequential patterns by memory indexing. Data Wareh Knowl Discov 2454:150–160
MATH Google Scholar
Lynch M, Griffith N (1997) Neurodraughts: the role of representation, search, training regime and architecture in a TD draughts player. In: Eighth Ireland conference on artificial intelligence, pp 67–72 . http://iamlynch.com/nd.html
Mabroukeh NR, Ezeife CI (2010) A taxonomy of sequential pattern mining algorithms. ACM Comput Surv 43(1):1–41
Article Google Scholar
McCarthy JL, Feigenbaum EA (1990) In memoriam: Arthur Samuel: pioneer in machine learning. AI Mag 11(3):10–11
Google Scholar
Millington I (2006) Artificial intelligence for games. Morgan Kaufmann Publishers Inc., San Francisco
Google Scholar
Misiunas T (2014) Realtime recommendation system for online games. Master’s thesis, School of Informatics, University of Edinburgh, Edinburgh
Mller M, Enzenberger M (2009) Fuego-an open-source framework for board games and go engine based on Monte-Carlo tree search. Technical report, Department of Computing Science
Google Scholar
Neto HC, Julia RMS (2015) ACE-RL-Checkers: improving automatic case elicitation through knowledge obtained by reinforcement learning in player agents. In: 2015 IEEE conference on computational intelligence and games (CIG), pp 328–335
Neto HC, Julia RMS, Caixeta GS, Barcelos ARA (2014) Ls-visiondraughts: improving the performance of an agent for checkers by integrating computational intelligence, reinforcement learning and a powerful search method. Appl Intell. https://doi.org/10.1007/s10489-014-0536-y
Google Scholar
Neto HC, Julia RMS, Duarte VAR (2015) Improving the accuracy of the cases in the automatic case elicitation-based hybrid agents for checkers. In: 2015 IEEE 27th international conference on tools with artificial intelligence (ICTAI), pp 912–919
Pei J, Han J, Mortazavi-Asl B, Wang J, Pinto H, Chen Q, Dayal U, Hsu MC (2004) Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans Knowl Data Eng 16(11):1424–1440
Article Google Scholar
Plaat A (1996) Research re: Search & re-search. Ph.D. thesis, Tinbergen Institute and Department of Computer Science, Erasmus University, Rotterdam
Plaat A, Schaeffer J, Pijls W, Bruin A (1995) A new paradigm for minimax search
Powell JH, Hauff BM, Hastings JD (2004) Utilizing case-based reasoning and automatic case elicitation to develop a self-taught knowledgeable agent. In: Challenges in game artificial intelligence: papers from the AAAI workshop (Technical report WS-0404). AAAI Press
Powell JH, Hauff BM, Hastings JD (2005) Evaluating the effectiveness of exploration and accumulated experience in automatic case elicitation. In: Proceedings of ICCBR 2005. Springer, Berlin, pp 397–407
Samuel AL (1959) Some studies in machine learning using the game of checkers. IBM J Res Dev 3(3):210–229
Article MathSciNet Google Scholar
Samuel AL (1967) Some studies in machine learning using the game of checkers II. IBM J Res Dev 11(6):601–617
Article Google Scholar
Schaeffer J, Burch N, Bjornsson Y, Kishimoto A, Muller M, Lake R, Lu P, Sutphen S (2007) Checkers is solved. Sci Express 328(5844):1518
Google Scholar
Schaeffer J, Lake R, Lu P, Bryant M (1996) Chinook: the world man-machine checkers champion. AI Mag 17(1):21–30
Google Scholar
Schein AI, Popescul A, Ungar LH, Pennock DM (2002) Methods and metrics for cold-start recommendations. In: Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR’02. ACM, New York, pp 253–260
Srikant R, Agrawal R (1996) Mining sequential patterns: Generalizations and performance improvements. In: Proceedings of the 5th international conference on extending database technology: advances in database technology. Springer, London, pp 3–17
Sutton RS (1988) Learning to predict by the methods of temporal differences. Mach Learn 3(1):9–44
Google Scholar
Tomaz LBP, Julia RMS, Barcelos ARA (2013) Improving the accomplishment of a neural network based agent for draughts that operates in a distributed learning environment. In: IRI. IEEE, pp 262–269
Wang L, Wang Y, Li Y (2015) Mining experiential patterns from game-logs of board game. Int J Comput Games Technol. https://doi.org/10.1155/2015/576201
Google Scholar
Yan X, Han J, Afshar R (2003) Clospan: mining closed sequential patterns in large datasets. In: Proceedings of the 3rd SIAM, pp 166–177
Zaki MJ (2001) Spade: an efficient algorithm for mining frequent sequences. Mach Learn 42(1–2):31–60
Article MATH Google Scholar
Zobrist AL (1969) A hashing method with applications for game playing. Technical report

Download references

Acknowledgements

The authors thank FAPEMIG (Brazil) for fellowships and financial support.

Author information

Authors and Affiliations

Computer Sciences Department, Federal University of Uberlandia, Campus Santa Monica, Av. Joao Naves de Avila, 2121, Block 1B, Room 1B143, Uberlandia, CEP 38400-902, Brazil
Henrique Castro Neto & Rita Maria Silva Julia

Authors

Henrique Castro Neto
View author publications
You can also search for this author in PubMed Google Scholar
Rita Maria Silva Julia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Henrique Castro Neto.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Neto, H.C., Julia, R.M.S. ACE-RL-Checkers: decision-making adaptability through integration of automatic case elicitation, reinforcement learning, and sequential pattern mining. Knowl Inf Syst 57, 603–634 (2018). https://doi.org/10.1007/s10115-018-1175-0

Download citation

Received: 22 November 2016
Revised: 27 June 2017
Accepted: 16 September 2017
Published: 26 February 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s10115-018-1175-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ACE-RL-Checkers: decision-making adaptability through integration of automatic case elicitation, reinforcement learning, and sequential pattern mining

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Quo vadis artificial intelligence?

Multi-agent deep reinforcement learning: a survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

ACE-RL-Checkers: decision-making adaptability through integration of automatic case elicitation, reinforcement learning, and sequential pattern mining

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Quo vadis artificial intelligence?

Multi-agent deep reinforcement learning: a survey

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation