Heuristically-Accelerated Reinforcement Learning: A Comparative Analysis of Performance

Martins, Murilo Fernandes; Bianchi, Reinaldo A. C.

doi:10.1007/978-3-662-43645-5_2

Murilo Fernandes Martins⁸ &
Reinaldo A. C. Bianchi⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8069))

Included in the following conference series:

Conference Towards Autonomous Robotic Systems

3177 Accesses
4 Citations

Abstract

This paper presents a comparative analysis of three Reinforcement Learning algorithms (Q-learning, Q(\(\lambda \))-learning and QS-learning) and their heuristically-accelerated variants (HAQL, HAQ(\(\lambda \)) and HAQS) where heuristics bias action selection, thus speeding up the learning. The experiments were performed in a simulated robot soccer environment which reproduces the conditions of a real competition league environment. The results clearly demonstrate that the use of heuristics substantially improves the performance of the learning algorithms.

The author acknowledges the current support of FAPESP – project n\(^{o}\) 2012/12640-1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://www.fira.net/?mid=simurosot

References

Watkins, C.: Learning from delayed rewards. Ph.D. thesis, University of Cambridge, England (1989)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. MIT Press, Cambridge (1998)
Google Scholar
Wiering, M., Schmidhuber, J.: Fast online q(lambda). Mach. Learn. 33(1), 105–115 (1998)
Article MATH Google Scholar
Ribeiro, C., Szepesvári, C.: Q-learning combined with spreading: convergence and results. In: ISRF-IEE International Conference on Intelligent and Cognitive Systems (Neural Networks Symposium), pp. 32–36 (1996)
Google Scholar
Ribeiro, C., Pegoraro, R., Costa, A.: Experience generalization for concurrent reinforcement learners: the minimax-qs algorithm. In: Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 1239–1245. ACM, NY (2002)
Google Scholar
Bianchi, R.A.C., Ribeiro, C.H.C., Costa, A.H.R.: Heuristically accelerated Q–learning: a new approach to speed up reinforcement learning. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 245–254. Springer, Heidelberg (2004)
Chapter Google Scholar
Peng, J., Williams, R.: Incremental multi-step q-learning. Mach. Learn. 22(1–3), 283–290 (1996)
Google Scholar
Wiering, M., van Hasselt, H.: Ensemble algorithms in reinforcement learning. IEEE Trans. Syst. Man Cybern. Part B 38(4), 930–936 (2008)
Article Google Scholar
Bianchi, R., Ribeiro, C., Costa, A.: Accelerating autonomous learning by using heuristic selection of actions. J. Heuristics 14(2), 135–168 (2008)
Article Google Scholar
Bianchi, R., Ribeiro, C., Costa, A.: Heuristic selection of actions in multiagent reinforcement learning. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence, pp. 690–696. Morgan Kaufmann Publishers Inc. (2007)
Google Scholar
Gurzoni Jr, J.A., Tonidandel, F., Bianchi, R.A.C.: Market-based dynamic task allocation using heuristically accelerated reinforcement learning. In: Antunes, L., Pinto, H.S. (eds.) EPIA 2011. LNCS, vol. 7026, pp. 365–376. Springer, Heidelberg (2011)
Chapter Google Scholar
Bianchi, R.A.C., Ros, R., Lopez de Mantaras, R.: Improving reinforcement learning by using case based heuristics. In: McGinty, L., Wilson, D.C. (eds.) ICCBR 2009. LNCS, vol. 5650, pp. 75–89. Springer, Heidelberg (2009)
Chapter Google Scholar
Bianchi, R., Martins, M., Ribeiro, C., Costa, A.: Heuristically-accelerated multiagent reinforcement learning. IEEE Trans. Cybern. 44(2), 252–265 (2013)
Article Google Scholar
Littman, M.L.: Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the 11th International Conference on Machine Learning (ML-94), pp. 157–163. Morgan Kaufmann, New Brunswick (1994)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering – IAAA Group, Centro Universitário da FEI, São Paulo, Brazil
Murilo Fernandes Martins & Reinaldo A. C. Bianchi

Authors

Murilo Fernandes Martins
View author publications
You can also search for this author in PubMed Google Scholar
Reinaldo A. C. Bianchi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Murilo Fernandes Martins .

Editor information

Editors and Affiliations

Department of Computer Science, University of Oxford, Oxford, United Kingdom
Ashutosh Natraj
Department of Computer Science, University of Oxford, Oxford, United Kingdom
Stephen Cameron
University of Bristol and the West of En, Bristol, United Kingdom
Chris Melhuish
Imperial College London Department of Electrical and Electronic, London, United Kingdom
Mark Witkowski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martins, M.F., Bianchi, R.A.C. (2014). Heuristically-Accelerated Reinforcement Learning: A Comparative Analysis of Performance. In: Natraj, A., Cameron, S., Melhuish, C., Witkowski, M. (eds) Towards Autonomous Robotic Systems. TAROS 2013. Lecture Notes in Computer Science(), vol 8069. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43645-5_2

Download citation

DOI: https://doi.org/10.1007/978-3-662-43645-5_2
Published: 28 June 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43644-8
Online ISBN: 978-3-662-43645-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics