Abstract
One of the substantial concerns of researchers in machine learning area is designing an artificial agent with an autonomous behaviour in a complex environment. In this paper, we considered a learning problem with multiple critics. The importance of each critic for the agent is different, and attention of agent to them is variable during its life. Inspired from neurological studies, we proposed a distributed learning approach for this problem that is flexible against the variable attention. In this approach, there is a distinct learner for each critic that an algorithm is introduced for aggregating of their knowledge based on combination of model-free and model-based learning methods. We showed that this aggregation method could provide the optimal policy for this problem.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press (1998)
Bayer, H.M., Glimcher, P.W.: Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141 (2005)
Niv, Y.: Reinforcement learning in the brain. Journal of Mathematical Psychology 53, 139–154 (2009)
Dayan, P., Daw, N.D.: Decision theory, reinforcement learning, and the brain. Cognitive, Affective, & Behavioral Neuroscience 8, 429–453 (2008)
Gläscher, J., Daw, N., Dayan, P., O’Doherty, J.P.: States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66, 585–595 (2010)
Shelton, C.R.: Balancing multiple sources of reward in reinforcement learning. DTIC Document (2006)
Raicevic, P.: Parallel reinforcement learning using multiple reward signals. Neurocomputing 69, 2171–2179 (2006)
Sprague, N., Ballard, D.: Multiple-goal reinforcement learning with modular sarsa (0). In: International Joint Conference on Artificial Intelligence, vol. 18, pp. 1445–1447 (2003)
Park, K.H., Kim, Y.J., Kim, J.H.: Modular Q-learning based multi-agent cooperation for robot soccer. In: Robotics and Autonomous Systems, vol. 35, pp. 109–122 (2001)
Samejima, K., Doya, K., Kawato, M.: Inter-module credit assignment in modular reinforcement learning. Neural Networks 16, 985–994 (2003)
Bhat, S., Isbell, C.L., Mateas, M.: On the difficulty of modular reinforcement learning for real-world partial programming. In: Proceedings of the National Conference on Artificial Intelligence, vol. 21, p. 318. AAAI Press (2006)
Daw, N.D.: Model-based reinforcement learning as cognitive search: Neurocomputational theories. Evolution, Algorithms and the Brain (2011)
Simon, D.A., Daw, N.D.: Environmental statistics and the trade-off between model-based and TD learning in humans. In: Advances in Neural Information Processing Systems, vol. 24, pp. 127–135 (2011)
Szepesvári, C.: Algorithms for reinforcement learning. Algorithms for Reinforcement Learning 4, 1–103 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tavakol, M., Ahmadabadi, M.N., Mirian, M., Asadpour, M. (2012). A Distributed Q-Learning Approach for Variable Attention to Multiple Critics. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds) Neural Information Processing. ICONIP 2012. Lecture Notes in Computer Science, vol 7665. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34487-9_30
Download citation
DOI: https://doi.org/10.1007/978-3-642-34487-9_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34486-2
Online ISBN: 978-3-642-34487-9
eBook Packages: Computer ScienceComputer Science (R0)