Skip to main content

A Distributed Q-Learning Approach for Variable Attention to Multiple Critics

  • Conference paper
Neural Information Processing (ICONIP 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7665))

Included in the following conference series:

Abstract

One of the substantial concerns of researchers in machine learning area is designing an artificial agent with an autonomous behaviour in a complex environment. In this paper, we considered a learning problem with multiple critics. The importance of each critic for the agent is different, and attention of agent to them is variable during its life. Inspired from neurological studies, we proposed a distributed learning approach for this problem that is flexible against the variable attention. In this approach, there is a distinct learner for each critic that an algorithm is introduced for aggregating of their knowledge based on combination of model-free and model-based learning methods. We showed that this aggregation method could provide the optimal policy for this problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press (1998)

    Google Scholar 

  2. Bayer, H.M., Glimcher, P.W.: Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141 (2005)

    Article  Google Scholar 

  3. Niv, Y.: Reinforcement learning in the brain. Journal of Mathematical Psychology 53, 139–154 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  4. Dayan, P., Daw, N.D.: Decision theory, reinforcement learning, and the brain. Cognitive, Affective, & Behavioral Neuroscience 8, 429–453 (2008)

    Article  Google Scholar 

  5. Gläscher, J., Daw, N., Dayan, P., O’Doherty, J.P.: States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66, 585–595 (2010)

    Article  Google Scholar 

  6. Shelton, C.R.: Balancing multiple sources of reward in reinforcement learning. DTIC Document (2006)

    Google Scholar 

  7. Raicevic, P.: Parallel reinforcement learning using multiple reward signals. Neurocomputing 69, 2171–2179 (2006)

    Article  Google Scholar 

  8. Sprague, N., Ballard, D.: Multiple-goal reinforcement learning with modular sarsa (0). In: International Joint Conference on Artificial Intelligence, vol. 18, pp. 1445–1447 (2003)

    Google Scholar 

  9. Park, K.H., Kim, Y.J., Kim, J.H.: Modular Q-learning based multi-agent cooperation for robot soccer. In: Robotics and Autonomous Systems, vol. 35, pp. 109–122 (2001)

    Google Scholar 

  10. Samejima, K., Doya, K., Kawato, M.: Inter-module credit assignment in modular reinforcement learning. Neural Networks 16, 985–994 (2003)

    Article  Google Scholar 

  11. Bhat, S., Isbell, C.L., Mateas, M.: On the difficulty of modular reinforcement learning for real-world partial programming. In: Proceedings of the National Conference on Artificial Intelligence, vol. 21, p. 318. AAAI Press (2006)

    Google Scholar 

  12. Daw, N.D.: Model-based reinforcement learning as cognitive search: Neurocomputational theories. Evolution, Algorithms and the Brain (2011)

    Google Scholar 

  13. Simon, D.A., Daw, N.D.: Environmental statistics and the trade-off between model-based and TD learning in humans. In: Advances in Neural Information Processing Systems, vol. 24, pp. 127–135 (2011)

    Google Scholar 

  14. Szepesvári, C.: Algorithms for reinforcement learning. Algorithms for Reinforcement Learning 4, 1–103 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tavakol, M., Ahmadabadi, M.N., Mirian, M., Asadpour, M. (2012). A Distributed Q-Learning Approach for Variable Attention to Multiple Critics. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds) Neural Information Processing. ICONIP 2012. Lecture Notes in Computer Science, vol 7665. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34487-9_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34487-9_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34486-2

  • Online ISBN: 978-3-642-34487-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics