Closing the Sensory-Motor Loop on Dopamine Signalled Reinforcement Learning

Chorley, Paul; Seth, Anil K.

doi:10.1007/978-3-540-69134-1_28

Paul Chorley¹ &
Anil K. Seth¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5040))

Included in the following conference series:

International Conference on Simulation of Adaptive Behavior

1156 Accesses
2 Citations

Abstract

It has been shown recently that dopamine signalled modulation of spike timing-dependent synaptic plasticity (DA-STDP) can enable reinforcement learning of delayed stimulus-reward associations when both stimulus and reward are delivered at precisely timed intervals. Here, we test whether a similar model can support learning in an embodied context, in which timing of both sensory input and delivery of reward depend on the agent’s behaviour. We show that effective reinforcement learning is indeed possible, but only when stimuli are gated so as to occur as near-synchronous patterns of neural activity and when neuroanatomical constraints are imposed which predispose agents to exploratative behaviours. Extinction of learned responses in this model is subsequently shown to result from agent-environment interactions and not directly from any specific neural mechanism.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bi, G., Poo, M.: Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type. Journal of Neuroscience 18, 10464–10472 (1998)
Google Scholar
Braitenberg, V.: Vehicles: Experiments in Synthetic Psychology. MIT Press, Cambridge (1984)
Google Scholar
deCharms, R.C., Zador, A.: Neural representation and the cortical code. Annual Review of Neuroscience 23, 613–646 (2000)
Article Google Scholar
Gho, M., Varela, F.J.: A quantitative assessment of the dependency of the visual temporal frame upon the cortical rhythm. Journal of Physiology 83(2), 95–101 (1988)
Google Scholar
Izhikevich, E.M.: Simple model of spiking neurons. IEEE Transactions on Neural Networks 14, 1569–1572 (2003)
Article Google Scholar
Izhikevich, E.M.: Solving the distal reward problem through linkage of stdp and dopamine signaling. Cerebral Cortex 17, 2443–2452 (2007)
Article Google Scholar
Izhikevich, E.M., Gally, J.A., Edelman, G.M.: Spike-timing dynamics of neuronal groups. Cerebral Cortex 14, 933–944 (2004)
Article Google Scholar
Rieke, F., Warland, D., de Ruyter van Steveninck, R.R., Bialek, W.: Spikes: exploring the neural code. MIT Press, Cambridge (1997)
Google Scholar
Schultz, W.: Predictive reward signal of dopamine neurons. Journal of Neurophysiology 80, 1–27 (1998)
Google Scholar
Schultz, W., Apicella, P., Ljungberg, T.: Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. Journal of Neuroscience 13, 900–913 (1993)
Google Scholar
Seth, A.K.: Evolving action selection and selective attention without actions, attention, or selection. In: Pfeifer, R., Blumberg, B., Meyer, J.-A., Wilson, S.W. (eds.) From Animals to Animats 5: Proceedings of the Fifth International Conference on the Simulation of Adaptive Behaviour, pp. 139–147. MIT Press, Cambridge (1998)
Google Scholar
Seth, A.K., McKinstry, J.L., Edelman, G.M., Krichmar, J.L.: Visual binding through reentrant connectivity and dynamic synchronization in a brain-based device. Cerebral Cortex 14, 1185–1199 (2004)
Article Google Scholar
Skinner, B.F.: The Behavior of Organisms. Appleton-Century-Crofts, New York (1938)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
VanRullen, R., Thorpe, S.J.: Rate coding vs temporal order coding: what the retinal ganglion cells tell the visual cortex. Neural Computation 13(6), 1255–1283 (2001)
Article Google Scholar
Van Rullen, R., Thorpe, S.J.: Surfing a spike wave down the ventral stream. Vision Research 42(23), 2593–2615 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dept of Informatics, University of Sussex, Brighton, BN1 9QJ, UK
Paul Chorley & Anil K. Seth

Authors

Paul Chorley
View author publications
You can also search for this author in PubMed Google Scholar
Anil K. Seth
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Minoru Asada John C. T. Hallam Jean-Arcady Meyer Jun Tani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chorley, P., Seth, A.K. (2008). Closing the Sensory-Motor Loop on Dopamine Signalled Reinforcement Learning. In: Asada, M., Hallam, J.C.T., Meyer, JA., Tani, J. (eds) From Animals to Animats 10. SAB 2008. Lecture Notes in Computer Science(), vol 5040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69134-1_28

Download citation

DOI: https://doi.org/10.1007/978-3-540-69134-1_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69133-4
Online ISBN: 978-3-540-69134-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics