Reward responses of dopamine neurons: A biological reinforcement signal

Schultz, Wolfram

doi:10.1007/BFb0020125

Wolfram Schultz¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1327))

Included in the following conference series:

International Conference on Artificial Neural Networks

132 Accesses
1 Citations

Abstract

A class of reinforcement models termed Temporal Difference (TD) models has been developed from theoretical grounds as effective algorithms for various learning situations. Based on the observation that learning depends on the unpredictability of primary motivating events, these models use errors in the prediction of reinforcing events as teaching signals. Independent of the theoretical work, neuophysiological experiments have revealed that neurons in the mammalian midbrain using the neurotransmitter dopamine process information about rewards and reward-predicting stimuli in a very similar manner as the teaching signal of TD models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alexander, G.E. and Crutcher, M.D.: Neural representations of the target (goal) of visually guided arm movements in three motor areas of the monkey. J. Neurophysiol. 64: 164–178, 1990
Google Scholar
Calabresi, P., Maj, R., Mercuri, N.B. and Bernardi, G.: Coactivation of D1 and D2 dopamine receptors is required for long-term synaptic depression in the striatum. Neurosci. Lett. 142: 95–99, 1992
Google Scholar
Calabresi, P., Pisani, A., Mercuri, N.B. and Bernardi, G.: Long-term potentiation in the striatum is unmasked by removing the voltage-dependent magnesium block of NMDA receptor channels. Europ. J. Neurosci. 4: 929–935, 1992
Google Scholar
Contreras-Vidal, J.L. and Schultz, W.: A neural network model of reward-related learning, motivation and orienting behavior. Soc. Neurosci. Abstr. 22: 2029, 1996
Google Scholar
Crutcher, M.D. and DeLong, M.R.: Single cell studies of the primate putamen. II. Relations to direction of movement and pattern of muscular activity. Exp. Brain Res. 53: 244–258, 1984
Google Scholar
Dickinson, A.: Contemporary animal learning theory. Cambridge University Press, Cambridge 1980
Google Scholar
Doucet, G., Descarries, L. and Garcia, S.: Quantification of the dopamine innervation in adult rat neostriatum. Neuroscience 19: 427–445, 1986
Google Scholar
Filion, M., Tremblay, L. and Bédard, P.J.: Abnormal influences of passive limb movement on the activity of globus pallidus neurons in parkinsonian monkey. Brain Res. 444: 165–176, 1988
Google Scholar
Flaherty, A.W. and Graybiel, A.: Two input systems for body representations in the primate striatal matrix: experimental evidence in the squirrel monkey. J. Neurosci. 13: 1120–1137, 1993
Google Scholar
Freund, T.T., Powell, J.F. and Smith, A.D.: Tyrosine hydroxylaseimmunoreactive boutons in synaptic contact with identified striatonigral neurons, with particular reference to dendritic spines. Neuroscience 13: 1189–1215, 1984
Google Scholar
Friston, K.J., Tononi, G., Reeke, G.N.Jr., Sporns, O. and Edelman, G.M.: Value-dependent selection in the brain: simulation in a synthetic neural model. Neuroscience 59: 229–243, 1994
Google Scholar
Goldman-Rakic, P.S., Leranth, C., Williams, M.S., Mons, N. and Geffard, M.: Dopamine synaptic complex with pyramidal neurons in primate cerebral cortex. Proc. Natl.Acad. Sci. USA 86: 9015–9019, 1989
Google Scholar
Hikosaka, O., Sakamoto, M. and Usui, S.: Functional properties of monkey caudate neurons. III. Activities related to expectation of target and reward. J. Neurophysiol. 61: 814–832, 1989
Google Scholar
Kimura, M.: Behaviorally contingent property of movement-related activity of the primate putamen. J. Neurophysiol. 63: 1277–1296, 1990
Google Scholar
Ljungberg, T., Apicella, P. and Schultz, W.: Responses of monkey midbrain dopamine neurons during delayed alternation performance. Brain Res. 586: 337–341, 1991
Google Scholar
Ljungberg, T., Apicella, P. and Schultz, W.: Responses of monkey dopamine neurons during learning of behavioral reactions. J. Neurophysiol. 67: 145–163, 1992
Google Scholar
Mackintosh, N.J.: A theory of attention: Variations in the associability of stimulus with reinforcement. Psychol. Rev. 82: 276–298, 1975
Google Scholar
Mirenowicz, J. and Schultz, W.: Importance of unpredictability for reward responses in primate dopamine neurons. J. Neurophysiol. 72: 1024–1027, 1994
Google Scholar
Mirenowicz, J. and Schultz, W.: Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli. Nature 379: 449–451, 1996
Google Scholar
Montague, P.R., Dayan, P., Nowlan, S.J., Pouget, A. and Sejnowski, T.J.: Using aperiodic reinforcement for directed self-organization during development. In: Neural Information Processing Systems 5 (Eds. S.J. Hanson, J.D. Cowan and C.L. Giles). pp. 969–976. Morgan Kaufmann, San Mateo, 1993
Google Scholar
Montague, P.R., Dayan, P. and Sejnowski, T.J.: A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16: 1936–1947, 1996
Google Scholar
Pearce, J.M. and Hall, G.: A model for Pavlovian conditioning: variations in the effectiveness of conditioned but not of unconditioned stimuli. Psychol. Rev. 87: 532–552, 1980
Google Scholar
Rescorla, R.A. and Wagner, A.R.: A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In: Classical Conditioning II: Current Research and Theory (Eds. Black, A.H. and Prokasy, W.F.) New York: Appleton Century Crofts, pp. 64–99, 1972
Google Scholar
Rolis, E.T., Thorpe, S.J. and Maddison, S.P.: Responses of striatal neurons in the behaving monkey. I. Head of the caudate nucleus. Behav. Brain Res. 7: 179–210, 1983
Google Scholar
Romo, R. and Schultz, W.: Dopamine neurons of the monkey midbrain: Contingencies of responses to active touch during self-initiated arm movements. J. Neurophysiol. 63: 592–606, 1990
Google Scholar
Schultz, W.: Activity of dopamine neurons in the behaving primate. Sem. Neurosci. 4: 129–138, 1992
Google Scholar
Schultz, W., Dayan, P. and Montague, R.R.: A neural substrate of prediction and reward. Science 275: 1593–1599, 1997
Google Scholar
Schultz, W., Apicella, P., Scarnati, E. and Ljungberg, T.: Neuronal activity in monkey ventral striatum related to the expectation of reward. J. Neurosci. 12: 4595–4610, 1992
Google Scholar
Schultz, W., Apicella, P. and Ljungberg, T.: Responses of monkey dopamine neurons during performance of a delayed response task. J. Neurosci. 13: 900–913, 1993
Google Scholar
Schultz, W. and Romo, R.: Dopamine neurons of the monkey midbrain: Contingencies of responses to stimuli eliciting immediate behavioral reactions. J. Neurophysiol. 63: 607–624, 1990
Google Scholar
Schultz, W. and Romo, R.: Role of primate basal ganglia and frontal cortex in the internal generation of movements: comparison with instruction-induced preparatory activity in striatal neurons. Exp. Brain Res. 91: 363–384, 1992
Google Scholar
Schultz, W., Romo, R., Ljungberg, T., Mirenowicz, J., Hollerman, J.R. and Dickinson, A.: Reward-related signals carried by dopamine neurons. In: Models of Information Processing in the Basal Ganglia (Eds. J.C.Houk, J.L.Davis and D.G.Beiser) MIT Press, Cambridge, MA, pp. 233–248, 1995
Google Scholar
Schultz, W., Ruffieux, A. and Aebischer, P.: The activity of pars compacta neurons of the monkey substantia nigra in relation to motor activation. Exp. Brain Res. 51: 377–387, 1983
Google Scholar
Smith, A.D. and Bolam, J.P.: The neural network of the basal ganglia as revealed by the study of synaptic connections of identified neurones. Trends Neurosci. 13: 259–265, 1990
Google Scholar
Steinfels, G.F., Heym, J., Strecker, R.E. and Jacobs, B.L.: Behavioral correlates of dopaminergic unit activity in freely moving cats. Brain Res. 258: 217–228, 1983
Google Scholar
Suri, R. and Schultz, W.: A neural learning model based on the activity of primate dopamine neurons. Soc. Neurosci. Abstr. 22: 1389, 1996
Google Scholar
Sutton, R.S. and Barto, A.G.: Toward a modern theory of adaptive networks: expectation and prediction. Psychol. Rev. 88: 135–170, 1981
Google Scholar
Sutton, R.S. and Barto, A.G.: Time-derivative Models of Pavlovian Reinforcement. In: Learning and Computational Neuroscience: Foundations of Adaptive Networks (Eds. M. Gabriel and J. Moore). MIT Press, Cambridge, pp. 497–537, 1990
Google Scholar
Toan, D.L. and Schultz, W.: Responses of rat pallidum cells to cortex stimulation and effects of altered dopaminergic activity. Neuroscience 15: 683–694, 1985
Google Scholar
Wickens, J. and Kotter, R.: Cellular models of reinforcement. In: Models of Information Processing in the Basal Ganglia (Eds. J.C.Houk, J.L.Davis and D.G.Beiser) MIT Press, Cambridge, MA, pp. 187–214, 1995
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Physiology, University of Fribourg, CH-1700, Fribourg, Switzerland
Wolfram Schultz

Authors

Wolfram Schultz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Wulfram Gerstner Alain Germond Martin Hasler Jean-Daniel Nicoud

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schultz, W. (1997). Reward responses of dopamine neurons: A biological reinforcement signal. In: Gerstner, W., Germond, A., Hasler, M., Nicoud, JD. (eds) Artificial Neural Networks — ICANN'97. ICANN 1997. Lecture Notes in Computer Science, vol 1327. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0020125

Download citation

DOI: https://doi.org/10.1007/BFb0020125
Published: 09 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63631-1
Online ISBN: 978-3-540-69620-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics