A Computational Model of Integration between Reinforcement Learning and Task Monitoring in the Prefrontal Cortex

Khamassi, Mehdi; Quilodran, René; Enel, Pierre; Procyk, Emmanuel; Dominey, Peter F.

doi:10.1007/978-3-642-15193-4_40

Mehdi Khamassi²¹,
René Quilodran²¹,
Pierre Enel²¹,
Emmanuel Procyk²¹ &
…
Peter F. Dominey²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6226))

Included in the following conference series:

International Conference on Simulation of Adaptive Behavior

1424 Accesses
1 Citations

Abstract

Taking inspiration from neural principles of decision-making is of particular interest to help improve adaptivity of artificial systems. Research at the crossroads of neuroscience and artificial intelligence in the last decade has helped understanding how the brain organizes reinforcement learning (RL) processes (the adaptation of decisions based on feedback from the environment). The current challenge is now to understand how the brain flexibly regulates parameters of RL such as the exploration rate based on the task structure, which is called meta-learning [1] Doya, 2002). Here, we propose a computational mechanism of exploration regulation based on real neurophysiological and behavioral data recorded in monkey prefrontal cortex during a visuo-motor task involving a clear distinction between exploratory and exploitative actions. We first fit trial-by-trial choices made by the monkeys with an analytical reinforcement learning model. We find that the model which has the highest likelihood of predicting monkeys’ choices reveals different exploration rates at different task phases. In addition, the optimized model has a very high learning rate, and a reset of action values associated to a cue used in the task to signal condition changes. Beyond classical RL mechanisms, these results suggest that the monkey brain extracted task regularities to tune learning parameters in a task-appropriate way. We finally use these principles to develop a neural network model extending a previous cortico-striatal loop model. In our prefrontal cortex component, prediction error signals are extracted to produce feedback categorization signals. The latter are used to boost exploration after errors, and to attenuate it during exploitation, ensuring a lock on the currently rewarded choice. This model performs the task like monkeys, and provides a set of experimental predictions to be tested by future neurophysiological recordings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Doya, K.: Metalearning and neuromodulation. Neural Netw. 15(4-6), 495–506 (2002)
Article Google Scholar
Barraclough, D., Conroy, M., Lee, D.: Prefrontal cortex and decision making in a mixed-strategy game. Nat. Neurosci. 7(4), 404–410 (2004)
Article Google Scholar
Procyk, E., Tanaka, Y., Joseph, J.: Anterior cingulate activity during routine and non-routine sequential behaviors in macaques. Nat. Neurosci. 3(5), 502–508 (2000)
Article Google Scholar
Aston-Jones, G., Cohen, J.: Adaptive gain and the role of the locus coeruleus-norepinephrine system in optimal performance. J. Comp. Neurol. 493(1), 99–110 (2005)
Article Google Scholar
Brown, J., Braver, T.: Learned predictions of error likelihood in the anterior cingulate cortex. Science 307, 1118–1121 (2005)
Article Google Scholar
Dosenbach, N.U., Visscher, K.M., Palmer, E.D., Miezin, F.M., Wenger, K.K., Kang, H.C., Burgund, E.D., Grimes, A.L., Schlaggar, B.L., Peterson, S.E.: A core system for the implementation of task sets. Neuron 50, 799–812 (2006)
Article Google Scholar
Matsumoto, M., Matsumoto, K., Abe, H., Tanaka, K.: Medial prefrontal cell activity signaling prediction errors of action values. Nat. Neurosci. 10, 647–656 (2007)
Article Google Scholar
Quilodran, R., Rothe, M., Procyk, E.: Behavioral shifts and action valuation in the anterior cingulate cortex. Neuron 57(2), 314–325 (2008)
Article Google Scholar
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Dominey, P., Arbib, M., Joseph, J.: A model of corticostriatal plasticity for learning oculomotor associations and sequences. Journal of Cognitive Neuroscience 7(3), 311–336 (1995)
Article Google Scholar
Khamassi, M., Martinet, L., Guillot, A.: Combining self-organizing maps with mixture of epxerts: Application to an Actor-Critic model of reinforcement learning in the basal ganglia. In: Proceedings of the 9th International Conference on the Simulation of Adaptive Behavior (SAB), Rome, Italy, pp. 394–405. Springer, Heidelberg (2006)
Google Scholar
Schultz, W., Dayan, P., Montague, P.: A neural substrate of prediction and reward. Science 275(5306), 1593–1599 (1997)
Article Google Scholar
Gurney, K., Prescott, T., Redgrave, P.: A computational model of action selection in the basal ganglia. I. A new functional anatomy. Biol. Cybern. 84(6), 401–410 (2001)
Article MATH Google Scholar
Girard, B., Cuzin, V., Guillot, A., Gurney, K., Prescott, T.: A basal ganglia inspired model of action selection evaluated in a robotic survival task. Journal of Integrative Neuroscience 2(2), 179–200 (2003)
Article Google Scholar
Procyk, E., Goldman-Rakic, P.: Modulation of dorsolateral prefrontal delay activity during self-organized behavior. J. Neurosci. 26(44), 11313–11323 (2006)
Article Google Scholar
Dehaene, S., Changeux, J.: A neuronal model of a global workspace in effortful cognitive tasks. Proc. Natl. Acad. Sci. USA 95, 14529–14534 (1998)
Article Google Scholar
Cohen, J., Aston-Jones, G., Gilzenut, S.: A systems-level perspective on attention and cognitive control. In: Posner, M. (ed.) Cognitive Neuroscience of Attention, pp. 71–90. Guilford Publications, New York (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

INSERM U846 SBRI, Bron, France
Mehdi Khamassi, René Quilodran, Pierre Enel, Emmanuel Procyk & Peter F. Dominey

Authors

Mehdi Khamassi
View author publications
You can also search for this author in PubMed Google Scholar
René Quilodran
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Enel
View author publications
You can also search for this author in PubMed Google Scholar
Emmanuel Procyk
View author publications
You can also search for this author in PubMed Google Scholar
Peter F. Dominey
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ISIR, Université Pierre et Marie Curie-Paris 6, 4 Place Jussieu, 75252, Paris cedex 05, France
Stéphane Doncieux , Benoît Girard , Agnès Guillot , Jean-Arcady Meyer & Jean-Baptiste Mouret , , , &
The Mærsk Mc-Kinney Møller Institute, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark
John Hallam

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khamassi, M., Quilodran, R., Enel, P., Procyk, E., Dominey, P.F. (2010). A Computational Model of Integration between Reinforcement Learning and Task Monitoring in the Prefrontal Cortex. In: Doncieux, S., Girard, B., Guillot, A., Hallam, J., Meyer, JA., Mouret, JB. (eds) From Animals to Animats 11. SAB 2010. Lecture Notes in Computer Science(), vol 6226. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15193-4_40

Download citation

DOI: https://doi.org/10.1007/978-3-642-15193-4_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15192-7
Online ISBN: 978-3-642-15193-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics