Abstract
Recent experimental findings on hippocampal representational dynamics such as route replay and sweeps match intuitive notions from reinforcement learning including transiently representing potential trajectories and reward locations. We explore these intuitions within a formal reinforcement learning framework and examine how these representational dynamics might be integrated with reinforcement learning algorithms. We suggest that hippocampal representational dynamics can be best integrated within a model-based reinforcement learning framework and show how this framework can be used to cultivate specific quantitative predictions for the control processes that direct and utilize hippocampal representations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
O’Keefe J, Nadel L. The hippocampus as a cognitive map. Oxford: Clarendon; 1978.
Buckner RL, Carroll DC. Self-projection and the brain. Trends Cogn Sci. 2007;11:49–57.
Corballis MC. Mental time travel: a case for evolutionary continuity. Trends Cogn Sci. 2012;17:5–6.
Redish AD. Beyond the cognitive map: from place cells to episodic memory. Cambridge, MA: MIT Press; 1999.
Schacter DL, Addis DR, Buckner RL. Remembering the past to imagine the future: the prospective brain. Nat Rev Neurosci. 2007;8:657–61.
Sutton RS, Barto AG. Reinforcement learning: an introduction. Cambridge, MA: MIT Press; 1998.
Zilli EA, Hasselmo ME. Modeling the role of working memory and episodic memory in behavioral tasks. Hippocampus. 2008;18:193–209.
Pavlides C, Winson J. Influences of hippocampal place cell firing in the awake state on the activity of these cells during subsequent sleep episodes. J Neurosci. 1989;9:2907–18.
Wilson MA, McNaughton BL. Reactivation of hippocampal ensemble memories during sleep. Science. 1994;265:676–9.
Kudrimoti HS, Barnes CA, McNaughton BL. Reactivation of hippocampal cell assemblies: effects of behavioral state, experience, and EEG dynamics. J Neurosci. 1999;19:4090–101.
Lee AK, Wilson MA. Memory of sequential experience in the hippocampus during slow wave sleep. Neuron. 2002;36:1183–94.
Nadasdy Z, Hirase H, Czurko A, Csicsvari J, Buzsaki G. Replay and time compression of recurring spike sequences in the hippocampus. J Neurosci. 1999;19:9497–507.
Skaggs WE, McNaughton BL. Replay of neuronal firing sequences in rat hippocampus during sleep following spatial experience. Science. 1996;271:1870–3.
Diba K, Buzsaki G. Forward and reverse hippocampal place-cell sequences during ripples. Nat Neurosci. 2007;10:1241–2.
Foster DJ, Wilson MA. Reverse replay of behavioural sequences in hippocampal place cells during the awake state. Nature. 2006;440:680–3.
Jackson JC, Johnson A, Redish AD. Hippocampal sharp waves and reactivation during awake states depend on repeated sequential experience. J Neurosci. 2006;26:12415–26.
O’Neill J, Senior T, Csicsvari J. Place-selective firing of CA1 pyramidal cells during sharp wave/ripple network patterns in exploratory behavior. Neuron. 2006;49:143–55.
O’Neill J, Senior TJ, Allen K, Huxter JR, Csicsvari J. Reactivation of experience-dependent cell assembly patterns in the hippocampus. Nat Neurosci. 2008;11:209–15.
Tatsuno M, Lipa P, McNaughton BL. Methodological considerations on the use of template matching to study long-lasting memory trace replay. J Neurosci. 2006;26:10727–42.
Johnson A, Redish AD. Hippocampal replay contributes to within session learning in a temporal difference reinforcement learning model. Neural Netw. 2005;18:1163–71.
Csicsvari J, O’Neill J, Allen K, Senior T. Place-selective firing contributes to the reverse-order reactivation of ca1 pyramidal cells during sharp waves in open-field exploration. Eur J Neurosci. 2007;26:704–16.
Karlsson MP, Frank LM. Awake replay of remote experiences in the hippocampus. Nat Neurosci. 2009;12:913–8.
Davidson TJ, Kloosterman F, Wilson MA. Hippocampal replay of extended experience. Neuron. 2009;63:497–507.
Cheng S, Frank LM. New experiences enhance coordinated neural activity in the hippocampus. Neuron. 2008;57:303–13.
Hirase H, Leinekugel X, Czurko A, Csicsvari J, Buzsaki G. Firing rates of hippocampal neurons are preserved during subsequent sleep episodes and modified. Proc Natl Acad Sci U S A. 2001;98:9386–90.
Singer AC, Frank LM. Rewarded outcomes enhance reactivation of experience in the hippocampus. Neuron. 2009;64:910–21.
Gupta AS, van der Meer MAA, Touretzky DS, Redish AD. Hippocampal replay is not a simple function of experience. Neuron. 2010;65:695–705.
Johnson A, Redish AD. Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point. J Neurosci. 2007;27:12176–89.
van der Meer MAA, Johnson A, Schmitzer-Torbert NC, Redish AD. Triple dissociation of information processing in dorsal striatum, ventral striatum, and hippocampus on a learned spatial decision task. Neuron. 2010;67:25–32.
Muenzinger KF. Vicarious trial and error at a point of choice: a general survey of its relation to learning efficiency. J Genet Psychol. 1938;53:75–86.
Tolman EC. The determiners of behavior at a choice point. Psychol Rev. 1938;46:318–36.
Pfeiffer BE, Foster DJ. Hippocampal place-cell sequences depict future paths to remembered goals. Nature. 2013;497:74–9.
Bellman R. Dynamic programming. Princeton, NJ: Princeton University Press; 1957.
Rescorla RA, Wagner AR. A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. In: Black AH, Prokesy WF, editors. Classical conditioning II: current research and theory. New York: Appleton Century Crofts; 1972. p. 64–99.
Sutton RS, Barto AG. Toward a modern theory of adaptive networks: expectation and prediction. Psychol Rev. 1981;88:135–70.
Montague PR, Dayan P, Sejnowski TJ. A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci. 1996;16:1936–47.
Schultz W, Dayan P, Montague PR. A neural substrate of prediction and reward. Science. 1997;275:1593–9.
Daw ND, Niv Y, Dayan P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci. 2005;8:1704–11.
Yin HH, Knowlton BJ, Balleine BW. Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. Eur J Neurosci. 2004;19:181–9.
Foster DJ, Morris RGM, Dayan P. A model of hippocampally dependent navigation using the temporal difference learning rule. Hippocampus. 2000;10:1–6.
Dietterich TG. Hierarchical reinforcement learning with the maxQ value function decomposition. J Artif Intell Res. 2000;13:227–303.
Sutton RS, Precup D, Singh S. Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif Intell. 1999;112:181–211.
Jung MW, Wiener SI, McNaughton BL. Comparison of spatial firing characteristics of the dorsal and ventral hippocampus of the rat. J Neurosci. 1994;14:7347–56.
Richmond M, Yee B, Pouzet B, Veenman L, Rawlins J, Feldon J, Bannerman D. Dissociating context and space within the hippocampus: effects of complete, dorsal, and ventral excitotoxic hippocampal lesions on conditioned freezing and spatial learning. Behav Neurosci. 1999;113:1189–203.
Olton DS, Becker JT, Handelmann GE. Hippocampal function: working memory or cognitive mapping? Physiol Psychol. 1980;8:239–46.
Wikenheiser AM, Redish AD. The balance of forward and backward hippocampal sequences shifts across behavioral states. Hippocampus. 2013;23:22–9.
Gupta AS, van der Meer MAA, Touretzky DS, Redish AD. Segmentation of spatial experience by hippocampal theta sequences. Nat Neurosci. 2012;15:1032–9.
Singer AC, Carr MF, Karlsson MP, Frank LM. Hippocampal SWR activity predicts correct decisions during the initial learning of an alternation task. Neuron. 2013;77:1163–73.
Walker JA, Olton DS. Fimbria-fornix lesions impair spatial working memory but not cognitive mapping. Behav Neurosci. 1984;98:226–42.
Kali S, Dayan P. Off-line replay maintains declarative memories in a model of hippocampal- neocortical interactions. Nat Neurosci. 2004;7:286–94.
McClelland JL, McNaughton BL, O’Reilly RC. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychol Rev. 1995;102:419–57.
Tse D, Langston RF, Kakeyama M, Bethus I, Spooner PA, Wood ER, Witter MP, Morris RGM. Schemas and memory consolidation. Science. 2007;316:76–82.
Johnson A, Varberg Z, Benhardus J, Maahs A, Schrater P. The hippocampus and exploration: dynamically evolving behavior and neural representations. Front Hum Neurosci. 2012;6:1–17.
Friston K, Schwartenbeck P, FitzGerald T, Moutoussis M, Behrens T, Dolan RJ. The anatomy of choice: active inference and agency. Front Hum Neurosci. 2013;7:1–18.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media New York
About this chapter
Cite this chapter
Johnson, A., Venditto, S. (2015). Reinforcement Learning and Hippocampal Dynamics. In: Tatsuno, M. (eds) Analysis and Modeling of Coordinated Multi-neuronal Activity. Springer Series in Computational Neuroscience, vol 12. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-1969-7_14
Download citation
DOI: https://doi.org/10.1007/978-1-4939-1969-7_14
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-1968-0
Online ISBN: 978-1-4939-1969-7
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)