Basal Ganglia Models for Autonomous Behavior Learning

Tsujino, Hiroshi; Takeuchi, Johane; Shouno, Osamu

doi:10.1007/978-3-642-00616-6_16

Hiroshi Tsujino²⁴,
Johane Takeuchi²⁴ &
Osamu Shouno²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5436))

2240 Accesses
1 Citations

Abstract

We propose two basal ganglia (BG) models for autonomous behavior learning: the BG system model and the BG spiking neural network model. These models were developed on the basis of reinforcement learning (RL) theories and neuroscience principals of behavioral learning. The BG system model focuses on problems with RL input selection and reward setting. This model assumes that parallel BG modules receive a variety of inputs. We also propose an automatic setting method of internal reward for this model. The BG spiking neural network model focuses on problems with biological neural network architecture, ambiguous inputs and the mechanism of timing. This model accounts for the neurophysiological characteristics of neurons and differential functions of the direct and indirect pathways. We demonstrate that the BG system model achieves goals in fewer trials by learning the internal state representation, whereas the BG spiking neural network model has the capacity for probabilistic selection of action. Our results suggest that these two models are a step toward developing an autonomous learning system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Reiner, A., Medina, L., Veenman, C.L.: Structural and functional evolution of the basal ganglia in vertebrates. Brain Res. Brain Res. Rev. 28(3), 235–285 (1998)
Article CAS PubMed Google Scholar
Barto, A.G., Sutton, R.S., Anderson, C.: Neuron-like adaptive elements that can solve difficult learning control problems. IEEE Trans. on Systems, Man, and Cybernetics, SMC 13, 834–846 (1983)
Article Google Scholar
Montague, P.R., Dayan, P., Sejnowski, T.J.: A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16, 1936–1947 (1996)
CAS PubMed Google Scholar
Schultz, W., Dayan, P., Montague, P.R.: A neural substrate of prediction and reward. Science 275(5306), 1593–1599 (1997)
Article CAS PubMed Google Scholar
Berns, G.S., McClure, S.M., Pagnoni, G., Montague, P.R.: Predictability modulates human brain response to reward. J. Neurosci. 21, 2793–2798 (2001)
CAS PubMed Google Scholar
Haruno, M., Kuroda, T., Doya, K., Toyama, K., Kimura, M., Samejima, K., Imamizu, H., Kawato, M.: A neural correlate of reward-based behavioral learning in caudate nucleus: a functional magnetic resonance imaging study of a stochastic decision task. J. Neurosci. 24, 1660–1665 (2004)
Article CAS PubMed Google Scholar
McHaffie, J.G., Jiang, H., May, P.J., Coizet, V., Overton, P.G., Stein, B.E., Redgrave, P.: A direct projection from superior colliculus to substantia nigra pars compacta in the cat. Neurosci. 138, 221–234 (2006)
Article CAS Google Scholar
Balleine, B.W., Delgado, M.R., Hikosaka, O.: The role of the dorsal striatum in reward and decision-making. J. Neurosci. 27, 8161–8165 (2007)
Article CAS PubMed Google Scholar
Niv, Y., Schoenbaum, G.: Dialogues on prediction errors. Trends Cogn. Sci. 12(7), 265–272 (2008)
Article PubMed Google Scholar
Dayan, P., Niv, Y.: Reinforcement learning: The Good. The Bad and The Ugly, Curr. Opin. Neurobiol. 18(2), 185–196 (2008)
Article CAS PubMed Google Scholar
Daw, N.D., Niv, Y., Dayan, P.: Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711 (2005)
Article CAS PubMed Google Scholar
Sutton, R.S.: Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In: Proc. of the Seventh International Conference on Machine Learning, Austin, TX (1990)
Google Scholar
Barto, A.G., Bradtke, S.J., Singh, S.P.: Learning to act using real-time dynamic programming. Artif. Intell. 72(1), 81–138 (1995)
Article Google Scholar
Sutton, R.S.: Learning to predict by the method of temporal differences. Machine Learning 3(1), 9–44 (1988)
Google Scholar
Watkins, C.J.C.H., Dayan, P.: Q-learning. Machine Learning 8(3), 279–292 (1992)
Google Scholar
Coutureau, E., Killcross, S.: Inactivation of the infralimbic prefrontal cortex reinstates goal-directed responding in overtrained rats. Behav. Brain Res. 146, 167–174 (2003)
Article PubMed Google Scholar
Balleine, B.W., Killcross, A.S., Dickinson, A.: The effect of lesions of the basolateral amygdale on instrumental conditioning. J. Neurosci. 23, 666–675 (2003)
CAS PubMed Google Scholar
Balleine, B.W.: Neural bases of food-seeking: affect, arousal and reward in corticostriatolimbic circuits. Physiol. Behav. 86, 717–730 (2005)
Article CAS PubMed Google Scholar
Valentin, V.V., Dickinson, A., O’Doherty, J.P.: Determining the neural substrates of goal-directed learning in the human brain. J. Neurosci. 27, 4019–4026 (2007)
Article CAS PubMed Google Scholar
Alexander, G.E., et al.: Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu. Rev. Neurosci. 9, 357–381 (1986)
Article CAS PubMed Google Scholar
Parent, A., Hazrati, L.N.: Functional anatomy of the basal ganglia.1. The cortico–basal ganglia–thalamo–cortical loop. Brain Res. Rev. 20, 91–127 (1995)
Article CAS PubMed Google Scholar
Middleton, F.A., Strick, P.L.: Basal ganglia and cerebellar loops: motor and cognitive circuits. Brain Res. Rev. 31, 236–250 (2000)
Article CAS PubMed Google Scholar
Montague, P.R., Dayan, P., Sejnowski, T.J.: A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16, 1936–1947 (1996)
CAS PubMed Google Scholar
Schultz, W., Dayan, P., Montague, P.R.: A neural substrate of prediction and reward. Science 275(5306), 1593–1599 (1997)
Article CAS PubMed Google Scholar
Matsumoto, M., Hikosaka, O.: Lateral habenula as a source of negative reward signals in dopamine neurons. Nature 447, 1111–1115 (2007)
Article CAS PubMed Google Scholar
Comoli, E., Coizet, V., Boyes, J., Bolam, J.P., Canteras, N.S., Quirk, R.H., Overton, P.G., Redgrave, P.: A direct projection from superior colliculus to substantia nigra for detecting salient visual events. Nat. Neurosci. 6(9), 974–980 (2003)
Article CAS PubMed Google Scholar
Zhou, F.M., Liang, Y., Dani, J.A.: Endogenous nicotinic cholinergic activity regulates dopamine release in the striatum. Nat. Neurosci. 4(12), 1224–1229 (2001)
Article CAS PubMed Google Scholar
Partridge, J.G., Apparsundaram, S., Gerhardt, G.A., Ronesi, J., Lovinger, D.M.: Nicotinic acetylcholine receptors interact with dopamine in induction of striatal long-term depression. J. Neurosci. 22(7), 2541–2549 (2002)
CAS PubMed Google Scholar
Tanaka, S.C., Doya, K., Okada, G., Ueda, K., Okamoto, Y., Yamawaki, S.: Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nat. Neurosci. 7(8), 887–893 (2004)
Article CAS PubMed Google Scholar
Graybiel, A.M.: Habits, Rituals, and the Evaluative Brain. Annu. Rev. Neurosci. 31, 359–387 (2008)
Article CAS PubMed Google Scholar
Pasupathy, A., Miller, E.K.: Different time courses of learning-related activity in the prefrontal cortex and striatum. Nature 433, 873–876 (2005)
Article CAS PubMed Google Scholar
Redgrave, P., Prescott, T.J., Gurney, K.: The basal ganglia: a vertebrate solution to the selection problem? Neurosci. 89, 1009–1023 (1999)
Article CAS Google Scholar
Gurney, K., Prescott, T.J., Redgrave, P.: A computational model of action selection in the basal ganglia. II. Analysis and simulation of behaviour. Biol. Cybern. 84, 411–423 (2001)
CAS PubMed Google Scholar
Prescott, T.J., Gurney, K., Montes-Gonzalez, F., Humphries, M.D., Redgrave, P.: The robot basal ganglia: action selection by an embedded model of the basal ganglia. In: Nicholson, L., Faull, R. (eds.) Basal Ganglia VII, pp. 349–356. Plenum Press
Google Scholar
Humphries, M.D., Stewart, R.D., Gurney, K.N.: A physiologically plausible model of action selection and oscillatory activity in the basal ganglia. J. Neurosci. 26(50), 12921–12942 (2006)
Article CAS PubMed Google Scholar
Bogacz, R., Gurney, K.: The Basal Ganglia and Cortex Implement Optimal Decision Making Between Alternative Actions. Neural. Compu. 19, 442–477 (2007)
Article Google Scholar
Doya, K., Samejima, K., Katagiri, K., Kawato, M.: Multiple model-based reinforcement learning. Neural. Comput. 14(6), 1347–1369 (2002)
Article PubMed Google Scholar
Hallett, M., Shahani, B., Young, R.: EMG analysis of patients with cerebellar lesions. Journal of Neurology, Neurosurgery, and Psychiatry 38, 1163–1169 (1975)
Article CAS PubMed PubMed Central Google Scholar
Hore, J., Wild, B., Diener, H.C.: Cerebellar dysmetria at the elbow, wrist, and fingers. J. Neurophysiol. 65, 563–571 (1991)
CAS PubMed Google Scholar
Jeuptner, M., Rijntjes, M., Weiller, C., Faiss, J.H., Timmann, D., Mueller, S., Diener, H.C.: Localization of cerebellar timing processes using PET. Neurology 45, 1540–1545 (1995)
Article Google Scholar
O’Boyle, D.J., Freeman, J.S., Cody, F.W.J.: The accuracy and precision of timing of self-paced, repetitive movements in subjects with Parkinson’s disease. Brain 119, 51–70 (1996)
Article PubMed Google Scholar
Lo, C.-C., Wang, X.-J.: Cortico–basal ganglia circuit mechanism for a decision threshold in reaction time tasks. Nat. Neurosci. 9, 956–963 (2006)
Article CAS PubMed Google Scholar
Maimon, G., Assad, J.: A cognitive signal for the proactivetiming of action in macaque LIP. Nat. Neuro. 9(7), 948–955 (2006)
Article CAS Google Scholar
Doya, K.: What are the computations of the cerebellum, the basal ganglia, and the cerebral cortex. Neural Netw. 12, 961–974 (1999)
Article CAS PubMed Google Scholar
Romanelli, P., Esposito, V., Schaal, D.W., Heit, G.: Somatotopy in the basal ganglia: experimental and clinical evidence for segregated sensorimotor channels. Brain Res. Brain Res. Rev. 48, 112–128 (2005)
Article PubMed Google Scholar
Middleton, F.A., Strick, P.L.: Basal ganglia and cerebellar loops: motor and cognitive circuits. Brain Res. Brain Res. Rev. 31, 236–250 (2000)
Article CAS PubMed Google Scholar
Takeuchi, J., Shouno, O., Tsujino, H.: Modular neural networks for reinforcement learning with temporal intrinsic rewards. In: Proc. of 2007 International Joint Conference on Neural Networks (IJCNN) (2007)
Google Scholar
Jaeger, H.: The ‘echo state’ approach to analysing and training recurrent neural networks. GMD report 148, German National Research Center for Information Technology (2001)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Nishida, S., Ishii, K., Furukawa, T.: An online adaptation control system using mnSOM. In: King, I., Wang, J., Chan, L.-W., Wang, D. (eds.) ICONIP 2006. LNCS, vol. 4232, pp. 935–942. Springer, Heidelberg (2006)
Chapter Google Scholar
Schmidhuber, J.: Curious model-building control system. In: Proc. International Joint Conference on Neural Networks (IJCNN 1991), pp. 1458–1463 (1991)
Google Scholar
Oudeyer, P.Y., Kaplan, F., Hafner, V.V.: Intrinsic motivation systems for autonomous mental development. IEEE Trans. Evol. Comput. 11(1), 265–286 (2007)
Article Google Scholar
Plenz, D., Kitai, S.T.: A basal ganglia pacemaker formed by the subthalamic nucleus and external globus pallidus. Nature 400, 677–682 (1999)
Article CAS PubMed Google Scholar
Diesmann, M., Gewaltig, M.-O.: NEST: An Environment for Neural Systems Simulations. Forschung und wisschenschaftliches Rechnen, Beiträge zum Heinz-Billing-Preis 2001. Ges. für Wiss. Datenverarbeitung, 43–70 (2002)
Google Scholar
Matsumoto, G., Tsujino, H.: Design of a brain computer using the novel principles of output-driven operation and memory-based architecture. In: Ono, T., Matsumoto, G., Llinas, R., Berthoz, A., Norgen, R., Nishijo, H., Tamura, R. (eds.) Cognition and Emotion in the Brain, pp. 529–546. Elsevier Science B.V, Amsterdam (2003)
Google Scholar
Watanabe, T., Nanez, J.E., Sasaki, Y.: Perceptual learning without perception. Nature 413, 844–848 (2001)
Article CAS PubMed Google Scholar
Barto, A.G., Singh, S., Chentanez, N.: Intrinsically motivated learning of hierarchical collection of skills. In: Proc. of the 3rd International Conference on Developmental Learning (ICDL) (2004)
Google Scholar
Singh, S., Barto, A.G., Chentanez, N.: Intrinsically motivated reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 17, pp. 1281–1288. MIT Press, Cambridge (2005)
Google Scholar
Tsujino, H.: Output-driven operation and memory-based architecture principles embedded in a real-world device. J. Integr. Neurosci. 3(2), 133–142 (2004)
Article PubMed Google Scholar
Koerner, E., Tsujino, H., Masutani, T.: A Cortical-type Modular Neural Network for Hypothetical Reasoning. Neural Netw. 10, 791–814 (1997)
Article PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Honda Research Institute Japan Co., Ltd., 8-1 Honcho, Wako-shi, Saitama, 351-0188, Japan
Hiroshi Tsujino, Johane Takeuchi & Osamu Shouno

Authors

Hiroshi Tsujino
View author publications
You can also search for this author in PubMed Google Scholar
Johane Takeuchi
View author publications
You can also search for this author in PubMed Google Scholar
Osamu Shouno
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Honda Research Institute Europe GmbH, 63073 Offenbach/Main, Germany
Bernhard Sendhoff
Honda Research Institute Europe GmbH, Carl-Legien-Strasse 30, 63073, Offenbach/Main, Germany
Edgar Körner
Dept. of Psychological and Brain Sciences, Indiana University, IN 47405, Bloomington, USA
Olaf Sporns
Faculty of Technology, Neuroinformatics Group, Bielefeld University, Universitätsstr. 25, 33615, Bielefeld, Germany
Helge Ritter
Okinawa Institute of Science and Technology, Neural Computation Unit,, 12-22 Suzaki, Uruma, 904-2234, Okinawa, Japan
Kenji Doya

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tsujino, H., Takeuchi, J., Shouno, O. (2009). Basal Ganglia Models for Autonomous Behavior Learning. In: Sendhoff, B., Körner, E., Sporns, O., Ritter, H., Doya, K. (eds) Creating Brain-Like Intelligence. Lecture Notes in Computer Science(), vol 5436. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00616-6_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-00616-6_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00615-9
Online ISBN: 978-3-642-00616-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics