Skip to main content

Basal Ganglia Models for Autonomous Behavior Learning

  • Chapter
Creating Brain-Like Intelligence

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5436))

Abstract

We propose two basal ganglia (BG) models for autonomous behavior learning: the BG system model and the BG spiking neural network model. These models were developed on the basis of reinforcement learning (RL) theories and neuroscience principals of behavioral learning. The BG system model focuses on problems with RL input selection and reward setting. This model assumes that parallel BG modules receive a variety of inputs. We also propose an automatic setting method of internal reward for this model. The BG spiking neural network model focuses on problems with biological neural network architecture, ambiguous inputs and the mechanism of timing. This model accounts for the neurophysiological characteristics of neurons and differential functions of the direct and indirect pathways. We demonstrate that the BG system model achieves goals in fewer trials by learning the internal state representation, whereas the BG spiking neural network model has the capacity for probabilistic selection of action. Our results suggest that these two models are a step toward developing an autonomous learning system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Reiner, A., Medina, L., Veenman, C.L.: Structural and functional evolution of the basal ganglia in vertebrates. Brain Res. Brain Res. Rev. 28(3), 235–285 (1998)

    Article  CAS  PubMed  Google Scholar 

  2. Barto, A.G., Sutton, R.S., Anderson, C.: Neuron-like adaptive elements that can solve difficult learning control problems. IEEE Trans. on Systems, Man, and Cybernetics, SMC 13, 834–846 (1983)

    Article  Google Scholar 

  3. Montague, P.R., Dayan, P., Sejnowski, T.J.: A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16, 1936–1947 (1996)

    CAS  PubMed  Google Scholar 

  4. Schultz, W., Dayan, P., Montague, P.R.: A neural substrate of prediction and reward. Science 275(5306), 1593–1599 (1997)

    Article  CAS  PubMed  Google Scholar 

  5. Berns, G.S., McClure, S.M., Pagnoni, G., Montague, P.R.: Predictability modulates human brain response to reward. J. Neurosci. 21, 2793–2798 (2001)

    CAS  PubMed  Google Scholar 

  6. Haruno, M., Kuroda, T., Doya, K., Toyama, K., Kimura, M., Samejima, K., Imamizu, H., Kawato, M.: A neural correlate of reward-based behavioral learning in caudate nucleus: a functional magnetic resonance imaging study of a stochastic decision task. J. Neurosci. 24, 1660–1665 (2004)

    Article  CAS  PubMed  Google Scholar 

  7. McHaffie, J.G., Jiang, H., May, P.J., Coizet, V., Overton, P.G., Stein, B.E., Redgrave, P.: A direct projection from superior colliculus to substantia nigra pars compacta in the cat. Neurosci. 138, 221–234 (2006)

    Article  CAS  Google Scholar 

  8. Balleine, B.W., Delgado, M.R., Hikosaka, O.: The role of the dorsal striatum in reward and decision-making. J. Neurosci. 27, 8161–8165 (2007)

    Article  CAS  PubMed  Google Scholar 

  9. Niv, Y., Schoenbaum, G.: Dialogues on prediction errors. Trends Cogn. Sci. 12(7), 265–272 (2008)

    Article  PubMed  Google Scholar 

  10. Dayan, P., Niv, Y.: Reinforcement learning: The Good. The Bad and The Ugly, Curr. Opin. Neurobiol. 18(2), 185–196 (2008)

    Article  CAS  PubMed  Google Scholar 

  11. Daw, N.D., Niv, Y., Dayan, P.: Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711 (2005)

    Article  CAS  PubMed  Google Scholar 

  12. Sutton, R.S.: Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In: Proc. of the Seventh International Conference on Machine Learning, Austin, TX (1990)

    Google Scholar 

  13. Barto, A.G., Bradtke, S.J., Singh, S.P.: Learning to act using real-time dynamic programming. Artif. Intell. 72(1), 81–138 (1995)

    Article  Google Scholar 

  14. Sutton, R.S.: Learning to predict by the method of temporal differences. Machine Learning 3(1), 9–44 (1988)

    Google Scholar 

  15. Watkins, C.J.C.H., Dayan, P.: Q-learning. Machine Learning 8(3), 279–292 (1992)

    Google Scholar 

  16. Coutureau, E., Killcross, S.: Inactivation of the infralimbic prefrontal cortex reinstates goal-directed responding in overtrained rats. Behav. Brain Res. 146, 167–174 (2003)

    Article  PubMed  Google Scholar 

  17. Balleine, B.W., Killcross, A.S., Dickinson, A.: The effect of lesions of the basolateral amygdale on instrumental conditioning. J. Neurosci. 23, 666–675 (2003)

    CAS  PubMed  Google Scholar 

  18. Balleine, B.W.: Neural bases of food-seeking: affect, arousal and reward in corticostriatolimbic circuits. Physiol. Behav. 86, 717–730 (2005)

    Article  CAS  PubMed  Google Scholar 

  19. Valentin, V.V., Dickinson, A., O’Doherty, J.P.: Determining the neural substrates of goal-directed learning in the human brain. J. Neurosci. 27, 4019–4026 (2007)

    Article  CAS  PubMed  Google Scholar 

  20. Alexander, G.E., et al.: Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu. Rev. Neurosci. 9, 357–381 (1986)

    Article  CAS  PubMed  Google Scholar 

  21. Parent, A., Hazrati, L.N.: Functional anatomy of the basal ganglia.1. The cortico–basal ganglia–thalamo–cortical loop. Brain Res. Rev. 20, 91–127 (1995)

    Article  CAS  PubMed  Google Scholar 

  22. Middleton, F.A., Strick, P.L.: Basal ganglia and cerebellar loops: motor and cognitive circuits. Brain Res. Rev. 31, 236–250 (2000)

    Article  CAS  PubMed  Google Scholar 

  23. Montague, P.R., Dayan, P., Sejnowski, T.J.: A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16, 1936–1947 (1996)

    CAS  PubMed  Google Scholar 

  24. Schultz, W., Dayan, P., Montague, P.R.: A neural substrate of prediction and reward. Science 275(5306), 1593–1599 (1997)

    Article  CAS  PubMed  Google Scholar 

  25. Matsumoto, M., Hikosaka, O.: Lateral habenula as a source of negative reward signals in dopamine neurons. Nature 447, 1111–1115 (2007)

    Article  CAS  PubMed  Google Scholar 

  26. Comoli, E., Coizet, V., Boyes, J., Bolam, J.P., Canteras, N.S., Quirk, R.H., Overton, P.G., Redgrave, P.: A direct projection from superior colliculus to substantia nigra for detecting salient visual events. Nat. Neurosci. 6(9), 974–980 (2003)

    Article  CAS  PubMed  Google Scholar 

  27. Zhou, F.M., Liang, Y., Dani, J.A.: Endogenous nicotinic cholinergic activity regulates dopamine release in the striatum. Nat. Neurosci. 4(12), 1224–1229 (2001)

    Article  CAS  PubMed  Google Scholar 

  28. Partridge, J.G., Apparsundaram, S., Gerhardt, G.A., Ronesi, J., Lovinger, D.M.: Nicotinic acetylcholine receptors interact with dopamine in induction of striatal long-term depression. J. Neurosci. 22(7), 2541–2549 (2002)

    CAS  PubMed  Google Scholar 

  29. Tanaka, S.C., Doya, K., Okada, G., Ueda, K., Okamoto, Y., Yamawaki, S.: Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nat. Neurosci. 7(8), 887–893 (2004)

    Article  CAS  PubMed  Google Scholar 

  30. Graybiel, A.M.: Habits, Rituals, and the Evaluative Brain. Annu. Rev. Neurosci. 31, 359–387 (2008)

    Article  CAS  PubMed  Google Scholar 

  31. Pasupathy, A., Miller, E.K.: Different time courses of learning-related activity in the prefrontal cortex and striatum. Nature 433, 873–876 (2005)

    Article  CAS  PubMed  Google Scholar 

  32. Redgrave, P., Prescott, T.J., Gurney, K.: The basal ganglia: a vertebrate solution to the selection problem? Neurosci. 89, 1009–1023 (1999)

    Article  CAS  Google Scholar 

  33. Gurney, K., Prescott, T.J., Redgrave, P.: A computational model of action selection in the basal ganglia. II. Analysis and simulation of behaviour. Biol. Cybern. 84, 411–423 (2001)

    CAS  PubMed  Google Scholar 

  34. Prescott, T.J., Gurney, K., Montes-Gonzalez, F., Humphries, M.D., Redgrave, P.: The robot basal ganglia: action selection by an embedded model of the basal ganglia. In: Nicholson, L., Faull, R. (eds.) Basal Ganglia VII, pp. 349–356. Plenum Press

    Google Scholar 

  35. Humphries, M.D., Stewart, R.D., Gurney, K.N.: A physiologically plausible model of action selection and oscillatory activity in the basal ganglia. J. Neurosci. 26(50), 12921–12942 (2006)

    Article  CAS  PubMed  Google Scholar 

  36. Bogacz, R., Gurney, K.: The Basal Ganglia and Cortex Implement Optimal Decision Making Between Alternative Actions. Neural. Compu. 19, 442–477 (2007)

    Article  Google Scholar 

  37. Doya, K., Samejima, K., Katagiri, K., Kawato, M.: Multiple model-based reinforcement learning. Neural. Comput. 14(6), 1347–1369 (2002)

    Article  PubMed  Google Scholar 

  38. Hallett, M., Shahani, B., Young, R.: EMG analysis of patients with cerebellar lesions. Journal of Neurology, Neurosurgery, and Psychiatry 38, 1163–1169 (1975)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Hore, J., Wild, B., Diener, H.C.: Cerebellar dysmetria at the elbow, wrist, and fingers. J. Neurophysiol. 65, 563–571 (1991)

    CAS  PubMed  Google Scholar 

  40. Jeuptner, M., Rijntjes, M., Weiller, C., Faiss, J.H., Timmann, D., Mueller, S., Diener, H.C.: Localization of cerebellar timing processes using PET. Neurology 45, 1540–1545 (1995)

    Article  Google Scholar 

  41. O’Boyle, D.J., Freeman, J.S., Cody, F.W.J.: The accuracy and precision of timing of self-paced, repetitive movements in subjects with Parkinson’s disease. Brain 119, 51–70 (1996)

    Article  PubMed  Google Scholar 

  42. Lo, C.-C., Wang, X.-J.: Cortico–basal ganglia circuit mechanism for a decision threshold in reaction time tasks. Nat. Neurosci. 9, 956–963 (2006)

    Article  CAS  PubMed  Google Scholar 

  43. Maimon, G., Assad, J.: A cognitive signal for the proactivetiming of action in macaque LIP. Nat. Neuro. 9(7), 948–955 (2006)

    Article  CAS  Google Scholar 

  44. Doya, K.: What are the computations of the cerebellum, the basal ganglia, and the cerebral cortex. Neural Netw. 12, 961–974 (1999)

    Article  CAS  PubMed  Google Scholar 

  45. Romanelli, P., Esposito, V., Schaal, D.W., Heit, G.: Somatotopy in the basal ganglia: experimental and clinical evidence for segregated sensorimotor channels. Brain Res. Brain Res. Rev. 48, 112–128 (2005)

    Article  PubMed  Google Scholar 

  46. Middleton, F.A., Strick, P.L.: Basal ganglia and cerebellar loops: motor and cognitive circuits. Brain Res. Brain Res. Rev. 31, 236–250 (2000)

    Article  CAS  PubMed  Google Scholar 

  47. Takeuchi, J., Shouno, O., Tsujino, H.: Modular neural networks for reinforcement learning with temporal intrinsic rewards. In: Proc. of 2007 International Joint Conference on Neural Networks (IJCNN) (2007)

    Google Scholar 

  48. Jaeger, H.: The ‘echo state’ approach to analysing and training recurrent neural networks. GMD report 148, German National Research Center for Information Technology (2001)

    Google Scholar 

  49. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  50. Nishida, S., Ishii, K., Furukawa, T.: An online adaptation control system using mnSOM. In: King, I., Wang, J., Chan, L.-W., Wang, D. (eds.) ICONIP 2006. LNCS, vol. 4232, pp. 935–942. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  51. Schmidhuber, J.: Curious model-building control system. In: Proc. International Joint Conference on Neural Networks (IJCNN 1991), pp. 1458–1463 (1991)

    Google Scholar 

  52. Oudeyer, P.Y., Kaplan, F., Hafner, V.V.: Intrinsic motivation systems for autonomous mental development. IEEE Trans. Evol. Comput. 11(1), 265–286 (2007)

    Article  Google Scholar 

  53. Plenz, D., Kitai, S.T.: A basal ganglia pacemaker formed by the subthalamic nucleus and external globus pallidus. Nature 400, 677–682 (1999)

    Article  CAS  PubMed  Google Scholar 

  54. Diesmann, M., Gewaltig, M.-O.: NEST: An Environment for Neural Systems Simulations. Forschung und wisschenschaftliches Rechnen, Beiträge zum Heinz-Billing-Preis 2001. Ges. für Wiss. Datenverarbeitung, 43–70 (2002)

    Google Scholar 

  55. Matsumoto, G., Tsujino, H.: Design of a brain computer using the novel principles of output-driven operation and memory-based architecture. In: Ono, T., Matsumoto, G., Llinas, R., Berthoz, A., Norgen, R., Nishijo, H., Tamura, R. (eds.) Cognition and Emotion in the Brain, pp. 529–546. Elsevier Science B.V, Amsterdam (2003)

    Google Scholar 

  56. Watanabe, T., Nanez, J.E., Sasaki, Y.: Perceptual learning without perception. Nature 413, 844–848 (2001)

    Article  CAS  PubMed  Google Scholar 

  57. Barto, A.G., Singh, S., Chentanez, N.: Intrinsically motivated learning of hierarchical collection of skills. In: Proc. of the 3rd International Conference on Developmental Learning (ICDL) (2004)

    Google Scholar 

  58. Singh, S., Barto, A.G., Chentanez, N.: Intrinsically motivated reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 17, pp. 1281–1288. MIT Press, Cambridge (2005)

    Google Scholar 

  59. Tsujino, H.: Output-driven operation and memory-based architecture principles embedded in a real-world device. J. Integr. Neurosci. 3(2), 133–142 (2004)

    Article  PubMed  Google Scholar 

  60. Koerner, E., Tsujino, H., Masutani, T.: A Cortical-type Modular Neural Network for Hypothetical Reasoning. Neural Netw. 10, 791–814 (1997)

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Tsujino, H., Takeuchi, J., Shouno, O. (2009). Basal Ganglia Models for Autonomous Behavior Learning. In: Sendhoff, B., Körner, E., Sporns, O., Ritter, H., Doya, K. (eds) Creating Brain-Like Intelligence. Lecture Notes in Computer Science(), vol 5436. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00616-6_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00616-6_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00615-9

  • Online ISBN: 978-3-642-00616-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics