Combining Self-organizing Maps with Mixtures of Experts: Application to an Actor-Critic Model of Reinforcement Learning in the Basal Ganglia

Khamassi, Mehdi; Martinet, Louis-Emmanuel; Guillot, Agnès

doi:10.1007/11840541_33

Mehdi Khamassi^25,26,
Louis-Emmanuel Martinet²⁵ &
Agnès Guillot²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4095))

Included in the following conference series:

International Conference on Simulation of Adaptive Behavior

1730 Accesses
8 Citations

Abstract

In a reward-seeking task performed in a continuous environment, our previous work compared several Actor-Critic (AC) architectures implementing dopamine-like reinforcement learning mechanisms in the rat’s basal ganglia. The task complexity imposes the coordination of several AC submodules, each module being an expert trained in a particular subset of the task. We showed that the classical method where the choice of the expert to train at a given time depends on each expert’s performance suffered from strong limitations. We rather proposed to cluster the continuous state space by an ad hoc method that lacked autonomy and generalization abilities. In the present work we have combined the mixture of experts with self-organizing maps in order to cluster autonomously the experts’ responsibility space. On the one hand, we find that classical Kohonen maps give very variable results: some task decompositions provide very good and stable reinforcement learning performances, whereas some others are unadapted to the task. Moreover, they require the number of experts to be set a priori. On the other hand, algorithms like Growing Neural Gas or Growing When Required have the property to choose autonomously and incrementally the number of experts to train. They lead to good performances, even if they are still weaker than our hand-tuned task decomposition and than the best Kohonen maps that we got. We finally discuss on propositions about what information to add to these algorithms, such as knowledge of current behavior, in order to make the task decomposition appropriate to the reinforcement learning process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Albertin, S.V., Mulder, A.B., Tabuchi, E., Zugaro, M.B., Wiener, S.I.: Lesions of the medial shell of the nucleus accumbens impair rats in finding larger rewards, but spare reward-seeking behavior. Behavioral Brain Research 117(1-2), 173–183 (2000)
Article Google Scholar
Arleo, A., Gerstner, W.: Spatial cognition and neuro-mimetic navigation: a model of hippo-campal place cell activity. Biological Cybernetics 83(3), 287–299 (2000)
Article Google Scholar
Baldassarre, G.: A modular neural-network model of the basal ganglia’s role in learning and selecting motor behaviors. Journal of Cognitive Systems Research 3(1), 5–13 (2002)
Article Google Scholar
Doya, K., Samejima., K., Katagiri, K., Kawato, M.: Multiple model-based reinforcement learning. Neural Computation 14(6), 1347–1369 (2002)
Article MATH Google Scholar
Filliat, D., Girard, B., Guillot, A., Khamassi, M., Lachèze, L., Meyer, J.-A.: State of the artificial rat Psikharpax. In: Schaal, S., Ijspeert, A., Billard, A., Vijayakumar, S., Hallam, J., Meyer, J.-A. (eds.) From Animals to Animats 8: Proceedings of the Seventh International Conference on Simulation of Adaptive Behavior, pp. 3–12. MIT Press, Cambridge (2004)
Google Scholar
Fritzke, B.: A growing neural gas network learns topologies. In: Tesauro, G., Touretzkys, D.S., Leen, K. (eds.) Advances in Neural Information Processing Systems, pp. 625–632. MIT Press, Cambridge (1995)
Google Scholar
Geman, S., Bienenstock, E., Doursat, R.: Neural networks and the bias/variance dilemma. Neural Computation 4, 1–58 (1992)
Article Google Scholar
Gurney, K., Prescott, T.J., Redgrave, P.: A computational model of action selection in the basal ganglia. I. A new functional anatomy. Biological Cybernetics 84, 401–410 (2001)
Article MATH Google Scholar
Holmström, J.: Growing neural gas: Experiments with GNG, GNG with utility and supervised GNG. Master’s thesis, Uppsala University (2002)
Google Scholar
Jog, M.S., Kubota, Y., Connolly, C.I., Hillegaart, V., Graybiel, A.M.: Building neural representations of habits. Science 286(5445), 1745–1749 (1999)
Article Google Scholar
Khamassi, M., Lachèze, L., Girard, B., Berthoz, A., Guillot, A.: Actor-critic models of rein-forcement learning in the basal ganglia: From natural to artificial rats. Adaptive Behavior, Special Issue Towards Artificial Rodents 13(2), 131–148 (2005)
Google Scholar
Kohonen, T.: Self-organizing maps. Springer, Heidelberg (1995)
Google Scholar
Lee, J.K., Kim, I.H.: Reinforcement learning control using self-organizing map and multi-layer feed-forward neural network. In: International Conference on Control Automation and Systems, ICCAS 2003 (2003)
Google Scholar
Meyer, J.-A., Guillot, A., Girard, B., Khamassi, M., Pirim, P., Berthoz, A.: The Psikharpax project: Towards building an artificial rat. Robotics and Autonomous Systems 50(4), 211–223 (2005)
Article Google Scholar
Marsland, S., Shapiro, J., Nehmzow, U.: A self-organising network that grows when required. Neural Networks 15, 1041–1058 (2002)
Article Google Scholar
Prescott, T.J., Redgrave, P., Gurney, K.: Layered control architectures in robots and vertebrates. Adaptive Behavior 7, 99–127 (1999)
Article Google Scholar
Schultz, W., Dayan, P., Montague, P.R.: A neural substrate of prediction and reward. Science 275, 1593–1599 (1997)
Article Google Scholar
Smith, A.J.: Applications of the self-organizing map to reinforcement learning. Neural Networks 15(8-9), 1107–1124 (2002)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. The MIT Press, Cambridge (1998)
Google Scholar
Tang, B., Heywood, M.I., Shepherd, M.: Input Partitioning to Mixture of Experts. In: IEEE/INNS International Joint Conference on Neural Networks, Honolulu, Hawaii, pp. 227–232 (2002)
Google Scholar
Tani, J., Nolfi, S.: Learning to perceive the world as articulated: an approach for hierarchical learning in sensory-motor systems. Neural Networks 12, 1131–1141 (1999)
Article Google Scholar

Download references

Author information

Authors and Affiliations

AnimatLab – LIP6, F-75005 Paris, France ; CNRS, UMR7606, Université Pierre et Marie Curie – Paris 6, UMR7606, F-75005, Paris, France
Mehdi Khamassi, Louis-Emmanuel Martinet & Agnès Guillot
Laboratoire de Physiologie de la Perception et de l’Action, UMR7152 CNRS, Collège de France, F-75005, Paris, France
Mehdi Khamassi

Authors

Mehdi Khamassi
View author publications
You can also search for this author in PubMed Google Scholar
Louis-Emmanuel Martinet
View author publications
You can also search for this author in PubMed Google Scholar
Agnès Guillot
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Cognitive Sciences and Technologies, LARRAL, Via S. Martino della Battaglia 44, 00185, Roma, Italy
Stefano Nolfi
Institute of Cognitive Science and Technology, (ISTC-CNR), Via San Martino della Battaglia 44, 00185, Rome, Italy
Gianluca Baldassarre
Institute of Cognitive Science and Technology, ISTC-CNR, Via S. Martino della Battaglia 44, 00185, Rome, Italy
Raffaele Calabretta & Davide Marocco &
The Mærsk Mc-Kinney Møller Institute, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark
John C. T. Hallam
UPMC Univ Paris 6, FRE2507, ISIR, F-75016, Paris, France
Jean-Arcady Meyer
Laboratory of Autonomous Robotics and Artificial Life, Institute of Cognitive, Sciences and Technologies, National Research Council, Rome, Italy
Orazio Miglino
Institute of Cognitive Sciences and Technologies, National Research Council, 44, Via San Martino della, 00185, Rome, Battaglia, Italy
Domenico Parisi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khamassi, M., Martinet, LE., Guillot, A. (2006). Combining Self-organizing Maps with Mixtures of Experts: Application to an Actor-Critic Model of Reinforcement Learning in the Basal Ganglia. In: Nolfi, S., et al. From Animals to Animats 9. SAB 2006. Lecture Notes in Computer Science(), vol 4095. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11840541_33

Download citation

DOI: https://doi.org/10.1007/11840541_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38608-7
Online ISBN: 978-3-540-38615-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics