Skip to main content

Combining Supervised, Unsupervised, and Reinforcement Learning in a Network of Spiking Neurons

  • Conference paper
  • First Online:
Advances in Cognitive Neurodynamics (II)

Abstract

The human brain constantly learns via mutiple different learning strategies. It can learn by simply having stimuli being presented to its sensory organs which is considered unsupervised learning. In addition, it can learn associations between inputs and outputs when a teacher provides the output which is considered as supervised learning. Most importantly, it can learn very efficiently if correct behaviour is followed by reward and/or incorrect behaviour is followed by punishment which is considered reinforcement learning. So far, most artificial neural architectures implement only one of the three learning mechanisms — even though the brain integrates all three. Here, we have implemented unsupervised, supervised, and reinforcement learning within a network of spiking neurons. In order to achieve this ambitious goal, the existing learning rule called spike-timing-dependent plasticity had to be extended such that it is modulated by the reward signal dopamine.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hebb, D.: The Organization of Behavior: A Neuropsychological Theory. New York, NY: Wiley (1949).

    Google Scholar 

  2. Dayan, P., Abbott, L.F.: Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. Cambridge, MA: MIT press (2001).

    Google Scholar 

  3. Izhikevich, E.M.: Simple model of spiking neurons. IEEE. Trans. Neural. Netw. 14(6) (2003) 1569–1572.

    Article  CAS  PubMed  Google Scholar 

  4. Markram, H., Lubke, J., Frotscher, M., Sakmann, B.: Regulation of synaptic efficacy by coincidence of postsynaptic APs and EPSPs. Science 275(5297) (1997) 213–215.

    Article  CAS  PubMed  Google Scholar 

  5. Bi, G.Q., Poo, M.M.: Synaptic modifications in cultured hippocampal neurons: Dependence on spike timing, synaptic strength, and postsynaptic cell type. J. Neurosci. 18(24) (1998) 10464–10472.

    CAS  PubMed  Google Scholar 

  6. Izhikevich, E.M., Gally, J.A., Edelman, G.M.: Spike-timing dynamics of neuronal groups. Cerebral. Cortex. 14 (2004) 933–944.

    Article  PubMed  Google Scholar 

  7. Yuille, A.L., Geiger, D.: Winner-Take-All Networks. In Arbib, M.A., ed.: The Handbook of Brain Therory and Neural Networks, 2nd edn. Cambridge, MA: MIT Press (2003) 1228–1231.

    Google Scholar 

  8. Handrich, S., Herzog, A., Wolf, A., Herrmann, C.S.: A Biologically Plausible Winner-Takesall Architecture. In: International Conference on Intelligent Computing (ICIC). (in press).

    Google Scholar 

  9. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press (1998).

    Google Scholar 

  10. Izhikevich, E.M.: Solving the distal reward problem through linkage of STDP and dopamine signaling. Cereb. Cortex. 17(10) (2007) 2443–2452.

    Article  PubMed  Google Scholar 

  11. Handrich, S., Herzog, A., Wolf, A., Herrmann, C.S.: Prerequisites for integrating unsupervised and reinforcement learning in a single network of spiking neurons. In: International Joint Conference on Neural Networks (IJCNN). (in press).

    Google Scholar 

  12. Desimone, R.: Neural mechanisms for visual memory and their role in attention. Proc. Natl. Acad. Sci. USA. 93(24) (1996) 13494–13499.

    Article  CAS  PubMed  Google Scholar 

  13. Suzuki,W.A., Miller, E.K., Desimone, R.: Object and place memory in the macaque entorhinal cortex. J. Neurophysiol. 78(2) (1997) 1062–1081.

    CAS  PubMed  Google Scholar 

  14. Redgrave, P., Gurney, K.: The short-latency dopamine signal: a role in discovering novel actions? Nat. Rev. Neurosci. 7(12) (2006) 967–975.

    Article  CAS  PubMed  Google Scholar 

  15. Shen, W., Flajolet, M., Greengard, P., Surmeier, D.J.: Dichotomous dopaminergic control of striatal synaptic plasticity. Science 321(5890) (2008) 848–851.

    Article  CAS  PubMed  Google Scholar 

  16. Gurney, K., Redgrave, P.: A model of sensory reinforced corticostriatal plasticity in the anaethetised rat. In: Society for Neuroscience Abstracts. (2008).

    Google Scholar 

  17. Herzog, A., Kube, K., Michaelis, B., de Lima, A., Baltz, Tand Voigt, T.: Contribution of the GABA shift to the transition from structural initializationto working stage in biologically realistic networks. Neurocomputing (2008) 1134–1142.

    Google Scholar 

  18. Morrison, A., Diesmann, M., Gerstner, W.: Phenomenological models of synaptic plasticity based on spike timing. Biol. Cybern. 98(6) (2008) 459–478.

    Article  PubMed  Google Scholar 

  19. Abbott, L., Nelson, S.: Synaptic plasticity: taming the beast. Nat. Neurosci. 3(Suppl) (2000) 1178–1183.

    Article  CAS  PubMed  Google Scholar 

  20. Turrigiano, G.: The self-tuning neuron: synaptic scaling of excitatory synapses. Cell 135(3) (2008) 422–435.

    Article  CAS  PubMed  Google Scholar 

  21. Wright, J.J., Liley, D.T.: Simulation of electrocortical waves. Biol. Cybern. 72(4) (1995) 347–356.

    Article  CAS  PubMed  Google Scholar 

  22. David, O., Friston, K.J.: A neural mass model for MEG/EEG: coupling and neuronal dynamics. Neuroimage 20(3) (2003) 1743–1755.

    Article  PubMed  Google Scholar 

  23. Freeman,W.J.: A field-theoretic approach to understanding scale-free neocortical dynamics. Biol.Cybern. 92(6) (2005) 350–359.

    Article  PubMed  Google Scholar 

  24. beim Graben, P., Kurths, J.: Simulation of global properties of electroencephalograms with minimal random neural networks. Neurocomputing 71 (2008) 999–1007.

    Article  Google Scholar 

  25. Fründ, I., Herrmann, C.S.: Simulating evoked gamma oscillations of human EEG in a network of spiking neurons reveals an early mechanism of memory matching. In Descalzi, O., Rosso, O., Larrondo, H., eds.: Nonequilibrium Statistical Mechanics and Nonlinear Physics, AIP Conference Proceedings 913(1), American Institute of Physics (2007) 215–221.

    Google Scholar 

  26. Schultz, W.: Dopamine neurons and their role in reward mechanisms. Curr. Opin. Neurobiol. 7(2) (1997) 191–197.

    Article  CAS  PubMed  Google Scholar 

  27. Ungless, M., Magill, P., Bolam, J.: Uniform inhibition of dopamine neurons in the ventral tegmental area by aversive stimuli. Science 303(5666) (2004) 2040–2042.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

This work was supported by the DFG HE 3353/6-1 Forschungsverbund “Electrophysiological Correlates of Memory and their Generation”, SFB 779, and by BMBF Bernstein-group “Components of cognition: small networks to flexible rules”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastian Handrich .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media B.V.

About this paper

Cite this paper

Handrich, S., Herzog, A., Wolf, A., Herrmann, C.S. (2011). Combining Supervised, Unsupervised, and Reinforcement Learning in a Network of Spiking Neurons. In: Wang, R., Gu, F. (eds) Advances in Cognitive Neurodynamics (II). Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9695-1_26

Download citation

Publish with us

Policies and ethics