Abstract
My aim in this chapter is to give a concise summary of what I consider the most important ideas in modern machine learning, and relate to one another different approaches, such as support vector machines and Bayesian networks, or reinforcement learning and temporal supervised learning. I begin with general comments on organizational mechanisms, then focus on unsupervised, supervised and reinforcement learning. I point out the links between these concepts and brain processes such as synaptic plasticity and models of the basal ganglia. Examples for each of the three main learning paradigms are also included to allow experimenting with these concepts.
Available at http://projects.cs.dal.ca/hallab/MLreview2013.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Available at http://code.google.com/p/bnt/, and used to implement Fig. 13; file at www.cs.dal.ca/~tt/repository/MLintro2012/PearlBurglary.m.
- 2.
Markov models are often a simplification or abstraction of a real world. In this section, however, we discuss a “toy world” in which state transitions were designed to fulfill the Markov condition.
- 3.
\(V^\pi (s)\) is usually called the state value function and \(Q^\pi (s, a)\) the state-action value function. Note, however, that the value depends in both cases on the states and the actions taken.
- 4.
This formulation of the Bellman equation for an MDP [36–38] is slightly different from the formulation of Sutton and Barto in [39], as these authors define the value function to be the cumulative reward starting from the next state, not the current state. In their case, the Bellman equation reads \(V^\pi (s) = \sum _{s'} T(s'|s, a) (r(s') + \gamma \, V^\pi (s'))\). This is only a matter of convention about when we consider the prediction: just before getting the current reward of after taking the next step.
- 5.
The same function name is used on both sides of this equation, but these are distinguished by the inclusion of parameters. The value functions all refer to the parametric model, which should be clear from the context.
- 6.
Julian Miller made this point nicely at the aforementioned workshop.
References
S. Geman, E. Bienenstock, R. Doursat, Neural networks and the bias/variance dilemma. Neural Comput. 4(1), 1–58 (1992)
P. Smolensky, Information Processing in Dynamical Systems: Foundations of Harmony Theory, in Parallel Distributed Processing: Volume 1: Foundations, ed. by D.E. Rumelhart, J.L. McClelland (MIT Press, Cambridge, MA, 1986), pp. 194–281
G. Hinton, Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 1711–1800 (2002)
G. Hinton, A Practical Guide to Training Restricted Boltzmann Machines. University of Toronto Technical Report UTML TR 2010–003, 2010
A. Graps, An Introduction to Wavelets. http://www.amara.com/IEEEwave/IEEEwavelet.html
N. Huang et al., The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lond. A 454, 903–995 (1998)
H. Barlow (1961) Possible principles underlying the transformation of sensory messages. Sens. Commun. 217–234, (1961)
P. Földiák, Forming sparse representations by local anti-Hebbian learning. Biol. Cybern. 64, 165–170 (1990)
P. Földiák, D. Endres, Sparse coding. Scholarpedia 3, 2984 (2008)
B. Olshausen, D. Field, Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)
H. Lee, E. Chaitanya and A. Ng, Sparse deep belief net model for visual area V2, NIPS*2007
C. von der Malsburg, Self-organization of orientation sensitive cells in the striate cortex. Kybernetik 14, 85–100 (1973)
S. Grossberg, Adaptive pattern classification and universal recoding, I: Parallel development and coding of neural feature detectors. Biol. Cybern. 23, 121–134 (1976)
T. Kohonen, Self-Organizing Maps (Springer, Berlin, 1994)
P. Hollensen, P. Hartono, T. Trappenberg (2011) Topographic RBM as Robot Controller, JNNS 2011
S. Grossberg, Adaptive resonance theory: how a brain learns to consciously attend, learn, and recognize a changing world. Neural Netw. 37, 1–47 (2012)
T. Trappenberg, P. Hartono, D. Rasmusson, in Top-Down Control of Learning in Biological Self-Organizing Maps, ed. by J. Principe, R. Miikkulainen. Lecture Notes in Computer Science 5629, WSOM 2009 (Springer, 2009), pp. 316–324
K. Tanaka, H. Saito, Y, Fukada, M. Moriya, Coding visual images of objects in the inferotemporal cortex of the macaque monkey. J. Neurophysiol. 66, 170–189 (1991)
S. Chatterjee, A. Hadi, Sensitivity Analysis in Linear Regression (John Wiley & Sons, New York, 1988)
Judea Pearl, Causality: Models, Reasoning and Inference (Cambridge University Press, Cambridge, 2009)
D. Cireşan, U. Meier, J. Masci, J. Schmidhuber, Multi-column deep neural network for traffic sign classification. Neural Netw. 32, 333–338 (2012)
D. Rumelhart, G. Hinton, R. Williams, Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
K. Hornik, Approximation capabilities of multilayer feedforward networks. Neural Netw. 4(2), 251–257 (1991)
A. Weigend, D. Rumelhart (1991) Generalization through minimal networks with application to forecasting, ed. by E.M. Keramidas. in Computing Science and Statistics (23rd Symposium INTERFACE’91, Seattle, WA), pp. 362–370
R. Caruana, S. Lawrence, C.L. Giles, Overfitting in neural nets: backpropagation, conjugate gradient, and early stopping, in Proceedings of Neural Information Processing Systems Conference, 2000. pp. 402–408
D.J.C. MacKay, A practical Bayesian framework for backpropagation networks. Neural Comput. 4(3), 448–472 (1992)
D. Silver, K. Bennett, Guest editor’s introduction: special issue on inductive transfer learning. Mach. Learn. 73(3), 215–220 (2008)
S. Pan, Q. Yang, A survey on transfer learning. IEEE Trans. Knowl. Data Eng. (IEEE TKDE) 22(10), 1345–1359 (2010)
B.E. Boser, I.M. Guyon, V. Vapnik, A training algorithm for optimal margin classifiers, in Proceedings of the Fifth Annual Workshop on Computational Learning Theory, (ACM, 1992), pp. 144–152
V. Vapnik, The Nature of Statistical Learning Theory (Springer, Berlin, 1995)
C. Cortes, V. Vapnik, Support-vector networks. Mach. Learn. 20, 273–297 (1995)
C. Burges, A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Disc. 2(2), 121–167 (1998)
A. Smola, B. Schölkopf, A tutorial on support vector regression. Stat. Comput. 14(3) (2004)
C.-C. Chang, C.-J. Lin, LibSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/cjlin/libsvm
M. Boardman, T. Trappenberg, A heuristic for free parameter optimization with support vector machines, WCCI 2006, pp. 1337–1344, (2006). http://www.cs.dal.ca/boardman/wcci
E. Alpaydim, Introduction to Machine Learning, 2e (MIT Press, Cambridge, 2010)
S. Thrun, W. Burgard, D. Fox, Probabilistic Robotics (MIT Press, Cambridge, 2005)
S. Russel, P. Norvigm, Artificial Intelligence: A Modern Approach, 3rd edn. (Prentice Hall, New York, 2010)
R.S. Sutton, A.G. Barto, Reinforcement Learning: An Introduction (MIT Press, Cambridge, 1998)
C.J.C.H. Watkins, Learning from Delayed Rewards. Ph.D. thesis, Cambridge University, Cambridge, England, 1989
H. van Hasselt, Reinforcement learning in continuous state and action spaces. Reinforcement Learn.: Adapt. Learn. Optim. 12, 207–251 (2012).
R. Sutton, Learning to predict by the methods of temporal differences. Mach. Learn. 3, 9–44 (erratum p. 377) (1988)
B. Sallans, G. Hinton, Reinforcement learning with factored states and actions. J. Mach. Learn. Res. 5, 1063–1088 (2004)
D.O. Hebb, The Organization of Behaviour (John Wiley & Sons, New York, 1949)
E.R. Caianiello, Outline of a theory of thought-processes and thinking machines. J. Theor. Biol. 1, 204–235 (1961)
T. Trappenberg, Fundamentals of Computational Neuroscience, 2nd edn. (Oxford University Press, Oxford, 2010)
R. Enoki, Y.L. Hu, D. Hamilton, A. Fine, Expression of long-term plasticity at individual synapses in hippocampus is graded, bidirectional, and mainly presynaptic: optical quantal analysis. Neuron 62(2), 242–253 (2009)
T. Bliss, T. Lømo, Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path. J. Physiol. 232(2), 331–56 (1973)
D. Heinke, E. Mavritsaki (eds.), Computational Modelling in Behavioural Neuroscience: Closing the gap between neurophysiology and behaviour (Psychology Press, London, 2008)
R. Rescorla, A. Wagner, in A Theory of Pavlovian Conditioning: Variations, in the Effectiveness of Reinforcement and Nonreinforcement, ed. by W.F. Prokasy, A.H. Black, Classical Conditioning, II: Current Research and Theory, (Appleton Century Crofts, New York, 1972), pp. 64–99
W. Schultz, Predictive reward signal of dopamine neurons. J. Neurophysiol. 80(1), 1–27 (1998)
J. Houk, J. Adams, A. Barto in A Model of How the Basal Ganglia Generate and Use Neural Signals that Predict Reinforcement, ed. by J.C. Hauk, J.L. Davis, D.G. Breiser. Models of Information Processing in the Basal Ganglia (MIT Press, Cambridge, 1995)
P. Connor, T. Trappenberg, in Characterizing a Brain-Based Value-Function Approximator, ed. by E. Stroulia, S. Matwin, Advances in Artificial Intelligence LNAI 2056, (Springer, Berlin, 2011), pp. 92–103
J. Reynolds, J. Wickens, Dopamine-dependent plasticity of corticostriatal synapses. Neural Netw. 15(4–6), 507–521 (2002)
P. Connor, V. LoLordo, T. Trappenberg (2012) An elemental model of retrospective revaluation without within-compound associations. Anim. Learn. 42(1), 22–38
T. Maia, M. Frank, From reinforcement learning models to psychiatric and neurological disorders. Nat. Neurosci. 14, 154–162 (2011)
Y. Bengio, Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1–127 (2009)
J. Hawkins, On Intelligence (Times Books, New York, 2004)
G. Gigerenzer, P. Todd and the ABC Research Group, Simple Heuristics that Make Us Smart (Oxford University Press, Oxford, 1999)
Acknowledgments
I would like to express my thanks to René Doursat for careful edits, Christian Albers, Igor Farkas, and Stephen Grossberg for useful comments of an earlier draft circulation, and all the colleagues that have provided me with encouraging comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Trappenberg, T.P. (2014). A Brief Introduction to Probabilistic Machine Learning and Its Relation to Neuroscience. In: Kowaliw, T., Bredeche, N., Doursat, R. (eds) Growing Adaptive Machines. Studies in Computational Intelligence, vol 557. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55337-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-55337-0_2
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-55336-3
Online ISBN: 978-3-642-55337-0
eBook Packages: EngineeringEngineering (R0)