Abstract
The purpose of this review is to provide a brief outline of some uses of Bayesian methods in artificial intelligence, specifically in the area of neural computation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Bibliography
Amari et al,1996] S. Amari, A. Cichocki, and H. H. Yang. A new learning algorithm for blind signal separation. In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors, Advances in Neural Information Pmcessing Systems 8,pages 757–763. MIT Press, 1996.
H. Attias and C. E. Schreiner. Blind source separation and deconvolu- tion: The dynamic component analysis algorithm. Neural Computation, 10 (6): 1373–1424, 1998.
David Barber and Christopher M. Bishop. Ensemble learning for multi-layer networks. In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors, Advances in Neural Information Processing Systems 10, pages 395–401. MIT Press, 1998.
A. R. Barron. Approximation and estimation bounds for artificial neural networks. Machine Learning, 14: 115–133, 1994.
P. Bartlett and J. Shawe-Taylor. Generalization performance of support vector machines and other pattern classifiers. In B. Schölkopf, C. J. C. Burgess, and A. J. Smola, editors, Advances in Kernel Methods—Support Vector Learning, pages 43–54. MIT Press, 1999.
A. J. Bell and T. J. Sejnowski. An information-maximization approach to blind separation and blind deconvolution. Neural Computation, 7 (6): 1129–1159, 1995.
C. M. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, 1995.
Bishop, 1995b] Chris M. Bishop. Training with noise is equivalent to Tikhonov regularization. Neural Computation, 7(1):108–116, 1995
C. M. Bishop and C. Legleye. Estimating conditional probability densities for periodic variables. In G. Tesauro, D. Touretzky, and T. Leen, editors, Advances in Neural Information Processing Systems 7, pages 641–648. MIT Press, 1995.
Christopher M. Bishop, Markus Svensén, and Christopher K. I. Williams. Developments of the generative topographic mapping. Neurocomputing,21:203–224,1998.
Bishop et al,1998b1 Christopher M. Bishop, Markus Svensén, and Christopher K. I. Williams GTM: The generative topographic mapping. Neural Computation,10(1):215–234,1998.
Wray L. Buntine and Andreas S. Weigend. Bayesian back-propagation. Complex Systems, 5: 603–643, 1991.
Zehra Cataltepe, Yaser S. Abu-Mostafa, and Malik Magdon-Ismail. No free lunch for early stopping. Neural Computation, 11(4):995–1009,1999
Olivier Chapelle and Vladimir Vapnik. Model selection for support vector machines. In S. A. Solla, T. K. Leen, and K-B Müller, editors, Advances in Neural Information Processing Systems 12, pages 230–236. MIT Press, 2000.
P. Common. Independent component analysis: A new concept. Signal Processing, 36: 287–314, 1994.
N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines. Cambridge University Press, 2000.
Mark Gibbs and David J. C. MacKay. Efficient implementation of Gaussian processes. Technical report, Cavendish Laboratory, Cambridge, 1997.
G. E. Hinton. Learning distributed representations of concepts. In Proceedings of the Eighth Annual Conference of the Cognitive Sicence Society (Amherst, 1986), pages 1–12. Hillsdale: Erlbaum, 1986.
G. E. Hinton and D. van Camp. Keeping neural networks simple by minimizing the description length of the weights. In Proceedings of the Sixth Annual Conference on Computational Learning Theory, pages 5–13, 1993.
K. Hornik. Some new results on neural network approximation. Neural Computation, 6 (8): 1069–1072, 1993.
H. Jeffreys. Theory of Probability. Oxford, third edition, 1961.
Peter K. Kitanidis. Parameter uncertainty in estimation of spatial functions: Bayesian analysis. Water Resources Research, 22 (4): 499–507, 1986.
T. Kohonen. Self-Organizing Maps. Springer, 1995.
Nhu D. Le and James V. Zidek. Interpolation with uncertain spatial covariances: A Bayesian alternative to Kriging. Journal of Multivariate Analysis, 43: 351–374, 1992.
Bayesian interpolation. Neural Computation, 4 (3): 415–447, 1992.
A practical Bayesian framework for backpropagation networks. Neural Computation, 4 (3): 448–472, 1992.
MacKay. In G. Heidbreder, editor, Maximum Entropy and Bayesian Methods, Santa Barbara 1993, Dordrecht, 1994. Kluwer.
Developments in probabilistic modelling with neural networks—ensemble learning. In Neural Networks: Artificial Intelligence and Industrial Applications, pages 191–198. Springer, 1995.
Matheron. La Théorie des Variables Regionalisées et ses Applications. Masson, 1965.
J. W. Miskin and D. J. C. MacKay. Ensemble learning for blind source separation. In S. Roberts and R. Everson, editors, Independent Component Analysis: Principles and Practice. Cambridge University Press, 2001.
Radford M. Neal. Bayesian training of backpropagation networks by the hybrid Monte Carlo method. Technical Report CRG-TR-92–1, Department of Computer Science, University of Toronto, April 1992.
Radford M. Neal. Priors for infinite networks. Technical Report CRG-TR-94–1, Department of Computer Science, University of Toronto, 1994.
Radford M. Neal. Bayesian Learning for Neural Networks. Lecture Notes in Statistics No. 118. Springer-Verlag, 1996.
Regression and classification using Gaussian process priors. In J. M. Bernardo et al, editor, Bayesian Statistics 6, pages 475–501. Oxford University Press, 1998.
B. D. Ripley. Pattern Recognition and Neural Networks. Cambridge University Press, 1996.
Sam Roweis and Zoubin Ghahramani. A unifying review of linear gaussian models. Neural Computation, 11 (2): 305–345, 1999.
D. E. Rumelhart and J. L. McClelland. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. MIT Press, 1986.
Matthias Seeger. Bayesian model selection for Support Vector machines, Gaussian processes and other kernel classifiers. In S. A. Solla, T. K. Leen, and K-B Müller, editors, Advances in Neural Information Processing Systems 12, pages 603–609. MIT Press, 2000.
A. N. Tikhonov and V. Y. Arsenin. Solutions of Ill-Posed Problems. John Wiley & Sons, 1977.
Michael E. Tipping. The relevance vector machine. In S. A. Solla, T. K. Leen, and K.-B. Müller, editors, Advances in Neural Information Processing Systems 12, pages 652–658. MIT Press, 2000.
Trecate et al,1999] Giancarlo Ferrari Trecate, C. K. I. Williams, and M. Opper. Finite-dimensional approximation of Gaussian processes. In M. J. Kearns, S. A. Solla, and D. A. Cohn, editors, Advances in Neural Information Processing Systems 11,pages 218–224. MIT Press, 1999.
V. Vapnik. Estimation of Dependences Based on Empirical Data. Nauka, Moscow, 1979. English translation: Springer Verlag, New York, 1982.
Vapnik, 1998] Vladimir Vapnik. Statistical Learning Theory John Wiley, 1998.
V. Vapnik and O. Chapelle. Bounds on error expectation for support vector machines. Neural Computation, 12 (9): 2013–2036, 2000.
Vladimir N. Vapnik and Sayan Mukherjee. Support vector method for multivariate density estimation. In S. A. Solla, T. K. Leen, and K-B Müller, editors, Advances in Neural Information Processing Systems 12, pages 659–664. MIT Press, 2000.
Vapnik et al,1997] Vladimir Vapnik, Steven E. Golowich, and Alex Smola. Support vector method for function approximation, regression estimation, and signal processing. In Michael C. Mozer, Michael I. Jordan, and Thomas Petsche, editors, Advances in Neural Information Processing Systems 9, pages 281–287. MIT Press, 1997.
N. Wiener. Extrapolation, Interpolation, and Smoothing of Times Series. MIT Press, 1949.
Christopher K. I. Williams. Computation with infinite neural networks. Neural Computation, 10 (5): 1203–1216, 1998.
Christopher K. I. Williams and Carl Edward Rasmussen. Gaussian processes for regression. In Michael C. Mozer David S. Touretzky and Michael E. Hasselmo, editors, Advances in Neural Information Processing Systems 8, pages 514–520. The MIT Press, 1996.
Christopher K. I. Williams and Matthias Seeger. Using the Nyström method to speed up kernel machines. In T. K. Leen, T. G. Diettrich, and V. Tresp, editors, Advances in Neural Information Processing Systems 13. The MIT Press, 2001.
P. M. Williams. A Marquardt algorithm for choosing the step-size in backpropagation learning with conjugate gradients. Cognitive Science Research Paper CSRP 229, University of Sussex, February 1991.
P. M. Williams. Bayesian regularization and pruning using a Laplace prior. Neural Computation, 7 (1): 117–143, 1995.
P. M. Williams. Using neural networks to model conditional multivariate densities. Neural Computation, 8 (4): 843–854, 1996.
P. M. Williams. Modelling seasonality and trends in daily rainfall data. In M. I. Jordan, M. J. Kearns, and S. A. Solla, editors, Advances in Neural Information Processing Systems 10, pages 985–991. The MIT Press, 1998.
P. M. Williams Matrix logarithm parametrizations for neural network covariance models. Neural Networks, I2 (2): 299–308, 1999.
R. C. Williamson, J. Shawe-Taylor, P. L. Bartlett and M. Anthony. Structural risk minimization over data-dependent hierarchies. IEEE Transactions on Information Theory,44(5):1926–1940,1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Williams, P.M. (2001). Probabilistic Learning Models. In: Corfield, D., Williamson, J. (eds) Foundations of Bayesianism. Applied Logic Series, vol 24. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-1586-7_5
Download citation
DOI: https://doi.org/10.1007/978-94-017-1586-7_5
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5920-8
Online ISBN: 978-94-017-1586-7
eBook Packages: Springer Book Archive