Analysis of Some Methods for Reduced Rank Gaussian Process Regression

Quiñonero-Candela, Joaquin; Rasmussen, Carl Edward

doi:10.1007/978-3-540-30560-6_4

Joaquin Quiñonero-Candela^18,19 &
Carl Edward Rasmussen¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3355))

1393 Accesses
7 Citations

Abstract

While there is strong motivation for using Gaussian Processes (GPs) due to their excellent performance in regression and classification problems, their computational complexity makes them impractical when the size of the training set exceeds a few thousand cases. This has motivated the recent proliferation of a number of cost-effective approximations to GPs, both for classification and for regression. In this paper we analyze one popular approximation to GPs for regression: the reduced rank approximation. While generally GPs are equivalent to infinite linear models, we show that Reduced Rank Gaussian Processes (RRGPs) are equivalent to finite sparse linear models. We also introduce the concept of degenerate GPs and show that they correspond to inappropriate priors. We show how to modify the RRGP to prevent it from being degenerate at test time. Training RRGPs consists both in learning the covariance function hyperparameters and the support set. We propose a method for learning hyperparameters for a given support set. We also review the Sparse Greedy GP (SGGP) approximation (Smola and Bartlett, 2001), which is a way of learning the support set for given hyperparameters based on approximating the posterior. We propose an alternative method to the SGGP that has better generalization capabilities. Finally we make experiments to compare the different ways of training a RRGP. We provide some Matlab code for learning RRGPs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cressie, N.A.C.: Statistics for Spatial Data. John Wiley and Sons, New Jersey (1993)
Google Scholar
Csató, L.: Gaussian Processes – Iterative Sparse Approximation. PhD thesis, Aston University, Birmingham, United Kingdom (2002)
Google Scholar
Csató, L., Opper, M.: Sparse online gaussian processes. Neural Computation 14(3), 641–669 (2002)
Article MATH Google Scholar
Gibbs, M., MacKay, D.J.C.: Efficient implementation of gaussian processes. Technical report, Cavendish Laboratory, Cambridge University, Cambridge, United Kingdom (1997)
Google Scholar
Lawrence, N., Seeger, M., Herbrich, R.: Fast sparse gaussian process methods: The informative vector machine. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Neural Information Processing Systems, vol. 15, pp. 609–616. MIT Press, Cambridge (2003)
Google Scholar
MacKay, D.J.C.: Bayesian non-linear modelling for the energy prediction competition. ASHRAE Transactions 100(2), 1053–1062 (1994)
Google Scholar
Cressie, N.A.C.: Statistics for Spatial Data. John Wiley and Sons, New Jersey (1993)
Google Scholar
Neal, R.M.: Bayesian Learning for Neural Networks. Lecture Notes in Statistics, vol. 118. Springer, Heidelberg (1996)
MATH Google Scholar
Press, W., Flannery, B., Teukolsky, S.A., Vetterling, W.T.: Numerical Recipes in C, 2nd edn. Cambridge University Press, Cambridge (1992)
MATH Google Scholar
Rasmussen, C.E.: Evaluation of Gaussian Processes and Other Methods for Non-linear Regression. PhD thesis, Department of Computer Science, University of Toronto, Toronto, Ontario (1996)
Google Scholar
Rasmussen, C.E.: Reduced rank gaussian process learning. Unpublished Manuscript (2002)
Google Scholar
Schölkopf, B., Smola, A.J.: Learning with Kernels. MIT Press, Cambridge (2002)
Google Scholar
Schwaighofer, A., Tresp, V.: Transductive and inductive methods for approximate gaussian process regression. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems, vol. 15, pp. 953–960. MIT Press, Cambridge (2003)
Google Scholar
Seeger, M.: Bayesian Gaussian Process Models: PAC-Bayesian Generalisation Error Bounds and Sparse Approximations. PhD thesis, University of Edinburgh, Edinburgh, Scotland (2003)
Google Scholar
Seeger, M., Williams, C., Lawrence, N.: Fast forward selection to speed up sparse gaussian process regression. In: Bishop, C.M., Frey, B.J. (eds.) Ninth International Workshop on Artificial Intelligence and Statistics, Society for Artificial Intelligence and Statistics (2003)
Google Scholar
Smola, A.J., Bartlett, P.L.: Sparse greedy Gaussian process regression. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems, vol. 13, pp. 619–625. MIT Press, Cambridge (2001)
Google Scholar
Smola, A.J., Schölkopf, B.: Sparse greedy matrix approximation for machine learning. In: Langley, P. (ed.) International Conference on Machine Learning, vol. 17, pp. 911–918. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Tipping, M.E.: Sparse bayesian learning and the relevance vector machine. Journal of Machine Learning Research 1, 211–244 (2001)
Article MATH MathSciNet Google Scholar
Tresp, V.: A bayesian committee machine. Neural Computation 12(11), 2719–2741 (2000)
Article Google Scholar
Wahba, G., Lin, X., Gao, F., Xiang, D., Klein, R., Klein, B.: The biasvariance tradeoff and the randomized GACV. In: Kerns, M.S., Solla, S.A., Cohn, D.A. (eds.) Advances in Neural Information Processing Systems, vol. 11, pp. 620–626. MIT Press, Cambridge (1999)
Google Scholar
Williams, C.: Computation with infinite neural networks. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems, vol. 9, pp. 295–301. MIT Press, Cambridge (1997a)
Google Scholar
Williams, C.: Prediction with gaussian processes: From linear regression to linear prediction and beyond. Technical Report NCRG/97/012, Dept of Computer Science and Applied Mathematics, Aston University, Birmingham, United Kingdom (1997b)
Google Scholar
Williams, C., Rasmussen, C.E., Schwaighofer, A., Tresp, V.: Observations of the nyström method for gaussiam process prediction. Technical report, University of Edinburgh, Edinburgh, Scotland (2002)
Google Scholar
Williams, C., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems, vol. 13, pp. 682–688. MIT Press, Cambridge (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Informatics and Mathematical Modelling, Technical University of Denmark, Richard Petersens Plads, B321, 2800, Kongens Lyngby, Denmark
Joaquin Quiñonero-Candela
Max Planck Institute for Biological Cybernetics, Spemannstraße 38, 72076, Tübingen, Germany
Joaquin Quiñonero-Candela & Carl Edward Rasmussen

Authors

Joaquin Quiñonero-Candela
View author publications
You can also search for this author in PubMed Google Scholar
Carl Edward Rasmussen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Hamilton Institute, NUI Maynooth, Ireland
Roderick Murray-Smith
Hamilton Institute, NUIM, Ireland
Robert Shorten

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Quiñonero-Candela, J., Rasmussen, C.E. (2005). Analysis of Some Methods for Reduced Rank Gaussian Process Regression. In: Murray-Smith, R., Shorten, R. (eds) Switching and Learning in Feedback Systems. Lecture Notes in Computer Science, vol 3355. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30560-6_4

Download citation

DOI: https://doi.org/10.1007/978-3-540-30560-6_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24457-8
Online ISBN: 978-3-540-30560-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics