Skip to main content

Learning Convex Combinations of Continuously Parameterized Basic Kernels

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3559))

Abstract

We study the problem of learning a kernel which minimizes a regularization error functional such as that used in regularization networks or support vector machines. We consider this problem when the kernel is in the convex hull of basic kernels, for example, Gaussian kernels which are continuously parameterized by a compact set. We show that there always exists an optimal kernel which is the convex combination of at most m+1 basic kernels, where m is the sample size, and provide a necessary and sufficient condition for a kernel to be optimal. The proof of our results is constructive and leads to a greedy algorithm for learning the kernel. We discuss the properties of this algorithm and present some preliminary numerical simulations.

This work was supported by EPSRC Grant GR/T18707/01, NSF Grant ITR-0312113 and the PASCAL European Network of Excellence.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aronszajn, N.: Theory of reproducing kernels. Trans. Amer. Math. Soc. 686, 337–404 (1950)

    Article  MathSciNet  Google Scholar 

  2. Aubin, J.P.: Mathematical Methods of Game and Economic Theory. In: Studies in Mathematics and its applications, vol. 7, North-Holland, Amsterdam (1982)

    Google Scholar 

  3. Bach, F.R., Lanckriet, G.R.G., Jordan, M.I.: Multiple kernels learning, conic duality, and the SMO algorithm. In: Proc. of the Int. Conf. on Machine Learning (2004)

    Google Scholar 

  4. Bousquet, O., Herrmann, D.J.L.: On the complexity of learning the kernel matrix. Advances in Neural Information Processing Systems 15 (2003)

    Google Scholar 

  5. Borwein, J.M., Lewis, A.S.: Convex Analysis and Nonlinear Optimization. Theory and Examples. CMS (Canadian Math. Soc.). Springer, New York (2000)

    Google Scholar 

  6. Chapelle, O., Vapnik, V.N., Bousquet, O., Mukherjee, S.: Choosing multiple parameters for support vector machines. Machine Learning 46(1), 131–159 (2002)

    Article  MATH  Google Scholar 

  7. Herbster, M.: Relative Loss Bounds and Polynomial-time Predictions for the K-LMS-NET Algorithm. In: Proc. of the 15-th Int. Conference on Algorithmic Learning Theory (October 2004)

    Google Scholar 

  8. Lanckriet, G.R.G., Cristianini, N., Bartlett, P., El Ghaoui, L., Jordan, M.I.: Learning the kernel matrix with semi-definite programming. J. of Machine Learning Research 5, 27–72 (2004)

    Google Scholar 

  9. Micchelli, C.A., Pontil, M.: Learning the kernel function via regularization. To appear in J. of Machine Learning Research (see also Research Note RN/04/11, Department of Computer Science, UCL (June 2004)

    Google Scholar 

  10. Micchelli, C.A., Rivlin, T.J.: Lectures on optimal recovery. In: Turner, P.R. (ed.) Lecture Notes in Mathematics, vol. 1129, Springer, Heidelberg (1985)

    Google Scholar 

  11. Ong, C.S., Smola, A.J., Williamson, R.C.: Hyperkernels. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems 15, MIT Press, Cambridge (2003)

    Google Scholar 

  12. Royden, H.L.: Real Analysis, 3rd edn. Macmillan Publ. Company, New York (1988)

    MATH  Google Scholar 

  13. Schoenberg, I.J.: Metric spaces and completely monotone functions. Annals of Mathematics 39, 811–841 (1938)

    Article  MathSciNet  Google Scholar 

  14. Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)

    Google Scholar 

  15. Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

  16. Wahba, G.: Spline Models for Observational Data. Series in Applied Mathematics, vol. 59. SIAM, Philadelphia (1990)

    MATH  Google Scholar 

  17. Zhang, T.: On the dual formulation of regularized linear systems with convex risks. Machine Learning 46, 91–129 (2002)

    Article  MATH  Google Scholar 

  18. Wu, Q., Ying, Y., Zhou, D.X.: Multi-kernel regularization classifiers. In: Preprint, City University of Hong Kong (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Argyriou, A., Micchelli, C.A., Pontil, M. (2005). Learning Convex Combinations of Continuously Parameterized Basic Kernels. In: Auer, P., Meir, R. (eds) Learning Theory. COLT 2005. Lecture Notes in Computer Science(), vol 3559. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11503415_23

Download citation

  • DOI: https://doi.org/10.1007/11503415_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26556-6

  • Online ISBN: 978-3-540-31892-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics