Online Multiple Kernel Learning: Algorithms and Mistake Bounds

  • Rong Jin
  • Steven C. H. Hoi
  • Tianbao Yang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6331)


Online learning and kernel learning are two active research topics in machine learning. Although each of them has been studied extensively, there is a limited effort in addressing the intersecting research. In this paper, we introduce a new research problem, termed Online Multiple Kernel Learning (OMKL), that aims to learn a kernel based prediction function from a pool of predefined kernels in an online learning fashion. OMKL is generally more challenging than typical online learning because both the kernel classifiers and their linear combination weights must be learned simultaneously. In this work, we consider two setups for OMKL, i.e. combining binary predictions or real-valued outputs from multiple kernel classifiers, and we propose both deterministic and stochastic approaches in the two setups for OMKL. The deterministic approach updates all kernel classifiers for every misclassified example, while the stochastic approach randomly chooses a classifier(s) for updating according to some sampling strategies. Mistake bounds are derived for all the proposed OMKL algorithms.


On-line learning and relative loss bounds Kernels 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agmon, S.: The relaxation method for linear inequalities. CJM 6(3), 382–392 (1954)zbMATHMathSciNetGoogle Scholar
  2. 2.
    Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SICOMP 32(1) (2003)Google Scholar
  3. 3.
    Bach, F.R., Lanckriet, G.R.G., Jordan, M.I.: Multiple kernel learning, conic duality, and the smo algorithm. In: ICML (2004)Google Scholar
  4. 4.
    Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)zbMATHCrossRefGoogle Scholar
  5. 5.
    Chapelle, O., Weston, J., Schölkopf, B.: Cluster kernels for semi-supervised learning. In: NIPS, pp. 585–592 (2002)Google Scholar
  6. 6.
    Chen, Y., Gupta, M.R., Recht, B.: Learning kernels from indefinite similarities. In: ICML, pp. 145–152 (2009)Google Scholar
  7. 7.
    Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. JMLR 7 (2006)Google Scholar
  8. 8.
    Crammer, K., Singer, Y.: Ultraconservative online algorithms for multiclass problems. JMLR,  3 (2003)Google Scholar
  9. 9.
    Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. JCSS 55(1) (1997)Google Scholar
  10. 10.
    Freund, Y., Schapire, R.E.: Large margin classification using the perceptron algorithm. ML 37(3) (1999)Google Scholar
  11. 11.
    Gentile, C.: A new approximate maximal margin classification algorithm. JMLR 2 (2001)Google Scholar
  12. 12.
    Hoi, S.C., Jin, R., Lyu, M.R.: Learning non-parametric kernel matrices from pairwise constraints. In: ICML, pp. 361–368 (2007)Google Scholar
  13. 13.
    Hoi, S.C.H., Lyu, M.R., Chang, E.Y.: Learning the unified kernel machines for classification. In: KDD, pp. 187–196 (2006)Google Scholar
  14. 14.
    Kashima, H., Tsuda, K., Inokuchi, A.: Marginalized kernels between labeled graphs. In: ICML, pp. 321–328 (2003)Google Scholar
  15. 15.
    Kivinen, J., Smola, A., Williamson, R.: Online learning with kernels. IEEE Trans. on Sig. Proc. 52(8) (2004)Google Scholar
  16. 16.
    Kivinen, J., Smola, A.J., Williamson, R.C.: Online learning with kernels. In: NIPS, pp. 785–792 (2001)Google Scholar
  17. 17.
    Kondor, R.I., Lafferty, J.D.: Diffusion kernels on graphs and other discrete input spaces. In: ICML, pp. 315–322 (2002)Google Scholar
  18. 18.
    Kulis, B., Sustik, M., Dhillon, I.: Learning low-rank kernel matrices. In: ICML, pp. 505–512 (2006)Google Scholar
  19. 19.
    Lanckriet, G.R.G., Cristianini, N., Bartlett, P., Ghaoui, L.E., Jordan, M.I.: Learning the kernel matrix with semidefinite programming. JMLR 5 (2004)Google Scholar
  20. 20.
    Li, Y., Long, P.M.: The relaxed online maximum margin algorithm. ML 46(1-3) (2002)Google Scholar
  21. 21.
    Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. In: FOCS (1989)Google Scholar
  22. 22.
    Novikoff, A.: On convergence proofs on perceptrons. In: Proceedings of the Symposium on the Mathematical Theory of Automata, vol. XII (1962)Google Scholar
  23. 23.
    Rakotomamonjy, A., Bach, F.R., Canu, S., Grandvalet, Y.: Simplemkl. JMLR 11 (2008)Google Scholar
  24. 24.
    Rosenblatt, F.: The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review 65 (1958)Google Scholar
  25. 25.
    Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)Google Scholar
  26. 26.
    Shalev-Shwartz, S.: Online learning: Theory, algorithms, and applications. In: Ph.D thesis (2007)Google Scholar
  27. 27.
    Sonnenburg, S., Rätsch, G., Schäfer, C., Schölkopf, B.: Large scale multiple kernel learning. JMLR 7 (2006)Google Scholar
  28. 28.
    Vapnik, V.N.: Statistical Learning Theory. Wiley, Chichester (1998)zbMATHGoogle Scholar
  29. 29.
    Vovk, V.: A game of prediction with expert advice. J. Comput. Syst. Sci. 56(2) (1998)Google Scholar
  30. 30.
    Xu, Z., Jin, R., King, I., Lyu, M.R.: An extended level method for efficient multiple kernel learning. In: NIPS, pp. 1825–1832 (2008)Google Scholar
  31. 31.
    Xu, Z., Jin, R., Yang, H., King, I., Lyu, M.: Simple and efficient multiple kernel learning by group lasso. In: ICML (2010)Google Scholar
  32. 32.
    Zhu, X., Kandola, J.S., Ghahramani, Z., Lafferty, J.D.: Nonparametric transforms of graph kernels for semi-supervised learning. In: NIPS, pp. 1641–1648 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Rong Jin
    • 1
  • Steven C. H. Hoi
    • 2
  • Tianbao Yang
    • 1
  1. 1.Department of Computer Science and EngineeringMichigan State UniversityUSA
  2. 2.School of Computer EngineeringNanyang Technological UniversitySingapore

Personalised recommendations