Online Kernel Selection with Multiple Bandit Feedbacks in Random Feature Space

  • Junfan Li
  • Shizhong LiaoEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11062)


Online kernel selection is critical to online kernel learning, and must address the exploration-exploitation dilemma, where we explore new kernels to find the best one and exploit the kernel that showed the best performance in the past. In this paper, we propose a novel multi-armed bandit solution to the exploration-exploitation dilemma in online kernel selection. We first correspond each candidate kernel to an arm of a multi-armed bandit problem. Different from typical multi-armed bandit models where only one kernel is selected at each round, we sample multiple kernels with replacement according to a probability distribution. Then, we make prediction with the hypotheses learned in the random feature spaces specified by the selected kernels, and incur multiple losses referred to as multiple bandit feedbacks. Finally, we use all the feedbacks to update the probability distribution. We prove that the proposed approach enjoys a sub-linear expected regret bound. Experimental results on benchmark datasets show that the proposed approach has a comparable performance with existing online kernel selection methods.


Online kernel selection Exploration-exploitation dilemma Multiple bandit feedbacks Random feature space 



The work was supported in part by the National Natural Science Foundation of China under grant No. 61673293.


Authors and Affiliations

  1. 1.School of Computer Science and TechnologyTianjin UniversityTianjinChina

