Abstract
Online learning does not distinguish between a training and an evaluation phase of learning, but considers learning as an ongoing process, such that learning algorithms need to perform and make predictions while they learn. After reviewing the online learning model and some algorithms, I will consider variants of the model where only partial information is revealed to the learner, in particular the bandit problem and reinforcement learning. The uncertainty of the learner caused by receiving only partial information, leads to an exploration-exploitation dilemma: is further information needed, or can the available information already be exploited? I will discuss how optimism in the face of uncertainty can address this dilemma in many cases.
Chapter PDF
Similar content being viewed by others
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Auer, P. (2011). Exploration and Exploitation in Online Learning. In: Bouchachia, A. (eds) Adaptive and Intelligent Systems. ICAIS 2011. Lecture Notes in Computer Science(), vol 6943. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23857-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-23857-4_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23856-7
Online ISBN: 978-3-642-23857-4
eBook Packages: Computer ScienceComputer Science (R0)