Abstract
We propose simple randomized strategies for sequential prediction under imperfect monitoring, that is, when the forecaster does not have access to the past outcomes but rather to a feedback signal. The proposed strategies are consistent in the sense that they achieve, asymptotically, the best possible average reward. It was Rustichini [11] who first proved the existence of such consistent predictors. The forecasters presented here offer the first constructive proof of consistency. Moreover, the proposed algorithms are computationally efficient. We also establish upper bounds for the rates of convergence. In the case of deterministic feedback, these rates are optimal up to logarithmic terms.
S. M. was partially supported by the Canada Research Chairs Program and by the Natural Sciences and Engineering Research Council of Canada. G.L. acknowledges the support of the Spanish Ministry of Science and Technology grant MTM2006-05650. G.S. was partially supported by the French “Agence Nationale pour la Recherche” under grant JCJC06-137444 “From applications to theory in learning and adaptive statistics.” G.L. and G.S. acknowledge the PASCAL Network of Excellence under EC grant no. 506778.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Azuma, K.: Weighted sums of certain dependent random variables. Tohoku Mathematical Journal 68, 357–367 (1967)
Blackwell, D.: Controlled random walks. In: Proceedings of the International Congress of Mathematicians, 1954, volume III, pp. 336–338. North-Holland (1956)
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, New York (2006)
Cesa-Bianchi, N., Lugosi, G., Stoltz, G.: Regret minimization under partial monitoring. Mathematics of Operations Research 31, 562–580 (2006)
Chen, X., White, H.: Laws of large numbers for Hilbert space-valued mixingales with applications. Econometric Theory 12(2), 284–304 (1996)
Freedman, D.A.: On tail probabilities for martingales. Annals of Probability 3, 100–118 (1975)
Hannan, J.: Approximation to Bayes risk in repeated play. Contributions to the theory of games 3, 97–139 (1957)
Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58, 13–30 (1963)
Mannor, S., Shimkin, N.: On-line learning with imperfect monitoring. In: Proceedings of the 16th Annual Conference on Learning Theory, pp. 552–567. Springer, Heidelberg (2003)
Piccolboni, A., Schindelhauer, C.: Discrete prediction games with arbitrary feedback and loss. In: Proceedings of the 14th Annual Conference on Computational Learning Theory, pp. 208–223 (2001)
Rustichini, A.: Minimizing regret: The general case. Games and Economic Behavior 29, 224–243 (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Lugosi, G., Mannor, S., Stoltz, G. (2007). Strategies for Prediction Under Imperfect Monitoring. In: Bshouty, N.H., Gentile, C. (eds) Learning Theory. COLT 2007. Lecture Notes in Computer Science(), vol 4539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72927-3_19
Download citation
DOI: https://doi.org/10.1007/978-3-540-72927-3_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72925-9
Online ISBN: 978-3-540-72927-3
eBook Packages: Computer ScienceComputer Science (R0)