Online Learning with Prior Knowledge

Hazan, Elad; Megiddo, Nimrod

doi:10.1007/978-3-540-72927-3_36

Elad Hazan¹ &
Nimrod Megiddo¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4539))

Included in the following conference series:

International Conference on Computational Learning Theory

3331 Accesses
9 Citations

Abstract

The standard so-called experts algorithms are methods for utilizing a given set of “experts” to make good choices in a sequential decision-making problem. In the standard setting of experts algorithms, the decision maker chooses repeatedly in the same “state” based on information about how the different experts would have performed if chosen to be followed. In this paper we seek to extend this framework by introducing state information. More precisely, we extend the framework by allowing an experts algorithm to rely on state information, namely, partial information about the cost function, which is revealed to the decision maker before the latter chooses an action. This extension is very natural in prediction problems. For illustration, an experts algorithm, which is supposed to predict whether the next day will be rainy, can be extended to predicting the same given the current temperature.

We introduce new algorithms, which attain optimal performance in the new framework, and apply to more general settings than variants of regression that have been considered in the statistics literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Antos, A., Györfi, L., Kohler, M.: Lower bounds on the rate of convergence of nonparametric regression estimates. Journal of Statistical Planning and Inference 83(1), 91–100 (2000)
Article MATH MathSciNet Google Scholar
Arora, S., Hazan, E., Kale, S.: The multiplicative weights update method: a meta algorithm and applications. Manuscript (2005)
Google Scholar
Blum, A., Kalai, A.: Universal portfolios with and without transaction costs. In: COLT ’97: Proceedings of the tenth annual conference on Computational learning theory, pp. 309–313. ACM Press, New York, USA (1997)
Chapter Google Scholar
Cesa-Bianchi, N., Conconi, A., Gentile, C.: On the generalization ability of on-line learning algorithms. IEEE Transactions on Information Theory (2004)
Google Scholar
Cesa-Bianchi, N., Freund, Y., Helmbold, D.P., Haussler, D., Schapire, R.E., Warmuth, M.K.: How to use expert advice. In: STOC ’93: Proceedings of the twenty-fifth annual ACM symposium on Theory of computing, pp. 382–391. ACM Press, New York, USA (1993)
Chapter Google Scholar
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, New York, USA (2006)
MATH Google Scholar
Clarkson, K.L.: Nearest-neighbor searching and metric space dimensions. In: Shakhnarovich, G., Darrell, T., Indyk, P. (eds.) Nearest-Neighbor Methods for Learning and Vision: Theory and Practice, pp. 15–59. MIT Press, Cambridge (2006)
Google Scholar
Cover, T.M., Ordentlich, E.: Universal portfolios with side information. 42, 348–363 (1996)
Google Scholar
Cover, T.: Universal portfolios. Math. Finance 1, 1–19 (1991)
Article MATH MathSciNet Google Scholar
Flaxman, A., Kalai, A.T., McMahan, H.B.: Online convex optimization in the bandit setting: gradient descent without a gradient. In: Proceedings of 16th SODA, pp. 385–394 (2005)
Google Scholar
Hazan, E., Kalai, A., Kale, S.: A.t Agarwal. Logarithmic regret algorithms for online convex optimization. In: COLT ’06: Proceedings of the 19’th annual conference on Computational learning theory (2006)
Google Scholar
Krauthgamer, R., Lee, J.R.: Navigating nets: Simple algorithms for proximity search. In: 15th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 791–801 (January 2004)
Google Scholar
Kivinen, J., Warmuth, M.K.: Averaging expert predictions. In: EuroCOLT ’99: Proceedings of the 4th European Conference on Computational Learning Theory, London, UK, pp. 153–167. Springer, Heidelberg (1999)
Google Scholar
Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Information and Computation 108(2), 212–261 (1994)
Article MATH MathSciNet Google Scholar
Stone, C.J.: Optimal global rates of convergence for nonparametric regression. Annals of Statistics 10, 1040–1053 (1982)
MATH MathSciNet Google Scholar
Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In: Proceedings of the Twentieth International Conference (ICML), pp. 928–936 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

IBM Almaden Research Center,
Elad Hazan & Nimrod Megiddo

Authors

Elad Hazan
View author publications
You can also search for this author in PubMed Google Scholar
Nimrod Megiddo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Nader H. Bshouty Claudio Gentile

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hazan, E., Megiddo, N. (2007). Online Learning with Prior Knowledge. In: Bshouty, N.H., Gentile, C. (eds) Learning Theory. COLT 2007. Lecture Notes in Computer Science(), vol 4539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72927-3_36

Download citation

DOI: https://doi.org/10.1007/978-3-540-72927-3_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72925-9
Online ISBN: 978-3-540-72927-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics