The Safe Bayesian

Grünwald, Peter

doi:10.1007/978-3-642-34106-9_16

Peter Grünwald²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7568))

Included in the following conference series:

International Conference on Algorithmic Learning Theory

2313 Accesses
23 Citations

Abstract

Standard Bayesian inference can behave suboptimally if the model is wrong. We present a modification of Bayesian inference which continues to achieve good rates with wrong models. Our method adapts the Bayesian learning rate to the data, picking the rate minimizing the cumulative loss of sequential prediction by posterior randomization. Our results can also be used to adapt the learning rate in a PAC-Bayesian context. The results are based on an extension of an inequality due to T. Zhang and others to dependent random variables.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Audibert, J.Y.: PAC-Bayesian statistical learning theory. PhD thesis, Université Paris VI (2004)
Google Scholar
Barron, A.R., Cover, T.M.: Minimum complexity density estimation. IEEE Transactions on Information Theory 37(4), 1034–1054 (1991)
Article MathSciNet MATH Google Scholar
Catoni, O.: PAC-Bayesian Supervised Classification. Lecture Notes IMS (2007)
Google Scholar
Chaudhuri, K., Freund, Y., Hsu, D.: A parameter-free hedging algorithm. In: NIPS 2009, pp. 297–305 (2009)
Google Scholar
Dawid, A.P.: Present position and potential developments: Some personal views, statistical theory, the prequential approach. J. R. Stat. Soc. Ser. A-G 147(2), 278–292 (1984)
Article MathSciNet MATH Google Scholar
Doob, J.L.: Application of the theory of martingales. In: Le Calcul de Probabilités et ses Applications. Colloques Internationaux du CNRS, pp. 23–27 (1949)
Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 119–139 (1997)
Article MathSciNet MATH Google Scholar
Grünwald, P.: The MDL Principle. MIT Press, Cambridge (2007)
Google Scholar
Grünwald, P.: Safe learning: bridging the gap between Bayes, MDL and statistical learning theory via empirical convexity. In: Proc. COLT 2011, pp. 551–573 (2011)
Google Scholar
Grünwald, P., Langford, J.: Suboptimal behavior of Bayes and MDL in classification under misspecification. Machine Learning 66(2-3), 119–149 (2007)
Article Google Scholar
Kleijn, B., van der Vaart, A.: Misspecification in infinite-dimensional Bayesian statistics. Ann. Stat. 34(2) (2006)
Google Scholar
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22, 76–86 (1951)
Article MathSciNet Google Scholar
Li, J.Q.: Estimation of Mixture Models. PhD thesis, Yale, New Haven, CT (1999)
Google Scholar
McAllester, D.: PAC-Bayesian stochastic model selection. Mach. Learn. 51(1), 5–21 (2003)
Article MATH Google Scholar
Jordan, M.I., Bartlett, P.L., McAuliffe, J.D.: Convexity, classification and risk bounds. J. Am. Stat. Assoc. 101(473), 138–156 (2006)
Article MathSciNet MATH Google Scholar
Seeger, M.: PAC-Bayesian generalization error bounds for Gaussian process classification. J. Mach. Learn. Res. 3, 233–269 (2002)
Article MathSciNet Google Scholar
Shalizi, C.: Dynamics of Bayesian updating with dependent data and misspecified models. Electronic Journal of Statistics 3, 1039–1074 (2009)
Article MathSciNet Google Scholar
Takeuchi, J., Barron, A.R.: Robustly minimax codes for universal data compression. In: Proc. ISITA 1998, Japan (1998)
Google Scholar
van der Vaart, A.: Asymptotic Statistics. Cambridge University Press (1998)
Google Scholar
Vovk, V.: Competitive on-line statistics. Intern. Stat. Rev. 69, 213–248 (2001)
MATH Google Scholar
Vovk, V.: Aggregating strategies. In: Proc. COLT 1990, pp. 371–383 (1990)
Google Scholar
Zhang, T.: From ε-entropy to KL entropy: analysis of minimum information complexity density estimation. Ann. Stat. 34(5), 2180–2210 (2006a)
Article MATH Google Scholar
Zhang, T.: Information theoretical upper and lower bounds for statistical estimation. IEEE T. Inform. Theory 52(4), 1307–1321 (2006b)
Article Google Scholar

Download references

Author information

Authors and Affiliations

CWI, Amsterdam and Leiden University, The Netherlands
Peter Grünwald

Authors

Peter Grünwald
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Technion, 32000, Haifa, Israel
Nader H. Bshouty
Ecolre Normale Sup’erieure, CNRS, INRIA, 45 rue d’Ulm, 75005, Paris, France
Gilles Stoltz
Ecole Normale Supérieure de Cachan, 61, avenue du Président Wilson, 94 235, Cachan cedex, France
Nicolas Vayatis
Division of Computer Science, Hokkaido University, N-14, W-9, 060-0814, Sapporo, Japan
Thomas Zeugmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Grünwald, P. (2012). The Safe Bayesian. In: Bshouty, N.H., Stoltz, G., Vayatis, N., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2012. Lecture Notes in Computer Science(), vol 7568. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34106-9_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-34106-9_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34105-2
Online ISBN: 978-3-642-34106-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics