Optimal Oracle Inequality for Aggregation of Classifiers Under Low Noise Condition

Lecué, Guillaume

doi:10.1007/11776420_28

Guillaume Lecué²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4005))

Included in the following conference series:

International Conference on Computational Learning Theory

2705 Accesses
5 Citations

Abstract

We consider the problem of optimality, in a minimax sense, and adaptivity to the margin and to regularity in binary classification. We prove an oracle inequality, under the margin assumption (low noise condition), satisfied by an aggregation procedure which uses exponential weights. This oracle inequality has an optimal residual: (logM/n)^{κ/(2κ− 1)} where κ is the margin parameter, M the number of classifiers to aggregate and n the number of observations. We use this inequality first to construct minimax classifiers under margin and regularity assumptions and second to aggregate them to obtain a classifier which is adaptive both to the margin and regularity. Moreover, by aggregating plug-in classifiers (only logn), we provide an easily implementable classifier adaptive both to the margin and to regularity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Audibert, J.-Y., Tsybakov, A.B.: Fast learning rates for plug-in classifiers under margin condition (2005) (Preprint PMA-998), available at: http://www.proba.jussieu.fr/mathdoc/preprints/index.html#2005
Barron, A., Leung, G.: Information theory and mixing least-square regressions (manuscript, 2004)
Google Scholar
Barron, A., Li, J.: Mixture density estimation. Biometrics 53, 603–618 (1997)
Article Google Scholar
Bartlett, P., Freund, Y., Lee, W.S., Schapire, R.E.: Boosting the margin: a new explanantion for the effectiveness of voting methods. Annals of Statistics 26, 1651–1686 (1998)
Article MathSciNet MATH Google Scholar
Bartlett, P., Jordan, M., McAuliffe, J.: Convexity, Classification and Risk Bounds, Technical Report 638, Department of Statistics, U.C. Berkeley (2003), available at: http://stat-www.berkeley.edu/tech-reports/638.pdf
Blanchard, G., Bousquet, O., Massart, P.: Statistical Performance of Support Vector Machines (2004), available at: http://mahery.math.u-psud.fr/~blanchard/publi/
Boucheron, S., Bousquet, O., Lugosi, G.: Theory of classification: A survey of some recent advances. ESAIM: Probability and statistics 9, 325–375 (2005)
Article MathSciNet Google Scholar
Blanchard, G., Lugosi, G., Vayatis, N.: On the rate of convergence of regularized boosting classifiers. JMLR 4, 861–894 (2003)
Article MathSciNet Google Scholar
Bühlmann, P., Yu, B.: Analyzing bagging. Ann. Statist. 30(4), 927–961 (2002)
Article MathSciNet MATH Google Scholar
Cristianini, N., Shawe-Taylor, J.: An introduction to Support Vector Machines. Cambridge University Press, Cambridge (2002)
Google Scholar
Cortes, C., Vapnik, V.: Support-Vector Networks. Machine Learning 20(3), 273–297 (1995)
MATH Google Scholar
Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer, Heidelberg (1996)
MATH Google Scholar
Catoni, O.: Statistical Learning Theory and Stochastic Optimization, Ecole d’été de Probabilités de Saint-Flour, Lecture Notes in Mathematics. Springer, N.Y. (2001)
Google Scholar
Koltchinskii, V., Panchenko, D.: Empirical margin distributions and bounding the generalization error of combined classifiers. Ann. Statist. 30, 1–50 (2002)
MathSciNet MATH Google Scholar
Koltchinskii, V.: Local Rademacher Complexities and Oracle Inequalities in Risk Minimization. Ann. Statist. (to appear, 2005)
Google Scholar
Lugosi, G., Vayatis, N.: On the Bayes-risk consistency of regularized boosting methods. Ann. Statist. 32(1), 30–55 (2004)
MathSciNet MATH Google Scholar
Lecué, G.: Simultaneous adaptation to the margin and to complexity in classification (2005), available at: http://hal.ccsd.cnrs.fr/ccsd-00009241/en/
Lecué, G.: Optimal rates of aggregation in classification (2006), available at: https://hal.ccsd.cnrs.fr/ccsd-00021233
Massart, P.: Some applications of concentration inequalities to Statistics. Probability Theory. Annales de la Faculté des Sciences de Toulouse 2, 245–303 (2000), volume spécial dédié à Michel Talagrand
Google Scholar
Massart, P.: Concentration inequalities and Model Selection. Lectures notes of Saint Flour (2004)
Google Scholar
Schölkopf, B., Smola, A.: Learning with kernels. MIT press, Cambridge University (2002)
Google Scholar
Steinwart, I., Scovel, C.: Fast Rates for Support Vector Machines using Gaussian Kernels (2004), Los Alamos National Laboratory Technical Report LA-UR 04-8796 (submitted to Annals of Statistics)
Google Scholar
Steinwart, I., Scovel, C.: Fast Rates for Support Vector Machines. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS, vol. 3559, pp. 279–294. Springer, Heidelberg (2005)
Chapter Google Scholar
Tsybakov, A.B.: Optimal rates of aggregation. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS (LNAI), vol. 2777, pp. 303–313. Springer, Heidelberg (2003)
Chapter Google Scholar
Tsybakov, A.B.: Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32(1), 135–166 (2004)
Article MathSciNet MATH Google Scholar
Tsybakov, A.B.: Introduction à l’estimation non-paramétrique. Springer, Heidelberg (2004)
MATH Google Scholar
Vovk, V.G.: Aggregating strategies. In: Proceedings of the Third Annual Workshop on Computational Learning Theory, pp. 371–383 (1990)
Google Scholar
Yang, Y.: Mixing strategies for density estimation. Ann. Statist. 28(1), 75–87 (2000)
Article MathSciNet MATH Google Scholar
Zhang, T.: Statistical behavior and consistency of classification methods based on convex risk minimization. Ann. Statist. 32(1), 56–85 (2004)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire de Probabilités et Modèles Aléatoires (UMR CNRS 7599), Université Paris VI, 4 pl.Jussieu, BP 188, 75252, Paris, France
Guillaume Lecué

Authors

Guillaume Lecué
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ICREA and Department of Economics, Universitat Pompeu Fabra, Ramon Trias Fargas 25-27, 08005, Barcelona, Spain
Gábor Lugosi
Ruhr-Universität Bochum, Germany
Hans Ulrich Simon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lecué, G. (2006). Optimal Oracle Inequality for Aggregation of Classifiers Under Low Noise Condition. In: Lugosi, G., Simon, H.U. (eds) Learning Theory. COLT 2006. Lecture Notes in Computer Science(), vol 4005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11776420_28

Download citation

DOI: https://doi.org/10.1007/11776420_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35294-5
Online ISBN: 978-3-540-35296-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics