Skip to main content

Model Selection by Bootstrap Penalization for Classification

  • Conference paper
Book cover Learning Theory (COLT 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3120))

Included in the following conference series:

Abstract

We consider the binary classification problem. Given an i.i.d. sample drawn from the distribution of an \(\mathcal{X}\times\{0,1\}\)-valued random pair, we propose to estimate the so-called Bayes classifier by minimizing the sum of the empirical classification error and a penalty term based on Efron’s or i.i.d. weighted bootstrap samples of the data. We obtain exponential inequalities for such bootstrap type penalties, which allow us to derive non-asymptotic properties for the corresponding estimators. In particular, we prove that these estimators achieve the global minimax risk over sets of functions built from Vapnik-Chervonenkis classes. The obtained results generalize Koltchinskii [12] and Bartlett, Boucheron, Lugosi’s [2] ones for Rademacher penalties that can thus be seen as special examples of bootstrap type penalties.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barron, A.R.: Logically smooth density estimation. Technical Report 56, Dept. of Statistics, Stanford Univ. (1985)

    Google Scholar 

  2. Bartlett, P., Boucheron, S., Lugosi, G.: Model selection and error estimation. Mach. Learn. 48, 85–113 (2002)

    Article  MATH  Google Scholar 

  3. Bartlett, P., Bousquet, O., Mendelson, S.: Localized Rademacher complexities. In: Proc. of the 15th annual conf. on Computational Learning Theory, pp. 44–58 (2002)

    Google Scholar 

  4. Birgé, L., Massart, P.: Minimum contrast estimators on sieves: exponential bounds and rates of convergence. Bernoulli 4, 329–375 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  5. Boucheron, S., Lugosi, G., Massart, P.: A sharp concentration inequality with applications. Random Struct. Algorithms 16, 277–292 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  6. Buescher, K.L., Kumar, P.R.: Learning by canonical smooth estimation. I: Simultaneous estimation, II: Learning and choice of model complexity. IEEE Trans. Autom. Control 41, 545–556 557–569 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  7. Devroye, L., Lugosi, G.: Lower bounds in pattern recognition and learning. Pattern Recognition 28, 1011–1018 (1995)

    Article  Google Scholar 

  8. Efron, B.: The jackknife, the bootstrap and other resampling plans. CBMS-NSF Reg. Conf. Ser. Appl. Math. 38 (1982)

    Google Scholar 

  9. Fromont, M.: Quelques problèmes de sélection de modèles : construction de tests adaptatifs, ajustement de pénalités par des méthodes de bootstrap (Some model selection problems: construction of adaptive tests, bootstrap penalization). Ph. D. thesis, Université Paris XI (2003)

    Google Scholar 

  10. Giné, E., Zinn, J.: Bootstrapping general empirical measures. Ann. Probab. 18, 851–869 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  11. Haussler, D.: Sphere packing numbers for subsets of the Boolean n-cube with bounded Vapnik-Chervonenkis dimension. J. Comb. Theory A 69, 217–232 (1995)

    Article  MathSciNet  Google Scholar 

  12. Koltchinskii, V.: Rademacher penalties and structural risk minimization. IEEE Trans. Inf. Theory 47, 1902–1914 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  13. Koltchinskii, V., Panchenko, D.: Rademacher processes and bounding the risk of function learning. In: High dimensional probability II. 2nd international conference, Univ. of Washington, DC (1999)

    Google Scholar 

  14. Lozano, F.: Model selection using Rademacher penalization. In: Proceedings of the 2nd ICSC Symp. on Neural Computation, Berlin, Germany (2000)

    Google Scholar 

  15. Lugosi, G., Nobel, A.B.: Adaptive model selection using empirical complexities. Ann. Statist. 27, 1830–1864 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  16. Lugosi, G., Wegkamp, M.: Complexity regularization via localized random penalties. (2003) (preprint)

    Google Scholar 

  17. Lugosi, G., Zeger, K.: Concept learning using complexity regularization. IEEE Trans. Inf. Theory 42, 48–54 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  18. Mammen, E., Tsybakov, A.: Smooth discrimination analysis. Ann. Statist. 27, 1808–1829 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  19. Massart, P.: Some applications of concentration inequalities to statistics. Ann. Fac. Sci. Toulouse 9, 245–303 (2000)

    MATH  MathSciNet  Google Scholar 

  20. Massart, P.: Concentration inequalities and model selection. Lectures given at the St-Flour summer school of Probability Theory, in Lect. Notes Math. (2003) (to appear)

    Google Scholar 

  21. Massart, P.: Nedelec E. Risk bounds for statistical learning. (2003) (preprint)

    Google Scholar 

  22. McDiarmid, C.: On the method of bounded differences. Surveys in combinatorics (Lond. Math. Soc. Lect. Notes) 141, 148–188 (1989)

    MathSciNet  Google Scholar 

  23. Præstgaard, J., Wellner, J.A.: Exchangeably weighted bootstraps of the general empirical process. Ann. Probab. 21, 2053–2086 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  24. Tsybakov, A.: Optimal aggregation of classifiers in statistical learning. (2001) (preprint)

    Google Scholar 

  25. Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities. Theor. Probab. Appl. 16, 264–280 (1971)

    Article  MATH  Google Scholar 

  26. Vapnik, V. N., Chervonenkis A.Y.: Teoriya raspoznavaniya obrazov. Statisticheskie problemy obucheniya. Nauka, Moscow (1974)

    Google Scholar 

  27. Vapnik, V.N.: Estimation of dependences based on empirical data. Springer, New York (1982)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fromont, M. (2004). Model Selection by Bootstrap Penalization for Classification. In: Shawe-Taylor, J., Singer, Y. (eds) Learning Theory. COLT 2004. Lecture Notes in Computer Science(), vol 3120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27819-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-27819-1_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22282-8

  • Online ISBN: 978-3-540-27819-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics