Abstract
We consider the binary classification problem. Given an i.i.d. sample drawn from the distribution of an \(\mathcal{X}\times\{0,1\}\)-valued random pair, we propose to estimate the so-called Bayes classifier by minimizing the sum of the empirical classification error and a penalty term based on Efron’s or i.i.d. weighted bootstrap samples of the data. We obtain exponential inequalities for such bootstrap type penalties, which allow us to derive non-asymptotic properties for the corresponding estimators. In particular, we prove that these estimators achieve the global minimax risk over sets of functions built from Vapnik-Chervonenkis classes. The obtained results generalize Koltchinskii [12] and Bartlett, Boucheron, Lugosi’s [2] ones for Rademacher penalties that can thus be seen as special examples of bootstrap type penalties.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barron, A.R.: Logically smooth density estimation. Technical Report 56, Dept. of Statistics, Stanford Univ. (1985)
Bartlett, P., Boucheron, S., Lugosi, G.: Model selection and error estimation. Mach. Learn. 48, 85–113 (2002)
Bartlett, P., Bousquet, O., Mendelson, S.: Localized Rademacher complexities. In: Proc. of the 15th annual conf. on Computational Learning Theory, pp. 44–58 (2002)
Birgé, L., Massart, P.: Minimum contrast estimators on sieves: exponential bounds and rates of convergence. Bernoulli 4, 329–375 (1998)
Boucheron, S., Lugosi, G., Massart, P.: A sharp concentration inequality with applications. Random Struct. Algorithms 16, 277–292 (2000)
Buescher, K.L., Kumar, P.R.: Learning by canonical smooth estimation. I: Simultaneous estimation, II: Learning and choice of model complexity. IEEE Trans. Autom. Control 41, 545–556 557–569 (1996)
Devroye, L., Lugosi, G.: Lower bounds in pattern recognition and learning. Pattern Recognition 28, 1011–1018 (1995)
Efron, B.: The jackknife, the bootstrap and other resampling plans. CBMS-NSF Reg. Conf. Ser. Appl. Math. 38 (1982)
Fromont, M.: Quelques problèmes de sélection de modèles : construction de tests adaptatifs, ajustement de pénalités par des méthodes de bootstrap (Some model selection problems: construction of adaptive tests, bootstrap penalization). Ph. D. thesis, Université Paris XI (2003)
Giné, E., Zinn, J.: Bootstrapping general empirical measures. Ann. Probab. 18, 851–869 (1990)
Haussler, D.: Sphere packing numbers for subsets of the Boolean n-cube with bounded Vapnik-Chervonenkis dimension. J. Comb. Theory A 69, 217–232 (1995)
Koltchinskii, V.: Rademacher penalties and structural risk minimization. IEEE Trans. Inf. Theory 47, 1902–1914 (2001)
Koltchinskii, V., Panchenko, D.: Rademacher processes and bounding the risk of function learning. In: High dimensional probability II. 2nd international conference, Univ. of Washington, DC (1999)
Lozano, F.: Model selection using Rademacher penalization. In: Proceedings of the 2nd ICSC Symp. on Neural Computation, Berlin, Germany (2000)
Lugosi, G., Nobel, A.B.: Adaptive model selection using empirical complexities. Ann. Statist. 27, 1830–1864 (1999)
Lugosi, G., Wegkamp, M.: Complexity regularization via localized random penalties. (2003) (preprint)
Lugosi, G., Zeger, K.: Concept learning using complexity regularization. IEEE Trans. Inf. Theory 42, 48–54 (1996)
Mammen, E., Tsybakov, A.: Smooth discrimination analysis. Ann. Statist. 27, 1808–1829 (1999)
Massart, P.: Some applications of concentration inequalities to statistics. Ann. Fac. Sci. Toulouse 9, 245–303 (2000)
Massart, P.: Concentration inequalities and model selection. Lectures given at the St-Flour summer school of Probability Theory, in Lect. Notes Math. (2003) (to appear)
Massart, P.: Nedelec E. Risk bounds for statistical learning. (2003) (preprint)
McDiarmid, C.: On the method of bounded differences. Surveys in combinatorics (Lond. Math. Soc. Lect. Notes) 141, 148–188 (1989)
Præstgaard, J., Wellner, J.A.: Exchangeably weighted bootstraps of the general empirical process. Ann. Probab. 21, 2053–2086 (1993)
Tsybakov, A.: Optimal aggregation of classifiers in statistical learning. (2001) (preprint)
Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities. Theor. Probab. Appl. 16, 264–280 (1971)
Vapnik, V. N., Chervonenkis A.Y.: Teoriya raspoznavaniya obrazov. Statisticheskie problemy obucheniya. Nauka, Moscow (1974)
Vapnik, V.N.: Estimation of dependences based on empirical data. Springer, New York (1982)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fromont, M. (2004). Model Selection by Bootstrap Penalization for Classification. In: Shawe-Taylor, J., Singer, Y. (eds) Learning Theory. COLT 2004. Lecture Notes in Computer Science(), vol 3120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27819-1_20
Download citation
DOI: https://doi.org/10.1007/978-3-540-27819-1_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22282-8
Online ISBN: 978-3-540-27819-1
eBook Packages: Springer Book Archive