Abstract
This chapter shows how returning to the combinatorial nature of the Vapnik–Chervonenkis bounds provides simple ways to increase their accuracy, take into account properties of the data and of the learning algorithm, and provide empirically accurate estimates of the deviation between training error and test error.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bloom, D.: A birthday problem. Am. Math. Mon. 80, 1141–1142 (1973)
Bottou, L., Cortes, C., Denker, J.S., Drucker, H., Guyon, I., Jackel, L.D., LeCun, Y., Muller, U.A., Säckinger, E., Simard, P., Vapnik, V.N.: Comparison of classifier methods: a case study in handwritten digit recognition. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition, vol. 2, pp. 77–82. IEEE (1994)
Bottou, L., Cortes, C., Vapnik, V.N.: On the effective VC dimension. Technical report, Neuroprose. http://ftp.funet.fi/pub/sci/neural/neuroprose/bottou-effvc.ps.Z, http://leon.bottou.org/papers/bottou-cortes-vapnik-94 (1994)
Bottou, L., LeCun, Y., Vapnik, V.N.: Report: predicting learning curves without the ground truth hypothesis. http://leon.bottou.org/papers/bottou-lecun-vapnik-1999 (1999)
Bousquet, O.: Concentration inequalities and empirical processes theory applied to the analysis of learning algorithms. Ph.D. thesis, École Polytechnique (2002)
Dudley, R.M.: The sizes of compact subsets of Hilbert space and continuity of Gaussian processes. J. Funct. Anal. 1(3), 290–330 (1967)
Dudley, R.M.: Uniform Central Limit Theorems. Cambridge University Press, Cambridge (1999)
Haussler, D.: Sphere packing numbers for subsets of the boolean \(n\)-cube with bounded Vapnik-Chervonenkis dimension. J. Comb. Theory Ser. A 69(2), 217–232 (1995)
Kanerva, P.: Sparse Distributed Memory. MIT Press, Cambridge (1988)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
LeCun, Y., Bottou, L., Orr, G.B., Müller, K.R.: Efficient backprop. In: Orr, G.B., Müller, K.R. (eds.) Neural Networks, Tricks of the Trade. Lecture Notes in Computer Science, vol. 1524. Springer, Berlin (1998)
Shawe-Taylor, J., Bartlett, P., Williamson, R., Anthony, M.: Structural risk minimization over data-dependent hierarchies. IEEE Trans. Inf. Theory 44(5), 1926–1940 (1998)
Talagrand, M.: The Generic Chaining: Upper and Lower Bounds of Stochastic Processes. Springer, Berlin (2005)
Trong, Wu: An accurate computation of the hypergeometric distribution function. ACM Trans. Math. Softw. 19(1), 33–43 (1993)
Vapnik, V.N.: Estimation of Dependences based on Empirical Data. Springer Series in Statistics. Springer, Berlin (1982)
Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)
Vapnik, V.N., Chervonenkis, A.Y.: A note on one class of perceptrons. Autom. Remote Control 25(1), 774–780 (1964)
Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities. Proc. USSR Acad. Sci. 181(4), 781–783 (1968) (English translation: Sov. Math. Dokl. 9(4), 915–918 (1968))
Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab. Appl. 16(2), 264–281 (1971) (This volume, Chap. 3)
Vapnik, V.N., Chervonenkis, A.Y.: Теория распознавания образов: Статистические проблемы обучения (Theory of Pattern Recognition: Statistical Problems of Learning: in Russian). Nauka, Moscow (1974). German translation: Theorie der Zeichenerkennung, transl. K.G. Stöckel and B. Schneider, ed. S. Unger and B. Fritzsch, Akademie Verlag, Berlin (1979)
Vapnik, V.N., Lerner, A.Y.: Pattern recognition using generalized portrait method. Autom. Remote Control 24(6), 774–780 (1963)
Vapnik, V.N., Levin, E., LeCun, Y.: Measuring the VC-dimension of a learning machine. Neural Comput. 6(5), 851–876 (1994)
Vorontsov, K.V.: Combinatorial substantiation of learning algorithms. Comput. Math. Math. Phys. 44(11), 1997–2009 (2004)
Vorontsov, K.V.: Exact combinatorial bounds on the probability of overfitting for empirical risk minimization. Pattern Recognit. Image Anal. Adv. Math. Theory Appl. 20(3), 269–285 (2010)
Acknowledgments
This work originates in long discussions held in the 1990 s with my AT&T Labs colleagues Olivier Bousquet, Corinna Cortes, John Denker, Isabelle Guyon, Yann LeCun, Sara Solla, and Vladimir Vapnik. My interest was revived in Paphos by Konstantin Vorontsov and Vladimir Vovk. I would like to thank Vladimir Vovk for convincing me to write it up and Matus Tegarlsky for suggesting the use of the birthday problem to lower bound \({{\mathrm{Card}}}\,{\varDelta }_\mathcal{A}\) using empirical evidence.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Bottou, L. (2015). Making Vapnik–Chervonenkis Bounds Accurate. In: Vovk, V., Papadopoulos, H., Gammerman, A. (eds) Measures of Complexity. Springer, Cham. https://doi.org/10.1007/978-3-319-21852-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-21852-6_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21851-9
Online ISBN: 978-3-319-21852-6
eBook Packages: Computer ScienceComputer Science (R0)