Skip to main content

The Consistency of Greedy Algorithms for Classification

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2375))

Abstract

We consider a class of algorithms for classification, which are based on sequential greedy minimization of a convex upper bound on the 0 — 1 loss function. A large class of recently popular algorithms falls within the scope of this approach, including many variants of Boosting algorithms. The basic question addressed in this paper relates to the statistical consistency of such approaches. We provide precise conditions which guarantee that sequential greedy procedures are consistent, and establish rates of convergence under the assumption that the Bayes decision boundary belongs to a certain class of smooth functions. The results are established using a form of regularization which constrains the search space at each iteration of the algorithm. In addition to providing general consistency results, we provide rates of convergence for smooth decision boundaries. A particularly interesting conclusion of our work is that Logistic function based Boosting provides faster rates of convergence than Boosting based on the exponential function used in AdaBoost.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. A. Adams. Sobolev Spaces. Academic Press, New York, 1975.

    MATH  Google Scholar 

  2. M. Anthony and P. L. Bartlett. Neural Network Learning; Theoretical Foundations. Cambridge University Press, 1999.

    Google Scholar 

  3. P. Bartlett and S. Mendelson. Rademacher and Gaussian complexities: risk bounds and structural results. In Proceedings of the Fourteenth Annual Conference on Computational Learning Theory, pages 224–240, 2001.

    Google Scholar 

  4. L. Breiman. Arcing classifiers. The Annals of Statistics, 26(3):801–824, 1998.

    Article  MATH  MathSciNet  Google Scholar 

  5. Y. Freund and R. E. Schapire. A decision theoretic generalization of on-line learning and application to boosting. Comput. Syst. Sci., 55(1):119–139, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  6. J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: a statistical view of boosting. The Annals of Statistics, 38(2):337–374, 2000.

    Article  MathSciNet  Google Scholar 

  7. W. Jiang. Does boosting overfit: Views from an exact solution. Technical Report 00-03, Department of Statistics, Northwestern University, 2000.

    Google Scholar 

  8. W. Jiang. Process consistency for adaboost. Technical Report 00-05, Department of Statistics, Northwestern University, 2000.

    Google Scholar 

  9. V. Koltchinksii and D. Panchenko. Empirical margin distributions and bounding the generalization error of combined classifiers. Ann. Statis., 30(1), 2002.

    Google Scholar 

  10. G. Lugosi and N. Vayatis. On the bayes-risk consistency of bosting methods. Technical report, Pompeu Fabra University, 2001.

    Google Scholar 

  11. S. Mannor and R. Meir. Geometric bounds for generlization in boosting. In Proceedings of the Fourteenth Annual Conference on Computational Learning Theory, pages 461–472, 2001.

    Google Scholar 

  12. S. Mannor and R. Meir. On the existence of weak learners and applications to boosting. Machine Learning, 2002. To appear.

    Google Scholar 

  13. L. Mason, P. Bartlett, J. Baxter, and M. Frean. Functional gradient techniques for combining hypotheses. In B. Schölkopf A. Smola, P. Bartlett and D. Schuurmans, editors, Advances in Large Margin Classifiers. MIT Press, 2000.

    Google Scholar 

  14. R. Meir and V. Maiorov. On the optimality of neural network approximation using incremental algorithms. IEEE Trans. Neural Networks, 11(2):323–337, 2000.

    Article  Google Scholar 

  15. D. Pollard. Convergence of Empirical Processes. Springer Verlag, New York, 1984.

    Google Scholar 

  16. R. Schapire. The strength of weak learnability. Machine Learning, 5(2):197–227, 1990.

    Google Scholar 

  17. A. W. van der Vaart and J. A. Wellner. Weak Convergence and EmpiricalProcesses. Springer Verlag, New York, 1996.

    Google Scholar 

  18. Y. Yang. Minimax nonparametric classification-patri: rates of convergence. IEEE Trans. Inf. Theory, 45(7):2271–2284, 1999.

    Article  MATH  Google Scholar 

  19. T. Zhang. Statistical behavior and consistency of classification methods based on convex risk minimization. Technical Report RC22155, IBM T. J. Watson Research Center, Yorktown Heights, 2001.

    Google Scholar 

  20. T. Zhang. Sequential greedy approximation for certain convex optimization problems. Technical Report RC22309, IBM T. J. Watson Research Center, Yorktown Heights, 2002.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mannor, S., Meir, R., Zhang, T. (2002). The Consistency of Greedy Algorithms for Classification. In: Kivinen, J., Sloan, R.H. (eds) Computational Learning Theory. COLT 2002. Lecture Notes in Computer Science(), vol 2375. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45435-7_22

Download citation

  • DOI: https://doi.org/10.1007/3-540-45435-7_22

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43836-6

  • Online ISBN: 978-3-540-45435-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics