The Consistency of Greedy Algorithms for Classification

Mannor, Shie; Meir, Ron; Zhang, Tong

doi:10.1007/3-540-45435-7_22

The Consistency of Greedy Algorithms for Classification

Shie Mannor³,
Ron Meir³ &
Tong Zhang⁴

Conference paper
First Online: 01 January 2002

1167 Accesses
18 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2375))

Abstract

We consider a class of algorithms for classification, which are based on sequential greedy minimization of a convex upper bound on the 0 — 1 loss function. A large class of recently popular algorithms falls within the scope of this approach, including many variants of Boosting algorithms. The basic question addressed in this paper relates to the statistical consistency of such approaches. We provide precise conditions which guarantee that sequential greedy procedures are consistent, and establish rates of convergence under the assumption that the Bayes decision boundary belongs to a certain class of smooth functions. The results are established using a form of regularization which constrains the search space at each iteration of the algorithm. In addition to providing general consistency results, we provide rates of convergence for smooth decision boundaries. A particularly interesting conclusion of our work is that Logistic function based Boosting provides faster rates of convergence than Boosting based on the exponential function used in AdaBoost.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

R. A. Adams. Sobolev Spaces. Academic Press, New York, 1975.
MATH Google Scholar
M. Anthony and P. L. Bartlett. Neural Network Learning; Theoretical Foundations. Cambridge University Press, 1999.
Google Scholar
P. Bartlett and S. Mendelson. Rademacher and Gaussian complexities: risk bounds and structural results. In Proceedings of the Fourteenth Annual Conference on Computational Learning Theory, pages 224–240, 2001.
Google Scholar
L. Breiman. Arcing classifiers. The Annals of Statistics, 26(3):801–824, 1998.
Article MATH MathSciNet Google Scholar
Y. Freund and R. E. Schapire. A decision theoretic generalization of on-line learning and application to boosting. Comput. Syst. Sci., 55(1):119–139, 1997.
Article MATH MathSciNet Google Scholar
J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: a statistical view of boosting. The Annals of Statistics, 38(2):337–374, 2000.
Article MathSciNet Google Scholar
W. Jiang. Does boosting overfit: Views from an exact solution. Technical Report 00-03, Department of Statistics, Northwestern University, 2000.
Google Scholar
W. Jiang. Process consistency for adaboost. Technical Report 00-05, Department of Statistics, Northwestern University, 2000.
Google Scholar
V. Koltchinksii and D. Panchenko. Empirical margin distributions and bounding the generalization error of combined classifiers. Ann. Statis., 30(1), 2002.
Google Scholar
G. Lugosi and N. Vayatis. On the bayes-risk consistency of bosting methods. Technical report, Pompeu Fabra University, 2001.
Google Scholar
S. Mannor and R. Meir. Geometric bounds for generlization in boosting. In Proceedings of the Fourteenth Annual Conference on Computational Learning Theory, pages 461–472, 2001.
Google Scholar
S. Mannor and R. Meir. On the existence of weak learners and applications to boosting. Machine Learning, 2002. To appear.
Google Scholar
L. Mason, P. Bartlett, J. Baxter, and M. Frean. Functional gradient techniques for combining hypotheses. In B. Schölkopf A. Smola, P. Bartlett and D. Schuurmans, editors, Advances in Large Margin Classifiers. MIT Press, 2000.
Google Scholar
R. Meir and V. Maiorov. On the optimality of neural network approximation using incremental algorithms. IEEE Trans. Neural Networks, 11(2):323–337, 2000.
Article Google Scholar
D. Pollard. Convergence of Empirical Processes. Springer Verlag, New York, 1984.
Google Scholar
R. Schapire. The strength of weak learnability. Machine Learning, 5(2):197–227, 1990.
Google Scholar
A. W. van der Vaart and J. A. Wellner. Weak Convergence and EmpiricalProcesses. Springer Verlag, New York, 1996.
Google Scholar
Y. Yang. Minimax nonparametric classification-patri: rates of convergence. IEEE Trans. Inf. Theory, 45(7):2271–2284, 1999.
Article MATH Google Scholar
T. Zhang. Statistical behavior and consistency of classification methods based on convex risk minimization. Technical Report RC22155, IBM T. J. Watson Research Center, Yorktown Heights, 2001.
Google Scholar
T. Zhang. Sequential greedy approximation for certain convex optimization problems. Technical Report RC22309, IBM T. J. Watson Research Center, Yorktown Heights, 2002.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Technion, Haifa, 32000, Israel
Shie Mannor & Ron Meir
IBM T. J. Watson Research Center, Yorktown Heights, NY, 10598, USA
Tong Zhang

Authors

Shie Mannor
View author publications
You can also search for this author in PubMed Google Scholar
Ron Meir
View author publications
You can also search for this author in PubMed Google Scholar
Tong Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Research School of Information Sciences and Engineering, Australian National University, Canberra, ACT, 0200, Australia
Jyrki Kivinen
Computer Science Department, University of Illinois at Chicago, 851 S. Morgan St., Chicago, IL, 60607, USA
Robert H. Sloan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mannor, S., Meir, R., Zhang, T. (2002). The Consistency of Greedy Algorithms for Classification. In: Kivinen, J., Sloan, R.H. (eds) Computational Learning Theory. COLT 2002. Lecture Notes in Computer Science(), vol 2375. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45435-7_22

Download citation

DOI: https://doi.org/10.1007/3-540-45435-7_22
Published: 25 June 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43836-6
Online ISBN: 978-3-540-45435-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics