Abstract
Algorithms for solving multi-category classification problems using output coding have become very popular in recent years. Following initial attempts with discrete coding matrices, recent work has attempted to alleviate some of their shortcomings by considering real-valued ‘coding’ matrices. We consider an approach to multi-category classification, based on minimizing a convex upper bound on the 0-1 loss. We show that this approach is closely related to output coding, and derive data-dependent bounds on the performance. These bounds can be optimized in order to obtain effective coding matrices, which guarantee small generalization error. Moreover, our results apply directly to kernel based approaches.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Allwein, E.L., Schapire, R.E., Singer, Y.: Reducing multiclass to binary: a unifying approach for margin classifiers. Journal of Machine Learning Research 1, 113–141 (2000)
Bartlett, P.L., Mendelson, S.: Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research 3, 463–482 (2002)
Boucheron, S., Lugosi, G., Massart, P.: Concentration inequalities using the ntropy method. The Annals of Probability 2 (2003) (to appear)
Crammer, K., Singer, Y.: On the learnability and design of output codes for multiclass problems. Machine Learning 47(2) (2002)
Crammer, K., Singer, Y.: Improved output coding for classification using continuous relaxation. In: Advances in Neural Information Processing Systems, vol. 15 (2003)
Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via errorcorrecting output codes. Journal of Aritifical Intelligence Research 2, 263–286 (1995)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical iew of boosting. The Annals of Statistics 38(2), 337–374 (2000)
Hardy, G., Littlewood, J.E., Polya, G.: Inequalities, 2nd edn. Cambridge University Press, Cambridge (1952)
Hastie, T., Tibshirani, R.: Classification by pairwise clustering. The Annals of Statistics 16(1), 451–471 (1998)
Herbrich, R.: Learning Kernel Classifiers: Theory and Algorithms. MIT Press, Boston (2002)
Koltchinksii, V., Panchenko, D.: Empirical margin distributions and bounding the generalization error of combined classifiers. The Annals of Statistics 30(1) (2002)
Ledoux, M., Talgrand, M.: Probability in Banach Spaces: PIsoperimetry and rocesses. Springer, New York (1991)
Lee, Y., LIin, Y., Wahba, G.: Multicategory Support Vector Machines, theory and applications to the classification of microarray data and satellite radiance data. Technical Report 1064, University of Wisconsin, Department of Statistics (2002)
Mannor, S., Meir, R., Zhang, T.: The consistency of greedy algorithms for classification. In: Kivinen, J., Sloan, R.H. (eds.) COLT 2002. LNCS (LNAI), vol. 2375, pp. 319–333. Springer, Heidelberg (2002)
Meir, R., Zhang, T.: Generalization bounds for Bayesian mixture algorithms (submitted for publication)
Meir, R., Zhang, T.: Data-dependent bounds for Bayesian mixture methods. In: Advances in Neural Information Processing Systems, vol. 15 (2003) (to appear)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Schapire, R.: Using outpout codes to boost multiclass learning problems. In: Proceeding f the 14th International Conference on Machine Learning, pp. 313–321 (1997)
Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Machine Learning 37(3), 297–336 (1999)
van der Vaart, A.W., Wellner, J.A.: Weak Convergence and Empirical Processes. Springer, New York (1996)
Vapnik, V.N.: Statistical Learning Theory. Wiley Interscience, New York (1998)
Zhang, T.: Statistical behavior and consistency of classification methods based on convex risk minimization. The Annals of Statistics (2003) (to appear)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Desyatnikov, I., Meir, R. (2003). Data-Dependent Bounds for Multi-category Classification Based on Convex Losses. In: Schölkopf, B., Warmuth, M.K. (eds) Learning Theory and Kernel Machines. Lecture Notes in Computer Science(), vol 2777. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45167-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-540-45167-9_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40720-1
Online ISBN: 978-3-540-45167-9
eBook Packages: Springer Book Archive