Skip to main content
Log in

Efficiency of classification methods based on empirical risk minimization

  • Published:
Cybernetics and Systems Analysis Aims and scope

A binary classification problem is reduced to the minimization of convex regularized empirical risk functionals in a reproducing kernel Hilbert space. The solution is searched for in the form of a finite linear combination of kernel support functions (Vapnik’s support vector machines). Risk estimates for a misclassification as a function of the training sample size and other model parameters are obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. V. N. Vapnik, Statistical Learning Theory, Wiley, New York (1998).

    MATH  Google Scholar 

  2. L. Devroye, L. Gyorfi, and G. Lugosi, A Probabilistic Theory of Pattern Recognition, Springer, New York (1996).

    MATH  Google Scholar 

  3. C. Stone, “Consistent nonparametric regression,” Ann. Statistics, 5, 595–645 (1977).

    Article  MATH  Google Scholar 

  4. V. N. Vapnik and A. Ya. Chervonenkis, Pattern Recognition Theory. Statistical Problems of Learning [in Russian], Nauka, Moscow (1974).

    Google Scholar 

  5. V. N. Vapnik, Estimation of Dependences based on Empirical Data [in Russian], Nauka, Moscow (1979).

    Google Scholar 

  6. M. A. Aizerman, E. M. Braverman, and L. I. Rozonoer, Potential Function Method in Machine Learning Theory [in Russian], Nauka, Moscow (1970).

    Google Scholar 

  7. B. Schoelkopf and A. J. Smola, Learning with Kernels. Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, Cambridge, MA (2002).

    Google Scholar 

  8. I. Steinwart and A. Christmann, Support Vector Machines, Springer, New York (2008).

    MATH  Google Scholar 

  9. S. Boucheron, O. Bousquet, and G. Lugosi, “Theory of classification: A survey of some recent advances,” ESAIM: Probability and Statistics, 9, 323–375 (2005).

    Article  MATH  MathSciNet  Google Scholar 

  10. M. I. Schlesinger and V. Hlavác, Ten Lectures on Statistical and Structural Pattern Recognition, Kluwer Acad. Publ. (2004).

  11. L. Gyorfi, M. Kohler, A. Krzyzak, and H. Walk, A Distribution Free Theory of Nonparametric Regression, Springer, New York–Berlin–Heidelberg (2002).

  12. A. M. Gupal, S. V. Pashko, and I. V. Sergienko, “Efficiency of Bayesian classification procedure,” Cybern. Syst. Analysis, 31, No. 4, 543–554 (1995).

    Article  MATH  MathSciNet  Google Scholar 

  13. I. V. Sergienko and A. M. Gupal, “Optimal pattern recognition procedures and their application,” Cybern. Syst. Analysis, 43, No. 6, 799–809 (2007).

    Article  MATH  MathSciNet  Google Scholar 

  14. A. M. Gupal and I. V. Sergienko, Optimal Pattern Recognition Procedures [in Russian], Naukova Dumka, Kyiv (2008).

    Google Scholar 

  15. T. Poggio and S. Smale, “The mathematics of learning: Dealing with data,” Notices Amer. Math. Soc., 50, No. 5, 537–544 (2003).

    MATH  MathSciNet  Google Scholar 

  16. R. Koenker and G. W. Bassett, “Regression quantiles,” Econometrica, 46, 33–50 (1978).

    Article  MATH  MathSciNet  Google Scholar 

  17. R. Koenker, Quantile Regression, Cambridge Univ. Press, Cambridge–New York (2005).

    MATH  Google Scholar 

  18. Yu. M. Ermoliev and A. I. Yastremskii, Stochastic Models and Methods in Economic Planning [in Russian], Nauka, Moscow (1979).

    Google Scholar 

  19. Y. M. Ermoliev and G. Leonardi, “Some proposals for stochastic facility location models,” Math. Modelling, 3, 407–420 (1982).

    Article  MATH  MathSciNet  Google Scholar 

  20. A. Ruszczynski and A. Shapiro (eds.), Stochastic Programming, Vol. 10 of the Handbooks in Operation Research and Management Science, Elsevier, Amsterdam (2003).

  21. F. Cucker and S. Smale, “On the mathematical foundations of learning,” Bull. Amer. Math. Soc. (N.S.), 39, No. 1, 1–49 (2002).

    Article  MATH  MathSciNet  Google Scholar 

  22. N. Aronshain, “Theory of reproducing kernels,” Matematika, 7, No. 2, 67–130 (1963).

    Google Scholar 

  23. A. Berlinet and C. Thomas-Agnan, Reproducing Kernel Hilbert Spaces in Probability and Statistics, Kluwer Acad. Publ., Dordrecht–Boston–London (2004).

    MATH  Google Scholar 

  24. A. N. Tikhonov and V. Ya. Arsenin, Methods of Solving Ill-Posed Problems [in Russian], Nauka, Moscow (1986).

    Google Scholar 

  25. F. P. Vasil’ev, Methods to Solve Extreme Problems. Minimization Problems in Functional Spaces, Regularization, and Approximation [in Russian], Nauka, Moscow (1981).

    Google Scholar 

  26. G. Wahba, “Spline models for observational data,” CBMS-NSF Regional Conference Series in Applied Mathematics, 59, SIAM, Philadelphia, PA (1990).

  27. M. A. Keyzer, “Rule-based and support vector (SV-) regression/classification algorithms for joint processing of census, map, survey and district data,” in: Working Paper WP-05-01, Centre for World Food Studies, Amsterdam (http://www.sow.vu.nl/pdf/wp05.01.pdf) (2005).

  28. R. T. Rockafellar and R. J.-B. Wets, Variational Analysis, Springer, Berlin (1998).

    Book  MATH  Google Scholar 

  29. O. Bousquet and A. Elisseeff, “Stability and generalization,” J. Mach. Learn. Res., No. 2, 499–526 (2002).

    Google Scholar 

  30. S. Smale and D. X. Zhou, “Shannon sampling. II: Connections to learning theory,” Appl. Comput. Harmon. Anal., 19, No. 3, 285–302 (2005).

    Article  MATH  MathSciNet  Google Scholar 

  31. E. De Vito, A. Caponnetto, and L. Rosasco, “Model selection for regularized least-squares algorithm in learning theory,” Found. Comput. Math., 5, No. 1, 59–85 (2005).

    Article  MATH  MathSciNet  Google Scholar 

  32. V. I. Norkin and M. A. Keyzer, “On convergence of kernel learning estimators,” in: L. Sakalauskas, O. W. Weber, and E. K. Zavadskas (eds.), Proc. 20th EURO Mini Conf. on Continuous Optimization and Knowledge-Based Technologies (EUROPT-2008), Inst. Math. and Inform., Vilnius (2008), pp. 306–310.

    Google Scholar 

  33. V. I. Norkin and M. A. Keyzer, “Asymptotic efficiency of kernel support vector machines (SVM),” Cybern. Syst. Analysis, 45, No. 4, 575–588 (2009).

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. I. Norkin.

Additional information

Translated from Kibernetika i Sistemnyi Analiz, No. 5, pp. 93–105, September–October 2009.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Norkin, V.I., Keyzer, M.A. Efficiency of classification methods based on empirical risk minimization. Cybern Syst Anal 45, 750–761 (2009). https://doi.org/10.1007/s10559-009-9153-x

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10559-009-9153-x

Keywords

Navigation