Skip to main content

Probabilistic Discriminative Kernel Classifiers for Multi-Class Problems

  • Conference paper
  • First Online:
Pattern Recognition (DAGM 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2191))

Included in the following conference series:

Abstract

Logistic regression is presumably the most popular representative of probabilistic discriminative classifiers. In this paper, a kernel variant of logistic regression is introduced as an iteratively re-weighted least-squares algorithm in kernel-induced feature spaces. This formulation allows us to apply highly efficient approximation methods that are capable of dealing with large-scale problems. For multi-class problems, a pairwise coupling procedure is proposed. Pairwise coupling for “kernelized” logistic regression effectively overcomes conceptual and numerical problems of standard multi-class kernel classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. D.R. Cox and E. J. Snell. Analysis of Binary Data. Chapman & Hall, London, 1989.

    MATH  Google Scholar 

  2. T. Jaakkola, M. Meila, and T. Jebara. Maximum entropy discrimination. In S.A. Solla, T.K. Leen, and K.-R. Müller, editors, Advances in Neural Information Processing Systems, volume 12, pages 470–476. MIT Press, 1999.

    Google Scholar 

  3. P. Sollich. Probabilistic methods for support vector machines. In S.A. Solla, T.K. Leen, and K.-R. Müller, editors, Advances in Neural Information Processing Systems, volume 12, pages 349–355. MIT Press, 1999.

    Google Scholar 

  4. L. Hermes, D. Frieauff, J. Puzicha, and J. Buhmann. Support vector machines for land usage classification in Landsat TM imagery. In Proc. of the IEEE 1999 International Geoscience and Remote Sensing Symposium, volume 1, pages 348–350, 1999.

    Google Scholar 

  5. V. Roth and V. Steinhage. Nonlinear discriminant analysis using kernel functions. In S.A. Solla, T.K. Leen, and K.-R. Müller, editors, Advances in Neural Information Processing Systems, volume 12, pages 568–574. MIT Press, 1999.

    Google Scholar 

  6. S. Mika, G. Rätsch, J. Weston, B. Schölkopf, and K.-R. Müller. Fisher discriminant analysis with kernels. InY.-H. Hu, J. Larsen, E. Wilson, and S. Douglas, editors, Neural Networks for Signal Processing IX, pages 41–48. IEEE, 1999.

    Google Scholar 

  7. Trevor Hastie and Robert Tibshirani. Classification by pairwise coupling. In Michael I. Jordan, Michael J. Kearns, and Sara A. Solla, editors, Advances in Neural Information Processing Systems, volume 10. The MIT Press, 1998.

    Google Scholar 

  8. K.-R. Müller, S. Mika, G. Rätsch, K. Tsuda, and B. Schölkopf. An introduction to kernelbased learning algorithms. IEEE Transactions on Neural Networks, 12(2): 181–201, March 2001.

    Article  Google Scholar 

  9. M.R. Osborne. Fisher's method of scoring. Internat. Statistical Review, 60: 99–117, 1992.

    Article  MATH  Google Scholar 

  10. I. Nabney. Efficient training of RBF networks for classification. Technical Report NCRG/99/002, Aston University, Birmingham, UK., 1999.

    Google Scholar 

  11. A.E. Hoerl and R.W. Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12: 55–67, 1970.

    Article  MATH  Google Scholar 

  12. W.H. Press, S.A. Teukolsky, W.T Vetterling, and B.P. Flannery. Numerical Recipes in C. Cambridge University Press, 1992.

    Google Scholar 

  13. T. Jaakkola and D. Haussler. Probabilistic kernel regression models. In David Heckerman and Joe Whittaker, editors, Procs. 7th International Workshop on AI and Statistics. Morgan Kaufmann, 1999.

    Google Scholar 

  14. R. J. Vanderbei. LOQO: An interior point code for quadratic programming. Optimization Methods and Software, 11: 451–484, 1999.

    Article  MathSciNet  Google Scholar 

  15. Ronan Collobert and Samy Bengio. Support vector machines for large-scale regression problems. Technical Report IDIAP-RR-00-17, IDIAP, Martigny, Switzerland, 2000.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Roth, V. (2001). Probabilistic Discriminative Kernel Classifiers for Multi-Class Problems. In: Radig, B., Florczyk, S. (eds) Pattern Recognition. DAGM 2001. Lecture Notes in Computer Science, vol 2191. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45404-7_33

Download citation

  • DOI: https://doi.org/10.1007/3-540-45404-7_33

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42596-0

  • Online ISBN: 978-3-540-45404-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics