Neural Processing Letters

, Volume 36, Issue 3, pp 257–273 | Cite as

Discriminant Kernel Learning Using Hybrid Regularization

  • Jun Liang
  • Long Chen
  • Xiaobo Chen


Kernel discriminant analysis (KDA) is one of the state-of-the-art kernel-based methods for pattern classification and dimensionality reduction. It performs linear discriminant analysis in the feature space via kernel function. However, the performance of KDA greatly depends on the selection of the optimal kernel for the learning task of interest. In this paper, we propose a novel algorithm termed as elastic multiple kernel discriminant analysis (EMKDA) by using hybrid regularization for automatically learning kernels over a linear combination of pre-specified kernel functions. EMKDA makes use of a mixing norm regularization function to compromise the sparsity and non-sparsity of the kernel weights. A semi-infinite program based algorithm is then proposed to solve EMKDA. Extensive experiments on synthetic datasets, UCI benchmark datasets, digit and terrain database are conducted to show the effectiveness of the proposed methods.


Kernel discriminant analysis Multiple kernel learning Elastic-net regularization Semi-infinite program 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Schölkopf B, Smola AJ (2002) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge MassGoogle Scholar
  2. 2.
    Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  3. 3.
    Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20: 273–297MATHGoogle Scholar
  4. 4.
    Vapnik VN (1998) Statistical learning theory. Wiley, New YorkGoogle Scholar
  5. 5.
    Schölkopf B, Smola A, Müller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10: 1299–1319CrossRefGoogle Scholar
  6. 6.
    Mika S, Ratsch G, Weston J, Scholkopf B, Mullers K (1999) Fisher discriminant analysis with kernels. In: Proceedings of IEEE workshop on neural networks for signal processing, Madison, WIGoogle Scholar
  7. 7.
    Yang J, Jin Z, Zhang D, Frangi AF (2004) Essence of kernel Fisher discriminant: KPCA plus LDA. Pattern Recognit 37: 2097–2100CrossRefGoogle Scholar
  8. 8.
    Lanckriet GRG, Cristianini N, Bartlett P, Ghaoui LE, Jordan MI (2004) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5: 27–72MATHGoogle Scholar
  9. 9.
    Bach FR, Lanckriet GRG, Jordan MI (2004) Multiple kernel learning, conic duality, and the SMO algorithm. In: Proceeding ICML ’04 Proceedings of the twenty-first international conference on Machine learning, ACM New York, NYGoogle Scholar
  10. 10.
    Sonnenburg S, Rätsch G, Schäfer C, Schälkopf B (2006) Large scale multiple kernel learning. J Mach Learn Res 7: 1531–1565MathSciNetMATHGoogle Scholar
  11. 11.
    Kloft M, Brefeld U, Laskov P, Sonnenburg S (2008) Non-sparse multiple kernel learning. NIPS workshop on kernel learning: automatic selection of optimal kernels, WhistlerGoogle Scholar
  12. 12.
    Kloft M, Brefeld U, Sonnenburg S, Laskov P, Müller KR, Zien A (2009) Efficient and accurate lp-norm multiple kernel learning. Adv Neural Inf Proc Syst 22: 997–1005Google Scholar
  13. 13.
    Yang H, Xu Z, Ye J, King I, Lyu MR (2011) Efficient sparse generalized multiple kernel learning. IEEE Trans Neural Netw 22: 433–446CrossRefGoogle Scholar
  14. 14.
    Fu L, Zhang M, Li H (2010) Sparse RBF networks with multi-kernels. Neural Proc Lett 32: 235–247CrossRefGoogle Scholar
  15. 15.
    Yang J, Frangi AF, Zhang D, Jin Z (2005) KPCA plus LDA: a complete kernel Fisher discriminant framework for feature extraction and recognition. IEEE Trans Pattern Anal Mach Intell 27: 230–244CrossRefGoogle Scholar
  16. 16.
    Xu Y, Zhang D, Jin Z, Li M, Yang JY (2006) A fast kernel-based nonlinear discriminant analysis for multi-class problems. Pattern Recognit 39: 1026–1033MATHCrossRefGoogle Scholar
  17. 17.
    Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, New York, NYMATHGoogle Scholar
  18. 18.
    Fung G, Dundar M, Bi J, Rao B (2004) A fast iterative algorithm for fisher discriminant using heterogeneous kernels. In: Proceedings of the 21st international conference on machine learning, BanffGoogle Scholar
  19. 19.
    Mika S, Ratsch G, Muller KR (2001) A mathematical programming approach to the kernel fisher algorithm. Advances in neural information processing systems 591–597Google Scholar
  20. 20.
    Kim SJ, Magnani A, Boyd S (2006) Optimal kernel selection in kernel fisher discriminant analysis, in Proceedings of ICMLGoogle Scholar
  21. 21.
    Ye J, Ji S, Chen J (2008) Multi-class discriminant kernel learning via convex programming. J Mach Learn Res 9: 719–758MathSciNetMATHGoogle Scholar
  22. 22.
    Khemchandani R (2010) Learning the optimal kernel for Fisher discriminant analysis via second order cone programming. Eur J Operational Res 203: 692–697MathSciNetMATHCrossRefGoogle Scholar
  23. 23.
    Liang Z, Li Y (2010) Multiple kernels for generalised discriminant analysis. IET Comput Vision 4: 117–128MathSciNetCrossRefGoogle Scholar
  24. 24.
    Lin YY, Liu TL, Fuh CS (2011) Multiple kernel learning for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 33: 1147–1160CrossRefGoogle Scholar
  25. 25.
    Yan F, Kittler J, Mikolajczyk K, Tahir A (2009) Non-sparse multiple kernel learning for fisher discriminant analysis. In: Proceedings of ICDM, Miami, FL, pp 1064–1069Google Scholar
  26. 26.
    Yan F, Mikolajczyk K, Barnard M, Cai H, Kittler J (2010) Lp norm multiple kernel Fisher discriminant analysis for object and image categorisation. In: Proceedings of CVPR, San Francisco, CA, pp 3626–3632Google Scholar
  27. 27.
    Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc 67: 301–320MathSciNetMATHCrossRefGoogle Scholar
  28. 28.
    Hettich R, Kortanek KO (1993) Semi-infinite programming: theory, methods, and applications. SIAM Rev 35: 380–429MathSciNetMATHCrossRefGoogle Scholar
  29. 29.
    Sun D, Zhang D (2009) A new discriminant principal component analysis method with partial supervision. Neural Proc Lett 30: 103–112CrossRefGoogle Scholar
  30. 30.
    Chen X, Yang J, Liang J (2011) Optimal locality regularized least squares support vector machine via alternating optimization. Neural Proc Lett 33: 301–315CrossRefGoogle Scholar
  31. 31.
    Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24: 971–987CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC. 2012

Authors and Affiliations

  1. 1.Automotive Engineering Research InstituteJiangsu UniversityZhenjiangPeople’s Republic of China
  2. 2.School of Computer Science and Telecommunication EngineeringJiangsu UniversityZhenjiangPeople’s Republic of China

Personalised recommendations