Advertisement

Kernel Methods

  • Ke-Lin DuEmail author
  • M. N. S. Swamy
Chapter

Abstract

This chapter introduces the basics of the kernel method. Extensions of the kernel method to some traditional methods are also described. The SVM method will be described in the next chapter.

References

  1. 1.
    Aflalo, J., Ben-Tal, A., Bhattacharyya, C., Nath, J. S., & Raman, S. (2011). Variable sparsity kernel learning. Journal of Machine Learning Research, 12, 565–592.MathSciNetzbMATHGoogle Scholar
  2. 2.
    Aizerman, M., Braverman, E., & Rozonoer, L. (1964). Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control, 25, 821–837.zbMATHGoogle Scholar
  3. 3.
    Alzate, C., & Suykens, J. A. K. (2008). A regularized kernel CCA contrast function for ICA. Neural Networks, 21, 170–181.zbMATHCrossRefGoogle Scholar
  4. 4.
    Alzate, C., & Suykens, J. A. K. (2008). Kernel component analysis using an epsilon-insensitive robust loss function. IEEE Transactions on Neural Networks, 19(9), 1583–1598.CrossRefGoogle Scholar
  5. 5.
    Alzate, C., & Suykens, J. A. K. (2010). Multiway spectral clustering with out-of-sample extensions through weighted kernel PCA. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(2), 335–347.CrossRefGoogle Scholar
  6. 6.
    Aravkin, A. Y., Bell, B. M., Burke, J. V., & Pillonetto, G. (2015). The connection between Bayesian estimation of a Gaussian random field and RKHS. IEEE Transactions on Neural Networks and Learning Systems, 26(7), 1518–1524.MathSciNetCrossRefGoogle Scholar
  7. 7.
    Aronszajn, N. (1950). Theory of reproducing kernels. Transactions of the American Mathematical Society, 68, 337–404.MathSciNetzbMATHCrossRefGoogle Scholar
  8. 8.
    Bach, F. R., & Jordan, M. I. (2002). Kernel independent component analysis. Journal of Machine Learning Research, 3, 1–48.MathSciNetzbMATHGoogle Scholar
  9. 9.
    Balcan, M.-F., Blum, A., & Vempala, S. (2004). Kernels as features: On kernels, margins, and low-dimensional mappings. In Proceedings of the 15th International Conference on Algorithmic Learning Theory (pp. 194–205).Google Scholar
  10. 10.
    Barreto, A. M. S., Precup, D., & Pineau, J. (2016). Practical kernel-based reinforcement learning. Journal of Machine Learning Research, 17, 1–70.MathSciNetzbMATHGoogle Scholar
  11. 11.
    Baudat, G., & Anouar, F. (2000). Generalized discriminant analysis using a kernel approach. Neural Computation, 12(10), 2385–2404.CrossRefGoogle Scholar
  12. 12.
    Bohmer, W., Grunewalder, S., Nickisch, H., & Obermayer, K. (2012). Generating feature spaces for linear algorithms with regularized sparse kernel slow feature analysis. Machine Learning, 89, 67–86.MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    Boubacar, H. A., Lecoeuche, S., & Maouche, S. (2008). SAKM: Self-adaptive kernel machine. A kernel-based algorithm for online clustering. Neural Networks, 21, 1287–1301.zbMATHCrossRefGoogle Scholar
  14. 14.
    Bouboulis, P., & Theodoridis, S. (2011). Extension of Wirtinger’s calculus to reproducing kernel Hilbert spaces and the complex kernel LMS. IEEE Transactions on Signal Processing, 59(3), 964–978.MathSciNetzbMATHCrossRefGoogle Scholar
  15. 15.
    Bouboulis, P., Slavakis, K., & Theodoridis, S. (2012). Adaptive learning in complex reproducing kernel Hilbert spaces employing Wirtinger’s subgradients. IEEE Transactions on Neural Networks and Learning Systems, 23(3), 425–438.CrossRefGoogle Scholar
  16. 16.
    Braun, M. L., Buhmann, J. M., & Muller, K.-R. (2008). On relevant dimensions in kernel feature spaces. Journal of Machine Learning Research, 9, 1875–1908.MathSciNetzbMATHGoogle Scholar
  17. 17.
    Buciu, I., Nikolaidis, N., & Pitas, I. (2008). Nonnegative matrix factorization in polynomial feature space. IEEE Transactions on Neural Networks, 19(6), 1090–1100.CrossRefGoogle Scholar
  18. 18.
    Cawley, G. C., & Talbot, N. L. C. (2003). Efficient leave-one-out cross-validation of kernel Fisher discriminant classifiers. Pattern Recognition, 36(11), 2585–2592.zbMATHCrossRefGoogle Scholar
  19. 19.
    Cawley, G. C., Janacek, G. J., & Talbot, N. L. C. (2007). Generalised kernel machines. In Proceedings of the IEEE/INNS International Joint Conference on Neural Networks, Orlando, FL (pp. 1720–1725).Google Scholar
  20. 20.
    Cesa-Bianchi, N., Conconi, A., & Gentile, C. (2006). Tracking the best hyperplane with a simple budget Perceptron. In Proceedings of the 19th International Conference on Learning Theory (pp. 483–498).Google Scholar
  21. 21.
    Cevikalp, H., Neamtu, M., & Wilkes, M. (2006). Discriminative common vector method with kernels. IEEE Transactions on Neural Networks, 17(6), 1550–1565.CrossRefGoogle Scholar
  22. 22.
    Cevikalp, H., Neamtu, M., & Barkana, A. (2007). The kernel common vector method: A novel nonlinear subspace classifier for pattern recognition. IEEE Transactions on Systems, Man, and Cybernetics Part B, 37(4), 937–951.CrossRefGoogle Scholar
  23. 23.
    Chapelle, O., & Rakotomamonjy, A. (2008). Second order optimization of kernel parameters. In NIPS Workshop on Kernel Learning: Automatic Selection of Optimal Kernels, Whistler, Canada.Google Scholar
  24. 24.
    Chin, T.-J., & Suter, D. (2007). Incremental kernel principal component analysis. IEEE Transactions on Image Processing, 16(6), 1662–1674.MathSciNetCrossRefGoogle Scholar
  25. 25.
    Chin, T.-J., Schindler, K., & Suter, D. (2006). Incremental kernel SVD for face recognition with image sets. In Proceedings of the 7th IEEE Conference on Automatic Face and Gesture Recognition (pp. 461–466).Google Scholar
  26. 26.
    De la Torre, F. (2012). A least-squares framework for component analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(6), 1041–1055.CrossRefGoogle Scholar
  27. 27.
    Dekel, O., Shalev-Shwartz, S., & Singer, Y. (2007). The Forgetron: A kernel-based perceptron on a budget. SIAM Journal on Computing, 37(5), 1342–1372.MathSciNetzbMATHCrossRefGoogle Scholar
  28. 28.
    Dhanjal, C., Gunn, S. R., & Shawe-Taylor, J. (2009). Efficient sparse kernel feature extraction based on partial least squares. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(8), 1347–1361.CrossRefGoogle Scholar
  29. 29.
    Dhillon, I. S., Guan, Y., & Kulis, B. (2004). Kernel \(k\)-means, spectral clustering and normalized cuts. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 551–556).Google Scholar
  30. 30.
    Dhillon, I. S., Guan, Y., & Kulis, B. (2007). Weighted graph cuts without eigenvectors: A multilevel approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(11), 1944–1957.CrossRefGoogle Scholar
  31. 31.
    Ding, M., Tian, Z., & Xu, H. (2010). Adaptive kernel principal component analysis. Signal Processing, 90, 1542–1553.zbMATHCrossRefGoogle Scholar
  32. 32.
    Dufrenois, F. (2015). A one-class kernel Fisher criterion for outlier detection. IEEE Transactions on Neural Networks and Learning Systems, 26(5), 982–994.MathSciNetCrossRefGoogle Scholar
  33. 33.
    Engel, Y., Mannor, S., & Meir, R. (2004). The kernel recursive least-squares algorithm. IEEE Transactions on Signal Processing, 52(8), 2275–2285.MathSciNetzbMATHCrossRefGoogle Scholar
  34. 34.
    Filippone, M., Masulli, F., & Rovetta, S. (2010). Applying the possibilistic \(c\)-means algorithm in kernel-induced spaces. IEEE Transactions on Fuzzy Systems, 18(3), 572–584.Google Scholar
  35. 35.
    Frieb, T.-T., & Harrison, R. F. (1999). A kernel-based ADALINE. In Proceedings of the European Symposium on Artificial Neural Networks, Bruges, Belgium (pp. 245–250).Google Scholar
  36. 36.
    Fukumizu, K., Bach, F. R., & Gretton, A. (2007). Statistical consistency of kernel canonical correlation analysis. Journal of Machine Learning Research, 8, 361–383.MathSciNetzbMATHGoogle Scholar
  37. 37.
    Gao, J., Kwan, P. W., & Shi, D. (2010). Sparse kernel learning with LASSO and Bayesian inference algorithm. Neural Networks, 23, 257–264.zbMATHCrossRefGoogle Scholar
  38. 38.
    Garcia, C., & Moreno, J. A. (2004). The Hopfield associative memory network: Improving performance with the kernel “trick”. Advances in artificial intelligence – IBERAMIA 2004. LNCS (Vol. 3315, pp. 871–880). Berlin: Springer.Google Scholar
  39. 39.
    Girolami, M. (2002). Mercer kernel-based clustering in feature space. IEEE Transactions on Neural Networks, 13(3), 780–784.CrossRefGoogle Scholar
  40. 40.
    Gonen, M. (2012). Bayesian efficient multiple kernel learning. In Proceedings of the 29th International Conference on Machine Learning, Edinburgh, UK (Vol. 1, pp. 1–8).Google Scholar
  41. 41.
    Graves, D., & Pedrycz, W. (2010). Kernel-based fuzzy clustering and fuzzy clustering: A comparative experimental study. Fuzzy Sets and Systems, 161, 522–543.MathSciNetCrossRefGoogle Scholar
  42. 42.
    Gretton, A., Herbrich, R., Smola, A., Bousquet, O., & Scholkopf, B. (2005). Kernel methods for measuring independence. Journal of Machine Learning Research, 6, 2075–2129.MathSciNetzbMATHGoogle Scholar
  43. 43.
    Gunter, S., Schraudolph, N. N., & Vishwanathan, S. V. N. (2007). Fast iterative kernel principal component analysis. Journal of Machine Learning Research, 8, 1893–1918.MathSciNetzbMATHGoogle Scholar
  44. 44.
    Harmeling, S., Ziehe, A., Kawanabe, M., & Muller, K.-R. (2003). Kernel-based nonlinear blind source separation. Neural Computation, 15, 1089–1124.zbMATHCrossRefGoogle Scholar
  45. 45.
    Heinz, C., & Seeger, B. (2008). Cluster kernels: Resource-aware kernel density estimators over streaming data. IEEE Transactions on Knowledge and Data Engineering, 20(7), 880–893.CrossRefGoogle Scholar
  46. 46.
    Heo, G., & Gader, P. (2011). Robust kernel discriminant analysis using fuzzy memberships. Pattern Recognition, 44(3), 716–723.zbMATHCrossRefGoogle Scholar
  47. 47.
    Hoegaerts, L., De Lathauwer, L., Goethals, I., Suykens, J. A. K., Vandewalle, J., & De Moor, B. (2007). Efficiently updating and tracking the dominant kernel principal components. Neural Networks, 20, 220–229.zbMATHCrossRefGoogle Scholar
  48. 48.
    Huang, H.-C., Chuang, Y.-Y., & Chen, C.-S. (2012). Multiple kernel fuzzy clustering. IEEE Transactions on Fuzzy Systems, 20(1), 120–134.CrossRefGoogle Scholar
  49. 49.
    Huang, S.-Y., Yeh, Y.-R., & Eguchi, S. (2009). Robust kernel principal component analysis. Neural Computation, 21, 3179–3213.MathSciNetzbMATHCrossRefGoogle Scholar
  50. 50.
    Jaakkola, T., & Haussler, D. (1999). Probabilistic kernel regression models. In Proceedings of the 7th International Workshop on Artificial Intelligence and Statistics. San Francisco, CA: Morgan Kaufmann.Google Scholar
  51. 51.
    Jenssen, R. (2010). Kernel entropy component analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(5), 847–860.CrossRefGoogle Scholar
  52. 52.
    Ji, S., & Ye, J. (2008). Kernel uncorrelated and regularized discriminant analysis: A theoretical and computational study. IEEE Transactions on Knowledge and Data Engineering, 20(10), 1311–1321.CrossRefGoogle Scholar
  53. 53.
    Kim, J., & Scott, C. D. (2010). \(L_2\) kernel classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(10), 1822–1831.Google Scholar
  54. 54.
    Kim, D. W., Lee, K. Y., Lee, D., & Lee, K. H. (2005). A kernel-based subtractive clustering method. Pattern Recognition Letters, 26, 879–891.CrossRefGoogle Scholar
  55. 55.
    Kim, D. W., Lee, K. Y., Lee, D., & Lee, K. H. (2005). Evaluation of the performance of clustering algorithms kernel-induced feature space. Pattern Recognition, 38(4), 607–611.CrossRefGoogle Scholar
  56. 56.
    Kim, K. I., Franz, M. O., & Scholkopf, B. (2005). Iterative kernel principal component analysis for image modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(9), 1351–1366.CrossRefGoogle Scholar
  57. 57.
    Kim, S.-J., Magnani, A., & Boyd, S. (2006). Optimal kernel selection in kernel Fisher discriminant analysis. In Proceedings of the International Conference on Machine Learning (pp. 465–472).Google Scholar
  58. 58.
    Kivinen, J., Smola, A., & Williamson, R. C. (2004). Online learning with kernels. IEEE Transactions on Signal Processing, 52(8), 2165–2176.MathSciNetzbMATHCrossRefGoogle Scholar
  59. 59.
    Kloft, M., Brefeld, U., Sonnenburg, S., & Zien, A. (2011). \(l_p\)-norm multiple kernel learning. Journal of Machine Learning Research, 12, 953–997.Google Scholar
  60. 60.
    Lai, P. L., & Fyfe, C. (2000). Kernel and nonlinear canonical correlation analysis. International Journal of Neural Systems, 10(5), 365–377.CrossRefGoogle Scholar
  61. 61.
    Lanckriet, G. R. G., Ghaoui, L. E., Bhattacharyya, C., & Jordan, M. I. (2002). A robust minimax approach to classification. Journal of Machine Learning Research, 3, 555–582.MathSciNetzbMATHGoogle Scholar
  62. 62.
    Lanckriet, G. R. G., Cristianini, N., Bartlett, P., Ghaoui, L. E., & Jordan, M. I. (2004). Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 5, 27–72.MathSciNetzbMATHGoogle Scholar
  63. 63.
    Lau, K. W., Yin, H., & Hubbard, S. (2006). Kernel self-organising maps for classification. Neurocomputing, 69, 2033–2040.CrossRefGoogle Scholar
  64. 64.
    Le, Q., Sarlos, T., & Smola, A. (2013). Fastfood – Approximating kernel expansions in loglinear time. In Proceedings of the 30th International Conference on Machine Learning, Atlanta, GA (Vol. 28, pp. 244–252).Google Scholar
  65. 65.
    Li, J., Tao, D., Hu, W., & Li, X. (2005). Kernel principle component analysis in pixels clustering. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (pp. 786–789).Google Scholar
  66. 66.
    Li, K., & Principe, J. C. (2016). The kernel adaptive autoregressive-moving-average algorithm. IEEE Transactions on Neural Networks and Learning Systems, 27(2), 334–346.MathSciNetCrossRefGoogle Scholar
  67. 67.
    Liu, W., & Principe, J. C. (2008). Kernel affine projection algorithms. EURASIP Journal on Advances in Signal Processing, 2008, Article ID 784292, 12 pp.Google Scholar
  68. 68.
    Liu, W., Pokharel, P. P., & Principe, J. C. (2008). The kernel least-mean-square algorithm. IEEE Transactions on Signal Processing, 56(2), 543–554.MathSciNetzbMATHCrossRefGoogle Scholar
  69. 69.
    Liu, W., Park, I., Wang, Y., & Principe, J. C. (2009). Extended kernel recursive least squares algorithm. IEEE Transactions on Signal Processing, 57(10), 3801–3814.MathSciNetzbMATHCrossRefGoogle Scholar
  70. 70.
    Liwicki, S., Zafeiriou, S., Tzimiropoulos, G., & Pantic, M. (2012). Efficient online subspace learning with an indefinite kernel for visual tracking and recognition. IEEE Transactions on Neural Networks and Learning Systems, 23(10), 1624–1636.CrossRefGoogle Scholar
  71. 71.
    Lu, J., Plataniotis, K. N., & Venetsanopoulos, A. N. (2003). Face recognition using kernel direct discriminant analysis algorithms. IEEE Transactions on Neural Networks, 14(1), 117–126.CrossRefGoogle Scholar
  72. 72.
    Ma, J. (2003). Function replacement vs. kernel trick. Neurocomputing, 50, 479–483.zbMATHCrossRefGoogle Scholar
  73. 73.
    MacDonald, D., & Fyfe, C. (2000). The kernel self organising map. In Proceedings of the 4th International Conference on Knowledge-Based Intelligence Engineering Systems and Allied Technologies (Vol. 1, pp. 317–320).Google Scholar
  74. 74.
    Mangasarian, O. L., & Wild, E. W. (2007). Nonlinear knowledge in kernel approximation. IEEE Transactions on Neural Networks, 18(1), 300–306.CrossRefGoogle Scholar
  75. 75.
    Mao, Q., Tsang, I. W., Gao, S., & Wang, L. (2015). Generalized multiple kernel learning with data-dependent priors. IEEE Transactions on Neural Networks and Learning Systems, 26(6), 1134–1148.MathSciNetCrossRefGoogle Scholar
  76. 76.
    Martinez, D., & Bray, A. (2003). Nonlinear blind source separation using kernels. IEEE Transactions on Neural Networks, 14(1), 228–235.CrossRefGoogle Scholar
  77. 77.
    Mercer, T. (1909). Functions of positive and negative type and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society of London Series A, 209, 415–446.zbMATHCrossRefGoogle Scholar
  78. 78.
    Mika, S., Ratsch, G., Weston, J., Scholkopf, B., & Muller, K.-R. (1999). Fisher discriminant analysis with kernels. In Proceedings of the IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing (pp. 41–48).Google Scholar
  79. 79.
    Muller, K. R., Mika, S., Ratsch, G., Tsuda, K., & Scholkopf, B. (2001). An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks, 12(2), 181–201.CrossRefGoogle Scholar
  80. 80.
    Nashed, M. Z., & Walter, G. G. (1991). General sampling theorem for functions in reproducing kernel Hilbert space. Mathematics of Control Signals and Systems, 4(4), 363–390.MathSciNetzbMATHCrossRefGoogle Scholar
  81. 81.
    Ogawa, H. (2009). What can we see behind sampling theorems? IEICE Transactions on Fundamentals, E92-A(3), 688–707.CrossRefGoogle Scholar
  82. 82.
    Ong, C. S., Smola, A. J., & Williamson, R. C. (2005). Learning the kernel with hyperkernels. Journal of Machine Learning Research, 6, 1043–1071.MathSciNetzbMATHGoogle Scholar
  83. 83.
    Orabona, F., Keshet, J., & Caputo, B. (2009). Bounded kernel-based online learning. Journal of Machine Learning Research, 10, 2643–2666.MathSciNetzbMATHGoogle Scholar
  84. 84.
    Ormoneit, D., & Sen, S. (2002). Kernel-based reinforcement learning. Machine Learning, 49, 161–178.zbMATHCrossRefGoogle Scholar
  85. 85.
    Paiva, A. R. C., Park, I., & Principe, J. C. (2009). A reproducing kernel Hilbert space framework for spike train signal processing. Neural Computation, 21, 424–449.MathSciNetzbMATHCrossRefGoogle Scholar
  86. 86.
    Papaioannou, A., & Zafeiriou, S. (2014). Principal component analysis with complex kernel: The widely linear model. IEEE Transactions on Neural Networks and Learning Systems, 25(9), 1719–1726.CrossRefGoogle Scholar
  87. 87.
    Pekalska, E., & Haasdonk, B. (2009). Kernel discriminant analysis for positive definite and indefinite kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(6), 1017–1031.CrossRefGoogle Scholar
  88. 88.
    Peleg, D., & Meir, R. (2009). A sparsity driven kernel machine based on minimizing a generalization error bound. Pattern Recognition, 42, 2607–2614.zbMATHCrossRefGoogle Scholar
  89. 89.
    Perfetti, R., & Ricci, E. (2008). Recurrent correlation associative memories: A feature space perspective. IEEE Transactions on Neural Networks, 19(2), 333–345.CrossRefGoogle Scholar
  90. 90.
    Pokharel, P. P., Liu, W., & Principe, J. C. (2007). Kernel LMS. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Honolulu, HI (Vol. 3, pp. 1421–1424).Google Scholar
  91. 91.
    Qin, A. K., & Suganthan, P. N. (2004). Kernel neural gas algorithms with application to cluster analysis. In Proceedings of the 17th International Conference on Pattern Recognition (Vol. 4, pp. 617–620).Google Scholar
  92. 92.
    Rahimi, A., & Recht, B. (2007). Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems (Vol. 20, pp. 1177–1184). Red Hook, NY: Curran & Associates Inc.Google Scholar
  93. 93.
    Rahimi, A., & Recht, B. (2008). Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning. In Advances in Neural Information Processing Systems (Vol. 21, pp. 1313–1320). Red Hook, NY: Curran & Associates Inc.Google Scholar
  94. 94.
    Rakotomamonjy, A., Bach, F., Canu, S., & Grandvalet, Y. (2008). SimpleMKL. Journal of Machine Learning Research, 9, 2491–2521.MathSciNetzbMATHGoogle Scholar
  95. 95.
    Rodriguez-Lujan, I., Santa Cruz, C., & Huerta, R. (2011). On the equivalence of kernel Fisher discriminant analysis and kernel quadratic programming feature selection. Pattern Recognition Letters, 32, 1567–1571.CrossRefGoogle Scholar
  96. 96.
    Rosipal, R., & Trejo, L. J. (2001). Kernel partial least squares regression in reproducing kernel Hilbert spaces. Journal of Machine Learning Research, 2, 97–123.zbMATHGoogle Scholar
  97. 97.
    Ruiz, A., & Lopez-de-Teruel, P. E. (2001). Nonlinear kernel-based statistical pattern analysis. IEEE Transactions on Neural Networks, 12(1), 16–32.CrossRefGoogle Scholar
  98. 98.
    Saadi, K., Talbot, N. L. C., & Cawley, G. C. (2007). Optimally regularised kernel Fisher discriminant classification. Neural Networks, 20, 832–841.zbMATHCrossRefGoogle Scholar
  99. 99.
    Scholkopf, B. (1997). Support vector learning. Munich, Germany: R Oldenbourg Verlag.zbMATHGoogle Scholar
  100. 100.
    Scholkopf, B., Smola, A., & Muller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10, 1299–1319.CrossRefGoogle Scholar
  101. 101.
    Scholkopf, B., Mika, S., Burges, C. J. C., Knirsch, P., Muller, K.-R., Scholz, M., et al. (1999). Input space versus feature space in kernel-based methods. IEEE Transactions on Neural Networks, 10(5), 1000–1017.CrossRefGoogle Scholar
  102. 102.
    Shashua, A. (1999). On the relationship between the support vector machine for classification and sparsified Fisher’s linear discriminant. Neural Processing Letters, 9(2), 129–139.CrossRefGoogle Scholar
  103. 103.
    Smola, A. J., Mangasarian, O., & Scholkopf, B. (1999). Sparse kernel feature analysis. Technical report 99-03. Madison, WI: Data Mining Institute, University of Wisconsin.Google Scholar
  104. 104.
    Song, G., & Zhang, H. (2011). Reproducing kernel Banach spaces with the \(l_1\) Norm II: Error analysis for regularized least square regression. Neural Computation, 23, 2713–2729.Google Scholar
  105. 105.
    Sonnenburg, S., Ratsch, G., Schafer, C., & Scholkopf, B. (2006). Large scale multiple kernel learning. Journal of Machine Learning Research, 7, 1531–1565.MathSciNetzbMATHGoogle Scholar
  106. 106.
    Subrahmanya, N., & Shin, Y. C. (2010). Sparse multiple kernel learning for signal processing applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(5), 788–798.CrossRefGoogle Scholar
  107. 107.
    Suykens, J. A. K., Van Gestel, T., Vandewalle, J., & De Moor, B. (2003). A support vector machine formulation to PCA analysis and its kernel version. IEEE Transactions on Neural Networks, 14(2), 447–450.CrossRefGoogle Scholar
  108. 108.
    Suzuki, T., & Tomioka, R. (2011). SpicyMKL: A fast algorithm for multiple kernel learning with thousands of kernels. Machine Learning, 85, 77–108.MathSciNetzbMATHCrossRefGoogle Scholar
  109. 109.
    Tanaka, A., Imai, H., & Miyakoshi, M. (2010). Kernel-induced sampling theorem. IEEE Transactions on Signal Processing, 58(7), 3569–3577.MathSciNetzbMATHCrossRefGoogle Scholar
  110. 110.
    Teh, C. S., & Lim, C. P. (2006). Monitoring the formation of kernel-based topographic maps in a hybrid SOM-kMER model. IEEE Transactions on Neural Networks, 17(5), 1336–1341.CrossRefGoogle Scholar
  111. 111.
    Teh, C. S., & Lim, C. P. (2008). An artificial neural network classifier design based-on variable kernel and non-parametric density estimation. Neural Processing Letters, 27, 137–151.CrossRefGoogle Scholar
  112. 112.
    van Hulle, M. M. (1998). Kernel-based equiprobabilistic topographic map formation. Neural Computation, 10(7), 1847–1871.CrossRefGoogle Scholar
  113. 113.
    Vincent, P., & Bengio, Y. (2002). Kernel matching pursuit. Machine Learning, 48, 165–187.zbMATHCrossRefGoogle Scholar
  114. 114.
    Vishwanathan, S. V. N., Sun, Z., Ampornpunt, N., & Varma, M. (2010). Multiple kernel learning and the SMO algorithm. Advances in neural information processing systems. Cambridge, MA: MIT Press.Google Scholar
  115. 115.
    Wang, L. (2008). Feature selection with kernel class separability. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(9), 1534–1546.CrossRefGoogle Scholar
  116. 116.
    Washizawa, Y. (2012). Adaptive subset kernel principal component analysis for time-varying patterns. IEEE Transactions on Neural Networks and Learning Systems, 23(12), 1961–1973.CrossRefGoogle Scholar
  117. 117.
    Wolf, L., & Shashua, A. (2003). Learning over sets using kernel principal angles. Journal of Machine Learning Research, 4, 913–931.MathSciNetzbMATHGoogle Scholar
  118. 118.
    Xiao, S., Tan, M., Xu, D., & Dong, Z. Y. (2016). Robust kernel low-rank representation. IEEE Transactions on Neural Networks and Learning Systems, 27(11), 2268–2281.MathSciNetCrossRefGoogle Scholar
  119. 119.
    Xiong, H., Swamy, M. N. S., & Ahmad, M. O. (2005). Optimizing the kernel in the empirical feature space. IEEE Transactions on Neural Networks, 16(2), 460–474.CrossRefGoogle Scholar
  120. 120.
    Xu, X., Tsang, I. W., & Xu, D. (2013). Soft margin multiple kernel learning. IEEE Transactions on Neural Networks and Learning Systems, 24(5), 749–761.CrossRefGoogle Scholar
  121. 121.
    Xu, Y., & Zhang, H. (2007). Refinable kernels. Journal of Machine Learning Research, 8, 2083–2120.MathSciNetzbMATHGoogle Scholar
  122. 122.
    Xu, Z., Huang, K., Zhu, J., King, I., & Lyua, M. R. (2009). A novel kernel-based maximum a posteriori classification method. Neural Networks, 22, 977–987.zbMATHCrossRefGoogle Scholar
  123. 123.
    Yang, C., Wang, L., & Feng, J. (2008). On feature extraction via kernels. IEEE Transactions on Systems, Man, and Cybernetics Part B, 38(2), 553–557.CrossRefGoogle Scholar
  124. 124.
    Yang, H., Xu, Z., Ye, J., King, I., & Lyu, M. R. (2011). Efficient sparse generalized multiple kernel learning. IEEE Transactions on Neural Networks, 22(3), 433–446.CrossRefGoogle Scholar
  125. 125.
    Yang, J., Frangi, A. F., Yang, J.-Y., Zhang, D., & Jin, Z. (2005). KPCA plus LDA: A complete kernel Fisher discriminant framework for feature extraction and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(2), 230–244.CrossRefGoogle Scholar
  126. 126.
    Ye, J., Ji, S., & Chen, J. (2008). Multi-class discriminant kernel learning via convex programming. Journal of Machine Learning Research, 9, 719–758.MathSciNetzbMATHGoogle Scholar
  127. 127.
    Yin, H., & Allinson, N. (2001). Self-organising mixture networks for probability density estimation. IEEE Transactions on Neural Networks, 12, 405–411.CrossRefGoogle Scholar
  128. 128.
    Yoshino, H., Dong, C., Washizawa, Y., & Yamashita, Y. (2010). Kernel Wiener filter and its application to pattern recognition. IEEE Transactions on Neural Networks, 21(11), 1719–1730.CrossRefGoogle Scholar
  129. 129.
    You, D., Hamsici, O. C., & Martinez, A. M. (2011). Kernel optimization in discriminant analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(3), 631–638.CrossRefGoogle Scholar
  130. 130.
    Zafeiriou, S., & Petrou, M. (2010). Nonlinear nonnegative component analysis algorithms. IEEE Transactions on Image Processing, 19, 1050–1066.MathSciNetzbMATHCrossRefGoogle Scholar
  131. 131.
    Zhang, T. (2003). Leave-one-out bounds for kernel methods. Neural Computation, 15, 1397–1437.zbMATHCrossRefGoogle Scholar
  132. 132.
    Zhang, D. Q., & Chen, S. C. (2003). Clustering incomplete data using kernel-based fuzzy C-means algorithm. Neural Processing Letters, 18, 155–162.CrossRefGoogle Scholar
  133. 133.
    Zhang, B., Zhang, H., & Ge, S. S. (2004). Face recognition by applying wavelet subband representation and kernel associative memory. IEEE Transactions on Neural Networks, 15(1), 166–177.CrossRefGoogle Scholar
  134. 134.
    Zhang, M., Wang, X., Chen, X., & Zhang, A. (2018). The kernel conjugate gradient algorithms. IEEE Transactions on Signal Processing, 66(16), 4377–4387.MathSciNetzbMATHCrossRefGoogle Scholar
  135. 135.
    Zheng, W., Zhao, L., & Zou, C. (2005). Foley-Sammon optimal discriminant vectors using kernel approach. IEEE Transactions on Neural Networks, 16(1), 1–9.CrossRefGoogle Scholar
  136. 136.
    Zheng, W., Zhou, X., Zou, C., & Zhao, L. (2006). Facial expression recognition using kernel canonical correlation analysis (KCCA). IEEE Transactions on Neural Networks, 17(1), 233–238.CrossRefGoogle Scholar
  137. 137.
    Zheng, W., Lin, Z., & Tang, X. (2010). A rank-one update algorithm for fast solving kernel Foley-Sammon optimal discriminant vectors. IEEE Transactions on Neural Networks, 21(3), 393–403.CrossRefGoogle Scholar
  138. 138.
    Zhu, J., & Hastie, T. (2002). Kernel logistic regression and the import vector machine. Advances in neural information processing systems (Vol. 14). Cambridge, MA: MIT Press.Google Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Electrical and Computer EngineeringConcordia UniversityMontrealCanada
  2. 2.Xonlink Inc.HangzhouChina

Personalised recommendations