Improving SNR and Reducing Training Time of Classifiers in Large Datasets via Kernel Averaging
Kernel methods are of growing importance in neuroscience research. As an elegant extension of linear methods, they are able to model complex non-linear relationships. However, since the kernel matrix grows with data size, the training of classifiers is computationally demanding in large datasets. Here, a technique developed for linear classifiers is extended to kernel methods: In linearly separable data, replacing sets of instances by their averages improves signal-to-noise ratio (SNR) and reduces data size. In kernel methods, data is linearly non-separable in input space, but linearly separable in the high-dimensional feature space that kernel methods implicitly operate in. It is shown that a classifier can be efficiently trained on instances averaged in feature space by averaging entries in the kernel matrix. Using artificial and publicly available data, it is shown that kernel averaging improves classification performance substantially and reduces training time, even in non-linearly separable data.
KeywordsKernel Machine learning Big data SVM FDA
- 1.Ayres-de Campos, D., Bernardes, J., Garrido, A., Marques-de Sá, J., Pereira-Leite, L.: Sisporto 2.0: a program for automated analysis of cardiotocograms. J. Matern. Fetal Med. 9(5), 311–318 (2000). https://doi.org/10.1002/1520-6661(200009/10)9:5<311::AID-MFM12>3.0.CO;2-9Google Scholar
- 12.Jäkel, F., Schölkopf, B., Wichmann, F.A.: Does cognitive science need kernels? Trends Cogn. Sci. 13, 381–388 (2009). https://www.sciencedirect.com/science/article/pii/S1364661309001430CrossRefGoogle Scholar
- 13.Orrù, G., Pettersson-Yeo, W., Marquand, A.F., Sartori, G., Mechelli, A.: Using support vector machine to identify imaging biomarkers of neurological and psychiatric disease: a critical review. Neurosci. Biobehav. Rev. 36(4), 1140–1152 (2012). https://doi.org/10.1016/j.neubiorev.2012.01.004CrossRefGoogle Scholar
- 14.Schölkopf, B., Smola, A.J.: A short introduction to learning with kernels. In: Mendelson, S., Smola, A.J. (eds.) Advanced Lectures on Machine Learning. Lecture Notes in Computer Science, vol. 2600, pp. 41–64. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-36434-X_2CrossRefzbMATHGoogle Scholar
- 20.Youssofzadeh, V., McGuinness, B., Maguire, L.P., Wong-Lin, K.: Multi-kernel learning with dartel improves combined MRI-PET classification of Alzheimer’s disease in AIBL data: group and individual analyses. Front. Hum. Neurosci. 11, 380 (2017). https://doi.org/10.3389/fnhum.2017.00380CrossRefGoogle Scholar