Machine Learning

, Volume 79, Issue 1–2, pp 29–46 | Cite as

Decomposing the tensor kernel support vector machine for neuroscience data with structured labels

  • David R. Hardoon
  • John Shawe-Taylor


The tensor kernel has been used across the machine learning literature for a number of purposes and applications, due to its ability to incorporate samples from multiple sources into a joint kernel defined feature space. Despite these uses, there have been no attempts made towards investigating the resulting tensor weight in respect to the contribution of the individual tensor sources. Motivated by the increase in the current availability of Neuroscience data, specifically for two-source analyses, we propose a novel approach for decomposing the resulting tensor weight into its two components without accessing the feature space. We demonstrate our method and give experimental results on paired fMRI image-stimuli data.

Tensor kernel Support vector machine Decomposition fMRI 


  1. Anderson, D. R., Fite, K. V., Petrovich, N., & Hirsch, J. (2006). Cortical activation while watching video montage: An fMRI study. Media Psychology, 8(1), 7–24. CrossRefGoogle Scholar
  2. Bach, F. R., & Jordan, M. I. (2002). Kernel independent component analysis. Journal of Machine Learning Research, 3, 1–48. CrossRefMathSciNetGoogle Scholar
  3. Ben-Hur, A., & Noble, W. S. (2005). Kernel methods for predicting protein-protein interactions. Bioinformatics, 21, i38–i46. CrossRefGoogle Scholar
  4. Bickel, S., Bogojeska, J., Lengauer, T., & Scheffer, T. (2008). Multi-task learning for HIV therapy screening. In Proceedings of ICML. Google Scholar
  5. Carlson, T. A., Schrater, P., & He, S. (2003). Patterns of activity in the categorical representations of objects. Journal of Cognitive Neuroscience, 15(5), 704–717. CrossRefGoogle Scholar
  6. Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines and other kernel-based learning methods. Cambridge: Cambridge University Press. Google Scholar
  7. Fan, R.-E., Chen, P.-H., & Lin, C.-J. (2005). Working set selection using the second order information for training SVM. Journal of Machine Learning, 6, 1889–1918. MathSciNetGoogle Scholar
  8. Friston, K. J., Holmes, A. P., Worsley, K. J., Poline, J. P., Frith, C. D., & Frackowiak, R. S. J. (1995). Statistical parametric maps in functional imaging: a general linear approach. Human Brain Mapping, 2(4), 189–210. CrossRefGoogle Scholar
  9. Hardoon, D. R., & Shawe-Taylor, J. (2007). Sparse canonical correlation analysis. Technical report, University College London. Google Scholar
  10. Hardoon, D. R., Szedmak, S., & Shawe-Taylor, J. (2004). Canonical correlation analysis: an overview with application to learning methods. Neural Computation, 16(12), 2639–2664. MATHCrossRefGoogle Scholar
  11. Hardoon, D. R., Mourão-Miranda, J., Brammer, M., & Shawe-Taylor, J. (2007). Unsupervised analysis of fMRI data using kernel canonical correlation. NeuroImage, 37(4), 1250–1259. CrossRefGoogle Scholar
  12. Hardoon, D. R., Ettinger, U., Mourão-Miranda, J., Antonova, E., Collier, D., Kumari, V., Williams, S. C. R., & Brammer, M. (2009). Correlation based multivariate analysis of genetic influence on brain volume. Neuroscience Letters, 450(3), 281–286. CrossRefGoogle Scholar
  13. Koelsch, S., Fritz, T., Yves, D., Cramon, V., Müller, K., & Friederici, A. D. (2006). Investigating emotion with music: An fMRI study. Human Brain Mapping, 27(3), 239–250. CrossRefGoogle Scholar
  14. Kolda, T. G., & Sun, J. (2008). Scalable tensor decompositions for multi-aspect data mining. In ICDM 2008: Proceedings of the 8th IEEE International Conference on Data Mining (pp. 363–372), December 2008. Google Scholar
  15. Kondor, R. I., & Lafferty, J. (2002). Diffusion kernels on graphs and other discrete input spaces. In Proceedings of the Nineteenth International Conference on Machine Learning (pp. 315–322). San Mateo: Morgan Kaufmann. Google Scholar
  16. LaConte, S., Strother, S., Cherkassky, V., Anderson, J., & Hu, X. (2005). Support vector machines for temporal classification of block design fMRI data. NeuroImage, 26(2), 317–329. CrossRefGoogle Scholar
  17. Lowe, D. G. (1999). Object recognition from local scale-invariant features. In Proceedings of the 7th IEEE International Conference on Computer Vision (pp. 1150–1157), Kerkyra, Greece. Google Scholar
  18. Martin, S., Roe, D., & Faulon, J.-L. (2005). Predicting protein-protein interactions using signature products. Bioinformatics, 21(2), 218–226. CrossRefGoogle Scholar
  19. Mitchell, T. M., Hutchinson, R., Niculescu, R. S., Pereira, F., Wang, X., Just, M., & Newman, S. (2004). Learning to decode cognitive states from brain images. Machine Learning, 57(1–2), 145–175. MATHCrossRefGoogle Scholar
  20. Mourão-Miranda, J., Bokde, A. L. W., Born, C., Hampel, H., & Stetter, M. (2005). Classifying brain states and determining the discriminating activation patterns: Support vector machine on functional MRI data. NeuroImage, 28(4), 980–995. CrossRefGoogle Scholar
  21. Mourão-Miranda, J., Reynaud, E., McGlone, F., Calvert, G., & Brammer, M. (2006). The impact of temporal compression and space selection on SVM analysis of single-subject and multi-subject fMRI data. NeuroImage, 33(4), 1055–1065. CrossRefGoogle Scholar
  22. O’Toole, A. J., Jiang, F., Abdi, H., Pénard, N., Dunlop, J. P., & Parent, M. A. (2007). Theoretical, statistical, and practical perspectives on pattern-based classification approaches to the analysis of functional neuroimaging data. Journal of Cognitive Neuroscience, 19(11), 1735–1752. CrossRefGoogle Scholar
  23. Pulmannová, S. (2004). Tensor products of Hilbert space effect algebras. Reports on Mathematical Physics, 53(2), 301–316. MATHCrossRefMathSciNetGoogle Scholar
  24. Qiu, J., & Noble, W. S. (2008). Predicting co-complexed protein pairs from heterogeneous data. PLoS Computational Biology, 4(4), e1000054. CrossRefMathSciNetGoogle Scholar
  25. Shawe-Taylor, J., Williams, C. K. I., Cristianini, N., & Kandola, J. (2005). On the eigenspectrum of the Gram matrix and the generalization error of kernel-PCA. IEEE Transactions on Information Theory, 51(7), 2510–2522. CrossRefMathSciNetGoogle Scholar
  26. Szedmak, S., Shawe-Taylor, J., & Parado-Hernandez, E. (2005). Learning via linear operators: Maximum margin regression; multiclass and multiview learning at one-class complexity. Technical report, University of Southampton. Google Scholar
  27. Szedmak, S., De Bie, T., & Hardoon, D. R. (2007). A metamorphosis of canonical correlation analysis into multivariate maximum margin learning. In Proceedings of the 15th European Symposium on Artificial Neural Networks (ESANN 2007), Bruges, April 2007. Google Scholar
  28. Tibshirani, R. (1994). Regression shrinkage and selection via the lasso. Technical report, University of Toronto. Google Scholar
  29. Weston, J., Baklr, G., Bousquet, O., Schölkopf, B., Mann, T., & Noble, W. S. (2007). Joint kernel maps. In G. Baklr, T. Hofmann, B. Scholkopf, A. J. Smola, B. Taskar, & S. V. N. Vishwanathan (Eds.), Predicting structured data. Cambridge: MIT Press. Google Scholar

Copyright information

© The Author(s) 2009

Authors and Affiliations

  1. 1.Centre for Computational Statistics and Machine Learning, Department of Computer ScienceUniversity College LondonLondonUK

Personalised recommendations