Abstract
In this paper, we propose the canonical correlation kernel (CCK), that seamlessly integrates the advantages of lower dimensional representation of videos with a discriminative classifier like SVM. In the process of defining the kernel, we learn a low-dimensional (linear as well as nonlinear) representation of the video data, which is originally represented as a tensor. We densely compute features at single (or two) frame level, and avoid any explicit tracking. Tensor representation provides the holistic view of the video data, which is the starting point of computing the CCK. Our kernel is defined in terms of the principal angles between the lower dimensional representations of the tensor, and captures the similarity of two videos in an efficient manner. We test our approach on four public data sets and demonstrate consistent superior results over the state of the art methods, including those that use canonical correlations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Akae, N., Mansur, A., Makihara, Y., Yagi, Y.: Video from nearly still: an application to low frame-rate gait recognition. In: CVPR, pp. 1537–1543 (2012)
Bjorck, A., Golub, G.H.: Numerical methods for computing angles between linear subspaces. Mathematics of Computation 27(123), 579–594 (1973)
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 27:1–27:27 (2011), Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm
Chapelle, O., Vapnik, V., Bousquet, O., Mukherjee, S.: Choosing multiple parameters for support vector machines. Machine Learning, 131–159 (2002)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (1), pp. 886–893 (2005)
Dalal, N., Triggs, B., Schmid, C.: Human Detection Using Oriented Histograms of Flow and Appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part II. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Gehler, P., Nowozin, S.: Infinite kernel learning. In: Proceedings of NIPS 2008 Workshop on ”Kernel Learning: Automatic Selection of Optimal Kernels” (2008)
Ikizler-Cinbis, N., Sclaroff, S.: Object, Scene and Actions: Combining Multiple Features for Human Action Recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 494–507. Springer, Heidelberg (2010)
Kim, T.K., Wong, S.F., Cipolla, R.: Tensor canonical correlation analysis for action classification. In: CVPR (2007)
Kim, T.-K., Cipolla, R.: Gesture Recognition Under Small Sample Size. In: Yagi, Y., Kang, S.B., Kweon, I.S., Zha, H. (eds.) ACCV 2007, Part I. LNCS, vol. 4843, pp. 335–344. Springer, Heidelberg (2007)
Kellokumpu, V., Zhao, G., Pietikainen, M.: Human activity recognition using a dynamic texture based method. In: BMVC (2008)
Kim, T.K., Cipolla, R.: Canonical correlation analysis of video volume tensors for action categorization and detection. IEEE Trans. Pattern Anal. Mach. Intell., 1415–1428 (2009)
Kloft, M., Brefeld, U., Sonnenburg, S., Laskov, P., Muller, K.R., Zien, A.: Efficient and accurate lp-norm multiple kernel learning. In: NIPS, pp. 997–1005 (2009)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)
Lui, Y.M., Beveridge, J.R.: Tangent bundle for human action recognition. In: FG, pp. 97–102 (2011)
Lui, Y.M., Beveridge, J.R., Kirby, M.: Action classification on product manifolds. In: CVPR, pp. 833–839 (2010)
Lanckriet, G.R.G., Cristianini, N., Bartlett, P.L., Ghaoui, L.E., Jordan, M.I.: Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 27–72 (2004)
Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV, pp. 1150–1157 (1999)
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos. In: CVPR, pp. 1996–2003 (2009)
Le, Q.V., Zou, W.Y., Yeung, S.Y., Ng, A.Y.: Learning hierarchical invariant spatiotemporal features for action recognition with independent subspace analysis. In: CVPR, pp. 3361–3368 (2011)
Messing, R., Pal, C., Kautz, H.A.: Activity recognition using the velocity histories of tracked keypoints. In: ICCV, pp. 104–111 (2009)
Matikainen, P., Hebert, M., Sukthankar, R.: Trajectons: Action recognition through the motion analysis of tracked features. In: Workshop on Video-Oriented Object and Event Classification, ICCV (2009)
Nowak, E., Jurie, F., Triggs, B.: Sampling Strategies for Bag-of-Features Image Classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 490–503. Springer, Heidelberg (2006)
Rodriguez, M., Ahmed, J., Shah, M.: Action mach: A spatiotemporal maximum average correlation height filter for action recognition. In: CVPR (2008)
Sun, J., Wu, X., Yan, S., Cheong, L.F., Chua, T.S., Li, J.: Hierarchical spatiotemporal context modeling for action recognition. In: CVPR, pp. 2004–2011 (2009)
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local svm approach. In: ICPR, pp. 32–36 (2004)
Ullah, M.M., Parizi, S.N., Laptev, I.: Improving bag-of-features action recognition with non-local cues. In: BMVC, pp. 1–11 (2010)
Wang, H., Klaser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: CVPR, pp. 3169–3176 (2011)
Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: BMVC (2009)
Wang, J., Chen, Z., Wu, Y.: Action recognition with multiscale spatio-temporal contexts. In: CVPR, pp. 3185–3192 (2011)
Wolf, L., Shashua, A.: Kernel principal angles for classification machines with applications to image sequence interpretation. In: CVPR, pp. 635–642 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nagendar, G., Ganesh Bandiatmakuri, S., Goud Tandarpally, M., Jawahar, C.V. (2013). Action Recognition Using Canonical Correlation Kernels. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7726. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37431-9_37
Download citation
DOI: https://doi.org/10.1007/978-3-642-37431-9_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37430-2
Online ISBN: 978-3-642-37431-9
eBook Packages: Computer ScienceComputer Science (R0)