Advertisement

A SURF-Based Spatio-Temporal Feature for Feature-Fusion-Based Action Recognition

  • Akitsugu Noguchi
  • Keiji Yanai
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6553)

Abstract

In this paper, we propose a novel spatio-temporal feature which is useful for feature-fusion-based action recognition with Multiple Kernel Learning (MKL). The proposed spatio-temporal feature is based on moving SURF interest points grouped by Delaunay triangulation and on their motion over time. Since this local spatio-temporal feature has different characteristics from holistic appearance features and motion features, it can boost action recognition performance for both controlled videos such as the KTH dataset and uncontrolled videos such as Youtube datasets, by combining it with visual and motion features with MKL. In the experiments, we evaluate our method using KTH dataset, and Youtube dataset. As a result, we obtain 94.5% as a classification rate for in KTH dataset which is almost equivalent to state-of-art, and 80.4% for Youtube dataset which outperforms state-of-the-art greatly.

Keywords

Support Vector Machine Motion Vector Action Recognition Motion Feature Interest Point 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local SVM approach. In: Proc. of International Conference on Pattern Recognition, pp. 32–36 (2004)Google Scholar
  2. 2.
    Liu, J., Luo, J., Shah, M.: Recognizing realistic action from videos. In: Proc. of IEEE Computer Vision and Pattern Recognition (2009)Google Scholar
  3. 3.
    Dollar, P., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: Proc. of Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65–72 (2005)Google Scholar
  4. 4.
    Laptev, I., Lindeberg, T.: Local descriptors for spatio-temporal recognition. In: Proc. of IEEE International Conference on Computer Vision (2003)Google Scholar
  5. 5.
    Kläser, A., Marszałek, M., Schmid, C.: A spatio-temporal descriptor based on 3D-gradients. In: Proc. of BMVA British Machine Vision Conference, pp. 995–1004 (2008)Google Scholar
  6. 6.
    Kobayashi, T., Otsu, N.: A three-way auto-correlation based approach to human identification by gait. In: Proc. of IEEE Workshop on Visual Surveillance, pp. 185–192 (2006)Google Scholar
  7. 7.
    Herbert, B., Andreas, E., Tinne, T., Luc, G.: Surf: Speeded up robust features. Computer Vision and Image Understanding, 346–359 (2008)Google Scholar
  8. 8.
    Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proc. of International Joint Conference on Artificial Intelligence, pp. 674–679 (1981)Google Scholar
  9. 9.
    Gilbert, A., Illingworth, J., Bowden, R.: Fast realistic multi-action recognition using mined dense spatio-temporal features. In: Proc. of IEEE International Conference on Computer Vision, pp. 925–931 (2009)Google Scholar
  10. 10.
    Uemura, H., Ishikawa, S., Mikolajczyk, K.: Feature tracking and motion compensation for action recognition. In: Proc. of BMVA British Machine Vision Conference (2008)Google Scholar
  11. 11.
    Kim, T., Wong, S., Cipolla, R.: Tensor canonical correlation analysis for action classification. In: Proc. of IEEE Computer Vision and Pattern Recognition (2009)Google Scholar
  12. 12.
    Varma, M., Ray, D.: Learning the discriminative power-invariance trade-off. In: Proc. of IEEE International Conference on Computer Vision, pp. 1150–1157 (2007)Google Scholar
  13. 13.
    Sun, J., Wu, X., Yan, S., Cheong, L.F., Chua, T.S., Li, J.: Hierarchical spatio-temporal context modeling for action recognition. In: Proc. of IEEE Computer Vision and Pattern Recognition (2009)Google Scholar
  14. 14.
    Han, D., Bo, L., Sminchisescu, C.: Selection and context for action recognition. In: Proc. of IEEE International Conference on Computer Vision (2009)Google Scholar
  15. 15.
    Cinbins, N.I., Cinbins, R.G., Sclaroff, S.: Learning action from the web. In: Proc. of IEEE International Conference on Computer Vision, pp. 995–1002 (2009)Google Scholar
  16. 16.
    Noguchi, A., Yanai, K.: Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions. In: Zha, H., Taniguchi, R.-I., Maybank, S. (eds.) ACCV 2009, Part II. LNCS, vol. 5995, pp. 458–467. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  17. 17.
    Sonnenburg, S., Rätsch, G., Schäfer, C., Schölkopf, B.: Large scale multiple kernel learning. Journal of Machine Learning Research 7, 1531–1565 (2006)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Akitsugu Noguchi
    • 1
  • Keiji Yanai
    • 1
  1. 1.Department of Computer ScienceThe University of Electro-CommunicationsChofu-shiJapan

Personalised recommendations