Advertisement

Benchmarking Datasets for Human Activity Recognition

  • Haowei Liu
  • Rogerio Feris
  • Ming-Ting Sun

Abstract

Recognizing human activities has become an important topic in the past few years. A variety of techniques for representing and modeling different human activities have been proposed, achieving reasonable performances in many scenarios. On the other hand, different benchmarks have also been collected and published. Different from other chapters focusing on the algorithmic aspects, this chapter gives an overview of different benchmarking datasets, summarizes the performances of the-state-of-the-art algorithms, and analyzes these datasets.

Keywords

Video Sequence False Alarm Rate Activity Recognition Human Activity Recognition Golf Swing 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Ali, S., Shah, M.: Human action recognition in videos using kinematic features and multiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. 32(2), 288–303 (2010) CrossRefGoogle Scholar
  2. 2.
    Bregonzio, M., Gong, S., Xiang, T.: Recognising action as clouds of space–time interest points. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2009) Google Scholar
  3. 3.
    Brendel, W., Todorovic, S.: Activities as time series of human postures. In: IEEE European Conference on Computer Vision (ECCV) (2010) Google Scholar
  4. 4.
    Cao, L., Liu, Z., Huang, T.: Cross-dataset action detection. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2010) Google Scholar
  5. 5.
    Chaudhry, R., Ravichandran, A., Hager, G., Vidal, R.: Histograms of oriented optical flow and Binet–Cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2009) Google Scholar
  6. 6.
    Christensen, H., Phillips, J.: Empirical Evaluation Methods in Computer Vision. World Scientific, Singapore (2002) MATHCrossRefGoogle Scholar
  7. 7.
    Efros, A., Berg, A., Mori, G., Malik, J.: Recognizing action at a distance. In: IEEE International Conference on Computer Vision (ICCV) (2003) Google Scholar
  8. 8.
    Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2008) Google Scholar
  9. 9.
    Gilbert, A., Illingworth, J., Bowden, R.: Action recognition using mined hierarchical compound features. IEEE Transaction on Pattern Analysis and Machine Intelligence (PAMI) (2010) Google Scholar
  10. 10.
    Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space–time shapes. In: IEEE International Conference on Computer Vision (ICCV) (2005) Google Scholar
  11. 11.
    Gupta, A., Kembhavi, A., Davis, L.: Observing human–object interactions: Using spatial and functional compatibility for recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(10), 1775–1789 (2009) CrossRefGoogle Scholar
  12. 12.
    Han, D., Bo, L., Sminchisescu, C.: Selection and context for action recognition. In: IEEE International Conference on Computer Vision (ICCV) (2009) Google Scholar
  13. 13.
    IEEE: Performance Evaluation of Tracking and Surveillance (2004) Google Scholar
  14. 14.
    IEEE: Performance Evaluation of Tracking and Surveillance (2007) Google Scholar
  15. 15.
    IEEE: Performance Evaluation of Tracking and Surveillance (2009) Google Scholar
  16. 16.
    Ikizler-Cinbis, N., Sclaroff, S.: Object, scene and actions: Combining multiple features for human action recognition. In: IEEE European Conference on Computer Vision (ECCV) (2010) Google Scholar
  17. 17.
    Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: IEEE International Conference on Computer Vision (ICCV) (2007) Google Scholar
  18. 18.
    Jiang, H., Martin, D.: Finding actions using shape flows. In: IEEE European Conference on Computer Vision (ECCV) (2008) Google Scholar
  19. 19.
    Ke, Y., Sukthankar, R., Hebert, M.: Event detection in cluttered videos. In: IEEE International Conference on Computer Vision (ICCV) (2007) Google Scholar
  20. 20.
    Kjellström, H., Romero, J., Martínez, D., Kragić, D.: Simultaneous visual recognition of manipulation actions and manipulated objects. In: IEEE European Conference on Computer Vision (ECCV) (2008) Google Scholar
  21. 21.
    Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space–time neighborhood features for human action recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2010) Google Scholar
  22. 22.
    Laptev, I., Perez, P.: Retrieving actions in movies. In: IEEE International Conference on Computer Vision (ICCV), pp. 1–8 (2007) CrossRefGoogle Scholar
  23. 23.
    Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2008) Google Scholar
  24. 24.
    Lin, Z., Jiang, Z., Davis, L.: Recognizing actions by shape-motion prototype trees. In: IEEE International Conference on Computer Vision (ICCV), pp. 444–451 (2009) CrossRefGoogle Scholar
  25. 25.
    Liu, J., Shah, M.: Learning human action via information maximization. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2008) Google Scholar
  26. 26.
    Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos in the wild. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2009) Google Scholar
  27. 27.
    Liu, J., Yang, Y., Shah, M.: Learning semantic visual vocabularies using diffusion distance. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2009) Google Scholar
  28. 28.
    Lv, F., Nevatia, R.: Single view human action recognition using key pose matching and Viterbi path searching. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2007) Google Scholar
  29. 29.
    Marszałek, M., Laptev, I., Schmid, C.: Actions in context. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2009) Google Scholar
  30. 30.
    Matikainen, P., Hebert, M., Sukthankar, R.: Representing pairwise spatial and temporal relations for action recognition. In: IEEE European Conference on Computer Vision (ECCV) (2010) Google Scholar
  31. 31.
    Messing, R., Pal, C., Kautz, H.: Activity recognition using the velocity histories of tracked keypoints. In: IEEE International Conference on Computer Vision (ICCV) (2009) Google Scholar
  32. 32.
    Niebles, J., Li, F.-F.: A hierarchical model of shape and appearance for human action classification. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2007) Google Scholar
  33. 33.
    Niebles, J., Chen, C.-W., Li, F.-F.: Modeling temporal structure of decomposable motion segments for activity classification. In: IEEE European Conference on Computer Vision (ECCV) (2010) Google Scholar
  34. 34.
    Prabhakar, K., Oh, S., Wang, P., Abowd, G., Rehg, J.: Temporal causality for the analysis of visual events. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2010) Google Scholar
  35. 35.
    Raptis, M., Soatto, S.: Tracklet descriptors for action modeling and video analysis. In: IEEE European Conference on Computer Vision (ECCV) (2010) Google Scholar
  36. 36.
    Ribeiro, P., Santos-Victor, J.: Human activity recognition from video: modeling, feature selection and classification architecture. In: International Workshop on Human Activity Recognition and Modelling (2005) Google Scholar
  37. 37.
    Rodriguez, M., Ahmed, J., Shah, M.: Action mach: A spatio-temporal maximum average correlation height filter for action recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2008) Google Scholar
  38. 38.
    Russell, B., Torralba, A., Murphy, K.: Labelme: A database and web-based tool for image annotation. Int. J. Comput. Vis. 77(1–3), 157–173 (2008) CrossRefGoogle Scholar
  39. 39.
    Satkin, S., Hebert, M.: Modeling the temporal extent of actions. In: IEEE European Conference on Computer Vision (ECCV) (2010) Google Scholar
  40. 40.
    Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local SVM approach. In: International Conference on Pattern Recognition (ICPR) (2004) Google Scholar
  41. 41.
    Sigal, L., Balan, A., Black, M.: Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. International Journal on Computer Vision (IJCV) 87(1–2) (2010) Google Scholar
  42. 42.
    Smeaton, A., Over, P., Kraaij, W.: Evaluation campaigns and trecvid. In: ACM International Conference on Multimedia Information Retrieval (MIR) (2006) Google Scholar
  43. 43.
    Sun, J., Wu, X., Yan, S., Cheong, L.-F., Chua, T.-S., Li, J.: Hierarchical spatio-temporal context modeling for action recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2009) Google Scholar
  44. 44.
    Tran, D., Sorokin, A.: Human activity recognition with metric learning. In: IEEE European Conference on Computer Vision (ECCV) (2008) Google Scholar
  45. 45.
    Turaga, P., Chellappa, R.: Machine recognition of human activities: A survey. IEEE Trans. Circuits Syst. Video Technol. 18(11), 1473–1488 (2008) CrossRefGoogle Scholar
  46. 46.
    Uemura, H., Ishikawa, S., Mikolajczyk, K.: Feature tracking and motion compensation for action recognition. In: British Machine Vision Conference (BMVC) (2008) Google Scholar
  47. 47.
    Venkata, S., Ahn, I., Jeon, D., Gupta, A., Louie, C., Garcia, S., Belongie, S., Taylor, M.: Sd-vbs: The San Diego Vision Benchmark Suite (2009) Google Scholar
  48. 48.
    Wang, P., Abowd, G., Rehg, J.: Quasi-periodic event analysis for social game retrieval. In: IEEE International Conference on Computer Vision (ICCV) (2009) Google Scholar
  49. 49.
    Wang, Y., Mori, G.: Learning a discriminative hidden part model for human action recognitio. In: Advances in Neural Information Processing Systems (NIPS) (2008) Google Scholar
  50. 50.
    Wang, Y., Mori, G.: Human action recognition by semilatent topic models. IEEE Trans. Pattern Anal. Mach. Intell. 31(10), 1762–1774 (2009) CrossRefGoogle Scholar
  51. 51.
    Wang, Y., Mori, G.: Max-margin hidden conditional random fields for human action recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2009) Google Scholar
  52. 52.
    Weinland, D., Boyer, E., Ronfard, R.: Action recognition from arbitrary views using 3d exemplars. In: IEEE International Conference on Computer Vision (ICCV) (2007) Google Scholar
  53. 53.
    Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Computer Vision and Image Understanding (2006) Google Scholar
  54. 54.
    Yao, B., Fei-Fei, L.: Grouplet: A structured image representation for recognizing human and object interactions. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2010) Google Scholar
  55. 55.
    Yao, B., Fei-Fei, L.: Modeling mutual context of object and human pose in human–object interaction activities. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2010) Google Scholar
  56. 56.
    Yao, B., Zhu, S.-C.: Learning deformable action templates from cluttered videos. In: IEEE International Conference on Computer Vision (ICCV) (2009) Google Scholar
  57. 57.
    Yeffet, L., Wolf, L.: Local trinary patterns for human action recognition. In: IEEE International Conference on Computer Vision (ICCV) (2009) Google Scholar
  58. 58.
    Yuan, J., Liu, Z., Wu, Y.: Discriminative subvolume search for efficient action detection. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2009) Google Scholar
  59. 59.
    Yuen, J., Russell, B., Liu, C., Torralba, A.: Labelme video: Building a video database with human annotations. In: IEEE International Conference on Computer Vision (ICCV) (2009) Google Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  1. 1.University of WashingtonSeattleUSA
  2. 2.IBM T.J. Watson Research CenterHawthornUSA

Personalised recommendations