Scale Invariant Action Recognition Using Compound Features Mined from Dense Spatio-temporal Corners

Gilbert, Andrew; Illingworth, John; Bowden, Richard

doi:10.1007/978-3-540-88682-2_18

Scale Invariant Action Recognition Using Compound Features Mined from Dense Spatio-temporal Corners

Andrew Gilbert⁴,
John Illingworth⁴ &
Richard Bowden⁴

Conference paper

9108 Accesses
33 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5302))

Abstract

The use of sparse invariant features to recognise classes of actions or objects has become common in the literature. However, features are often ”engineered” to be both sparse and invariant to transformation and it is assumed that they provide the greatest discriminative information. To tackle activity recognition, we propose learning compound features that are assembled from simple 2D corners in both space and time. Each corner is encoded in relation to its neighbours and from an over complete set (in excess of 1 million possible features), compound features are extracted using data mining. The final classifier, consisting of sets of compound features, can then be applied to recognise and localise an activity in real-time while providing superior performance to other state-of-the-art approaches (including those based upon sparse feature detectors). Furthermore, the approach requires only weak supervision in the form of class labels for each training sequence. No ground truth position or temporal alignment is required during training.

Download to read the full chapter text

Chapter PDF

References

Schuldt, C., Laptev, I., Caputo, B.: Recognizing Human Actions: a Local SVM Approach. In: Proc. of International Conference on Pattern Recognition (ICPR 2004), vol. III, pp. 32–36 (2004)
Google Scholar
Viola, P., Jones, M.: Rapid Object Detection using a Boosted Cascade of Simple Features. In: Proc. of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2001), vol. I, pp. 511–518 (2001)
Google Scholar
Ke, Y., Sukthankar, R., Hebert, M.: Efficient Visual Event Detection using Volumetric Features. In: Proc. of IEEE International Conference on Computer Vision (ICCV 2005) (2005)
Google Scholar
Cooper, H.M., Bowden, R.: Sign Language Recognition Using Boosted Volumetric Features. In: Proc. IAPR Conf. on Machine Vision Applications, pp. 359–362 (2007)
Google Scholar
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior Recognition via Sparse Spatio-temporal Features. In: ICCCN 2005: Proceedings of the 14th International Conference on Computer Communications and Networks, pp. 65–72 (2005)
Google Scholar
Laptev, I., Pérez.: Retrieving Actions in Movies. In: Proc. of IEEE International Conference on Computer Vision (ICCV 2007) (2007)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB 1994, Proceedings of 20th International Conference on Very Large Data Bases, pp. 487–499 (1994)
Google Scholar
Quack, T., Ferrari, V., Leibe, B., Gool, L.: Efficient Mining of Frequent and Distinctive Feature Configurations. In: Proc. of IEEE International Conference on Computer Vision (ICCV 2007) (2007)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Semi-Local Affine Parts for Object Recognition. In: Proc. of BMVA British Machine Vision Conference (BMVC 2004), vol. II, pp. 959–968 (2004)
Google Scholar
Sivic, J., Zisserman, A.: Video Data Mining using Configurations of Viewpoint Invariant Regions. In: Proc. of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2004), vol. I, pp. 488–495 (2004)
Google Scholar
Niebles, J.C., Fei-Fei, L.: A Hierarchical Model of Shape and Appearance for Human Action Classification. In: Proc. of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2007) (2007)
Google Scholar
Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: Proc. of MULTIMEDIA 2007, pp. 357–360 (2007)
Google Scholar
Lowe, D.: Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision 20, 91–110 (2003)
Google Scholar
Dalal, N., Triggs, B., Schmid, C.: Human Detection using Oriented Histograms of Flow and Apperance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 428–441. Springer, Heidelberg (2006)
Chapter Google Scholar
Lucas, B., Kanade, T.: An Iterative Image Registration Technique with an Application to Stereo Vision. In: Proc. of 7th International Joint Conference on Artificial Intelligence (IJCAI), pp. 674–679 (1998)
Google Scholar
Song, Y., Goncalves, L., Perona, P.: Unsupervised Learning of Human Motion. Transactions on Pattern Analysis and Machine Intelligence 25, 814–827 (2003)
Article Google Scholar
Tesic, J., Newsam, S., Manjunath, B.S.: Mining image datasets using perceptual association rules. In: Proc. SIAM International Conference on Data Mining, Workshop on Mining Scientific and Engineering Datasets, pp. 71–77 (2003)
Google Scholar
Ding, Q., Ding, Q., Perrizo, W.: Association rule mining on remotely sensed images using p-trees. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 66–79 (2002)
Google Scholar
Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval. In: Proc. IEEE International Conference on Computer Vision (ICCV 2007), pp. 1–8 (2007)
Google Scholar
Harris, C., Stphens, M.: A Combined Corner and Edge Detector. In: Proc. of Alvey Vision Conference, 189–192 (1988)
Google Scholar
Fleuret, F., Geman, D.: Coarse to Fine Face Detection. International Journal of Computer Vision 41, 85–107 (2001)
Article MATH Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. of the 1993 ACM SIGMOD International Conference on Management of Data SIGMOD 1993, pp. 207–216 (1993)
Google Scholar
Nowozin, S., Bakir, G., Tsuda, K.: Discriminative Subsequence Mining for Action Classification. In: Proc. of IEEE International Conference on Computer Vision (ICCV 2007), pp. 1919–1923 (2007)
Google Scholar
Wong, S.F., Cipolla, R.: Extracting Spatio Temporal Interest Points using Global Information. In: Proc. of IEEE International Conference on Computer Vision (ICCV 2007) (2007)
Google Scholar
Niebles, J., Wang, H., Fei-Fei, L.: Unsupervised Learning of Human Action Categories using Spatial-Temporal Words. In: Proc. of BMVA British Machine Vision Conference (BMVC 2006), vol. III, pp. 1249–1259 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

CVSSP, University of Surrey, Guildford, GU2 7XH, England
Andrew Gilbert, John Illingworth & Richard Bowden

Authors

Andrew Gilbert
View author publications
You can also search for this author in PubMed Google Scholar
John Illingworth
View author publications
You can also search for this author in PubMed Google Scholar
Richard Bowden
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, University of Illinois at Urbana Champaign, 3310 Siebel Hall, Urbana, IL 61801, USA
David Forsyth
Department of Computing, Oxford Brookes University, OX33 1HX, Wheatley, Oxford, UK
Philip Torr
Department of Engineering Science, University of Oxford, Parks Road, OX1 3PJ, Oxford, UK
Andrew Zisserman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gilbert, A., Illingworth, J., Bowden, R. (2008). Scale Invariant Action Recognition Using Compound Features Mined from Dense Spatio-temporal Corners. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5302. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88682-2_18

Download citation

DOI: https://doi.org/10.1007/978-3-540-88682-2_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88681-5
Online ISBN: 978-3-540-88682-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics