Human Action Recognition in Video by Fusion of Structural and Spatio-temporal Features

Zare Borzeshi, Ehsan; Perez Concha, Oscar; Piccardi, Massimo

doi:10.1007/978-3-642-34166-3_52

Ehsan Zare Borzeshi²⁴,
Oscar Perez Concha²⁵ &
Massimo Piccardi²⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7626))

Included in the following conference series:

Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR)

2537 Accesses
1 Citations

Abstract

The problem of human action recognition has received increasing attention in recent years for its importance in many applications. Local representations and in particular STIP descriptors have gained increasing popularity for action recognition. Yet, the main limitation of those approaches is that they do not capture the spatial relationships in the subject performing the action. This paper proposes a novel method based on the fusion of global spatial relationships provided by graph embedding and the local spatio-temporal information of STIP descriptors. Experiments on an action recognition dataset reported in the paper show that recognition accuracy can be significantly improved by combining the structural information with the spatio-temporal features.

Download to read the full chapter text

Chapter PDF

Structural iMoSIFT for human action recognition

Article 01 June 2016

Recognition of Human Action and Identification Based on SIFT and Watermark

Spatio-temporal information for human action recognition

Article Open access 24 November 2016

Keywords

References

Laptev, I.: On space-time interest points. International Journal of Computer Vision 64(2), 107–123 (2005)
Article MathSciNet Google Scholar
Niebles, J., Chen, C.W., Fei-Fei, L.: Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 392–405. Springer, Heidelberg (2010)
Chapter Google Scholar
Ta, A.-P., Wolf, C., Lavoue, G., Baskurt, A.: Recognizing and localizing individual activities through graph matching, pp. 196–203. IEEE Computer Society, Los Alamitos (2010)
Google Scholar
Borzeshi, E.Z., Xu, R.Y.D., Piccardi, M.: Automatic Human Action Recognition in Videos by Graph Embedding. In: Maino, G., Foresti, G.L. (eds.) ICIAP 2011, Part II. LNCS, vol. 6979, pp. 19–28. Springer, Heidelberg (2011)
Chapter Google Scholar
Fischler, M., Elschlager, R.: The representation and matching of pictorial structures. IEEE Transactions on Computers 22(1), 67–92 (1973)
Article Google Scholar
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 3 (2004)
Google Scholar
Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern Analysis & Applications 13(1), 113–129 (2010)
Article MathSciNet Google Scholar
Neuhaus, M., Bunke, H.: Automatic learning of cost functions for graph edit distance. Information Sciences 177(1), 239–247 (2007)
Article MathSciNet MATH Google Scholar
Rieck, K., Laskov, P.: Linear-Time Computation of Similarity Measures for Sequential Data. Journal of Machine Learning Research 9, 23–48 (2007)
Google Scholar
Belkin, M., Niyogi, P.: Laplacian Eigenmaps for Dimensionality Reduction and Data Representation. Neural Computation 15(6), 1373–1396 (2003)
Article MATH Google Scholar
Qiu, H., Hancock, E.R.: Clustering and embedding using commute times. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(11), 1873–1890 (2007)
Article Google Scholar
Wilson, R.C., Hancock, E.R., Luo, B.: Pattern vectors from algebraic graph theory. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1112–1124 (2005)
Google Scholar
Riesen, K., Neuhaus, M., Bunke, H.: Graph Embedding in Vector Spaces by Means of Prototype Selection. In: Escolano, F., Vento, M. (eds.) GbRPR. LNCS, vol. 4538, pp. 383–393. Springer, Heidelberg (2007)
Chapter Google Scholar
Hjaltason, G.R., Samet, H.: Properties of embedding methods for similarity searching in metric spaces. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(5), 530–549 (2003)
Article Google Scholar
Borzeshi, E.Z., Piccardi, M., Xu, R.Y.D.: A discriminative prototype selection approach for graph embedding in human action recognition. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1295–1301. IEEE (2011)
Google Scholar
Riesen, K., Bunke, H.: Graph classification by means of Lipschitz embedding. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 39(6), 1472–1483 (2009)
Article Google Scholar
Chen, T.P., Haussecker, H., Bovyrin, A., Belenov, R., Rodyushkin, K., Kuranov, A., Eruhimov, V.: Computer vision workload analysis: case study of video surveillance systems. Intel Technology Journal 9(2), 109–118 (2005)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
Vedaldi, A., Fulkerson, B.: Vlfeat: An open and portable library of computer vision algorithms. In: Proceedings of the International Conference on Multimedia, pp. 1469–1472. ACM (2010)
Google Scholar
Rodriguez, M.D., Ahmed, J., Shah, M.: Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE (2008)
Google Scholar
Singh, S., Velastin, S.A., Ragheb, H.: Muhavi: A multicamera human action video dataset for the evaluation of action recognition methods. In: 2010 Seventh IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 48–55. IEEE (2010)
Google Scholar
Concha, O.P., Xu, D., Yi, R., Moghaddam, Z., Piccardi, M.: Hmm-mio: an enhanced hidden markov model for action recognition. In: 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 62–69. IEEE (2011)
Google Scholar
Rabiner, L., Juang, B.: An introduction to hidden markov models. IEEE ASSP Magazine 3(1), 4–16 (1986)
Article Google Scholar
Liu, C., Rubin, D.B.: Ml estimation of the t distribution using em and its extensions, ecm and ecme. Statistica Sinica 5(1), 19–39 (1995)
MathSciNet MATH Google Scholar
Archambeau, C., Delannay, N., Verleysen, M.: Mixtures of robust probabilistic principal component analyzers. Neurocomputing 71(7), 1274–1282 (2008)
Article Google Scholar
Gao, Z., Chen, M., Hauptmann, A., Cai, A.: Comparing Evaluation Protocols on the KTH Dataset. In: Salah, A.A., Gevers, T., Sebe, N., Vinciarelli, A. (eds.) HBU 2010. LNCS, vol. 6219, pp. 88–100. Springer, Heidelberg (2010)
Chapter Google Scholar
Guo, K., Ishwar, P., Konrad, J.: Action recognition using sparse representation on covariance manifolds of optical flow. In: 2010 Seventh IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 188–195. IEEE (2010)
Google Scholar
Rother, C., Kolmogorov, V., Blake, A.: Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics (TOG) 23, 309–314 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing and Communications, Faculty of Engineering and IT, University of Technology, Sydney (UTS), Sydney, Australia
Ehsan Zare Borzeshi & Massimo Piccardi
Centre for Health Informatics, Australian Institute of Health Innovation, University of New South Wales, Sydney (UNSW), Australia
Oscar Perez Concha

Authors

Ehsan Zare Borzeshi
View author publications
You can also search for this author in PubMed Google Scholar
Oscar Perez Concha
View author publications
You can also search for this author in PubMed Google Scholar
Massimo Piccardi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Auckland, Private Bag 92019, 1142, Auckland, New Zealand
Georgy Gimel’farb
Department of Computer Science, University of York, Deramore Lane, YO10 5GH, York, UK
Edwin Hancock
Institute of Media and Information Technology, Chiba University, Yayoi-cho 1-33, 263-8522, Inage-ku, Chiba, Japan
Atsushi Imiya
Technische Universität/Fraunhofer IGD, Fraunhoferstraße 5, 64283, Darmstadt, Germany
Arjan Kuijper
Graduate School of Information Science and Technology, Hokkaido University, 060-0814, Sapporo, Japan
Mineichi Kudo
Graduate School of Engineering, Tohoku University, 6-6-05 Aoba, Aramaki, Aoba-ku, 980-8579, Sendai, Miyagi, Japan
Shinichiro Omachi
Centre for Vision, Speech and Signal Processing, University of Surrey, GU2 7XH, Guildford, Surrey, UK
Terry Windeatt
C&C Innovation Research Laboratories, NEC Corporation, 8916-47 Takayama-cho, Ikoma-Shi, Nara, Japan
Keiji Yamada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zare Borzeshi, E., Perez Concha, O., Piccardi, M. (2012). Human Action Recognition in Video by Fusion of Structural and Spatio-temporal Features. In: Gimel’farb, G., et al. Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2012. Lecture Notes in Computer Science, vol 7626. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34166-3_52

Download citation

DOI: https://doi.org/10.1007/978-3-642-34166-3_52
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34165-6
Online ISBN: 978-3-642-34166-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Human Action Recognition in Video by Fusion of Structural and Spatio-temporal Features

Abstract

Chapter PDF

Similar content being viewed by others

Structural iMoSIFT for human action recognition

Recognition of Human Action and Identification Based on SIFT and Watermark

Spatio-temporal information for human action recognition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Human Action Recognition in Video by Fusion of Structural and Spatio-temporal Features

Abstract

Chapter PDF

Similar content being viewed by others

Structural iMoSIFT for human action recognition

Recognition of Human Action and Identification Based on SIFT and Watermark

Spatio-temporal information for human action recognition

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation