Comprehensive Representation and Efficient Extraction of Spatial Information for Human Activity Recognition from Video Data

Kalita, Shobhanjana; Karmakar, Arindam; Hazarika, Shyamanta M.

doi:10.1007/978-981-10-2107-7_8

Comprehensive Representation and Efficient Extraction of Spatial Information for Human Activity Recognition from Video Data

Shobhanjana Kalita¹⁸,
Arindam Karmakar¹⁸ &
Shyamanta M. Hazarika¹⁸

Conference paper
First Online: 25 December 2016

1104 Accesses
2 Citations

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 460))

Abstract

Of late, human activity recognition (HAR) in video has generated much interest. A fundamental step is to develop a computational representation of interactions. Human body is often abstracted using minimum bounding rectangles (MBRs) and approximated as a set of MBRs corresponding to different body parts. Such approximations assume each MBR as an independent entity. This defeats the idea that these are parts of the whole body. A representation schema for interaction between entities, each of which is considered as set of related rectangles or what is referred to as extended objects holds promise. We propose an efficient representation schema for extended objects together with a simple recursive algorithm to extract spatial information. We evaluate our approach and demonstrate that, for HAR, the spatial information thus extracted leads to better models compared to CORE9 [1] a compact and comprehensive representation schema for video understanding.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
http://www.visint.org.
2.
We use I-frames obtained using the tool ffmpeg as keyframes, http://www.ffmpeg.org.

References

Cohn, A.G., Renz, J., Sridhar, M.: Thinking inside the box: A comprehensive spatial representation for video analysis. In: Proc. 13th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR2012). pp. 588–592. AAAI Press (2012)
Google Scholar
Aggarwal, J., Ryoo, M.: Human activity analysis: A review. ACM Computing Surveys 43(3), 16:1–16:43 (Apr 2011)
Google Scholar
Dubba, K.S.R., Bhatt, M., Dylla, F., Hogg, D.C., Cohn, A.G.: Interleaved inductive-abductive reasoning for learning complex event models. In: ILP. Lecture Notes in Computer Science, vol. 7207, pp. 113–129. Springer (2012)
Google Scholar
Kusumam, K.: Relational Learning using body parts for Human Activity Recognition in Videos. Master’s thesis, University of Leeds (2012)
Google Scholar
Schneider, M., Behr, T.: Topological relationships between complex spatial objects. ACM Trans. Database Syst. 31(1), 39–81 (2006)
Article Google Scholar
Skiadopoulos, S., Koubarakis, M.: On the consistency of cardinal directions constraints. Artificial Intelligence 163, 91 – 135 (2005)
Article MathSciNet MATH Google Scholar
Chen, L., Nugent, C., Mulvenna, M., Finlay, D., Hong, X.: Semantic smart homes: Towards knowledge rich assisted living environments. In: Intelligent Patient Management, vol. 189, pp. 279–296. Springer Berlin Heidelberg (2009)
Google Scholar
Cohn, A.G., Hazarika, S.M.: Qualitative spatial representation and reasoning: An overview. Fundam. Inform. 46(1-2), 1–29 (2001)
MathSciNet MATH Google Scholar
Randell, D.A., Cui, Z., Cohn, A.G.: A spatial logic based on regions and connection. In: Proc. of 3rd Int. Conf. on Principles of Knowledge Representation and Reasoning (KR’92). pp. 165–176. Morgan Kauffman (1992)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Google Scholar
al Harbi, N., Gotoh, Y.: Describing spatio-temporal relations between object volumes in video streams. In: Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Google Scholar
Sokeh, H.S., Gould, S., J, J.: Efficient extraction and representation of spatial information from video data. In: Proc. of the 23rd Int. Joint Conf. on Artificial Intelligence (IJCAI’13). pp. 1076–1082. AAAI Press/IJCAI (2013)
Google Scholar
Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Comp. Soc. Conf. on Computer Vision and Pattern Recognition (CVPR). vol. 2, pp. 524–531 (2005)
Google Scholar
Phan, X.H., Nguyen, C.T.: GibbsLDA++: A C/C++ implementation of latent Dirichlet allocation (LDA) (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Biomimetic and Cognitive Robotics Lab, Computer Science and Engineering, Tezpur University, Tezpur, 784028, India
Shobhanjana Kalita, Arindam Karmakar & Shyamanta M. Hazarika

Authors

Shobhanjana Kalita
View author publications
You can also search for this author in PubMed Google Scholar
Arindam Karmakar
View author publications
You can also search for this author in PubMed Google Scholar
Shyamanta M. Hazarika
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shobhanjana Kalita .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Balasubramanian Raman
Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Sanjeev Kumar
Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Partha Pratim Roy
Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Debashis Sen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kalita, S., Karmakar, A., Hazarika, S.M. (2017). Comprehensive Representation and Efficient Extraction of Spatial Information for Human Activity Recognition from Video Data. In: Raman, B., Kumar, S., Roy, P., Sen, D. (eds) Proceedings of International Conference on Computer Vision and Image Processing. Advances in Intelligent Systems and Computing, vol 460. Springer, Singapore. https://doi.org/10.1007/978-981-10-2107-7_8

Download citation

DOI: https://doi.org/10.1007/978-981-10-2107-7_8
Published: 25 December 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-2106-0
Online ISBN: 978-981-10-2107-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics