Abstract
This paper presents an approach for collective activity recognition. Collective activities are activities performed by multiple persons, such as queueing in a line and talking together. To recognize them, the action context (AC) descriptor [1] encodes the “apparent” relation (e.g. a group crossing and facing “right”), however this representation is sensitive to viewpoint change. We instead propose a novel feature representation called the relative action context (RAC) descriptor that encodes the “relative” relation (e.g. a group crossing and facing the “same” direction). This representation is viewpoint invariant and complementary to AC; hence we employ a simplified combinational classifier. This paper also introduces two methods to accelerate performance. First, to make the contexts robust to various situations, we apply post processes. Second, to reduce local classification failures, we regularize the classification using fully connected CRFs. Experimental results show that our method is applicable to various scenes and outperforms state-of-the art methods.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Lan, T., Wang, Y., Mori, G.: Retrieving actions in group contexts. In: International Workshop on Sign Gesture Activity (2010)
Lan, T., Wang, Y., Yang, W., Mori, G.: Beyond actions: Discriminative models for contextual group activities. In: Adv. in NIPS 23 (2010)
Lan, T., Sigal, L., Mori, G.: Social roles in hierarchical models for human activity recognition. In: CVPR (2012)
Choi, W., Shahid, K., Savarese, S.: What are they doing?: Collective activity classification using spatio-temporal relationship among people. In: International Workshop on Visual Surveillance (2009)
Choi, W., Shahid, K., Savarese, S.: Learning context for collective activity recognition. In: CVPR (2011)
Amer, M.R., Todorovic, S.: A chains model for localizing participants of group activities in videos. In: ICCV (2011)
Kaneko, T., Shimosaka, M., Odashima, S., Fukui, R., Sato, T.: Consistent collective activity recognition with fully connected CRFs. In: ICPR (to appear, 2012)
Ryoo, M.S., Aggarwal, J.K.: Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In: ICCV (2009)
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR (2009)
Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV Workshop on Statistical Learning in Computer Vision (2004)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)
Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: Adv. in NIPS 24 (2011)
Zhang, Y., Chen, T.: Efficient inference for fully-connected CRFs with stationarity. In: CVPR (2012)
Dalal, N., Triggs, B.: Histogram of oriented gradients for human detection. In: CVPR (2005)
Hatef, M., Duin, R.P., Matas, J.: On combining classifiers. PAMI 20, 226–239 (1998)
Boykov, Y.Y., Jolly, M.P.: Interactive graph cuts for optimal boundary and region segmentation of objects in n-d images. In: ICCV (2001)
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9, 1871–1874 (2008)
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: CVPR (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kaneko, T., Shimosaka, M., Odashima, S., Fukui, R., Sato, T. (2012). Viewpoint Invariant Collective Activity Recognition with Relative Action Context. In: Fusiello, A., Murino, V., Cucchiara, R. (eds) Computer Vision – ECCV 2012. Workshops and Demonstrations. ECCV 2012. Lecture Notes in Computer Science, vol 7585. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33885-4_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-33885-4_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33884-7
Online ISBN: 978-3-642-33885-4
eBook Packages: Computer ScienceComputer Science (R0)