Abstract
We develop methods for action retrieval from surveillance video using contextual feature representations. The novelty of our proposed approach is two-fold. First, we introduce a new feature representation called the action context (AC) descriptor. The AC descriptor encodes information about not only the action of an individual person in the video, but also the behaviour of other people nearby. This feature representation is inspired by the fact that the context of what other people are doing provides very useful cues for recognizing the actions of each individual. Second, we formulate our problem as a retrieval/ranking task, which is different from previous work on action classification. We develop an action retrieval technique based on rank-SVM, a state-of-the-art approach for solving ranking problems. We apply our proposed approach on two real-world datasets. The first dataset consists of videos of multiple people performing several group activities. The second dataset consists of surveillance videos from a nursing home environment. Our experimental results show the advantage of using contextual information for disambiguating different actions and the benefit of using rank-SVMs instead of regular SVMs for video retrieval problems.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Wang, Y., Mori, G.: Human action recognition by semi-latent topic models. IEEE Trans. PAMI 31, 1762–1774 (2009)
Wang, X., Ma, X., Grimson, E.: Unsupervised activity perception in crowded and complicated scenes using hierarchical bayesian models. IEEE Trans. PAMI 31, 539–555 (2009)
Loy, C.C., Xiang, T., Gong, S.: Modelling activity global temporal dependencies using time delayed probabilistic graphical model. In: ICCV (2009)
Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: CVPR (2009)
Han, D., Bo, L., Sminchisescu, C.: Selection and context for action recognition. In: IEEE International Conference on Computer Vision (2009)
Xiang, T., Gong, S.: Beyond tracking: Modelling activity and understanding behaviour. Int. Journal of Computer Vision 67, 21–51 (2006)
Gupta, A., Srinivasan, P., Shi, J., Davis, L.S.: Understanding videos, constructing plots - learning a visually grounded storyline model from annotated videos. In: CVPR (2009)
Zhong, H., Shi, J., Visontai, M.: Detecting unusual activity in video. In: Proc. IEEE Comput. Soc. Conf. Comput. Vision and Pattern Recogn. (2004)
Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: CVPR (2009)
Choi, W., Shahid, K., Savarese, S.: What are they doing?: Collective activity classification using spatio-temporal relationship among people. In: VS (2009)
Joachims, T.: Optimizing search engines using clickthrough data. In: ACM SIGKDD (2002)
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: CVPR (2008)
Dalal, N., Triggs, B.: Histogram of oriented gradients for human detection. In: CVPR (2005)
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local svm approach. In: 17th International Conference on Pattern Recognition (2004)
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: Proc. 10th Int. Conf. Computer Vision (2005)
Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. In: BMVC (2006)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)
Joachims, T.: A support vector method for multivariate performance measures. In: International Conference on Machine Learning (2005)
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: ICCV (2005)
Chapelle, O., Le, Q., Smola, A.: Large margin optimization of ranking measures. In: NIPS Workshop on Learning to Rank (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lan, T., Wang, Y., Mori, G., Robinovitch, S.N. (2012). Retrieving Actions in Group Contexts. In: Kutulakos, K.N. (eds) Trends and Topics in Computer Vision. ECCV 2010. Lecture Notes in Computer Science, vol 6553. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35749-7_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-35749-7_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35748-0
Online ISBN: 978-3-642-35749-7
eBook Packages: Computer ScienceComputer Science (R0)