Abstract
Interaction recognition is an important part of action recognition and has various applications such as surveillance systems, human computer interface, and machine intelligence. In this paper, we propose a novel group-sparsity-optimization-based feature selection model for complex interaction recognition. Firstly multiple local and global features are concatenated into a feature pool, and then based on the group sparsity optimization, different feature types are automatically selected to fit specific interaction categorization. We test our method on the benchmark dataset: the UT-interaction dataset. Experimental results substantiate the effectiveness of the proposed method on complex interaction recognition tasks as compared with current state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gallese, V., Fadiga, L., Fogassi, L., Rizzolatti, G.: Action recognition in the premotor cortex. Brain 119, 593–609 (1996)
Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Comput. Vis. Image Underst. 104, 249–257 (2006)
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: 2004 IEEE International Conference on Pattern Recognition, vol. 3, pp. 32–36. IEEE (2004)
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: 2005 IEEE International Conference on Computer Vision, vol. 2, pp. 1395–1402. IEEE (2005)
Bregonzio, M., Gong, S., Xiang, T.: Recognising action as clouds of space-time interest points. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1948–1955. IEEE (2009)
Hoai, M., Lan, Z.Z., De la Torre, F.: Joint segmentation and classification of human actions in video. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3265–3272. IEEE (2011)
Joo, S.W., Chellappa, R.: Attribute grammar-based event recognition and anomaly detection. In: 2006 Conference on Computer Vision and Pattern Recognition Workshop, pp. 107–107. IEEE (2006)
Khan, S.M., Shah, M.: Detecting group activities using rigidity of formation. In: Proceedings of the 13th ACM international conference on Multimedia, pp. 403–406. ACM (2005)
Kitani, K.M., Sato, Y., Sugimoto, A.: Deleted interpolation using a hierarchical bayesian grammar network for recognizing human activity. In: 2005 Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 239–246. IEEE (2005)
Waltisberg, D., Yao, A., Gall, J., Van Gool, L.: Variations of a hough-voting action recognition system. In: Ünay, D., Çataltepe, Z., Aksoy, S. (eds.) Recognizing Patterns in Signals, Speech, Images, and Videos. LNCS, vol. 6388, pp. 306–312. Springer, Heidelberg (2010)
Oliver, N., Rosario, B., Pentland, A.: Graphical models for recognizing human interactions. In: Proceedings of International Conference on Neural Information and Processing Systems, pp. 924–930. Citeseer (1998)
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65–72 (2005)
Wu, X., Ngo, C.W., Li, J., Zhang, Y.: Localizing volumetric motion for action recognition in realistic videos. In: Proceedings of the 17th ACM International Conference on Multimedia, pp. 505–508 (2009)
Yu, T.H., Kim, T.K., Cipolla, R.: Real-time action recognition by spatiotemporal semantic and structural forest. In: Proceedings of the British Machine Vision Conference, pp. 52.1–52.12. BMVA Press (2010)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)
Choi, W., Savarese, S.: A unified framework for multi-target tracking and collective activity recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) Computer Vision - ECCV 2012. LNCS. Springer, Heidelberg (2012)
Laptev, I., Pérez, P.: Retrieving actions in movies. In: IEEE 11th International Conference on Computer Vision, pp. 1–8. IEEE (2007)
Vahdat, A., Gao, B., Ranjbar, M., Mori, G.: A discriminative key pose sequence model for recognizing human interactions. In: 2011 IEEE International Conference on Computer Vision Workshops, pp. 1729–1736. IEEE (2011)
Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Trans. Pattern Anal. Mach. Intell. 34, 1549–1562 (2012)
Kong, Y., Jia, Y., Fu, Y.: Interactive phrases: semantic descriptions for human interaction recognition. IEEE Trans. Pattern Anal. Mach. Intell. 36, 1775–1788 (2014)
Patron-Perez, A., Marszalek, M., Reid, I., Zisserman, A.: Structured learning of human interactions in tv shows. IEEE Trans. Pattern Anal. Mach. Intell. 34, 2441–2453 (2012)
Tan, M., Wang, L., Tsang, I.W.: Learning sparse svm for feature selection on very high dimensional datasets. In: Proceedings of the 27th International Conference on Machine Learning, pp. 1047–1054 (2010)
Qian, Y., Zhou, J., Ye, M., Wang, Q.: Structured sparse model based feature selection and classification for hyperspectral imagery. In: 2011 IEEE International Conference on Geoscience and Remote Sensing Symposium, pp. 1771–1774. IEEE (2011)
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893. IEEE (2005)
Ryoo, M.S.: Human activity prediction: early recognition of ongoing activities from streaming videos. In: 2011 IEEE International Conference on Computer Vision, pp. 1036–1043. IEEE (2011)
Dong, Z., Kong, Y., Liu, C., Li, H., Jia, Y.: Recognizing human interaction by multiple features. In: 2011 First Asian Conference on Pattern Recognition, pp. 77–81. IEEE (2011)
Wang, H., Klaser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3169–3176. IEEE (2011)
Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 363–370. Springer, Heidelberg (2003)
Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C., et al.: Evaluation of local spatio-temporal features for action recognition. In: 2009 British Machine Vision Conference, p. 127 (2009)
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Oneata, D., Verbeek, J., Schmid, C.: Action and event recognition with fisher vectors on a compact feature set. In: 2013 IEEE International Conference on Computer Vision, pp. 1817–1824. IEEE (2013)
Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint 2, 1-norms minimization. In: Advances in Neural Information Processing Systems, pp. 1813–1821 (2010)
Xu, Z., Dai, M., Meng, D.: Fast and efficient strategies for model selection of gaussian support vector machine. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 39, 1292–1307 (2009)
Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27 (2011)
Baird, L., Moore, A.W.: Gradient descent for general reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 968–974 (1999)
Ryoo, M.S., Aggarwal, J.K.: Spatio-temporal relationship match: video structure comparison for recognition of complex human activities. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 1593–1600. IEEE (2009)
Acknowledgement
This work is supported by the National Natural Science Foundation of China (No. 61102131, 61373114, 61275099), the Natural Science Foundation of Chongqing Science and Technology Commission (No. cstc2014jcyjA40048), the Project of Key Laboratory of Signal and Information Processing of Chongqing (No. CSTC2009CA2003).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Yang, L., Gao, C., Meng, D., Jiang, L. (2015). A Novel Group-Sparsity-Optimization-Based Feature Selection Model for Complex Interaction Recognition. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9007. Springer, Cham. https://doi.org/10.1007/978-3-319-16814-2_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-16814-2_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16813-5
Online ISBN: 978-3-319-16814-2
eBook Packages: Computer ScienceComputer Science (R0)