Abstract
Recognizing human activities is a fundamental problem in the computer vision community and is a key step toward the automatic understanding of scenes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In this chapter, we use interactive phrases and phrases interchangeably.
- 2.
Please refer to the supplemental material to see details about the connectivity patterns of interactive phrases and attributes.
References
Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. 43(3), pp. 16:1–16:43. (2011)
Choi, W., Savarese, S.: A unified framework for multi-target tracking and collective activity recognition. In: European Conference on Computer Vision, pp. 215–230. Springer, Berlin (2012)
Choi, W., Shahid, K., Savarese, S.: Learning context for collective activity recognition. In: Conference on Computer Vision and Pattern Recognition (2011)
Chow, C.K., Liu, C.N.: Approximating discrete probability distributions with dependence tree. IEEE Trans. Inf. Theory 14(3), 462–467 (1968)
Desai, C., Ramanan, D., Fowlkes, C.: Discriminative models for multi-class object layout. In: International Conference on Computer Vision (2009)
Desai, C., Ramanan, D., Fowlkes, C.: Discriminative models for static human-object interactions. In: Conference on Computer Vision and Pattern Recognition Workshop on Structured Models in Computer Vision (2010)
Do, T.-M.-T., Artieres, T.: Large margin training for hidden Markov models with partially observed states. In: International Conference on Machine Learning (2009)
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: Visual Surveillance and Performance Evaluation of Tracking and Surveillance (2005)
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: Conference on Computer Vision and Pattern Recognition, pp. 1778–1785. IEEE (2009)
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: Conference on Computer Vision and Pattern Recognition (2008)
Ferrari, V., Zisserman, A.: Learning visual attributes. In: Conference on Neural Information Processing Systems (2007)
Filipovych, R., Ribeiro, E.: Recognizing primitive interactions by exploring actor-object states. In: Conference on Computer Vision and Pattern Recognition, pp. 1–7. IEEE, New York (2008)
Gong, S., Xiang, T.: Recognition of group activities using dynamic probabilistic networks. In: International Conference on Computer Vision, vol. 2, pp. 742–749 (2003)
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)
Gupta, A., Davis, L.S.: Beyond nouns: exploiting prepositions and comparative adjectives for learning visual classifiers. In: European Conference on Computer Vision (2008)
Gupta, A., Kembhavi, A., Davis, L.S.: Observing human-object interactions: using spatial and functional compatibility for recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(10), 1775–1789 (2009)
Kong, Y., Fu, Y.: Modeling supporting regions for close human interaction recognition. In: European Conference on Computer Vision Workshop (2014)
Kong, Y., Jia, Y., Fu, Y.: Learning human interaction by interactive phrases. In: European Conference on Computer Vision (2012)
Kong, Y., Jia, Y., Fu, Y.: Interactive phrases: semantic descriptions for human interaction recognition. IEEE Trans. Pattern Anal. Mach. Intell. 36, 1775–1788 (2014)
Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: Conference on Computer Vision and Pattern Recognition, pp. 2046–2053. IEEE, New York (2010)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: International Conference on Machine Learning (2001)
Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: Conference on Computer Vision and Pattern Recognition (2009)
Lan, T., Wang, Y., Yang, W., Mori, G.: Beyond actions: discriminative models for contextual group activities. In: Conference on Neural Information Processing Systems (2010)
Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Trans. Pattern Anal. Mach. Intell. 34(8), 1549–1562 (2012)
Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Conference on Computer Vision and Pattern Recognition (2008)
Li, R., Chellappa, R., Zhou, S.K.: Learning multi-modal densities on discriminative temporal interaction manifold for group activity recognition. In: Conference on Computer Vision and Pattern Recognition, pp. 2450–2457 (2009)
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos “in the wild”. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2009)
Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: Conference on Computer Vision and Pattern Recognition (2011)
Marszałek, M., Laptev, I., Schmid, C.: Actions in context. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2009)
Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: Conference on Computer Vision and Pattern Recognition (2009)
Ni, B., Yan, S., Kassim, A.A.: Recognizing human group activities with localized causalities. In: Conference on Computer Vision and Pattern Recognition, pp. 1470–1477 (2009)
Niebles, J.C., Chen, C.-W., Fei-Fei, L.: Modeling temporal structure of decomposable motion segments for activity classification. In: European Conference on Computer Vision, vol. 6312 (2010)
Odashima, S., Shimosaka, M., Kaneko, T., Fuikui, R., Sato, T.: Collective activity localization with contextual spatial pyramid. In: European Conference on Computer Vision (2012)
Oliver, N.M., Rosario, B., Pentland, A.P.: A Bayesian computer vision system for modeling human interactions. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 831–843 (2000)
Patron-Perez, A., Marszalek, M., Reid, I., Zissermann, A.: Structured learning of human interaction in tv shows. IEEE Trans. Pattern Anal. Mach. Intell. 34(12), 2441–2453 (2012)
Raptis, M., Soatto, S.: Tracklet descriptors for action modeling and video analysis. In: European Conference on Computer Vision (2010)
Ryoo, M.S.: Human activity prediction: early recognition of ongoing activities from streaming videos. In: International Conference on Computer Vision (2011)
Ryoo, M.S., Aggarwal, J.K.: Recognition of composite human activities through context-free grammar based representation. In: Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1709–1718 (2006)
Ryoo, M.S., Aggarwal, J.K.: Spatio-temporal relationship match: video structure comparison for recognition of complex human activities. In: International Conference on Computer Vision, pp. 1593–1600 (2009)
Ryoo, M.S., Aggarwal, J.K.: UT-interaction dataset, ICPR contest on semantic description of human activities (SDHA) (2010). http://cvrc.ece.utexas.edu/SDHA2010/Human_Interaction.html
Ryoo, M., Aggarwal, J.: Stochastic representation and recognition of high-level group activities. Int. J. Comput. Vis. 93, 183–200 (2011)
Sadeghi, M.A., Farhadi, A.: Recognition using visual phrases. In: Conference on Computer Vision and Pattern Recognition (2011)
Shechtman, E., Irani, M.: Space-time behavior based correlation. In: Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 405–412. IEEE (2005)
Taskar, B., Guestrin, C., Koller, D.: Max-margin Markov networks. In: Conference on Neural Information Processing Systems (2003)
Vahdat, A., Gao, B., Ranjbar, M., Mori, G.: A discriminative key pose sequence model for recognizing human interactions. In: International Conference on Computer Vision Workshops, pp. 1729–1736 (2011)
Wang, Y., Mori, G.: Max-margin hidden conditional random fields for human action recognition. In: Conference on Computer Vision and Pattern Recognition, pp. 872–879 (2009)
Wang, Y., Mori, G.: A discriminative latent model of object classes and attributes. In: European Conference on Computer Vision (2010)
Wang, Y., Mori, G.: Hidden part models for human action recognition: probabilistic vs. max-margin. IEEE Trans. Pattern Anal. Mach. Intell. vol 33, pp. 1310–1323. (2010)
Yao, B., Fei-Fei, L.: Modeling mutual context of object and human pose in human-object interaction activities. In: Conference on Computer Vision and Pattern Recognition, pp. 17–24 (2010)
Yao, B., Fei-Fei, L.: Recognizing human-object interactions in still images by modeling the mutual context of objects and human poses. IEEE Trans. Pattern Anal. Mach. Intell. 34(9), 1691–1703 (2012)
Yu, T.-H., Kim, T.-K., Cipolla, R.: Real-time action recognition by spatiotemporal semantic and structural forests. In: British Machine Vision Conference (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Kong, Y., Fu, Y. (2016). Action Recognition and Human Interaction. In: Fu, Y. (eds) Human Activity Recognition and Prediction. Springer, Cham. https://doi.org/10.1007/978-3-319-27004-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-27004-3_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27002-9
Online ISBN: 978-3-319-27004-3
eBook Packages: EngineeringEngineering (R0)