Abstract
Extremely crowded scenes present unique challenges to motion-based video analysis due to the large quantity of pedestrians within the scene and the frequent occlusions they produce. The movement of pedestrians, however, collectively form a spatially and temporally structured pattern in the motion of the crowd. In this work, we present a novel statistical framework for modeling this structured pattern, or steady-state, of the motion in extremely crowded scenes. Our key insight is to model the motion of the crowd by the spatial and temporal variations of local spatio-temporal motion patterns exhibited by pedestrians within the scene. We divide the video into local spatio-temporal sub-volumes and represent the movement through each sub-volume with a local spatio-temporal motion pattern. We then derive a novel, distribution-based hidden Markov model to encode the temporal variations of local spatio-temporal motion patterns. We demonstrate that by capturing the steady-state of the motion within the scene, we can naturally detect unusual activities as statistical deviations in videos with complex activities that are hard for even human observers to analyze.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The original videos courtesy of Nippon Telegraph and Telephone Corporation.
References
Ali, S., Shah, M.: Floor fields for tracking in high density crowd scenes. In: Proc. of European Conference on Computer Vision (2008)
Andrade, E., Blunsden, S., Fisher, R.: Modelling crowd scenes for event detection. In: Proc. of International Conference on Pattern Recognition, pp. 175–178 (2006)
Black, M.: Explaining optical flow events with parameterized spatio-temporal models. In: Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 326–332 (1999)
Boiman, O., Irani, M.: Detecting irregularities in images and in video. In: Proc. of IEEE International Conference on Computer Vision, pp. 462–469 (2005)
Dee, H., Hogg, D.: Detecting inexplicable behaviour. In: Proc. of British Machine Vision Conference, pp. 477–486 (2004)
DeMenthon, D., Doermann, D.: Video retrieval using spatio-temporal descriptors. In: Proc. of the 11th ACM International Conference on Multimedia, pp. 508–517 (2003)
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: International Workshop on Performance Evaluation of Tracking and Surveillance, pp. 65–72 (2005)
Hu, W., Xiao, X., Fu, Z., Xie, D., Tan, T., Maybank, S.: A system for learning statistical motion patterns. IEEE Trans. Pattern Anal. Mach. Intell. 28(9), 1450–1464 (2006)
Johnson, N., Hogg, D.: Learning the distribution of object trajectories for event recognition. In: Proc. of British Machine Vision Conference, pp. 583–592 (1995)
Ke, Y., Sukthankar, R., Hebert, M.: Event detection in crowded videos. In: Proc. of IEEE International Conference on Computer Vision, pp. 1–8 (2007)
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)
Ma, Y., Cisar, P.: Activity representation in crowd. In: Proc. of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition, pp. 107–116 (2008)
Myrvoll, T., Soong, F.: On divergence based clustering of normal distributions and its application to HMM adaptation. In: Proc. of European Conference Speech Communication and Technology, pp. 1517–1520 (2003)
Nishino, K., Nayar, S.K., Jebara, T.: Clustered blockwise PCA for representing visual data. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1675–1679 (2005)
PETS: 10th IEEE International Workshop on Performance Evaluation of Tracking and Surveillance. http://www.pets2007.net/ (2007)
Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)
Rodriguez, M., Ali, S., Kanade, T.: Tracking in unstructured crowded scenes. In: Proc. of IEEE International Conference on Computer Vision (2009)
Shechtman, E., Irani, M.: Space-time behavior based correlation. In: Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 405–412 (2005)
Wright, J., Pless, R.: Analysis of persistent motion patterns using the 3D structure tensor. In: IEEE Workshop on Motion and Video Computing, pp. 14–19 (2005)
Zhong, H., Shi, J., Visontai, M.: Detecting unusual activity in video. In: Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 819–826 (2004)
Acknowledgements
This work was supported in part by Nippon Telegraph and Telephone Corporation and the National Science Foundation grants IIS-0746717 and IIS-0803670.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag London Limited
About this chapter
Cite this chapter
Kratz, L., Nishino, K. (2011). Spatio-Temporal Motion Pattern Models of Extremely Crowded Scenes. In: Wang, L., Zhao, G., Cheng, L., Pietikäinen, M. (eds) Machine Learning for Vision-Based Motion Analysis. Advances in Pattern Recognition. Springer, London. https://doi.org/10.1007/978-0-85729-057-1_10
Download citation
DOI: https://doi.org/10.1007/978-0-85729-057-1_10
Publisher Name: Springer, London
Print ISBN: 978-0-85729-056-4
Online ISBN: 978-0-85729-057-1
eBook Packages: Computer ScienceComputer Science (R0)