Abstract
Robust detection of people in video is critical in visual surveillance. In this work we present a framework for robust people detection in highly cluttered scenes with low resolution image sequences. Our model utilises both human appearance and their long-term motion information through a fusion formulated in a Bayesian framework. In particular, we introduce a spatial pyramid Gaussian Mixture approach to model variations of long-term human motion information, which is computed via an improved background modeling using spatial motion constrains. Simultaneously, people appearance is modeled by histograms of oriented gradients. Experiments demonstrate that our method reduces significantly false positive rate compared to that of a state of the art human detector under very challenging lighting condition, occlusion and background clutter.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Boiman, O., Irani, M.: Detecting irregularities in images and in video. In: International Conference on Computer Vision, pp. 462–469 (2005)
Cutler, R., Davis, L.: Robust real-time periodic motion detection: Analysis and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 781–796 (2000)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: International Conference on Computer Vision & Pattern Recognition, vol. 2, pp. 886–893 (2005)
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Gautama, T., Van Hulle, M.: A phase-based approach to the estimation of the optical flow field using spatial filtering. IEEE Transactions on Neural Networks 13, 1127–1136 (2002)
Gavrila, D., Philomin, V.: Real-time object detection for “smart“ vehicles. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 87–93 (1999)
Gavrila, D.M.: The visual analysis of human movement: A survey. Computer Vision and Image Understanding 73(1), 82–98 (1999)
Grauman, K., Darrell, T.: Pyramid match kernel: Discriminative classification with sets of image features. In: International Conference on Computer Vision, pp. 1458–1465 (2005)
Hoffman, D.D., Flinchbaugh, B.E.: The interpretation of biological motion. Biological Cybernetics 42, 195–204 (1982)
Horn, B.K.P., Schunck, B.G.: ”determining optical flow”: A retrospective. Artifical Intelligence 59(1-2), 81–87 (1993)
Johansson, G.: Visual perception of biological motion and a model for its analysis. Perception and Psychophysics 14, 201–211 (1973)
Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: International Conference on Computer Vision, pp. 166–173 (2005)
Laptev, I.: On space-time interest points. International Journal of Computer Vision 64(2), 107–123 (2005)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)
de Lima, C., Alcaim, A., Apolinario, J.J.: On the use of pca in gmm and ar-vector models for text independent speaker verification. In: International Conference on Digital Signal Processing, vol. 2 (2002)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Papageorgiou, C., Poggio, T.: A trainable system for object detection. International Journal of Computer Vision 38(1), 15–33 (2000)
Proesmans, M., Gool, L.J.V., Pauwels, E.J., Oosterlinck, A.: Determination of optical flow and its discontinuities using non-linear diffusion. In: Eklundh, J.-O. (ed.) ECCV 1994. LNCS, vol. 801, pp. 295–304. Springer, Heidelberg (1994)
Rissanen, J.: A universal prior for integers and estimation by minimum description length. The Annals of Statistics, 416–431 (1983)
Sankaranarayanan, A., Chellappa, R., Zheng, Q.: Tracking objects in video using motion and appearance models. In: IEEE International Conference on Image Processing, vol. 2, pp. 394–397 (2005)
Schiele, B., Crowley, J.: Recognition without correspondence using multidimensional receptive field histograms. International Journal of Computer Vision 36(1), 31–50 (2000)
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local svm approach. In: International Conference on Pattern Recognition, Cambridge, UK, pp. 32–36 (2004)
Viola, P., Jones, M.J., Snow, D.: Detecting pedestrians using patterns of motion and appearance. International Journal of Computer Vision 63(2), 153–161 (2005)
Xiang, T., Gong, S.: Beyond tracking: Modelling activity and understanding behaviour. International Journal of Computer Vision 67(1), 21–51 (2006)
Zhang, J., Gong, S.: Beyond static detectors: A bayesian approach to fusing long-term motion with appearance for robust people detection in highly cluttered scenes. In: IEEE Workshop on Visual Surveillance in conjunction with ECCV 2006, Graz, pp. 121–128 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Zhang, J., Gong, S. (2011). Fusion of Motion and Appearance for Robust People Detection in Cluttered Scenes. In: Zhang, J., Shao, L., Zhang, L., Jones, G.A. (eds) Intelligent Video Event Analysis and Understanding. Studies in Computational Intelligence, vol 332. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17554-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-17554-1_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17553-4
Online ISBN: 978-3-642-17554-1
eBook Packages: EngineeringEngineering (R0)