Detecting Carried Objects in Short Video Sequences

  • Dima Damen
  • David Hogg
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5304)


We propose a new method for detecting objects such as bags carried by pedestrians depicted in short video sequences. In common with earlier work [1,2] on the same problem, the method starts by averaging aligned foreground regions of a walking pedestrian to produce a representation of motion and shape (known as a temporal template) that has some immunity to noise in foreground segmentations and phase of the walking cycle. Our key novelty is for carried objects to be revealed by comparing the temporal templates against view-specific exemplars generated offline for unencumbered pedestrians. A likelihood map obtained from this match is combined in a Markov random field with a map of prior probabilities for carried objects and a spatial continuity assumption, from which we obtain a segmentation of carried objects using the MAP solution. We have re-implemented the earlier state of the art method [1] and demonstrate a substantial improvement in performance for the new method on the challenging PETS2006 dataset [3]. Although developed for a specific problem, the method could be applied to the detection of irregularities in appearance for other categories of object that move in a periodic fashion.

Supplementary material

978-3-540-88690-7_12_MOESM1_ESM.avi (11.8 mb)
Supplementary material (12,097 KB)


  1. 1.
    Haritaoglu, I., Cutler, R., Harwood, D., Davis, L.S.: Backpack: detection of people carrying objects using silhouettes. In: Proc. Int. Conf. on Computer Vision (ICCV), vol. 1, pp. 102–107 (1999)Google Scholar
  2. 2.
    Haritaoglu, I., Harwood, D., Davis, L.: W4: real-time surveillance of people and their activities. IEEE Trans. on Pattern Analysis and Machine Intelligence 22(8), 809–830 (2000)CrossRefGoogle Scholar
  3. 3.
    Ferryman, J. (ed.): IEEE Int. Workshop on Performance Evaluation of Tracking and Surveillance (PETS). IEEE, New York (2006)Google Scholar
  4. 4.
    Haritaoglu, I., Harwood, D., Davis, L.: Hydra: Multiple people detection and tracking using silhouettes. In: Proc. IEEE Workshop on Visual Surveillance (1999)Google Scholar
  5. 5.
    Hu, W., Hu, M., Zhou, X., Tan, T., Lou, J., Maybank, S.: Principal axis-based correspondence between multiple cameras for people tracking. IEEE Trans. on Pattern Analysis and Machine Intelligence 28(4), 663–671 (2006)CrossRefGoogle Scholar
  6. 6.
    Benabdelkader, C., Davis, L.: Detection of people carrying objects: a motion-based recognition approach. In: Proc. Int. Conf. on Automatic Face and Gesture Recognition (FGR), pp. 378–384 (2002)Google Scholar
  7. 7.
    Branca, A., Leo, M., Attolico, G., Distante, A.: Detection of objects carried by people. In: Proc. Int. Conf on Image Processing (ICIP), vol. 3, pp. 317–320 (2002)Google Scholar
  8. 8.
    Nanda, H., Benabdelkedar, C., Davis, L.: Modelling pedestrian shapes for outlier detection: a neural net based approach. In: Proc. Intelligent Vehicles Symposium, pp. 428–433 (2003)Google Scholar
  9. 9.
    Tao, D., Li, X., Maybank, S.J., Xindong, W.: Human carrying status in visual surveillance. In: Proc. Computer Vision and Pattern Recognition (CVPR) (2006)Google Scholar
  10. 10.
    Ghanem, N.M., Davis, L.S.: Human appearance change detection. In: Image Analysis and Processing (ICIAP), pp. 536–541 (2007)Google Scholar
  11. 11.
    Dimitrijevic, M., Lepetit, V., Fua, P.: Human body pose detection using Bayesian spatio-temporal templates. Computer Vision and Image Understanding 104(2), 127–139 (2006)CrossRefGoogle Scholar
  12. 12.
    Rousseeuw, P.J.: Least median of squares regression. Journal of the American Statistical Association 79(388), 871–880 (1984)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Fossati, A., Dimitrijevic, M., Lepetit, V., Fua, P.: Bridging the gap between detection and tracking for 3D monocular video-based motion capture. In: Proc. Computer Vision and Pattern Recognition (CVPR) (2007)Google Scholar
  14. 14.
    Magee, D.: Tracking multiple vehicles using foreground, background and motion models. In: Proc. ECCV Workshop on Statistical Methods in Video Processing, pp. 7–12 (2002)Google Scholar
  15. 15.
    Everingham, M., Winn, J.: The PASCAL visual object classes challenge (VOC 2007) development kit. Technical report (2007)Google Scholar
  16. 16.
    Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. on Pattern Analysis and Machine Intelligence 23(11), 1222–1239 (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Dima Damen
    • 1
  • David Hogg
    • 1
  1. 1.School of ComputingUniversity of LeedsUK

Personalised recommendations