Spatial-Temporal Granularity-Tunable Gradients Partition (STGGP) Descriptors for Human Detection

  • Yazhou Liu
  • Shiguang Shan
  • Xilin Chen
  • Janne Heikkila
  • Wen Gao
  • Matti Pietikainen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6311)


This paper presents a novel descriptor for human detection in video sequence. It is referred to as spatial-temporal granularity -tunable gradients partition (STGGP), which is an extension of granularity-tunable gradients partition (GGP) from the still image domain to the spatial-temporal domain. Specifically, the moving human body is considered as a 3-dimensional entity in the spatial-temporal domain. Then in 3D Hough space, we define the generalized plane as a primitive to parse the structure of this 3D entity. The advantage of the generalized plane is that it can tolerate imperfect planes with certain level of uncertainty in rotation and translation. The robustness to the uncertainty is controlled quantitatively by the granularity parameters defined explicitly in the generalized plane. This property endows the STGGP descriptors versatile ability to represent both the deterministic structures and the statistical summarizations of the object. Moreover, the STGGP descriptor encodes much heterogeneous information such as the gradients’ strength, position, and distribution, as well as their temporal motion to enrich its representation ability. We evaluate the STGGP on human detection in sequence on the public datasets and very promising results have been achieved.


Object Detection Generalize Plane Motion Information Space Partition Human Detection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Han, F., Shan, Y., Sawhney, H.S., Kumar, R.: Discovering class specific composite features through discriminative sampling with swendsen-wang cut. In: CVPR (2008)Google Scholar
  2. 2.
    Wang, X., Han, T.X., Yan, S.: An hog-lbp human detector with partial occlusion handling. In: ICCV, pp. 32–39 (2009)Google Scholar
  3. 3.
    Schwartz, W.R., Kembhavi, A., Harwood, D., Davis, L.S.: Human detection using partial least squares analysis. In: ICCV (2009)Google Scholar
  4. 4.
    Gall, J., Lempitsky, V.: Class-specific hough forests for object detection. In: CVPR (2009)Google Scholar
  5. 5.
    Dollar, P., Babenko, B., Belongie, S., Perona, P., Zhuowen, T.: Multiple component learning for object detection. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 211–224. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: A benchmark. In: CVPR (2009)Google Scholar
  7. 7.
    Ott, P., Everingham, M.: Implicit color segmentation features for pedestrian and object detection. In: ICCV, pp. 724–730 (2009)Google Scholar
  8. 8.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR, pp. 511–518 (2001)Google Scholar
  9. 9.
    Papageorgiou, C., Poggio, T.: A trainable system for object detection. IJCV 38, 15–33 (2000)zbMATHCrossRefGoogle Scholar
  10. 10.
    Mohan, A., Papageorgiou, C., Poggio, T.: Example-based object detection in images by components. TPAMI 23, 349–361 (2001)Google Scholar
  11. 11.
    Gavrila, D.M.: Pedestrian detection from a moving vehicle. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 37–49. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  12. 12.
    Lin, Z., Davis, L.S., Doermann, D., DeMenthon, D.: Hierarchical part-template matching for human detection and segmentation. In: ICCV (2007)Google Scholar
  13. 13.
    Lin, Z., Davis, L.S.: A pose-invariant descriptor for human detection and segmentation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 423–436. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  14. 14.
    Ferrari, V., Tuytelaars, T., Gool, L.V.: Object detection by contour segment networks. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 14–28. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  15. 15.
    Wu, B., Nevatia, R.: Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In: ICCV (2005)Google Scholar
  16. 16.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV, pp. 1150–1157 (1999)Google Scholar
  17. 17.
    Mikolajczyk, K., Schmid, C., Zisserman, A.: Human detection based on a probabilistic assembly of robust part detectors. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 69–82. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  18. 18.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893 (2005)Google Scholar
  19. 19.
    Zhu, Q., Avidan, S., Yeh, M.C., Cheng, K.T.: Fast human detection using a cascade of histograms of oriented gradients. In: CVPR, pp. 1491–1498 (2006)Google Scholar
  20. 20.
    Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: CVPR (2008)Google Scholar
  21. 21.
    Tuzel, O., Porikli, F., Meer, P.: Human detection via classification on riemannian manifolds. In: CVPR (2007)Google Scholar
  22. 22.
    Liu, Y., Shan, S., Zhang, W., Gao, W., Chen, X.: Granularity-tunable gradients partition (ggp) descriptors for human detection. In: CVPR, pp. 1255–1262 (2009)Google Scholar
  23. 23.
    Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  24. 24.
    Viola, P., Jones, M.J., Snow, D.: Detecting pedestrians using patterns of motion and appearance. In: ICCV, pp. 734–741 (2003)Google Scholar
  25. 25.
    Wojek, C., Walk, S., Schiele, B.: Multi-cue onboard pedestrian detection. In: CVPR (2009)Google Scholar
  26. 26.
    Zelnik-Manor, L., Irani, M.: Event-based analysis of video. In: CVPR, vol. 2, pp. 123–130 (2001)Google Scholar
  27. 27.
    Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: CVPR, vol. 2, pp. 1395–1402 (2005)Google Scholar
  28. 28.
    Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: ICCV, vol. 1, p. 166–173 (2005)Google Scholar
  29. 29.
    Ess, A., Leibe, B., Gool, L.V.: Depth and appearance for mobile scene analysis. In: ICCV, pp. 14–21 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Yazhou Liu
    • 1
    • 2
  • Shiguang Shan
    • 1
    • 2
  • Xilin Chen
    • 1
    • 2
  • Janne Heikkila
    • 1
    • 2
  • Wen Gao
    • 1
    • 2
  • Matti Pietikainen
    • 1
    • 2
  1. 1.Key Laboratory of Intelligent Information Processing, Institute of Computing TechnologyChinese Academy of Sciences (CAS)China
  2. 2.Machine Vision Group, Department of Electrical and Information EngineeringUniversity of OuluFinland

Personalised recommendations