Skip to main content

Body Detection in Spectator Crowd Images Using Partial Heads

  • Conference paper
  • First Online:
Image and Video Technology (PSIVT 2019)

Abstract

In spectator crowd images, the high number of people, small size and occlusion of body parts, make the body detection task challenging. Due to the similarity in facial features of different people, the variance in head features is less compared to the variation in the body features. Similarly, the visibility of the head in a crowd is more, compared to the visibility of the body. Therefore, the detection of only the head is more successful than the detection of the full body. We show that there exists a relation between head size and location, and the body size and location in the image. Therefore, head size and location can be leveraged to detect full bodies. This paper suggests that due to lack of visibility, more variance in body features, and lack of available training data of occluded bodies, full bodies should not be detected directly in occluded scenes. The proposed strategy is to detect full bodies using information extracted from head detection. Additionally, body detection technique should not be affected by the level of occlusion. Therefore, we propose to use only color matching for body detection. It does not require any explicit training data like CNN based body detection. To evaluate the effectiveness of this strategy, experiments are performed using the S-HOCK spectator crowd dataset. Using partial ground truth head information as the input, full bodies in a dense crowd is detected. Experimental results show that our technique using only head detection and color matching can detect occluded full bodies in a spectator crowd successfully.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alyammahi, S., Bhaskar, H., Ruta, D., Al-Mualla, M.: People detection and articulated pose estimation framework for crowded scenes. Knowl.-Based Syst. 131, 83–104 (2017)

    Article  Google Scholar 

  2. Cao, Z., Simon, T., Wei, S., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1302–1310, July 2017

    Google Scholar 

  3. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 886–893, June 2005

    Google Scholar 

  4. Dollár, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1532–1545 (2014)

    Article  Google Scholar 

  5. Eichner, M., Marin-Jimenez, M., Zisserman, A., Ferrari, V.: 2D articulated human pose estimation and retrieval in (almost) unconstrained still images. Int. J. Comput. Vis. 99(2), 190–214 (2012)

    Article  MathSciNet  Google Scholar 

  6. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  7. Hu, P., Ramanan, D.: Finding tiny faces. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1522–1530, July 2017

    Google Scholar 

  8. Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M., Schiele, B.: Deepercut: A deeper, stronger, and faster multi-person pose estimation model. CoRR abs/1605.03170 (2016)

    Google Scholar 

  9. Jan, Y., Sohel, F., Shiratuddin, M.F., Wong, K.W.: WNet: joint multiple head detection and head pose estimation from a spectator crowd image. In: Carneiro, G., You, S. (eds.) ACCV 2018. LNCS, vol. 11367, pp. 484–493. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21074-8_38

    Chapter  Google Scholar 

  10. Lee, M.W., Cohen, I.: A model-based approach for estimating human 3D poses in static images. IEEE Trans. Pattern Anal. Mach. Intell. 28(6), 905–916 (2006)

    Article  Google Scholar 

  11. Li, J., Liang, X., Wei, Y., Xu, T., Feng, J., Yan, S.: Perceptual generative adversarial networks for small object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1951–1959, July 2017

    Google Scholar 

  12. Li, M., Zhou, Z., Li, J., Liu, X.: Bottom-up pose estimation of multiple person with bounding box constraint. CoRR abs/1807.09972 (2018)

    Google Scholar 

  13. Li, S., Fang, Z., Song, W.F., Hao, A.M., Qin, H.: Bidirectional optimization coupled lightweight networks for efficient and robust multi-person 2D pose estimation. J. Comput. Sci. Technol. 34(3), 522–536 (2019)

    Article  Google Scholar 

  14. Ouyang, W., Wang, X.: A discriminative deep model for pedestrian detection with occlusion handling. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3258–3265, June 2012

    Google Scholar 

  15. Pishchulin, L., et al.: Deepcut: joint subset partition and labeling for multi person pose estimation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4929–4937, June 2016

    Google Scholar 

  16. Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41, 1–1 (2018)

    Google Scholar 

  17. Ranjan, R., Sankaranarayanan, S., Castillo, C.D., Chellappa, R.: An all-in-one convolutional neural network for face analysis. In: 2017 12th IEEE International Conference on Automatic Face Gesture Recognition (FG 2017), pp. 17–24, May 2017

    Google Scholar 

  18. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)

    Article  Google Scholar 

  19. Rodriguez, M., Laptev, I., Sivic, J., Audibert, J.: Density-aware person detection and tracking in crowds. In: 2011 International Conference on Computer Vision, pp. 2423–2430, November 2011

    Google Scholar 

  20. San, M., Crocco, M., Cristani, M., Martelli, S., Murino, V.: Heterogeneous auto-similarities of characteristics (HASC): exploiting relational information for classification. In: 2013 IEEE International Conference on Computer Vision, pp. 809–816, December 2013

    Google Scholar 

  21. Setti, F., et al.: The s-hock dataset: a new benchmark for spectator crowd analysis. Comput. Vis. Image Underst. 159(Supplement C), 47–58 (2017)

    Google Scholar 

  22. Tian, Y., Luo, P., Wang, X., Tang, X.: Deep learning strong parts for pedestrian detection. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1904–1912, December 2015

    Google Scholar 

  23. Wang, X., Xiao, T., Jiang, Y., Shao, S., Sun, J., Shen, C.: Repulsion loss: Detecting pedestrians in a crowd. CoRR abs/1711.07752 (2017)

    Google Scholar 

  24. Wen, Y.H., Gao, L., Fu, H., Zhang, F.L., Xia, S.: Graph CNNs with motif and variable temporal block for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence (2019)

    Google Scholar 

  25. Wu, B., Nevatia, R.: Detection and tracking of multiple, partially occluded humans by bayesian combination of edgelet based part detectors. Int. J. Comput. Vis. 75(2), 247–266 (2007)

    Article  Google Scholar 

  26. Zhan, B., Monekosso, D.N., Remagnino, P., Velastin, S.A., Xu, L.Q.: Crowd analysis: a survey. Mach. Vis. Appl. 19(5), 345–357 (2008)

    Article  Google Scholar 

  27. Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Occlusion-aware R-CNN: detecting pedestrians in a crowd. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 657–674. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_39

    Chapter  Google Scholar 

  28. Zhou, C., Yuan, J.: Learning to integrate occlusion-specific detectors for heavily occluded pedestrian detection. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10112, pp. 305–320. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54184-6_19

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yasir Jan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jan, Y., Sohel, F., Shiratuddin, M.F., Wong, K.W. (2019). Body Detection in Spectator Crowd Images Using Partial Heads. In: Lee, C., Su, Z., Sugimoto, A. (eds) Image and Video Technology. PSIVT 2019. Lecture Notes in Computer Science(), vol 11854. Springer, Cham. https://doi.org/10.1007/978-3-030-34879-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34879-3_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34878-6

  • Online ISBN: 978-3-030-34879-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics