Body Detection in Spectator Crowd Images Using Partial Heads

Jan, Yasir; Sohel, Ferdous; Shiratuddin, Mohd Fairuz; Wong, Kok Wai

doi:10.1007/978-3-030-34879-3_6

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11854))

Included in the following conference series:

Pacific-Rim Symposium on Image and Video Technology

1197 Accesses

Abstract

In spectator crowd images, the high number of people, small size and occlusion of body parts, make the body detection task challenging. Due to the similarity in facial features of different people, the variance in head features is less compared to the variation in the body features. Similarly, the visibility of the head in a crowd is more, compared to the visibility of the body. Therefore, the detection of only the head is more successful than the detection of the full body. We show that there exists a relation between head size and location, and the body size and location in the image. Therefore, head size and location can be leveraged to detect full bodies. This paper suggests that due to lack of visibility, more variance in body features, and lack of available training data of occluded bodies, full bodies should not be detected directly in occluded scenes. The proposed strategy is to detect full bodies using information extracted from head detection. Additionally, body detection technique should not be affected by the level of occlusion. Therefore, we propose to use only color matching for body detection. It does not require any explicit training data like CNN based body detection. To evaluate the effectiveness of this strategy, experiments are performed using the S-HOCK spectator crowd dataset. Using partial ground truth head information as the input, full bodies in a dense crowd is detected. Experimental results show that our technique using only head detection and color matching can detect occluded full bodies in a spectator crowd successfully.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alyammahi, S., Bhaskar, H., Ruta, D., Al-Mualla, M.: People detection and articulated pose estimation framework for crowded scenes. Knowl.-Based Syst. 131, 83–104 (2017)
Article Google Scholar
Cao, Z., Simon, T., Wei, S., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1302–1310, July 2017
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 886–893, June 2005
Google Scholar
Dollár, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1532–1545 (2014)
Article Google Scholar
Eichner, M., Marin-Jimenez, M., Zisserman, A., Ferrari, V.: 2D articulated human pose estimation and retrieval in (almost) unconstrained still images. Int. J. Comput. Vis. 99(2), 190–214 (2012)
Article MathSciNet Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Article Google Scholar
Hu, P., Ramanan, D.: Finding tiny faces. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1522–1530, July 2017
Google Scholar
Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M., Schiele, B.: Deepercut: A deeper, stronger, and faster multi-person pose estimation model. CoRR abs/1605.03170 (2016)
Google Scholar
Jan, Y., Sohel, F., Shiratuddin, M.F., Wong, K.W.: WNet: joint multiple head detection and head pose estimation from a spectator crowd image. In: Carneiro, G., You, S. (eds.) ACCV 2018. LNCS, vol. 11367, pp. 484–493. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21074-8_38
Chapter Google Scholar
Lee, M.W., Cohen, I.: A model-based approach for estimating human 3D poses in static images. IEEE Trans. Pattern Anal. Mach. Intell. 28(6), 905–916 (2006)
Article Google Scholar
Li, J., Liang, X., Wei, Y., Xu, T., Feng, J., Yan, S.: Perceptual generative adversarial networks for small object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1951–1959, July 2017
Google Scholar
Li, M., Zhou, Z., Li, J., Liu, X.: Bottom-up pose estimation of multiple person with bounding box constraint. CoRR abs/1807.09972 (2018)
Google Scholar
Li, S., Fang, Z., Song, W.F., Hao, A.M., Qin, H.: Bidirectional optimization coupled lightweight networks for efficient and robust multi-person 2D pose estimation. J. Comput. Sci. Technol. 34(3), 522–536 (2019)
Article Google Scholar
Ouyang, W., Wang, X.: A discriminative deep model for pedestrian detection with occlusion handling. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3258–3265, June 2012
Google Scholar
Pishchulin, L., et al.: Deepcut: joint subset partition and labeling for multi person pose estimation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4929–4937, June 2016
Google Scholar
Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41, 1–1 (2018)
Google Scholar
Ranjan, R., Sankaranarayanan, S., Castillo, C.D., Chellappa, R.: An all-in-one convolutional neural network for face analysis. In: 2017 12th IEEE International Conference on Automatic Face Gesture Recognition (FG 2017), pp. 17–24, May 2017
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
Rodriguez, M., Laptev, I., Sivic, J., Audibert, J.: Density-aware person detection and tracking in crowds. In: 2011 International Conference on Computer Vision, pp. 2423–2430, November 2011
Google Scholar
San, M., Crocco, M., Cristani, M., Martelli, S., Murino, V.: Heterogeneous auto-similarities of characteristics (HASC): exploiting relational information for classification. In: 2013 IEEE International Conference on Computer Vision, pp. 809–816, December 2013
Google Scholar
Setti, F., et al.: The s-hock dataset: a new benchmark for spectator crowd analysis. Comput. Vis. Image Underst. 159(Supplement C), 47–58 (2017)
Google Scholar
Tian, Y., Luo, P., Wang, X., Tang, X.: Deep learning strong parts for pedestrian detection. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1904–1912, December 2015
Google Scholar
Wang, X., Xiao, T., Jiang, Y., Shao, S., Sun, J., Shen, C.: Repulsion loss: Detecting pedestrians in a crowd. CoRR abs/1711.07752 (2017)
Google Scholar
Wen, Y.H., Gao, L., Fu, H., Zhang, F.L., Xia, S.: Graph CNNs with motif and variable temporal block for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence (2019)
Google Scholar
Wu, B., Nevatia, R.: Detection and tracking of multiple, partially occluded humans by bayesian combination of edgelet based part detectors. Int. J. Comput. Vis. 75(2), 247–266 (2007)
Article Google Scholar
Zhan, B., Monekosso, D.N., Remagnino, P., Velastin, S.A., Xu, L.Q.: Crowd analysis: a survey. Mach. Vis. Appl. 19(5), 345–357 (2008)
Article Google Scholar
Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Occlusion-aware R-CNN: detecting pedestrians in a crowd. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 657–674. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_39
Chapter Google Scholar
Zhou, C., Yuan, J.: Learning to integrate occlusion-specific detectors for heavily occluded pedestrian detection. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10112, pp. 305–320. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54184-6_19
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Murdoch University, Perth, Australia
Yasir Jan, Ferdous Sohel, Mohd Fairuz Shiratuddin & Kok Wai Wong

Authors

Yasir Jan
View author publications
You can also search for this author in PubMed Google Scholar
Ferdous Sohel
View author publications
You can also search for this author in PubMed Google Scholar
Mohd Fairuz Shiratuddin
View author publications
You can also search for this author in PubMed Google Scholar
Kok Wai Wong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yasir Jan .

Editor information

Editors and Affiliations

Chonnam National University, Gwangju, Korea (Republic of)
Chilwoo Lee
Dalian University of Technology, Dalian, China
Zhixun Su
National Institute of Informatics, Tokyo, Japan
Akihiro Sugimoto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jan, Y., Sohel, F., Shiratuddin, M.F., Wong, K.W. (2019). Body Detection in Spectator Crowd Images Using Partial Heads. In: Lee, C., Su, Z., Sugimoto, A. (eds) Image and Video Technology. PSIVT 2019. Lecture Notes in Computer Science(), vol 11854. Springer, Cham. https://doi.org/10.1007/978-3-030-34879-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-34879-3_6
Published: 11 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34878-6
Online ISBN: 978-3-030-34879-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics