Skip to main content

Crowd Counting from a Still Image Using Multi-scale Fully Convolutional Network with Adaptive Human-Shaped Kernel

  • Conference paper
  • First Online:
Image and Video Technology (PSIVT 2017)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10799))

Included in the following conference series:

  • 1121 Accesses

Abstract

Crowd count estimation from a still crowd image with arbitrary perspective and density level is one of the challenges in crowd analysis. Techniques developed in the past performed poorly in highly congested scenes with several thousands of people. To resolve the problem, we propose a Multi-scale Fully Convolutional Network for robust crowd counting, that is achieved through estimating density map. Our approach consists of the following contributions: (1) an adaptive human-shaped kernel is proposed to generate the ground truth of the density map. (2) A deep, multi-scale, fully convolutional network is proposed to predict crowd counts. Per-scale loss is used to guarantee the effectiveness of multi-scale strategy. (3) Several attempts, e.g. de-convolutional and minimizing per-scale loss, are tried to improve the counting performance of the proposed approach. Our approach can adapt to not only sparse scenes, but also dense ones. In addition, it achieves the state-of-the-art counting performance in benchmarking datasets, including the World Expo’10, the UCF_CC_50, and the UCSD datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ryan, D., Denman, S., Sriharan, S., et al.: An evaluation of crowd counting methods, features and regression models. Comput. Vis. Image. Und. 130, 1–17 (2015)

    Article  Google Scholar 

  2. Gao, C.Q., Liu, J., Feng, Q., et al.: People-flow counting in complex environments by combining depth and color information. Multimedia Tools Appl. 75(15), 9315–9331 (2016)

    Article  Google Scholar 

  3. Luo, J., Wang, J., Xu, H., et al.: Real-time people counting for indoor scenes. Sig. Process. 124, 27–35 (2016)

    Article  Google Scholar 

  4. Rao, A.S., Gubbi, J., Marusic, S., et al.: Estimation of crowd density by clustering motion cues. Vis. Comput. 31(11), 1533–1552 (2016)

    Article  Google Scholar 

  5. Hashemzadeh, M., Farajzadeh, N.: Combining keypoint-based and segment-based features for counting people in crowded scenes. Inf. Sci. 345, 199–216 (2016)

    Article  Google Scholar 

  6. Siva, P., Shafiee, M.J., Jamieson, M., et al.: Scene Invariant Crowd Segmentation and Counting Using Scale-Normalized Histogram of Moving Gradients (HoMG). arXiv preprint arXiv:1602.00386 (2016)

  7. Zhang, X., He, H., Cao, S., et al.: Flow field texture representation-based motion segmentation for crowd counting. Mach. Vis. Appl. 26(7–8), 871–883 (2015)

    Article  Google Scholar 

  8. Zhang, C., Li, H., Wang, X., et al.: Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 833–841 (2015)

    Google Scholar 

  9. Oñoro-Rubio, D., López-Sastre, R.J.: Towards perspective-free object counting with deep learning. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 615–629. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_38

    Chapter  Google Scholar 

  10. Hu, Y., Chang, H., Nian, F., et al.: Dense crowd counting from still images with convolutional neural networks. J. Vis. Commun. Image Representation 38, 530–539 (2016)

    Article  Google Scholar 

  11. Sourtzinos, P., Velastin, S.A., Jara, M., Zegers, P., Makris, D.: People counting in videos by fusing temporal cues from spatial context-aware convolutional neural networks. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 655–667. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_46

    Chapter  Google Scholar 

  12. Zhang, Y., Zhou, D., Chen, S., et al.: Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 589–597 (2016)

    Google Scholar 

  13. Marsden, M., McGuiness, K., Little, S., et al.: Fully Convolutional Crowd Counting On Highly Congested Scenes. arXiv preprint arXiv:1612.00220 (2016)

  14. Zeiler, M.D., Ranzato, M., Monga, R.: On rectified linear units for speech processing. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3517–3521 (2013)

    Google Scholar 

  15. Wang, T., Li, G., Lei, J., Li, S., Xu, S.: Crowd counting based on MMCNN in still images. In: Sharma, P., Bianchi, F.M. (eds.) SCIA 2017. LNCS, vol. 10269, pp. 468–479. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59126-1_39

    Chapter  Google Scholar 

  16. Sindagi, V.A., Patel, V.M.: CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. arXiv preprint arXiv:1707.09605, pp. 833–841 (2017)

  17. Liang, X., Wei, Y., Shen, X., et al.: Proposal-free network for instance-level object segmentation. arXiv preprint arXiv:1509.02636 (2015)

  18. Chen, L.C., Yang, Y., Wang, J., et al.: Attention to scale: scale-aware semantic image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3640–3649 (2016)

    Google Scholar 

  19. He, D., Yang, X., Liang, C., et al.: Multi-scale FCN with cascaded instance aware segmentation for arbitrary oriented word spotting in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3519–3528 (2017)

    Google Scholar 

  20. Chan, A.B., Liang, Z.S.J., Vasconcelos, N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–7 (2008)

    Google Scholar 

  21. An, S., Liu, W., Venkatesh, S.: Face recognition using kernel ridge regression. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, vol. 130, pp. 1–7 (2007)

    Google Scholar 

  22. Chen, K., Loy, C.C., Gong, S., et al.: Feature mining for localised crowd counting. In: BMVC, vol. 1, no. 2, p. 3 (2012)

    Google Scholar 

  23. Chen, K., Gong, S., Xiang, T., et al.: Cumulative attribute space for age and crowd density estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 124, pp. 2467–2474 (2013)

    Google Scholar 

  24. Idrees, H., Saleemi, I., Seibert, C., et al.: Multi-source multi-scale counting in extremely dense crowd images. In: CVPR, vol. 31, no. 11, pp. 2547–2554 (2013)

    Google Scholar 

  25. Rodriguez, M., Laptev, I., Sivic, J., et al: Density-aware person detection and tracking in crowds. In: IEEE International Conference on Computer Vision (ICCV), pp. 2423–2430 (2011)

    Google Scholar 

  26. Lempitsky, V., Zisserman, A.: Learning to count objects in images. In: Advances in Neural Information Processing Systems, pp. 1324–1332 (2010)

    Google Scholar 

Download references

Acknowledgement

This work has been supported by the National Natural Science Foundation of China under Grant No. 61501060, the Natural Science Foundation of Jiangsu Province under Grant No. BK20150271, Key Laboratory for New Technology Application of Road Conveyance of Jiangsu Province under Grant BM20082061708.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Biao Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cao, J., Yang, B., Zhang, Y., Zou, L. (2018). Crowd Counting from a Still Image Using Multi-scale Fully Convolutional Network with Adaptive Human-Shaped Kernel. In: Satoh, S. (eds) Image and Video Technology. PSIVT 2017. Lecture Notes in Computer Science(), vol 10799. Springer, Cham. https://doi.org/10.1007/978-3-319-92753-4_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-92753-4_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-92752-7

  • Online ISBN: 978-3-319-92753-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics