Abstract
Crowd count estimation from a still crowd image with arbitrary perspective and density level is one of the challenges in crowd analysis. Techniques developed in the past performed poorly in highly congested scenes with several thousands of people. To resolve the problem, we propose a Multi-scale Fully Convolutional Network for robust crowd counting, that is achieved through estimating density map. Our approach consists of the following contributions: (1) an adaptive human-shaped kernel is proposed to generate the ground truth of the density map. (2) A deep, multi-scale, fully convolutional network is proposed to predict crowd counts. Per-scale loss is used to guarantee the effectiveness of multi-scale strategy. (3) Several attempts, e.g. de-convolutional and minimizing per-scale loss, are tried to improve the counting performance of the proposed approach. Our approach can adapt to not only sparse scenes, but also dense ones. In addition, it achieves the state-of-the-art counting performance in benchmarking datasets, including the World Expo’10, the UCF_CC_50, and the UCSD datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ryan, D., Denman, S., Sriharan, S., et al.: An evaluation of crowd counting methods, features and regression models. Comput. Vis. Image. Und. 130, 1–17 (2015)
Gao, C.Q., Liu, J., Feng, Q., et al.: People-flow counting in complex environments by combining depth and color information. Multimedia Tools Appl. 75(15), 9315–9331 (2016)
Luo, J., Wang, J., Xu, H., et al.: Real-time people counting for indoor scenes. Sig. Process. 124, 27–35 (2016)
Rao, A.S., Gubbi, J., Marusic, S., et al.: Estimation of crowd density by clustering motion cues. Vis. Comput. 31(11), 1533–1552 (2016)
Hashemzadeh, M., Farajzadeh, N.: Combining keypoint-based and segment-based features for counting people in crowded scenes. Inf. Sci. 345, 199–216 (2016)
Siva, P., Shafiee, M.J., Jamieson, M., et al.: Scene Invariant Crowd Segmentation and Counting Using Scale-Normalized Histogram of Moving Gradients (HoMG). arXiv preprint arXiv:1602.00386 (2016)
Zhang, X., He, H., Cao, S., et al.: Flow field texture representation-based motion segmentation for crowd counting. Mach. Vis. Appl. 26(7–8), 871–883 (2015)
Zhang, C., Li, H., Wang, X., et al.: Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 833–841 (2015)
Oñoro-Rubio, D., López-Sastre, R.J.: Towards perspective-free object counting with deep learning. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 615–629. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_38
Hu, Y., Chang, H., Nian, F., et al.: Dense crowd counting from still images with convolutional neural networks. J. Vis. Commun. Image Representation 38, 530–539 (2016)
Sourtzinos, P., Velastin, S.A., Jara, M., Zegers, P., Makris, D.: People counting in videos by fusing temporal cues from spatial context-aware convolutional neural networks. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 655–667. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_46
Zhang, Y., Zhou, D., Chen, S., et al.: Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 589–597 (2016)
Marsden, M., McGuiness, K., Little, S., et al.: Fully Convolutional Crowd Counting On Highly Congested Scenes. arXiv preprint arXiv:1612.00220 (2016)
Zeiler, M.D., Ranzato, M., Monga, R.: On rectified linear units for speech processing. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3517–3521 (2013)
Wang, T., Li, G., Lei, J., Li, S., Xu, S.: Crowd counting based on MMCNN in still images. In: Sharma, P., Bianchi, F.M. (eds.) SCIA 2017. LNCS, vol. 10269, pp. 468–479. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59126-1_39
Sindagi, V.A., Patel, V.M.: CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. arXiv preprint arXiv:1707.09605, pp. 833–841 (2017)
Liang, X., Wei, Y., Shen, X., et al.: Proposal-free network for instance-level object segmentation. arXiv preprint arXiv:1509.02636 (2015)
Chen, L.C., Yang, Y., Wang, J., et al.: Attention to scale: scale-aware semantic image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3640–3649 (2016)
He, D., Yang, X., Liang, C., et al.: Multi-scale FCN with cascaded instance aware segmentation for arbitrary oriented word spotting in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3519–3528 (2017)
Chan, A.B., Liang, Z.S.J., Vasconcelos, N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–7 (2008)
An, S., Liu, W., Venkatesh, S.: Face recognition using kernel ridge regression. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, vol. 130, pp. 1–7 (2007)
Chen, K., Loy, C.C., Gong, S., et al.: Feature mining for localised crowd counting. In: BMVC, vol. 1, no. 2, p. 3 (2012)
Chen, K., Gong, S., Xiang, T., et al.: Cumulative attribute space for age and crowd density estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 124, pp. 2467–2474 (2013)
Idrees, H., Saleemi, I., Seibert, C., et al.: Multi-source multi-scale counting in extremely dense crowd images. In: CVPR, vol. 31, no. 11, pp. 2547–2554 (2013)
Rodriguez, M., Laptev, I., Sivic, J., et al: Density-aware person detection and tracking in crowds. In: IEEE International Conference on Computer Vision (ICCV), pp. 2423–2430 (2011)
Lempitsky, V., Zisserman, A.: Learning to count objects in images. In: Advances in Neural Information Processing Systems, pp. 1324–1332 (2010)
Acknowledgement
This work has been supported by the National Natural Science Foundation of China under Grant No. 61501060, the Natural Science Foundation of Jiangsu Province under Grant No. BK20150271, Key Laboratory for New Technology Application of Road Conveyance of Jiangsu Province under Grant BM20082061708.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Cao, J., Yang, B., Zhang, Y., Zou, L. (2018). Crowd Counting from a Still Image Using Multi-scale Fully Convolutional Network with Adaptive Human-Shaped Kernel. In: Satoh, S. (eds) Image and Video Technology. PSIVT 2017. Lecture Notes in Computer Science(), vol 10799. Springer, Cham. https://doi.org/10.1007/978-3-319-92753-4_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-92753-4_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92752-7
Online ISBN: 978-3-319-92753-4
eBook Packages: Computer ScienceComputer Science (R0)