Real-Time Depth Map Based People Counting

  • František Galčík
  • Radoslav Gargalík
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8192)


People counting is an important task in video surveillance applications. It can provide statistic information for shopping centers and other public buildings or knowledge of the current number of people in a building in a case of an emergency. This paper describes a real-time people counting system based on a vertical Kinect depth sensor. Processing pipeline of the system includes depth map improvement, a novel approach to head segmentation, and continuous tracking of head segments. The head segmentation is based on an adaptation of the region-growing segmentation approach with thresholding. The tracking of segments combines minimum-weighted bipartite graph matchings and prediction of object movement to eliminate inaccuracy of segmentation. Results of evaluatation realized on datasets from a shopping center (more than 23 hours of recordings) show that the system can handle almost all real-world situations with high accuracy.


people counting Kinect depth sensor continuous tracking 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bevilacqua, A., Di Stefano, L., Azzari, P.: People tracking using a time-of-flight depth sensor. In: IEEE International Conference on Video and Signal Based Surveillance, AVSS 2006, pp. 89–89 (2006)Google Scholar
  2. 2.
    Camplani, M., Salgado, L.: Efficient spatio-temporal hole filling strategy for kinect depth maps. In: Proc. SPIE 8290, Three-Dimensional Image Processing (3DIP) and Applications II (2012)Google Scholar
  3. 3.
    Chowdhury, A.S., Chatterjee, R., Ghosh, M., Ray, N.: Cell tracking in video microscopy using bipartite graph matching. In: 2010 20th International Conference on Pattern Recognition (ICPR), pp. 2456–2459 (2010)Google Scholar
  4. 4.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893 (2005)Google Scholar
  5. 5.
    Desai, C., Ramanan, D., Fowlkes, C.: Discriminative models for multi-class object layout. International Journal of Computer Vision 95, 1–12 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. Pattern Analysis and Machine Intelligence 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  7. 7.
    Han, J., Shao, L., Xu, D., Shotton, J.: Enhanced computer vision with microsoft kinect sensor: A review. IEEE Transactions on Cybernetics (2013)Google Scholar
  8. 8.
    Kong, D., Gray, D., Tao, H.: Counting pedestrians in crowds using viewpoint invariant training. In: British Machine Vision Conf. Citeseer (2005)Google Scholar
  9. 9.
    Kuhn, H.W.: The hungarian method for the assignment problem. Naval Research Logistics Quarterly 2(1-2), 83–97 (1955)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Marana, A., Costa, L.F., Lotufo, R., Velastin, S.: On the efficacy of texture analysis for crowd monitoring. In: International Symposium on Proceedings of the Computer Graphics, Image Processing, and Vision, SIBGRAPI 1998, pp. 354–361 (1998)Google Scholar
  11. 11.
    Qi, F., Han, J., Wang, P., Shi, G., Li, F.: Structure guided fusion for depth map inpainting. Pattern Recognition Letters 34(1), 70–76 (2013)CrossRefGoogle Scholar
  12. 12.
    Tanner, R., Studer, M., Zanoli, A., Hartmann, A.: People detection and tracking with tof sensor. In: IEEE Fifth International Conference on Advanced Video and Signal Based Surveillance, AVSS 2008, pp. 356–361 (2008)Google Scholar
  13. 13.
    Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, vol. 2, pp. 734–741 (2003)Google Scholar
  14. 14.
    Yu, Y., Song, Y., Zhang, Y., Wen, S.: A shadow repair approach for kinect depth maps. In: Lee, K., Matsushita, Y., Rehg, J., Hu, Z. (eds.) ACCV 2012, Part IV. LNCS, vol. 7727, pp. 615–626. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  15. 15.
    Zhang, X., Yan, J., Feng, S., Lei, Z., Yi, D., Li, S.: Water filling: Unsupervised people counting via vertical kinect sensor. In: 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance (AVSS), pp. 215–220 (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • František Galčík
    • 1
  • Radoslav Gargalík
    • 1
  1. 1.Institute of Computer Science, Faculty of ScienceP.J. Šafárik UniversityKošiceSlovak Republic

Personalised recommendations