Pedestrian Detection and Tracking in Challenging Surveillance Videos

  • Kristof Van BeeckEmail author
  • Toon Goedemé
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 598)


In this chapter we propose a novel approach for real-time robust pedestrian tracking in surveillance images. Typical surveillance images are challenging to analyse since the overall image quality is low (e.g. low resolution and high compression). Furthermore often birds-eye viewpoint wide-angle lenses are used to achieve maximum coverage with a minimal amount of cameras. These specific viewpoints make it unfeasible to directly apply existing pedestrian detection techniques. Moreover, real-time processing speeds are required. To overcome these problems we introduce a pedestrian detection and tracking framework which exploits and integrates these scene constraints to achieve high accuracy results. We performed extensive experiments on publically available challenging real-life video sequences concerning both speed and accuracy. Our approach achieves excellent accuracy results while still meeting the stringent real-time demands needed for these surveillance applications, using only a single-core CPU implementation.


Pedestrian detection Tracking Surveillance Computer vision Real-time 



The authors would like to acknowledge that the dataset used here is from the EC Funded CAVIAR project/IST 2001 37540 [8].


  1. 1.
    Benenson, R., Mathias, M., Timofte, R., Van Gool, L.: Fast stixel computation for fast pedestrian detection. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012 Ws/Demos, Part III. LNCS, vol. 7585, pp. 11–20. Springer, Heidelberg (2012)Google Scholar
  2. 2.
    Benenson, R., Mathias, M., Timofte, R., Van Gool, L.: Pedestrian detection at 100 frames per second. In: Proceedings of CVPR, pp. 2903–2910 (2012)Google Scholar
  3. 3.
    Benenson, R., Mathias, M., Tuytelaars, T., Van Gool, L.: Seeking the strongest rigid detector. In: Proceedings of CVPR, Portland, Oregon, pp. 3666–3673 (2013)Google Scholar
  4. 4.
    Benenson, R., Omran, M., Hosang, J., Schiele, B.: Ten years of pedestrian detection, what have we learned? In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014 Workshops. LNCS, vol. 8926, pp. 613–627. Springer, Heidelberg (2015)Google Scholar
  5. 5.
    Benezeth, Y., Jon, P.-M., Emile, B., Laurent, H., Rosenberger, C.: Review and evaluation of commonly-implemented background subtraction algorithms. In: 19th International Conference on Pattern Recognition, ICPR 2008, pp. 1–4. IEEE (2008)Google Scholar
  6. 6.
    Benfold, B., Reid, I.: Stable multi-target tracking in real-time surveillance video. In: CVPR, pp. 3457–3464 (2011)Google Scholar
  7. 7.
    Breitenstein, M.D., Reichlin, F., Leibe, B., Koller-Meier, E., Van Gool, L.: Online multiperson tracking-by-detection from a single, uncalibrated camera. IEEE PAMI 33(9), 1820–1833 (2011)CrossRefGoogle Scholar
  8. 8.
    CAVIAR project: Context aware vision using image-based active recognition.
  9. 9.
    Cho, H., Rybski, P., Bar-Hillel, A., Zhang, W.: Real-time pedestrian detection with deformable part models. In: IEEE Intelligent Vehicles Symposium, pp. 1035–1042 (2012)Google Scholar
  10. 10.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of CVPR, vol. 2, pp. 886–893 (2005)Google Scholar
  11. 11.
    Dollár, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection (2014)Google Scholar
  12. 12.
    Dollár, P., Belongie, S., Perona, P.: The fastest pedestrian detector in the west. In: Proceedings of BMVC, pp. 68.1–68.11 (2010)Google Scholar
  13. 13.
    Dollár, P., Tu, Z., Perona, P., Belongie, S.: Integral channel features. In: Proceedings of BMVC, pp. 91.1–91.11 (2009)Google Scholar
  14. 14.
    Dollár, P., Wojek, C., Schiele, B., Perona, P.: A benchmark. In: Proceedings of CVPR, pp. 304–311 (2009)Google Scholar
  15. 15.
    Dollár, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. IEEE PAMI 34, 743–761 (2012)CrossRefGoogle Scholar
  16. 16.
    Felzenszwalb, P., Girschick, R., McAllester, D.: Cascade object detection with deformable part models. In: Proceedings of CVPR, pp. 2241–2248 (2010)Google Scholar
  17. 17.
    Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: Proceedings of CVPR (2008)Google Scholar
  18. 18.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Computer Vision and Pattern Recognition (2014)Google Scholar
  19. 19.
    Girshick, R., Felzenszwalb, P., McAllester, D.: Object detection with grammar models. In: Proceedings of NIPS, pp. 442–450 (2011)Google Scholar
  20. 20.
    Girshick, R.B., Iandola, F.N., Darrell, T., Malik, J.: Deformable part models are convolutional neural networks. CoRR, abs/1409.5403 (2014)
  21. 21.
    Girshick, R.B., Malik, J.: Training deformable part models with decorrelated features. In: IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, 1–8 December 2013Google Scholar
  22. 22.
    Kalman, R.: A new approach to linear filtering and prediction problems. Trans. ASME J. Basic Eng. 82, 35–45 (1960)CrossRefGoogle Scholar
  23. 23.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc. (2012)Google Scholar
  24. 24.
    Leykin, A., Hammoud, R.: Pedestrian tracking by fusion of thermal-visible surveillance videos. Mach. Vis. Appl. 21(4), 587–595 (2010)CrossRefGoogle Scholar
  25. 25.
    Orts-Escolano, S., Garcia-Rodriguez, J., Morell, V., Cazorla, M., Azorin, J., Garcia-Chamizo, J.M.: Parallel computational intelligence-based multi-camera surveillance system. J. Sens. Actuator Netw. 3(2), 95–112 (2014)CrossRefGoogle Scholar
  26. 26.
    Parks, D.H., Fels, S.S.: Evaluation of background subtraction algorithms with post-processing. In: IEEE Fifth International Conference on Advanced Video and Signal Based Surveillance, AVSS 2008, pp. 192–199. IEEE (2008)Google Scholar
  27. 27.
    Pedersoli, M., Gonzalez, J., Hu, X., Roca, X.: Toward real-time pedestrian detection based on a deformable template model. IEEE Intell. Transp. Syst. 15(1), 355–364 (2013)CrossRefGoogle Scholar
  28. 28.
    Rogez, G., Orrite, C., Guerrero, J.J., Torr, P.H.S.: Exploiting projective geometry for view-invariant monocular human motion analysis in man-made environments. Comput. Vis. Image Underst. 120, 126–140 (2014a)CrossRefGoogle Scholar
  29. 29.
    Rogez, G., Rihan, J., Guerrero, J.J., Orrite, C.: Monocular 3D gait tracking in surveillance scenes. IEEE Trans. Syst. Man Cybern. Part B (Cybernetics) 44(6), 894–909 (2014b)Google Scholar
  30. 30.
    Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: Imagenet large scale visual recognition challenge (2014)Google Scholar
  31. 31.
    Singh, V.K., Wu, B., Nevatia, R.: Pedestrian tracking by associating tracklets using detection residuals. In: IEEE Workshop on Motion and video Computing, WMVC 2008, pp. 1–8. IEEE (2008)Google Scholar
  32. 32.
    Van Beeck, K., Tuytelaars, T., Goedemé, T.: A warping window approach to real-time vision-based pedestrian detection in a truck’s blind spot zone. In: Proceedings of ICINCO (2012)Google Scholar
  33. 33.
    Zivkovic, Z., van der Heijden, F.: Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recogn. Lett. 27(7), 773–780 (2006)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.EAVISE - Campus De NayerKU LeuvenSint-Katelijne-WaverBelgium

Personalised recommendations