A light tracker for online multiple pedestrian tracking


We propose a novel real-time multiple pedestrian tracker for videos acquired from both static and moving cameras in unconstrained real-world environment. In such scenes, trackers always suffer from noisy detections and frequent occlusions. Existing methods usually use complex learning approaches and a large number of training samples to get discriminative appearance features. However, this leads to high computational cost and hardly works in occlusions (missing detections) and undistinguishable appearance. Addressing this, we design a light two-stage tracker. Firstly, a shallow net with two layers of full convolution is proposed to encode appearance. Compared with other deep architectures and sophisticated learning approaches, our shallow net is efficient and robust enough without any online updating. Secondly, we design a motion model to deal with noisy detections and missing objects caused by motion blur or occlusion. By mining the motion pattern, our tracker can reliably predict the object location under challenging scenes. Furthermore, we propose a speedup version to verify our robustness and the possibility of using in online applications. Extensive experiments are implemented on multiple object tracking benchmarks, MOT15 and MOT17. The performance is competitive over a number of state-of-the-art trackers and demonstrates that our tracker is very promising for real-time applications.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9


  1. 1.

    In tracking fields, VOT indicates single object tracking.

  2. 2.

    In the MOT field, an offline method indicates the algorithm uses global information from both the past and future frames. However, in some real-time applications, future information is unavailable which limits the use of offline methods. In contrast, an online method conducts tracking frame-by-frame, i.e., it only uses information from the past and current frames.


  1. 1.

    Bae, S.H., Yoon, K.J.: Confidence-based data association and discriminative deep appearance learning for robust online multi-object tracking. PAMI 40(3), 595–610 (2018)

    Article  Google Scholar 

  2. 2.

    Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the CLEAR MOT metrics. J. Image Video Process. 2008(1), 1–10 (2008)

    Google Scholar 

  3. 3.

    Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.: Fully-convolutional siamese networks for object tracking. In: ECCV , pp. 850–865. Springer, Cham (2016)

  4. 4.

    Bochinski, E., Eiselein, V., Sikora, T.: High-speed tracking-by-detection without using image information. In: AVSS, pp. 1–6 (2017)

  5. 5.

    Cao, T.P., Elton, D., Deng, G.: Fast buffering for FPGA implementation of vision-based object recognition systems. J. Real-Time Image Proc. 7(3), 173–183 (2012)

    Article  Google Scholar 

  6. 6.

    Chen, J., Sheng, H., Zhang, Y., Xiong, Z.: Enhancing detection model for multiple hypothesis tracking. In: CVPR workshops, pp. 18–27 (2017)

  7. 7.

    Chen, L., Ai, H., Shang, C., Zhuang, Z., Bai, B.: Online multi-object tracking with convolutional neural networks. In: ICIP, pp. 645–649 (2017)

  8. 8.

    Chu, P., Fan, H., Tan, C. C., Ling, H.: Online multi-object tracking with instance-aware tracker and dynamic model refreshment. In: WACV, pp. 161–170 (2019)

  9. 9.

    Chu, P., Ling, H.: FAMNet: Joint learning of feature, affinity and multi-dimensional assignment for online multiple object tracking. arXiv:1904.04989 (2019)

  10. 10.

    Chu, Q., Ouyang, W., Li, H., Wang, X., Liu, B., Yu, N.: Online multi-object tracking using CNN-based single object tracker with spatial-temporal attention mechanism. In: ICCV , pp. 4846–4855 (2017)

  11. 11.

    Choi, W.: Near-online multi-target tracking with aggregated local flow descriptor. In: ICCV, pp. 3029–3037 (2015)

  12. 12.

    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893 (2005)

  13. 13.

    Dehghan, A., Modiri Assari, S., Shah, M.: GMMCP-tracker: globally optimal generalized maximum multi clique problem for multiple object tracking. In: CVPR, pp. 4091–4099 (2015)

  14. 14.

    Dollár, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. PAMI 36(8), 1532–1545 (2014)

    Article  Google Scholar 

  15. 15.

    Fang, K., Xiang, Y., Li, X., Savarese, S.: Recurrent autoregressive networks for online multi-object tracking. In: WACV, pp. 466–475 (2018)

  16. 16.

    Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: CVPR pp. 1–8 (2008)

  17. 17.

    Fu, Z., Angelini, F., Chambers, J., Naqvi, S.M.: Multi-level cooperative fusion of GM-PHD filters for online multiple human tracking. IEEE Trans. Multimedia 21, 2277–2291 (2019)

    Article  Google Scholar 

  18. 18.

    Fu, Z., Feng, P., Angelini, F., Chambers, J., Naqvi, S.M.: Particle PHD filter based multiple human tracking using online group-structured dictionary learning. IEEE Access 6, 14764–14778 (2018)

    Article  Google Scholar 

  19. 19.

    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)

  20. 20.

    Girshick, R.: Fast r-cnn. In: ICCV, pp. 1440–1448 (2015)

  21. 21.

    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

  22. 22.

    Henschel, R., Leal-Taixe, L., Cremers, D., Rosenhahn, B.: Fusion of head and full-body detectors for multi-object tracking. In: CVPR Workshops, pp. 1428–1437 (2018)

  23. 23.

    Henschel, R., Zou, Y., Rosenhahn, B.: Multiple people tracking using body and joint detections. In: CVPR Workshops (2019)

  24. 24.

    Hong Yoon, J., Lee, C. R., Yang, M. H., Yoon, K. J.: Online multi-object tracking via structural constraint event aggregation. In: CVPR, pp. 1392–1400 (2016)

  25. 25.

    Jiang, H., Fels, S., Little, J. J.: A linear programming approach for multiple object tracking. In: CVPR, pp. 1–8 (2007)

  26. 26.

    Joginipelly, A.K., Charalampidis, D.: Efficient separable convolution using field programmable gate arrays. J. Microprocess. Microsyst. 71, 102852 (2019)

    Article  Google Scholar 

  27. 27.

    Joginipelly, A. K., Charalampidis, D., Ioup, G., Ioup, J., Thompson, C.H.: Species-specific fish feature extraction using gabor filters. In: Proceedings of 66-th Annual Conference Gulf and Carribean Fisheries Institute, pp. 283–291 (2013)

  28. 28.

    Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Basic Eng. 82(1), 35–45 (1960)

    MathSciNet  Article  Google Scholar 

  29. 29.

    Keuper, M., Tang, S., Zhongjie, Y., Andres, B., Brox, T., Schiele, B.: A multi-cut formulation for joint segmentation and tracking of multiple objects. arXiv:1607.06317 (2016)

  30. 30.

    Kieritz, H., Becker, S., Hübner, W., Arens, M.: Online multi-person tracking using integral channel features. In: AVSS, pp. 122–130 (2016)

  31. 31.

    Kim, C., Li, F., Ciptadi, A., Rehg, J. M.: Multiple hypothesis tracking revisited. In: ICCV, pp. 4696–4704 (2015)

  32. 32.

    Kim, C., Li, F., Rehg, J. M.: Multi-object tracking with neural gating using bilinear LSTM. In: ECCV, pp. 200–215. Springer, Cham (2018)

  33. 33.

    Leal-Taixé, L., Milan, A., Reid, I., Roth, S., Schindler, K.: Motchallenge 2015: towards a benchmark for multi-target tracking. arXiv:1504.01942 (2015)

  34. 34.

    Leal-Taixé, L., Canton-Ferrer, C., Schindler, K.: Learning by tracking: Siamese CNN for robust target association. In: CVPR Workshops, pp. 33-40 (2016)

  35. 35.

    Lee, S.H., Kim, M.Y., Bae, S.H.: Learning discriminative appearance models for online multi-object tracking with appearance discriminability measures. IEEE Access 6, 67316–67328 (2018)

    Article  Google Scholar 

  36. 36.

    Maher, A., Taha, H., Zhang, B.: Realtime multi-aircraft tracking in aerial scene with deep orientation network. J. Real-Time Image Proc. 15, 495–507 (2018)

    Article  Google Scholar 

  37. 37.

    Mahgoub, H., Mostafa, K., Wassif, K.T., Farag, I.: Multi-target tracking using hierarchical convolutional features and motion cues. Int. J. Adv. Comput. Sci. Appl. (2017). https://doi.org/10.14569/IJACSA.2017.081129

    Article  Google Scholar 

  38. 38.

    Maksai, A., Fua, P.: Eliminating exposure bias and metric mismatch in multiple object tracking. In: CVPR, pp. 4639–4648 (2019)

  39. 39.

    Manen, S., Timofte, R., Dai, D., Van Gool, L.: Leveraging single for multi-target tracking using a novel trajectory overlap affinity measure. In: WACV, pp. 1–9 (2016)

  40. 40.

    Milan, A., Leal-Taixé, L., Reid, I., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. arXiv:1603.00831 (2016)

  41. 41.

    Milan, A., Rezatofighi, S.H., Dick, A.R., Reid, I.D., Schindler, K.: Online multi-target tracking using recurrent neural networks. In: AAAI, pp. 4225–4232 (2017)

  42. 42.

    Munkres, J.: Algorithms for the assignment and transportation problems. J. Soc. Ind. Appl. Math. 5(1), 32–38 (1957)

    MathSciNet  Article  Google Scholar 

  43. 43.

    Novak, C.L., Shafer, S.A.: Anatomy of a color histogram. In: CVPR, pp. 599–605 (1992)

  44. 44.

    Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)

  45. 45.

    Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: ECCV Workshop on Benchmarking Multi-target Tracking (2016)

  46. 46.

    Rowghanian, V., Ansari-Asl, K.: Object tracking by mean shift and radial basis function neural networks. J. Real-Time Image Proc. 15, 799–816 (2018)

    Article  Google Scholar 

  47. 47.

    Sadeghian, A., Alahi, A., Savarese, S.: Tracking the untrackable: Learning to track multiple cues with long-term dependencies. In: ICCV, pp. 300–311 (2017)

  48. 48.

    Sanchez-Matilla, R., Poiesi, F., Cavallaro, A.: Online multi-target tracking with strong and weak detections. In: ECCV, pp. 84–99. Springer, Cham (2016)

  49. 49.

    Sheng, H., Chen, J., Zhang, Y., Ke, W., Xiong, Z., Yu, J.: Iterative multiple hypothesis tracking with tracklet-level association. In: IEEE Transactions on Circuits and Systems for Video Technology (2018)

  50. 50.

    Sheng, H., Hao, L., Chen, J., Zhang, Y., Ke, W.: Robust local effective matching model for multi-target tracking. In: Pacific Rim Conference on Multimedia, pp. 233–243. Springer, Cham (2017)

  51. 51.

    Sheng, H., Zhang, Y., Chen, J., Xiong, Z., Zhang, J.: Heterogeneous association graph fusion for target association in multiple object tracking. In: IEEE Transactions on Circuits and Systems for Video Technology (2018)

  52. 52.

    Son, J., Baek, M., Cho, M., Han, B.: Multi-object tracking with quadruplet convolutional neural networks. In: CVPR, pp. 5620–5629 (2017)

  53. 53.

    Tang, S., Andres, B., Andriluka, M., Schiele, B.: Subgraph decomposition for multi-target tracking. In: CVPR, pp. 5033–5041 (2015)

  54. 54.

    Touil, D.E., Terki, N., Medouakh, S.: Learning spatially correlation filters based on convolutional features via PSO algorithm and two combined color spaces for visual tracking. Appl. Intell. 48, 2837–2846 (2018)

    Article  Google Scholar 

  55. 55.

    Vedaldi, A., Lenc, K.: Matconvnet: convolutional neural networks for matlab. In: Proceedings of the 23rd ACM International Conference on Multimedia, pp. 689–692 (2015)

  56. 56.

    Wang, B., Wang, L., Shuai, B., Zuo, Z., Liu, T., Luk Chan, K., Wang, G.: Joint learning of convolutional neural networks and temporally constrained metrics for tracklet association. In: CVPR Workshops, pp. 1–8 (2016)

  57. 57.

    Wang, G., Wang, Y., Zhang, H., Gu, R., Hwang, J.N.: Exploit the connectivity: multi-object tracking with trackletnet. arXiv:1811.07258.(2018)

  58. 58.

    Wojke, N., Paulus, D.: Global data association for the probability hypothesis density filter using network flows. In: ICRA, pp. 567–572 (2016)

  59. 59.

    Xiang, Y., Alahi, A., Savarese, S.: Learning to track: online multi-object tracking by decision making. In: ICCV 2015, pp. 4705–4713

  60. 60.

    Yang, F., Choi, W., Lin, Y.: Exploit all the layers: Fast and accurate cnn object detector with scale dependent pooling and cascaded rejection classifiers. In: CVPR, pp. 2129-2137 (2016)

  61. 61.

    Yang, M., Jia, Y.: Temporal dynamic appearance modeling for online multi-person tracking. Comput. Vis. Image Underst. 153, 16–28 (2016)

    Article  Google Scholar 

  62. 62.

    Yang, M., Wu, Y., Jia, Y.: A hybrid data association framework for robust online multi-object tracking. IEEE Trans. Image Process. 26(12), 5667–5679 (2017)

    MathSciNet  Article  Google Scholar 

  63. 63.

    Yoon, Y.C., Boragule, A., Song, Y.M., Yoon, K., Jeon, M.: Online multi-object tracking with historical appearance matching and scene adaptive detection filtering. In: AVSS, pp. 1–6 (2018)

  64. 64.

    Zhang, L., Li, Y., Nevatia, R.: Global data association for multi-object tracking using network flows. In: CVPR, pp. 1–8 (2008)

  65. 65.

    Zhou, H., Ouyang, W., Cheng, J., Wang, X., Li, H.: Deep continuous conditional random fields with asymmetric inter-object constraints for online multi-object tracking. IEEE Trans. Circuits Syst. Video Technol. 29(4), 1011–1022 (2018)

    Article  Google Scholar 

  66. 66.

    Zhou, X., Jiang, P., Wei, Z., Dong, H., Wang, F.: Online multi-object tracking with structural invariance constraint. In: BMVC (2018)

  67. 67.

    Zhu, J., Yang, H., Liu, N., Kim, M., Zhang, W., Yang, M.H.: Online multi-object tracking with dual matching attention networks. In: ECCV, pp. 366–382 (2018)

Download references


This work was supported by the Fundamental Research Funds for the Central Universities under Grant 2019JBM019.

Author information



Corresponding author

Correspondence to Qi Zou.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, N., Zou, Q., Ma, Q. et al. A light tracker for online multiple pedestrian tracking. J Real-Time Image Proc 18, 175–191 (2021). https://doi.org/10.1007/s11554-020-00962-3

Download citation


  • Shallow network
  • Motion pattern
  • Multiple pedestrian tracker