Multi-object tracking by mutual supervision of CNN and particle filter

  • Yu Xia
  • Shiru Qu
  • Sotirios Goudos
  • Yu Bai
  • Shaohua WanEmail author
Original Article


In the multi-object tracking process, a long-term tracking algorithm for traffic scene based on deep learning is proposed to handle several challenging problems, such as the complex variation of background illumination, change of pixel due to partial occlusion, cumulative error and the short-term disappearance of the target. Firstly, we train a CNN to identify and determine the target bounding box in a traffic scene. Secondly, we use a particle filter (PF) as the tracker to implement the preliminary multi-object tracking. Finally, the multi-object tracking trajectory then is generated by the mutual supervision of the PF tracker and CNN detector. In order to evaluate the experimental results, we use the forward-backward (FB) error of our tracker at a certain moment. The experimental results show that the method can track single and multi-objects in long-term tracking in real-time. For the situation of target disappearance and reappearance, the proposed algorithm can also recover its long-term tracking.


Intelligent transportation systems Multi-object tracking Particle filters Object detection Neural networks 


Compliance with Ethical Standards

Conflict of interests

The authors declare that they have to conflict of interest.


  1. 1.
    Liu JX, Wang Z, Xu M (2017) A Kalman estimation based Rao-Blackwellized particle filtering for radar tracking. IEEE Access 5:8162–8174CrossRefGoogle Scholar
  2. 2.
    Carneiro G, Nascimento JC (2013) Combining multiple dynamic models and deep learning architectures for tracking the left ventricle endocardium in ultrasound data. IEEE Trans Pattern Anal Mach Intell 35(11):2592–2607CrossRefGoogle Scholar
  3. 3.
    Lu Y, Wu T, Zhu SC (2014) Online object tracking, learning, and parsing with and-or graphs. In: IEEE conference on computer vision and pattern recognition. IEEE Computer Society, pp 3462–3469Google Scholar
  4. 4.
    Dollar P, Belongie S, et al. (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–1545CrossRefGoogle Scholar
  5. 5.
    Everingham M, Zisserman A, Williams CKI, et al. (2010) The 2005 PASCAL visual object classes challenge. Lect Notes Comput Sci 111(1):98–136Google Scholar
  6. 6.
    Zhang S, Yao H, Sun X, et al. (2013) Sparse coding based visual tracking: review and experimental comparison. Pattern Recogn 46(7):1772–1788CrossRefGoogle Scholar
  7. 7.
    Zhang Rui, et al. (2019) Classifying transportation mode and speed from trajectory data via deep multi-scale learning. Comput. Netw. 106861Google Scholar
  8. 8.
    Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619CrossRefGoogle Scholar
  9. 9.
    Breitenstein MD, Reichlin F, Leibe B, et al. (2010) Robust tracking-by-detection using a detector confidence particle filter. IEEE, International Conference on Computer Vision 30(2):1515–1522Google Scholar
  10. 10.
    Yin Y, Xu D, Wang X, et al. (2017) Online state-based structured SVM combined with incremental PCA for robust visual tracking. IEEE Transactions on Cybernetics 45(9):1988–2000CrossRefGoogle Scholar
  11. 11.
    Babenko B, Yang M, Belongie S (2009) Visual tracking with online multiple instance learning. In: IEEE conference on computer vision and pattern recognition, 2009. CVPR 2009, vol 33(8). IEEE, pp 983–990Google Scholar
  12. 12.
    Son J, Jung I, Park K, et al. (2015) Tracking-by-segmentation with online gradient boosting decision tree. In: IEEE International Conference on Computer Vision. IEEE, pp 3056–3064Google Scholar
  13. 13.
    Ning J, Yang J, Jiang S, et al. (2016) Object tracking via dual linear structured SVM and explicit feature map. In: Computer vision and pattern recognition. IEEE, pp 4266–4274Google Scholar
  14. 14.
    Hong S, You T, Kwak S, et al. (2015) Online tracking by learning discriminative saliency map with convolutional neural network. In: International conference on machine learning, pp 597–606Google Scholar
  15. 15.
    Szegedy C, Liu W, Jia Y, et al. (2015) Going deeper with convolutions. In: Computer vision and pattern recognition. IEEE, pp 1–9Google Scholar
  16. 16.
    Taigman Y, Yang M, Ranzato M, et al. (2014) Deepface: closing the gap to human-level performance in face verification. In: Conference on computer vision and pattern recognition, pp 1701–1708Google Scholar
  17. 17.
    Girshick R, Donahue J, Darrell T, et al. (2016) Region-Based Convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158CrossRefGoogle Scholar
  18. 18.
    Chen Y, Yang X, Zhong B, et al. (2016) CNNTRacker: Online discriminative object tracking via deep convolutional neural network. Appl Soft Comput 38:1088–1098CrossRefGoogle Scholar
  19. 19.
    Wang L, Ouyang W, Wang X, et al. (2016) Visual tracking with fully convolutional networks. In: IEEE international conference on computer vision. IEEE, pp 3119–3127Google Scholar
  20. 20.
    Breitenstein MD, Reichlin F, Leibe B, et al. (2010) Robust tracking-by-detection using a detector confidence particle filter. International Conference on Computer Vision, IEEE 30(2):1515–1522Google Scholar
  21. 21.
    Kalal Z, Mikolajczyk K, Matas J (2012) Tracking-learning-detection. IEEE Trans Pattern Anal Mach Intell 34(7):1409–1422CrossRefGoogle Scholar
  22. 22.
    Xia Y, Qu S, Wan S (2018) Scene guided colorization using neural networks. Neural Comput & Applic, pp 1–14Google Scholar
  23. 23.
    Gao Z, Wang DY, Wan S, et al. (2019) Cognitive-inspired class-statistic matching with triple-constrain for camera free 3D object retrieval. Future Generation Computer SystemsGoogle Scholar
  24. 24.
    Wan S, Zhao Y, Wang T, et al. (2019) Multi-dimensional data indexing and range query processing via Voronoi diagram for internet of things. Futur Gener Comput Syst 91:382–391CrossRefGoogle Scholar
  25. 25.
    Sun Y, Lu C, Bie R, Zhang J (2016) Semantic relation computing theory and its application. J Netw Comput Appl 59:219–229CrossRefGoogle Scholar
  26. 26.
    Sun Y, Jara AJ (2014) An extensible and active semantic model of information organizing for the Internet of Things. Pers Ubiquit Comput 18(8):1821–1833CrossRefGoogle Scholar
  27. 27.
    Zamir AR, Dehghan A, Shah M (2012) GMCP-Tracker: global multi-object tracking using generalized minimum clique graphs. In: European conference on computer vision. Springer, pp 343–356Google Scholar
  28. 28.
    Possegger H, Mauthner T, Roth PM, et al. (2014) Occlusion geodesics for online multi-object tracking. In: Computer vision and pattern recognition. IEEE, pp 1306–1313Google Scholar
  29. 29.
    Possegger H, Sternig S, et al. (2013) Robust real-time tracking of multiple objects by volumetric mass densities. IEEE Conference on Computer Vision and Pattern Recognition IEEE Computer Society 9(4):2395–2402Google Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  • Yu Xia
    • 1
  • Shiru Qu
    • 1
  • Sotirios Goudos
    • 2
  • Yu Bai
    • 3
  • Shaohua Wan
    • 4
    Email author
  1. 1.School of AutomationNorthwestern Polytechnical UniversityXi’anChina
  2. 2.Department of PhysicsAristotle University of ThessalonikiThessalonikiGreece
  3. 3.College of Engineering and Computer ScienceCalifornia State UniversityFullertonUSA
  4. 4.School of Information and Safety EngineeringZhongnan University of Economics and LawWuhanChina

Personalised recommendations