Multimedia Tools and Applications

, Volume 78, Issue 1, pp 257–270 | Cite as

Online convolution network tracking via spatio-temporal context

  • Hongxiang Wang
  • Peizhong Liu
  • Yongzhao DuEmail author
  • Xiaofang Liu


According to the lack of spatio-temporal information of convolution neural network abstraction, an online visual tracking algorithm based on convolution neural network is proposed, combining the spatio-temporal context model to the order filter of convolution neural network. Firstly, the initial target is preprocessed and the target spatial model is extracted, the spatio-temporal context model is obtained by the spatio-temporal information. The first layer adopts the spatio-temporal context model to convolve the input to obtain the simple layer feature. The second layer starts with skip the spatio-temporal context model to get a set of convolution filters, convolving with the simple features of the first layer to extract the target abstract features, and then the deep expression of the target can be obtained by superimposing the convolution results of the simple layer. Finally, the target tracking is realized by sparse updating method combining with particle filter tracking framework. Experiments show that deep abstract feature extracted by online convolution network structure combining with spatio-temporal context model, can preserve spatio-temporal information and improve the background clutters, illumination variation, low resolution, occlusion and scale variation and the tracking efficiency under complex background.


Visual tracking Spatio-temporal context Convolutional neural network (CNN) Particle filter 



This work is supported by the Nature Science Foundation of China (Grant No. 61605048), and the Fujian Provincial Natural Science Foundation Projects Grant (No. 2016 J01300).The authors would like to thank the reviewers for their valuable suggestions and comments.


  1. 1.
    Babenko B, Yang MH, Belongie S (2009) Visual tracking with online multiple instance learning. In CVPRGoogle Scholar
  2. 2.
    Bao C, Wu Y, Ling H, Ji H (2012) Real time robust L1 tracker using accelerated proximal gradient approach. In CVPRGoogle Scholar
  3. 3.
    Bolme DS, Beveridge JR, Draper B, Lui YM (2010) Visual object tracking using adaptive correlation filters. In ICCVGoogle Scholar
  4. 4.
    Chi ZZ, Li HY, Lu HC, Yang MH (2017) Dual deep network for visual tracking. In TIPGoogle Scholar
  5. 5.
    Danelljan M, Khan FS, Felsberg M, van de Weijer J (2014) Adaptive color attributes for real-time visual tracking. In CVPRGoogle Scholar
  6. 6.
    Danelljan M, Häger G, Khan FS, Felsberg M (2014) Accurate scale estimation for robust visual tracking. In BMVCGoogle Scholar
  7. 7.
    Danelljan M, Khan FS, Häger G, Felsberg M (2015) Learning spatially regularized correlation filters for visual tracking. In ICCVGoogle Scholar
  8. 8.
    Danelljan M, Häger G, Khan FS, Felsberg M (2015) Convolutional features for correlation filter based visual tracking. In ICCV workshopGoogle Scholar
  9. 9.
    Danelljan M, Robinson A, Khan F, Felsberg M (2016) Beyond correlation filters: learning continuous convolution operators for visual tracking. In ECCVGoogle Scholar
  10. 10.
    Hare S, Golodetz S, Saffari A, Vineet V, Cheng MM (2016) Struck: structured output tracking with Kernels. In TPAMIGoogle Scholar
  11. 11.
    Held D, Thrun S, Savarese S (2016) Learning to track at 100 FPS with deep regression networks. In ECCVGoogle Scholar
  12. 12.
    Henriques JF, Caseiro R, Martins P, Batista J (2012) Exploiting the circulant structure of tracking-by-detection with Kernels. In ECCVGoogle Scholar
  13. 13.
    Henriques JF, Caseiro R, Martins P, Batista J (2015) High-speed tracking with kernelized correlation filters. In TPAMIGoogle Scholar
  14. 14.
    Kalal Z, Mikolajczyk K, Matas J (2012) Tracking-learning-detection, In TPAMIGoogle Scholar
  15. 15.
    Ma C, Huang JB, Yang XK, Yang MH (2015) Hierarchical convolutional features for visual tracking. In ICCVGoogle Scholar
  16. 16.
    Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In CVPRGoogle Scholar
  17. 17.
    Nam H, Baek M, Han B (2016) Modeling and propagating CNNs in a tree structure for visual tracking. In arXivGoogle Scholar
  18. 18.
    Qi YK, Zhang SP, Qin L, Yao HX, Huang QM, Lim JW, Yang MH (2016) Hedged deep tracking. In CVPRGoogle Scholar
  19. 19.
    Wang NY, Yeung DY (2013) Learning a deep compact image representation for visual tracking. In NIPSGoogle Scholar
  20. 20.
    Wang NY, Shi JP, Yeung DY, Jia JY (2015) Understanding and diagnosing visual tracking systems. In ICCVGoogle Scholar
  21. 21.
    Wang LJ, Ouyang WL, Wang XG, Lu HC (2016) STCT: sequentially training convolutional networks for visual tracking. In CVPRGoogle Scholar
  22. 22.
    Wu Y, Lim J, Yang MH (2013) online object tracking: a benchmark. In CVPRGoogle Scholar
  23. 23.
    Wu Y, Lim J, Yang MH (2015) Object tracking benchmark. In TPAMIGoogle Scholar
  24. 24.
    Zhang KH, Zhang L, Yang MH (2012) Real-time compressive tracking. In ECCVGoogle Scholar
  25. 25.
    Zhang KH, Zhang L, Yang MH, Zhang D (2014) Fast tracking via spatio-temporal context learning. In ECCVGoogle Scholar
  26. 26.
    Zhang KH, Liu QS, Wu Y, Yang MH (2016) Robust visual tracking via convolutional networks without training. In TIPGoogle Scholar
  27. 27.
    Zhong W, Lu H, Yang MH (2012) Robust object tracking via sparsity-based collaborative model. In CVPRGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Hongxiang Wang
    • 1
  • Peizhong Liu
    • 1
  • Yongzhao Du
    • 1
    Email author
  • Xiaofang Liu
    • 1
  1. 1.College of EngineeringHuaqiao UniversityQuanzhouChina

Personalised recommendations