Learning spatial-temporally regularized complementary kernelized correlation filters for visual tracking

Abstract

Despite excellent performance shown by spatially regularized discriminative correlation filters (SRDCF) for visual tracking, some issues remain open that hinder further boosting their performance: first, SRDCF utilizes multiple training images to formulate its model, which makes it unable to exploit the circulant structure of the training samples in learning, leading to high computational burden; second, SRDCF is unable to efficiently exploit the powerfully discriminative nonlinear kernels, further negatively affecting its performance. In this paper, we present a novel spatial-temporally regularized complementary kernelized CFs (STRCKCF) based tracking approach. First, by introducing spatial-temporal regularization to the filter learning, the STRCKCF formulates its model with only one training image, which can not only facilitate exploiting the circulant structure in learning, but also reasonably approximate the SRDCF with multiple training images. Furthermore, by incorporating two types of kernels whose matrices are circulant, the STRCKCF is able to fully take advantage of the complementary traits of the color and HOG features to learn a robust target representation efficiently. Besides, our STRCKCF can be efficiently optimized via the alternating direction method of multipliers (ADMM). Extensive evaluations on OTB100 and VOT2016 visual tracking benchmarks demonstrate that the proposed method achieves favorable performance against state-of-the-art trackers with a speed of 40 fps on a single CPU. Compared with SRDCF, STRCKCF provides a 8 × speedup and achieves a gain of 5.5% AUC score on OTB100 and 8.4% EAO score on VOT2016.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

References

  1. 1.

    Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PH (2016) Staple: Complementary learners for real-time tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1401–1409

  2. 2.

    Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PH (2016) Fully-convolutional siamese networks for object tracking. In: European conference on computer vision, pp 850–865

  3. 3.

    Bolme DS, Beveridge JR, Draper BA, Lui YM (2010) Visual object tracking using adaptive correlation filters. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 2544–2550

  4. 4.

    Chen Z, Hong Z, Tao D (2015) An experimental survey on correlation filter-based tracking. arXiv:1509.05520

  5. 5.

    Chen W, Zhang K, Liu Q (2016) Robust visual tracking via patch based kernel correlation filters with adaptive multiple feature ensemble. Neurocomputing 214:607–617

    Article  Google Scholar 

  6. 6.

    Chen Z, You X, Zhong B, Li J, Tao D (2016) Dynamically modulated mask sparse tracking. IEEE Trans Cybern 47(11):3706–3718

    Article  Google Scholar 

  7. 7.

    Choi J, Jin Chang H, Jeong J, Demiris Y, Young Choi J (2016) Visual tracking using attention-modulated disintegration and integration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4321–4330

  8. 8.

    Cui Z, Xiao S, Feng J, Yan S (2016) Recurrently target-attending tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1449–1458

  9. 9.

    Danelljan M, Shahbaz Khan F, Felsberg M, Weijer VdJ (2014) Adaptive color attributes for real-time visual tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 1090–1097

  10. 10.

    Danelljan M, Hager G, Shahbaz Khan F, Felsberg M (2015) Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp 4310–4318

  11. 11.

    Danelljan M, Robinson A, Khan FS, Felsberg M (2016) Beyond correlation filters: Learning continuous convolution operators for visual tracking. In: European Conference on Computer Vision, pp 472–488

  12. 12.

    Danelljan M, Bhat G, Shahbaz Khan F, Felsberg M (2017) Eco: Efficient convolution operators for tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6638–6646

  13. 13.

    Ding G, Chen W, Zhao S, Han J, Liu Q (2017) Real-time scalable visual tracking via quadrangle kernelized correlation filters. IEEE Trans Intell Transp Syst 19(1):140–150

    Article  Google Scholar 

  14. 14.

    Fan H, Ling H (2017) Sanet: Structure-aware network for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 42–49

  15. 15.

    Fan J, Song H, Zhang K, Liu Q, Lian W (2018) Complementary tracking via dual color clustering and spatio-temporal regularized correlation learning. IEEE Access 6:56526–56538

    Article  Google Scholar 

  16. 16.

    Henriques J F, Caseiro R, Martins P, Batista J (2015) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596

    Article  Google Scholar 

  17. 17.

    Hong Z, Chen Z, Wang C, Mei X, Prokhorov D, Tao D (2015) Multi-store tracker (muster): A cognitive psychology inspired approach to object tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 749–758

  18. 18.

    Jia X, Lu H, Yang MH (2012) Visual tracking via adaptive structural local sparse appearance model. In: Proceedings of the IEEE Conference on Computer vision and pattern recognition, pp 1822–1829

  19. 19.

    Kiani Galoogahi H, Sim T, Lucey S (2013) Multi-channel correlation filters. In: Proceedings of the IEEE international conference on computer vision, pp 3072–3079

  20. 20.

    Kiani Galoogahi H, Fagg A, Lucey S (2017) Learning background-aware correlation filters for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1135–1143

  21. 21.

    Kristan M, Leonardis A, Matas J, Felsberg M, Pflugfelder R, Čehovin L, Vojír T, Häger G, Lukežič A, Fernández G et al (2016) The visual object tracking vot2016 challenge results. In: ECCV Workshops, 777–823

  22. 22.

    Lee H, Kim D (2018) Salient region-based online object tracking. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp 1170–1177

  23. 23.

    Li X, Hu W, Shen C, Zhang Z, Dick A, Hengel AVD (2013) A survey of appearance models in visual object tracking. ACM Trans Intell Syst Technol 4(4):58

    Article  Google Scholar 

  24. 24.

    Li Y, Zhu J (2014) A scale adaptive kernel correlation filter tracker with feature integration. In: European Conference on Computer Vision, 254–265

  25. 25.

    Li F, Tian C, Zuo W, Zhang L, Yang M H (2018) Learning spatial-temporal regularized correlation filters for visual tracking. In: Proceedings of the IEEE international conference on computer vision, pp 479–487

  26. 26.

    Li P, Wang D, Wang L, Lu H (2018) Deep visual tracking: Review and experimental comparison. Pattern Recogn 76:323–338

    Article  Google Scholar 

  27. 27.

    Li C, Huang Y, Wang L, Tang J, Lin L (2019) Learning compact target-oriented feature representations for visual tracking. arXiv:1908.01442

  28. 28.

    Li P, Chen B, Ouyang W, Wang D, Yang X, Lu H (2019) Gradnet: Gradient-guided network for visual object tracking. arXiv:1909.06800

  29. 29.

    Lin Y, Zhong B, Li G, Zhao S, Chen Z, Fan W (2019) Localization-aware meta tracker guided with adversarial features. IEEE Access 7:99441–99450

    Article  Google Scholar 

  30. 30.

    Liu S, Zhang T, Cao X, Xu C (2016) Structural correlation filter for robust visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4312–4320

  31. 31.

    Lukezic A, Vojír T, Zajc LC, Matas J, Kristan M (2017) Discriminative correlation filter with channel and spatial reliability. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  32. 32.

    Ma C, Huang JB, Yang X, Yang MH (2015) Hierarchical convolutional features for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3074–3082

  33. 33.

    Ma C, Yang X, Zhang C, Yang MH (2015) Long-term correlation tracking. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 5388–5396

  34. 34.

    Mueller M, Smith N, Ghanem B (2017) Context-aware correlation filter tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1396–1404

  35. 35.

    Nam H, Baek M, Han B (2016) Modeling and propagating cnns in a tree structure for visual tracking. arXiv:1608.07242

  36. 36.

    Qi Y, Zhang S, Qin L, Yao H, Huang Q, Lim J, Yang MH (2016) Hedged deep tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4303–4311

  37. 37.

    Qi Y, Qin L, Zhang J, Zhang S, Huang Q, Yang MH (2018) Structure-aware local sparse coding for visual tracking. IEEE Trans Image Process 27(8):3857–3869

    MathSciNet  Article  Google Scholar 

  38. 38.

    Qi Y, Qin L, Zhang S, Huang Q, Yao H (2018) Robust visual tracking via scale-and-state-awareness. Neurocomputing

  39. 39.

    Qi Y, Zhang S, Qin L, Huang Q, Yao H, Lim J, Yang MH (2018) Hedging deep features for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence

  40. 40.

    Qi Y, Zhang S, Zhang W, Su L, Huang Q, Yang MH (2019) Learning attribute-specific representations for visual tracking. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 8835–8842

  41. 41.

    Scholkopf B, Smola AJ (2001) Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT Press, Cambridge

  42. 42.

    Song Y, Ma C, Wu X, Gong L, Bao L, Zuo W, Shen C, Lau R, Yang MH (2018) VITAL: VIsual Tracking via Adversarial Learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8990–8999

  43. 43.

    Su Z, Li J, Chang J, Du B, Xiao Y Real-time visual tracking using complementary kernel support correlation filters. Frontiers of Computer Science, https://doi.org/10.1007/s11704-018-8116-1

  44. 44.

    Sun C, Wang D, Lu H, Yang MH (2018) Correlation tracking via joint discrimination and reliability learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 489–497

  45. 45.

    Tang M, Feng J (2015) Multi-kernel correlation filter for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3038–3046

  46. 46.

    Tang M, Yu B, Zhang F, Wang J (2018) High-speed tracking with multi-kernel correlation filters. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4874–4883

  47. 47.

    Varma M, Ray D (2007) Learning the discriminative power-invariance trade-off. In: Proceedings of International Conference on Computer Vision, pp 1–8

  48. 48.

    Wang L, Ouyang W, Wang X, Lu H (2015) Visual tracking with fully convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 3119–3127

  49. 49.

    Wang N, Shi J, Yeung DY, Jia J (2015) Understanding and diagnosing visual tracking systems. In: Proceedings of IEEE International Conference on Computer Vision, pp 3101–3109

  50. 50.

    Wu Y, Lim J, Yang MH (2013) Online object tracking: A benchmark. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 2411–2418

  51. 51.

    Wu Y, Lim J, Yang MH (2015) Object tracking benchmark. IEEE Trans Pattern Anal Mach Intell 37(9):1834–1848

    Article  Google Scholar 

  52. 52.

    Xu T, Feng Z H, Wu X J, Kittler J (2019) Learning adaptive discriminative correlation filters via temporal consistency preserving spatial feature selection for robust visual object tracking. IEEE Transactions on Image Processing

  53. 53.

    Yang J, Zhang K, Liu Q (2016) Robust object tracking by online fisher discrimination boosting feature selection. Comput Vis Image Underst 153:100–108

    Article  Google Scholar 

  54. 54.

    Zhang K, Zhang L, Yang MH (2012) Real-time compressive tracking. In: European Conference on Computer Vision, pp 864–877

  55. 55.

    Zhang K, Song H (2013) Real-time visual tracking via online weighted multiple instance learning. Pattern Recogn 46(1):397–411

    MathSciNet  Article  Google Scholar 

  56. 56.

    Zhang K, Zhang L, Yang MH (2014) Fast compressive tracking. IEEE Trans Pattern Anal Mach Intell 36(10):2002–2015

    Article  Google Scholar 

  57. 57.

    Zhang K, Liu Q, Wu Y, Yang MH (2016) Robust visual tracking via convolutional networks without training. IEEE Trans Image Process 25 (4):1779–1792

    MathSciNet  MATH  Google Scholar 

  58. 58.

    Zhang T, Xu C, Yang MH (2017) Multi-task correlation particle filter for robust object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3

  59. 59.

    Zhang T, Xu C, Yang MH (2018) Learning Multi-task Correlation Particle Filters for Visual Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence

  60. 60.

    Zhang K, Fan J, Liu Q, Yang J, Lian W (2018) Parallel attentive correlation tracking. IEEE Trans Image Process 28(1):479–491

    MathSciNet  Article  Google Scholar 

  61. 61.

    Zhang T, Liu S, Xu C, Liu B, Yang MH (2018) Correlation particle filter for visual tracking. IEEE Trans Image Process 27(6):2676–2687

    MathSciNet  Article  Google Scholar 

  62. 62.

    Zhang K, Li X, Song H, Liu Q, Lian W (2018) Visual tracking using spatio-temporally nonlocally regularized correlation filter. Pattern Recogn 83:185–195

    Article  Google Scholar 

  63. 63.

    Zheng Y, Song H, Zhang K, Fan J, Liu X (2019) Dynamically spatiotemporal regularized correlation tracking. IEEE transactions on neural networks and learning systems

  64. 64.

    Zhong W, Lu H, Yang MH (2012) Robust object tracking via sparsity-based collaborative model. In: Proceedings of the IEEE Conference on Computer vision and pattern recognition, pp 1838–1845

  65. 65.

    Zhu G, Porikli F, Li H (2016) Beyond local search: Tracking objects everywhere with instance-specific proposals. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 943–951

Download references

Acknowledgements

This work was supported in part by the National Nature Science Foundation of China (41201404) and the Fundamental Research Funds for the Central Universities of China (2042018gf0008).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jing Li.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Su, Z., Li, J., Chang, J. et al. Learning spatial-temporally regularized complementary kernelized correlation filters for visual tracking. Multimed Tools Appl (2020). https://doi.org/10.1007/s11042-020-09028-9

Download citation

Keywords

  • Visual tracking
  • Spatial-temporal regularization
  • Correlation filter
  • Multi-kernel learning