Robust and Real-Time Visual Tracking Based on Single-Layer Convolutional Features and Accurate Scale Estimation

  • Runling WangEmail author
  • Jiancheng Zou
  • Manqiang Che
  • Changzhen Xiong
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 875)


Visual tracking is a fundamental problem in computer vision. Recently, some methods have been developed to utilize features learned from a deep convolutional neural network for visual tracking and achieve record-breaking performances. However, deep trackers suffer from efficiency. In this paper, we propose an object tracking method combining the single-layer convolutional features with correlation filter to locate and speed up. Meanwhile accurate scale prediction and high-confidence model update strategy are adopted to solve the scale variation and similarity interfere problems. Extensive experiments on large scale benchmarks demonstrate the effectiveness of the proposed algorithm against state-of-the-art trackers.


Object tracking Correlation filter Convolutional features Scale pyramid Model update 



This work is supported in part by National Key R&D Program of China, 2017YFC0821102, in part by North China University of Technology Students’ Technological Activity.


  1. 1.
    Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: ECO: efficient convolution operators for tracking. In: Computer Vision and Pattern Recognition, pp. 6931–6939 (2017)Google Scholar
  2. 2.
    Fan, H., Ling, H.: SANet: structure-aware network for visual tracking. In: CVPR Deep Vision Workshop, pp. 2217–2224 (2016)Google Scholar
  3. 3.
    Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: Computer Vision and Pattern Recognition, Las Vegas, pp. 4293–4302 (2016)Google Scholar
  4. 4.
    Han, B., Sim, J., Adam, H.: BranchOut: regularization for online ensemble tracking with convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, pp. 521–530 (2017)Google Scholar
  5. 5.
    Wu, Y., Lim J, Yang M.: Online object tracking: a benchmark. In: Computer Vision and Pattern Recognition, Portland, pp. 2411–2418 (2013)Google Scholar
  6. 6.
    Wu, Y., Lim J, Yang M.: Object tracking benchmark. In: Computer Vision and Pattern Recognition, pp. 1834–1848 (2015)Google Scholar
  7. 7.
    Nam, H., Baek, M., Han, B.: Modeling and propagating CNNs in a tree structure for visual tracking.
  8. 8.
    Song, Y., Ma, C., Gong, L., Zhang, J., Lau, R.W., Yang, M.: CREST: convolutional residual learning for visual tracking. In: IEEE International Conference on Computer Vision, pp. 2574–2583 (2017)Google Scholar
  9. 9.
    Chi, Z., Li, H., Lu, H.: Dual deep network for visual tracking. IEEE Trans. Image Process. 26, 2005–2015 (2017)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Fan, H., Ling, H.: Parallel tracking and verifying: a framework for real-time and high accuracy visual tracking.
  11. 11.
    Ma, C., Huang, J., Yang, X.: Hierarchical convolutional features for visual tracking. In: Computer Vision and Pattern Recognition, Boston, pp. 3074–3082 (2015)Google Scholar
  12. 12.
    Qi, Y., Zhang, S., Qin, L., Yao, H., Huang, Q., Lim, J.: Hedged deep tracking. In: Computer Vision and Pattern Recognition, pp. 4303–4311 (2016)Google Scholar
  13. 13.
    Simonyan, K., Zisserman, A.: Very deep convolutional net works for large-scale image recognition. In: International Conference on Learning Representations, San Diego (2015)Google Scholar
  14. 14.
    Wang, N., Yeung, D. Y.: Learning a deep compact image representation for visual tracking. In: International Conference on Neural Information Processing Systems, pp. 809–817. Curran Associates Inc. (2013)Google Scholar
  15. 15.
    Hong, S., You, T., Kwak, S.: Online tracking by learning discriminative saliency map with convolutional neural network. In: Computer Science, pp. 597–606 (2015)Google Scholar
  16. 16.
    Wang, L., Ouyang, W., Wang, X.: Visual tracking with fully convolutional networks. In: IEEE International Conference on Computer Vision, Santiago, pp. 3119–3127 (2015)Google Scholar
  17. 17.
    Held, D., Thrun, S., Savarese, S.: Learning to track at 100 FPS with deep regression networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 749–765. Springer, Cham (2016). Scholar
  18. 18.
    Bolme, D., Beveridge, J., Draper, B.: Visual object tracking using adaptive correlation filters. In: Computer Vision and Pattern Recognition, California, pp. 2544–2550 (2010)Google Scholar
  19. 19.
    Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: Exploiting the circulant structure of tracking-by-detection with kernels. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 702–715. Springer, Heidelberg (2012). Scholar
  20. 20.
    Henriques, J.F., Rui, C., Martins, P.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37, 583–596 (2015)CrossRefGoogle Scholar
  21. 21.
    Xiong, C., Zhao, L., Guo F.: Kernelized correlation filters tracking based on adaptive feature fusion. J. Comput.-Aided Des. Comput. Graph. 1068–1074 (2017). (in Chinese)Google Scholar
  22. 22.
    Danelljan, M., Häger, G., Khan, F.: Accurate scale estimation for robust visual tracking. In: Proceedings of British Machine Vision Conference, Nottingham, pp. 65.1–65.11 (2014)Google Scholar
  23. 23.
    Wang, X., Li, H., Li, Y.: Robust and real-time deep tracking via multi-scale domain adaptation. In: IEEE International Conference on Multimedia and Expo, Hong Kong, pp. 1338–1343 (2017)Google Scholar
  24. 24.
    Wang, M., Liu, Y., Huang, Z.: Large margin object tracking with circulant feature maps. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, Hawaii, pp. 4800–4808 (2017)Google Scholar
  25. 25.
    Zhang, J., Ma, S., Sclaroff, S.: MEEM: robust tracking via multiple experts using entropy minimization. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 188–203. Springer, Cham (2014). Scholar
  26. 26.
    Ning, J., Yang, J., Jiang, S.: Object tracking via dual linear structured SVM and explicit feature map. In: Computer Vision and Pattern Recognition, Las Vegas, pp. 4266–4274 (2016)Google Scholar
  27. 27.
    Danelljan, M., Gustav, H., Fahad, S.: Learning spatially regularized correlation filters for visual tracking. In: IEEE International Conference on Computer Vision, Santiago, pp. 4310–4318 (2015)Google Scholar
  28. 28.
    Li, Y., Zhu, J.: A scale adaptive kernel correlation filter tracker with feature integration. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8926, pp. 254–265. Springer, Cham (2015). Scholar
  29. 29.
    Bertinetto, L., Valmadre, J., Golodetz, S.: Staple: complementary learners for real-time tracking. In: Computer Vision and Pattern Recognition, Las Vegas, pp. 1401–1409 (2016)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • Runling Wang
    • 1
    Email author
  • Jiancheng Zou
    • 1
  • Manqiang Che
    • 2
  • Changzhen Xiong
    • 2
  1. 1.School of SciencesNorth China University of TechnologyBeijingChina
  2. 2.Beijing Key Laboratory of Urban Intelligent Control TechnologyNorth China University of TechnologyBeijingChina

Personalised recommendations