Advertisement

Hierarchical convolutional features for end-to-end representation-based visual tracking

Special Issue Paper

Abstract

Recently, deep learning is widely developed in computer vision applications. In this paper, a novel simple tracker with deep learning is proposed to complete the tracking task. A simple fully convolutional Siamese network is applied to capture the similarity between different frames. Nevertheless, the detailed information from lower layers, which is also important for locating the target object, is not considered into the tracking task. In this paper, the detailed information from two lower layers is considered into the response map to improve the performance and not to increase much time spent. This leads more significant improvement for feature representation and localization of the target object. The experimental results demonstrate that the proposed algorithm is efficient and robust compared with the baseline and the state-of-the-art trackers.

Keywords

Tracking Siamese network Deep neural network Deep learning End-to-end Hierarchical convolutional features 

Notes

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 61472110, 61772161,61532006, 61320106006 and 61601158, by the Zhejiang Provincial Science Foundation under Grants LQ16F030004.

References

  1. 1.
    Wu, Y., Lim, J., Yang, M.H.: online object tracking: a benchmark. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2411–2418 (2013)Google Scholar
  2. 2.
    Yu, J., Hong, C., Rui, Y., Tao, D.: Multi-task autoencoder model for recovering human poses. IEEE Trans. Ind. Electron. (2017)Google Scholar
  3. 3.
    Yu, J., Rui, Y., Tao, D.: Click prediction for web image reranking using multimodal sparse coding. IEEE Trans. Image Process. 23(5), 2019 (2014)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Yu, J., Zhang, B., Kuang, Z., Lin, D., Fan, J.: iPrivacy: image privacy protection by identifying sensitive objects via deep multi-task learning. IEEE Trans. Inf. Forensics Secur. 12(5), 1005 (2017)CrossRefGoogle Scholar
  5. 5.
    Danelljan, M., Hager, G., Shahbaz Khan, F., Felsberg, M.: Convolutional features for correlation filter based visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 58–66 (2015)Google Scholar
  6. 6.
    Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: Eco: Efficient convolution operators for tracking. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 21–26 (2017)Google Scholar
  7. 7.
    Li, K., Kong, Y., Fu, Y.: Multi-stream deep similarity learning networks for visual tracking. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 2166–2172. AAAI Press, Palo Alto (2017)Google Scholar
  8. 8.
    Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.: Fully-convolutional siamese networks for object tracking. In: European Conference on Computer Vision, pp. 850–865. Springer, Berlin (2016)Google Scholar
  9. 9.
    Valmadre, J., Bertinetto, L., Henriques, J., Vedaldi, A., Torr, P.H.: End-to-end representation learning for correlation filter based tracking. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 5000–5008 (2017)Google Scholar
  10. 10.
    Zhong, W., Lu, H., Yang, M.H.: Robust object tracking via sparsity-based collaborative model. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 1838–1845 (2012)Google Scholar
  11. 11.
    Hare, S., Golodetz, S., Saffari, A., Vineet, V., Cheng, M.M., Hicks, S.L., Torr, P.H.: Struck: structured output tracking with kernels. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 2096 (2016)CrossRefGoogle Scholar
  12. 12.
    Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409 (2012)CrossRefGoogle Scholar
  13. 13.
    Sevilla-Lara, L., Learned-Miller, E.: Distribution fields for tracking. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 1910–1917 (2012)Google Scholar
  14. 14.
    Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: Exploiting the circulant structure of tracking-by-detection with kernels. In: European Conference on Computer Vision, pp. 702–715. Springer, Berlin (2012)Google Scholar
  15. 15.
    Jia, X., Lu, H., Yang, M.H.: Visual tracking via adaptive structural local sparse appearance model. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 1822–1829 (2012)Google Scholar
  16. 16.
    Zhang, T., Ghanem, B., Liu, S., Ahuja, N.: Robust visual tracking via multi-task sparse learning. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 2042–2049 (2012)Google Scholar
  17. 17.
    Oron, S., Bar-Hillel, A., Levi, D., Avidan, S.: Locally orderless tracking. Int. J. Comput. Vis. 111(2), 213 (2015)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Baldi, P., Chauvin, Y.: Neural networks for fingerprint recognition. Neural Comput. 5(3), 402 (1993)CrossRefGoogle Scholar
  19. 19.
    Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a “siamese” time delay neural network. In: Advances in Neural Information Processing Systems, pp. 737–744 (1994)Google Scholar
  20. 20.
    Chen, K., Tao, W.: Once for all: a two-ow convolutional neural network for visual tracking. In: IEEE Transactions on Circuits and Systems for Video Technology (2017)Google Scholar
  21. 21.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  22. 22.
    Tao, R., Gavves, E., Smeulders, A.W.: Siamese instance search for tracking. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 1420–1429 (2016)Google Scholar
  23. 23.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  24. 24.
    Held, D., Thrun, S., Savarese, S.: Learning to track at 100 fps with deep regression networks. In: European Conference on Computer Vision, pp. 749–765. Springer, Berlin (2016)Google Scholar
  25. 25.
    Ma, C., Huang, J.B., Yang, X., Yang, M.H.: Hierarchical convolutional features for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3074–3082 (2015)Google Scholar
  26. 26.
    Wang, L., Ouyang, W., Wang, X., Lu, H.: Visual tracking with fully convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3119–3127 (2015)Google Scholar
  27. 27.
    Bertinetto, L., Valmadre, J., Golodetz, S., Miksik, O., Torr, P.H.: Staple: Complementary learners for real-time tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1401–1409 (2016)Google Scholar
  28. 28.
    Bao, C., Wu, Y., Ling, H., Ji, H.: Real time robust 11 tracker using accelerated proximal gradient approach. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 1830–1837 (2012)Google Scholar
  29. 29.
    Dinh, T.B., Vo, N., Medioni, G.: Context tracker: Exploring supporters and distracters in unconstrained environments. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 1177–1184 (2011)Google Scholar
  30. 30.
    Wu, Y., Shen, B., Ling, H.: Online robust image alignment via iterative convex optimization. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 1808–1814 (2012)Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Key Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and TechnologyHangzhou Dianzi UniversityHangzhouChina
  2. 2.State Key Laboratory of Integrated Services NetworksXidian UniversityXianChina

Personalised recommendations