Skip to main content

Visual Tracking Based on Convolutional Deep Belief Network

  • Conference paper
  • First Online:
Advanced Parallel Processing Technologies (APPT 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9231))

Included in the following conference series:

Abstract

Visual tracking is an important task within the field of computer vision. Recently, deep neural networks have gained significant attention thanks to their success on learning image features. But the existing deep neural networks applied in visual tracking are full-connected complicated architectures with large amount of redundant parameters that would be low efficiently to learn. We tackle this problem by using a novel convolutional deep belief network (CDBN) with convolution, weights sharing and pooling to have much fewer parameters to learn, in addition to gain translation invariance which would benefit the tracker performance. Theoretical analysis and experimental evaluations on an open tracker benchmark demonstrate our CDBN based tracker is more accurate by improving tracking success rate 22.6 % and tracking precision 62.8 % on average, while maintaining low computation cost by reduces the number of parameters to 44.4 %, compared to DLT, another well-known deep learning tracker. Meanwhile, our tracker can achieve real-time performance by a graphics processing unit (GPU) speedup of 2.61 times on average and up to 3.08 times.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://winsty.net/dlt.html.

References

  1. Adam, A., Rivlin, E., Shimshoni, I.: Robust fragments-based tracking using the integral histogram. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2006)

    Google Scholar 

  2. Hare, S., Saffari, A., Torr, P.H.: Struck: structured output tracking with kernels. In: IEEE International Conference on Computer Vision (ICCV) (2011)

    Google Scholar 

  3. Yang, H., Shao, L., Zheng, F., Wang, L., Song, Z.: Recent advances and trends in visual tracking: a review. Neurocomputing 74(18), 3823–3831 (2011)

    Article  Google Scholar 

  4. Smeulders, A., Chu, D., Cucchiara, R., Calderara, S., Dehghan, A., Shah, M.: Visual tracking: an experimental survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1442–1468 (2014)

    Article  Google Scholar 

  5. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Annual Conference on Neural Information Processing Systems (NIPS) (2012)

    Google Scholar 

  6. Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: Decaf: A Deep Convolutional Activation Feature for Generic Visual Recognition (2013). arXiv preprint arXiv:1310.1531

  7. Hinton, G.E., Deng, L., Yu, D., Dahl, G., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 29(6), 82–97 (2012)

    Article  Google Scholar 

  8. Sainath, T.N., Kingsbury, B., Saon, G., Soltau, H., Mohamed, A.R., Dahl, G., Ramabhadran, B.: Deep convolutional neural networks for large-scale speech tasks. Neural Netw. 64, 39–48 (2015)

    Article  Google Scholar 

  9. Socher, R., Liu, C., Ng, A.: Parsing natural scenes and natural language with recursive neural networks. In: International Conference on Machine Learning (ICML) (2011)

    Google Scholar 

  10. Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: International Conference on Machine Learning (ICML) (2009)

    Google Scholar 

  11. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  12. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  13. Wang, N., Yeung, D.Y.: Learning a deep compact image representation for visual tracking. In: Annual Conference on Neural Information Processing Systems (NIPS) (2013)

    Google Scholar 

  14. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)

    MathSciNet  Google Scholar 

  15. Wu, Y., Lim, J., Yang, M.: Online object tracking: a benchmark. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013)

    Google Scholar 

  16. Torralba, A., Fergus, R., Freeman, W.T.: 80 Million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008)

    Article  Google Scholar 

  17. Agrawal, P., Girshick, R., Malik, J.: Analyzing the performance of multilayer neural networks for object recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VII. LNCS, vol. 8695, pp. 329–344. Springer, Heidelberg (2014)

    Google Scholar 

  18. LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and time-series. In: Arbib, M.A. (ed.) Handbook of Brain Theory and Neural Networks. MIT Press, Cambridge (1995)

    Google Scholar 

  19. Huang, G.B., Lee, H., Erik, L.M.: Learning hierarchical representations for face verification with convolutional deep belief networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)

    Google Scholar 

  20. Doucet, A., Freitas, D.N., Gordon, N.: Sequential Monte Carlo Methods in Practice. Springer, New York (2001)

    Book  MATH  Google Scholar 

  21. Arulampalam, M., Maskell, S., Gordon, N., Clapp, T.: A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Trans. Sig. Process. 50(2), 174–188 (2002)

    Article  Google Scholar 

  22. Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409–1422 (2012)

    Article  MATH  Google Scholar 

  23. Babenko, B., Yang, M., Belongie, S.: Robust object tracking with online multiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1619–1632 (2011)

    Article  Google Scholar 

  24. Ross, D., Lim, J., Lin, R., Yang, M.: Incremental learning for robust visual tracking. Int. J. Comput. Vis. 77(1), 125–141 (2008)

    Article  Google Scholar 

  25. Zhang, K., Zhang, L., Yang, M.-H.: Real-time compressive tracking. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 864–877. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  26. Kwon, J., Lee, K.: Visual tracking decomposition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010)

    Google Scholar 

  27. Brosch, T., Tam, R.: Efficient training of convolutional deep belief networks in the frequency domain for application to high-resolution 2D and 3D images. Neural Comput. 27(1), 211–227 (2015)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junjie Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Hu, D., Zhou, X., Wu, J. (2015). Visual Tracking Based on Convolutional Deep Belief Network. In: Chen, Y., Ienne, P., Ji, Q. (eds) Advanced Parallel Processing Technologies. APPT 2015. Lecture Notes in Computer Science(), vol 9231. Springer, Cham. https://doi.org/10.1007/978-3-319-23216-4_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23216-4_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23215-7

  • Online ISBN: 978-3-319-23216-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics