Visual Tracking Based on Convolutional Deep Belief Network

Hu, Dan; Zhou, Xingshe; Wu, Junjie

doi:10.1007/978-3-319-23216-4_8

Dan Hu^16,18,
Xingshe Zhou¹⁶ &
Junjie Wu¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9231))

Included in the following conference series:

International Workshop on Advanced Parallel Processing Technologies

671 Accesses
2 Citations

Abstract

Visual tracking is an important task within the field of computer vision. Recently, deep neural networks have gained significant attention thanks to their success on learning image features. But the existing deep neural networks applied in visual tracking are full-connected complicated architectures with large amount of redundant parameters that would be low efficiently to learn. We tackle this problem by using a novel convolutional deep belief network (CDBN) with convolution, weights sharing and pooling to have much fewer parameters to learn, in addition to gain translation invariance which would benefit the tracker performance. Theoretical analysis and experimental evaluations on an open tracker benchmark demonstrate our CDBN based tracker is more accurate by improving tracking success rate 22.6 % and tracking precision 62.8 % on average, while maintaining low computation cost by reduces the number of parameters to 44.4 %, compared to DLT, another well-known deep learning tracker. Meanwhile, our tracker can achieve real-time performance by a graphics processing unit (GPU) speedup of 2.61 times on average and up to 3.08 times.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 34.99; Price excludes VAT (USA)

Softcover Book: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://winsty.net/dlt.html.

References

Adam, A., Rivlin, E., Shimshoni, I.: Robust fragments-based tracking using the integral histogram. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2006)
Google Scholar
Hare, S., Saffari, A., Torr, P.H.: Struck: structured output tracking with kernels. In: IEEE International Conference on Computer Vision (ICCV) (2011)
Google Scholar
Yang, H., Shao, L., Zheng, F., Wang, L., Song, Z.: Recent advances and trends in visual tracking: a review. Neurocomputing 74(18), 3823–3831 (2011)
Article Google Scholar
Smeulders, A., Chu, D., Cucchiara, R., Calderara, S., Dehghan, A., Shah, M.: Visual tracking: an experimental survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1442–1468 (2014)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Annual Conference on Neural Information Processing Systems (NIPS) (2012)
Google Scholar
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: Decaf: A Deep Convolutional Activation Feature for Generic Visual Recognition (2013). arXiv preprint arXiv:1310.1531
Hinton, G.E., Deng, L., Yu, D., Dahl, G., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
Article Google Scholar
Sainath, T.N., Kingsbury, B., Saon, G., Soltau, H., Mohamed, A.R., Dahl, G., Ramabhadran, B.: Deep convolutional neural networks for large-scale speech tasks. Neural Netw. 64, 39–48 (2015)
Article Google Scholar
Socher, R., Liu, C., Ng, A.: Parsing natural scenes and natural language with recursive neural networks. In: International Conference on Machine Learning (ICML) (2011)
Google Scholar
Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: International Conference on Machine Learning (ICML) (2009)
Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MATH MathSciNet Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Article MATH MathSciNet Google Scholar
Wang, N., Yeung, D.Y.: Learning a deep compact image representation for visual tracking. In: Annual Conference on Neural Information Processing Systems (NIPS) (2013)
Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
MathSciNet Google Scholar
Wu, Y., Lim, J., Yang, M.: Online object tracking: a benchmark. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013)
Google Scholar
Torralba, A., Fergus, R., Freeman, W.T.: 80 Million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008)
Article Google Scholar
Agrawal, P., Girshick, R., Malik, J.: Analyzing the performance of multilayer neural networks for object recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VII. LNCS, vol. 8695, pp. 329–344. Springer, Heidelberg (2014)
Google Scholar
LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and time-series. In: Arbib, M.A. (ed.) Handbook of Brain Theory and Neural Networks. MIT Press, Cambridge (1995)
Google Scholar
Huang, G.B., Lee, H., Erik, L.M.: Learning hierarchical representations for face verification with convolutional deep belief networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Google Scholar
Doucet, A., Freitas, D.N., Gordon, N.: Sequential Monte Carlo Methods in Practice. Springer, New York (2001)
Book MATH Google Scholar
Arulampalam, M., Maskell, S., Gordon, N., Clapp, T.: A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Trans. Sig. Process. 50(2), 174–188 (2002)
Article Google Scholar
Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409–1422 (2012)
Article MATH Google Scholar
Babenko, B., Yang, M., Belongie, S.: Robust object tracking with online multiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1619–1632 (2011)
Article Google Scholar
Ross, D., Lim, J., Lin, R., Yang, M.: Incremental learning for robust visual tracking. Int. J. Comput. Vis. 77(1), 125–141 (2008)
Article Google Scholar
Zhang, K., Zhang, L., Yang, M.-H.: Real-time compressive tracking. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 864–877. Springer, Heidelberg (2012)
Chapter Google Scholar
Kwon, J., Lee, K.: Visual tracking decomposition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010)
Google Scholar
Brosch, T., Tam, R.: Efficient training of convolutional deep belief networks in the frequency domain for application to high-resolution 2D and 3D images. Neural Comput. 27(1), 211–227 (2015)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Northwestern Polytechnical University, Xi’an, China
Dan Hu & Xingshe Zhou
State Key Laboratory of High Performance Computing, National University of Defense Technology, Changsha, China
Junjie Wu
Information and Navigation College, Air Force Engineering University, Xi’an, China
Dan Hu

Authors

Dan Hu
View author publications
You can also search for this author in PubMed Google Scholar
Xingshe Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Junjie Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junjie Wu .

Editor information

Editors and Affiliations

Chinese Academy of Sciences, Beijing, China
Yunji Chen
EPFL IC ISIM LAP, Lausanne, Switzerland
Paolo Ienne
Inspur, Shangdong, China
Qing Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, D., Zhou, X., Wu, J. (2015). Visual Tracking Based on Convolutional Deep Belief Network. In: Chen, Y., Ienne, P., Ji, Q. (eds) Advanced Parallel Processing Technologies. APPT 2015. Lecture Notes in Computer Science(), vol 9231. Springer, Cham. https://doi.org/10.1007/978-3-319-23216-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-23216-4_8
Published: 15 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23215-7
Online ISBN: 978-3-319-23216-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics