Journal of Real-Time Image Processing

, Volume 16, Issue 2, pp 339–353 | Cite as

Heterogeneous CPU–GPU tracking–learning–detection (H-TLD) for real-time object tracking

  • Ilker Gurcan
  • Alptekin TemizelEmail author
Original Research Paper


The recently proposed tracking–learning–detection (TLD) method has become a popular visual tracking algorithm as it was shown to provide promising long-term tracking results. On the other hand, the high computational cost of the algorithm prevents it being used at higher resolutions and frame rates. In this paper, we describe the design and implementation of a heterogeneous CPU–GPU TLD (H-TLD) solution using OpenMP and CUDA. Leveraging the advantages of the heterogeneous architecture, serial parts are run asynchronously on the CPU while the most computationally costly parts are parallelized and run on the GPU. Design of the solution ensures keeping data transfers between CPU and GPU at a minimum and applying stream compaction and overlapping data transfer with computation whenever such transfers are necessary. The workload is balanced for a uniform work distribution across the GPU multiprocessors. Results show that 10.25 times speed-up is achieved at 1920 \(\times\) 1080 resolution compared to the baseline TLD. The source code has been made publicly available to download from the following address:


Object tracking Heterogeneous CPU–GPU implementations Real time CUDA 


  1. 1.
    Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking–learning–detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409–1422 (2012)CrossRefGoogle Scholar
  2. 2.
    Sinha, S.N., Frahm, J.M., Pollefeys, M., Genc, Y.: Feature tracking and matching in video using programmable graphics hardware. Mach. Vis. Appl. 22(1), 207–217 (2011)CrossRefGoogle Scholar
  3. 3.
    Concha, D., Cabido, R., Pantrigo, J., Montemayor, A.: Performance evaluation of a 3D multi-view-based particle filter for visual object tracking using GPUs and multicore CPUs. J. Real Time Image Process. 1–19 (2014). doi: 10.1007/s11554-014-0483-1
  4. 4.
    Guler, P., Emeksiz, D., Temizel, A., Teke, M., Temizel, T.T.: Real-time multi-camera video analytics system on GPU. J. Real Time Image Process. 1–16 (2013). doi: 10.1007/s11554-013-0337-2
  5. 5.
    Kumar, P., Singhal, A., Mehta, S., Mittal, A.: Real-time moving object detection algorithm on high-resolution videos using GPUs. J. Real Time Image Process. pp. 1–17 (2013). doi: 10.1007/s11554-012-0309-y
  6. 6.
    Ishii, I., Ichida, T., Gu, Q., Takaki, T.: 500-fps face tracking system. J. Real Time Image Process. 8(4), 379–388 (2013)CrossRefGoogle Scholar
  7. 7.
    Liu, K.Y., Li, Y.H., Li, S., Tang, L., Wang, L.: A new parallel particle filter face tracking method based on heterogeneous system. J. Real Time Image Process. 7(3), 153–163 (2012)CrossRefGoogle Scholar
  8. 8.
    Mahmoudi, S., Kierzynka, M., Manneback, P., Kurowski, K.: Real-time motion tracking using optical flow on multiple GPUs. Bull. Polish Acad. Sci. Tech. Sci. 62(1), 139–150 (2014)Google Scholar
  9. 9.
    Marzat, J., Dumortier, Y., Ducrot, A.: Real-time dense and accurate parallel optical flow using cuda. In: International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision (WSCG) (2009)Google Scholar
  10. 10.
    Mizukami, Y., Tadamura, K.: Optical flow computation on compute unified device architecture. In: International Conference on Image Analysis and Processing, pp. 179–184. IEEE (2007)Google Scholar
  11. 11.
    Bouguet, J.Y.: Pyramidal implementation of the affine Lucas Kanade feature tracker description of the algorithm. Intel Corp. 5, 1–10 (2001)Google Scholar
  12. 12.
    Nebehay, G.: Robust object tracking based on tracking–learning–detection. Master’s thesis, Faculty of Informatics, TU Vienna (2012)Google Scholar
  13. 13.
    Atala, J., Bederián, C., Bordese, A., Ingaramo, G., Gaich, F., Medina, J., Rosetti, M., Sánchez, J., Tealdi, M., Wolovick, N.: Real-time FullHD tracking–learning–detection on a 2-SMX GPU. In: GPU Technology Conference (GTC) Poster, 2015Google Scholar
  14. 14.
    Ping, Z., Yongqi, S., Yali, W., Rui, Z.: A parallel implementation of TLD algorithm using CUDA. In: 5th IET International Conference on Wireless, Mobile and Multimedia Networks (ICWMMN 2013), pp. 220–224 (2013)Google Scholar
  15. 15.
    Lewis, J.: Fast normalized cross-correlation. Vis. Interface 10, 120–123 (1995)Google Scholar
  16. 16.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  17. 17.
    Kalal, Z., Matas, J., Mikolajczyk, K.: Pn learning: Bootstrapping binary classifiers by structural constraints. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 49–56. IEEE (2010)Google Scholar
  18. 18.
    Kalal, Z., Mikolajczyk, K., Matas, J.: Forward-backward error: automatic detection of tracking failures. In: International Conference on Pattern Recognition (ICPR), pp. 2756–2759. IEEE (2010)Google Scholar
  19. 19.
    Bradski, G.: OpenCV Library. Dr. Dobb’s J. Softw. Tools (2008)Google Scholar
  20. 20.
    NPP library. Available: (2015)
  21. 21.
    CUB library. Available: (2015)
  22. 22.
    Thrust library. Available: (2015)
  23. 23.
    Dagum, L., Menon, R.: OpenMP: an industry standard API for shared-memory programming. Comput. Sci. Eng. IEEE 5(1), 46–55 (1998)CrossRefGoogle Scholar
  24. 24.
    PassMark Software: CPU Benchmarks. Available: (2015)

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Graduate School of InformaticsMiddle East Technical UniversityAnkaraTurkey

Personalised recommendations