How to Make an RGBD Tracker?

  • Uğur KartEmail author
  • Joni-Kristian Kämäräinen
  • Jiří Matas
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11129)


We propose a generic framework for converting an arbitrary short-term RGB tracker into an RGBD tracker. The proposed framework has two mild requirements – the short-term tracker provides a bounding box and its object model update can be stopped and resumed. The core of the framework is a depth augmented foreground segmentation which is formulated as an energy minimization problem solved by graph cuts. The proposed framework offers two levels of integration. The first requires that the RGB tracker can be stopped and resumed according to the decision on target visibility. The level-two integration requires that the tracker accept an external mask (foreground region) in the target update. We integrate in the proposed framework the Discriminative Correlation Filter (DCF), and three state-of-the-art trackers – Efficient Convolution Operators for Tracking (ECOhc, ECOgpu) and Discriminative Correlation Filter with Channel and Spatial Reliability (CSR-DCF). Comprehensive experiments on Princeton Tracking Benchmark (PTB) show that level-one integration provides significant improvements for all trackers: DCF average rank improves from 18th to 17th, ECOgpu from 16th to 10th, ECOhc from 15th to 5th and CSR-DCF from 19th to 14th. CSR-DCF with level-two integration achieves the top rank by a clear margin on PTB. Our framework is particularly powerful in occlusion scenarios where it provides 13.5% average improvement and 26% for the best tracker (CSR-DCF).


Visual object tracking RGBD tracking 



Uğur Kart was supported by two projects: Business Finland Project “360 Video Intelligence - For Research Benefit” with Nokia, Lynx, JJ-Net, BigHill, Leonidas and Business Finland - FiDiPro Project “Pocket - Sized Big Visual Data”. Jiří Matas was supported by the OP VVV MEYS project CZ.02.1.01/0.0/0.0/16_019/ 0000765 “Research Center for Informatics”.


  1. 1.
    Wu, Y., Lim, J., Yang, M.H.: Online object tracking: a benchmark. In: CVPR (2013)Google Scholar
  2. 2.
    Kristan, M., Pflugfelder, R., Leonardis, A., Matas, J., Porikli, F., et al.: The visual object tracking VOT2013 challenge results. In: CVPR Workshops (2013)Google Scholar
  3. 3.
    Kristan, M., et al.: The visual object tracking VOT2014 challenge results. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8926, pp. 191–217. Springer, Cham (2015). Scholar
  4. 4.
    Kristan, M., Matas, J., Leonardis, A., Felsberg, M., Cehovin, L., et al.: The visual object tracking VOT2015 challenge results. In: ICCV Workshops (2015)Google Scholar
  5. 5.
    Kristan, M., et al.: The visual object tracking VOT2016 challenge results. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 777–823. Springer, Cham (2016). Scholar
  6. 6.
    Kristan, M., Leonardis, A., Matas, J., Felsberg, M., Pflugfelder, R., et al.: The visual object tracking VOT2017 challenge results. In: ICCV Workshops (2017)Google Scholar
  7. 7.
    Lukezic, A., Vojir, T., Cehovin, L., Matas, J., Kristan, M.: Discriminative correlation filter with channel and spatial reliability. In: CVPR (2017)Google Scholar
  8. 8.
    Galoogahi, H., Sim, T., Lucey, S.: Correlation filters with limited boundaries. In: CVPR (2015)Google Scholar
  9. 9.
    Danelljan, M., Bhat, G., Shahbaz Khan, F., Felsberg, M.: ECO: efficient convolution operators for tracking. In: CVPR (2017)Google Scholar
  10. 10.
    Ross, D.A., Lim, J., Lin, R.S., Yang, M.H.: Incremental visual tracking. Int. J. Comput. Vis. (IJCV) 77, 125–141 (2008)CrossRefGoogle Scholar
  11. 11.
    Zhang, T., et al.: Structural sparse tracking. In: CVPR (2015)Google Scholar
  12. 12.
    Comaniciu, D., Ramesh, V., Meer, P.: Kernel-based object tracking. IEEE PAMI 25, 564–567 (2003)CrossRefGoogle Scholar
  13. 13.
    Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE PAMI 34, 1409–1422 (2011)CrossRefGoogle Scholar
  14. 14.
    Danelljan, M., Robinson, A., Shahbaz Khan, F., Felsberg, M.: Beyond correlation filters: learning continuous convolution operators for visual tracking. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 472–488. Springer, Cham (2016). Scholar
  15. 15.
    Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: CVPR (2016)Google Scholar
  16. 16.
    Bolme, D.S., Beveridge, J., Draper, B.A., Lui, Y.M.: Visual object tracking using adaptive correlation filters. In: CVPR (2010)Google Scholar
  17. 17.
    Henriques, J., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE PAMI 37, 1–14 (2014)Google Scholar
  18. 18.
    Song, S., Xiao, J.: Tracking revisited using RGBD camera: unified benchmark and baselines. In: ICCV (2013)Google Scholar
  19. 19.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)Google Scholar
  20. 20.
    Meshgi, K., Maeda, S.I., Oba, S., Skibbe, H., Li, Y.Z., Ishii, S.: An occlusion-aware particle filter tracker to handle complex and persistent occlusions. CVIU 150, 81–94 (2016)Google Scholar
  21. 21.
    Bibi, A., Zhang, T., Ghanem, B.: 3D part-based sparse tracker with automatic synchronization and registration. In: CVPR (2016)Google Scholar
  22. 22.
    Camplani, M., et al.: Real-time RGB-D tracking with depth scaling kernelised correlation filters and occlusion handling. In: BMVC (2015)Google Scholar
  23. 23.
    An, N., Zhao, X.G., Hou, Z.G.: Online RGB-D tracking via detection-learning-segmentation. In: ICPR (2016)Google Scholar
  24. 24.
    Kart, U., Kämäräinen, J.K., Matas, J., Fan, L., Cricri, F.: Depth masked discriminative correlation filter. In: ICPR (2018)Google Scholar
  25. 25.
    Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE PAMI 23(11), 1222–1239 (2001)CrossRefGoogle Scholar
  26. 26.
    Diplaros, A., Vlassis, N., Gevers, T.: A spatially constrained generative model and an EM algorithm for image segmentation. IEEE Trans. Neural Netw. 18(3), 798–808 (2007)CrossRefGoogle Scholar
  27. 27.
    Rother, C., Kolmogorov, V., Blake, A.: GrabCut interactive foreground extraction using iterated graph cuts. In: SIGGRAPH (2004)Google Scholar
  28. 28.
    Hester, C., Casasent, D.: Multivariant technique for multiclass pattern recognition. Appl. Optics 19, 1758–1761 (1980)CrossRefGoogle Scholar
  29. 29.
    Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: Exploiting the circulant structure of tracking-by-detection with kernels. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 702–715. Springer, Heidelberg (2012). Scholar
  30. 30.
    Hannuna, S., et al.: DS-KCF: a real-time tracker for RGB-D data. J. Real-Time Image Process. 1–20 (2016).
  31. 31.
    Springstübe, P., Heinrich, S., Wermter, S.: Continuous convolutional object tracking. In: ESANN (2018)Google Scholar
  32. 32.

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Laboratory of Signal ProcessingTampere University of TechnologyTampereFinland
  2. 2.The Center for Machine PerceptionCzech Technical UniversityPragueCzech Republic

Personalised recommendations