Tracking Using Multilevel Quantizations

  • Zhibin Hong
  • Chaohui Wang
  • Xue Mei
  • Danil Prokhorov
  • Dacheng Tao
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8694)


Most object tracking methods only exploit a single quantization of an image space: pixels, superpixels, or bounding boxes, each of which has advantages and disadvantages. It is highly unlikely that a common optimal quantization level, suitable for tracking all objects in all environments, exists. We therefore propose a hierarchical appearance representation model for tracking, based on a graphical model that exploits shared information across multiple quantization levels. The tracker aims to find the most possible position of the target by jointly classifying the pixels and superpixels and obtaining the best configuration across all levels. The motion of the bounding box is taken into consideration, while Online Random Forests are used to provide pixel- and superpixel-level quantizations and progressively updated on-the-fly. By appropriately considering the multilevel quantizations, our tracker exhibits not only excellent performance in non-rigid object deformation handling, but also its robustness to occlusions. A quantitative evaluation is conducted on two benchmark datasets: a non-rigid object tracking dataset (11 sequences) and the CVPR2013 tracking benchmark (50 sequences). Experimental results show that our tracker overcomes various tracking challenges and is superior to a number of other popular tracking methods.


Tracking Multilevel Quantizations Online Random Forests Non-rigid Object Tracking Conditional Random Fields 


  1. 1.
    Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. TPAMI 34(11), 2274–2282 (2012)CrossRefGoogle Scholar
  2. 2.
    Adam, A., Rivlin, E., Shimshoni, I.: Robust fragments-based tracking using the integral histogram. In: CVPR, pp. 798–805 (2006)Google Scholar
  3. 3.
    Aeschliman, C., Park, J., Kak, A.C.: A probabilistic framework for joint segmentation and tracking. In: CVPR, pp. 1371–1378 (2010)Google Scholar
  4. 4.
    Avidan, S.: Ensemble tracking. TPAMI 29(2), 261–271 (2007)CrossRefGoogle Scholar
  5. 5.
    Babenko, B., Yang, M., Belongie, S.: Robust object tracking with online multiple instance learning. TPAMI 33(8), 1619–1632 (2011)CrossRefGoogle Scholar
  6. 6.
    Bosch, A., Zisserman, A., Muoz, X.: Image classification using random forests and ferns. In: ICCV, pp. 1–8 (2007)Google Scholar
  7. 7.
    Boykov, Y., Funka-Lea, G.: Graph cuts and efficient nd image segmentation. IJCV 70(2), 109–131 (2006)CrossRefGoogle Scholar
  8. 8.
    Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. TPAMI 26(9), 1124–1137 (2004)CrossRefGoogle Scholar
  9. 9.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  10. 10.
    Brunelli, R.: Template matching techniques in computer vision: theory and practice. John Wiley & Sons (2009)Google Scholar
  11. 11.
    Chockalingam, P., Pradeep, N., Birchfield, S.: Adaptive fragments-based tracking of non-rigid objects using level sets. In: ICCV, pp. 1530–1537 (2009)Google Scholar
  12. 12.
    Collins, R., Liu, Y., Leordeanu, M.: Online selection of discriminative tracking features. TPAMI 27(10), 1631–1643 (2005)CrossRefGoogle Scholar
  13. 13.
    Dinh, T.B., Vo, N., Medioni, G.: Context tracker: Exploring supporters and distracters in unconstrained environments. In: CVPR, pp. 1177–1184 (2011)Google Scholar
  14. 14.
    Duffner, S., Garcia, C.: Pixeltrack: a fast adaptive algorithm for tracking non-rigid objects. In: ICCV, pp. 2480–2487 (2013)Google Scholar
  15. 15.
    Godec, M., Roth, P.M., Bischof, H.: Hough-based tracking of non-rigid objects. In: ICCV, pp. 81–88 (2011)Google Scholar
  16. 16.
    Grabner, H., Bischof, H.: On-line boosting and vision. In: CVPR, pp. 260–267 (2006)Google Scholar
  17. 17.
    Hare, S., Saffari, A., Torr, P.H.: Struck: Structured output tracking with kernels. In: ICCV, pp. 263–270 (2011)Google Scholar
  18. 18.
    He, X., Zemel, R.S., Carreira-Perpiñán, M.Á.: Multiscale conditional random fields for image labeling. In: CVPR, pp. 695–702 (2004)Google Scholar
  19. 19.
    Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: Exploiting the circulant structure of tracking-by-detection with kernels. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 702–715. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  20. 20.
    Hong, Z., Mei, X., Prokhorov, D., Tao, D.: Tracking via robust multi-task multi-view joint sparse representation. In: ICCV, pp. 649–656 (2013)Google Scholar
  21. 21.
    Hong, Z., Mei, X., Tao, D.: Dual-force metric learning for robust distracter-resistant tracker. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 513–527. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  22. 22.
    Huang, Q., Han, M., Wu, B., Ioffe, S.: A hierarchical conditional random field model for labeling and segmenting images of street scenes. In: CVPR, pp. 1953–1960 (2011)Google Scholar
  23. 23.
    Jia, X., Lu, H., Yang, M.H.: Visual tracking via adaptive structural local sparse appearance model. In: CVPR, pp. 1822–1829 (2012)Google Scholar
  24. 24.
    Kalal, Z., Mikolajczyk, K., Matas, J.: Forward-backward error: Automatic detection of tracking failures. In: ICPR, pp. 2756–2759 (2010)Google Scholar
  25. 25.
    Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. TPAMI 34(7), 1409–1422 (2012)CrossRefGoogle Scholar
  26. 26.
    Kohli, P., Rihan, J., Bray, M., Torr, P.H.: Simultaneous segmentation and pose estimation of humans using dynamic graph cuts. IJCV 79(3), 285–298 (2008)CrossRefGoogle Scholar
  27. 27.
    Kohli, P., Torr, P.H.: Dynamic graph cuts for efficient inference in markov random fields. TPAMI 29(12), 2079–2088 (2007)CrossRefGoogle Scholar
  28. 28.
    Kwon, J., Lee, K.M.: Tracking of a non-rigid object via patch-based dynamic appearance modeling and adaptive basin hopping monte carlo sampling. In: CVPR, pp. 1208–1215 (2009)Google Scholar
  29. 29.
    Kwon, J., Lee, K.M.: Visual tracking decomposition. In: CVPR, pp. 1269–1276 (2010)Google Scholar
  30. 30.
    Kwon, J., Lee, K.M.: Tracking by sampling trackers. In: ICCV, pp. 1195–1202 (2011)Google Scholar
  31. 31.
    Ladicky, L., Russell, C., Kohli, P., Torr, P.H.: Associative hierarchical crfs for object class image segmentation. In: ICCV, pp. 739–746 (2009)Google Scholar
  32. 32.
    Ladický, Ľ., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, where and how many? Combining object detectors and CRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  33. 33.
    Lagarias, J.C., Reeds, J.A., Wright, M.H., Wright, P.E.: Convergence properties of the nelder–mead simplex method in low dimensions. SIAM Journal on Optimization 9(1), 112–147 (1998)CrossRefzbMATHMathSciNetGoogle Scholar
  34. 34.
    Learned-Miller, E., Sevilla-Lara, L.: Distribution fields for tracking. In: CVPR, pp. 1910–1917 (2012)Google Scholar
  35. 35.
    Lepetit, V., Fua, P.: Keypoint recognition using randomized trees. TPAMI 28(9), 1465–1479 (2006)CrossRefGoogle Scholar
  36. 36.
    Leung, T., Malik, J.: Representing and recognizing the visual appearance of materials using three-dimensional textons. IJCV 43(1), 29–44 (2001)CrossRefzbMATHGoogle Scholar
  37. 37.
    Levinshtein, A., Sminchisescu, C., Dickinson, S.: Optimal contour closure by superpixel grouping. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 480–493. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  38. 38.
    Li, H., Shen, C., Shi, Q.: Real-time visual tracking using compressive sensing. In: CVPR, pp. 1305–1312 (2011)Google Scholar
  39. 39.
    Li, X., Hu, W., Shen, C., Zhang, Z., Dick, A., Hengel, A.V.D.: A survey of appearance models in visual object tracking. ACM Transactions on Intelligent Systems and Technology (TIST) 4(4) (2013)Google Scholar
  40. 40.
    Liu, B., Huang, J., Yang, L., Kulikowsk, C.: Robust tracking using local sparse appearance model and k-selection. In: CVPR, pp. 1313–1320 (2011)Google Scholar
  41. 41.
    Mei, X., Ling, H.: Robust visual tracking and vehicle classification via sparse representation. TPAMI 33(11), 2259–2272 (2011)CrossRefGoogle Scholar
  42. 42.
    Mei, X., Ling, H., Wu, Y., Blasch, E., Bai, L.: Efficient minimum error bounded particle resampling L1 tracker with occlusion detection. TIP 22(7), 2661–2675 (2013)MathSciNetGoogle Scholar
  43. 43.
    Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. TPAMI 24(7), 971–987 (2002)CrossRefGoogle Scholar
  44. 44.
    Oron, S., Bar-Hillel, A., Levi, D., Avidan, S.: Locally orderless tracking. In: CVPR, pp. 1940–1947 (2012)Google Scholar
  45. 45.
    Pérez, P., Hue, C., Vermaak, J., Gangnet, M.: Color-based probabilistic tracking. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part I. LNCS, vol. 2350, pp. 661–675. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  46. 46.
    Ross, D., Lim, J., Lin, R., Yang, M.: Incremental learning for robust visual tracking. IJCV 77(1), 125–141 (2008)CrossRefGoogle Scholar
  47. 47.
    Rother, C., Kolmogorov, V., Blake, A.: Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics (TOG) 23(3), 309–314 (2004)CrossRefGoogle Scholar
  48. 48.
    Saffari, A., Leistner, C., Santner, J., Godec, M., Bischof, H.: On-line random forests. In: ICCV Workshops, pp. 1393–1400 (2009)Google Scholar
  49. 49.
    Santner, J., Leistner, C., Saffari, A., Pock, T., Bischof, H.: Prost: Parallel robust online simple tracking. In: CVPR, pp. 723–730 (2010)Google Scholar
  50. 50.
    Wang, C., Komodakis, N., Paragios, N.: Markov random field modeling, inference & learning in computer vision & image understanding: A survey. CVIU 117(11), 1610–1627 (2013)Google Scholar
  51. 51.
    Wang, C., de La Gorce, M., Paragios, N.: Segmentation, ordering and multi-object tracking using graphical models. In: ICCV, pp. 747–754 (2009)Google Scholar
  52. 52.
    Wang, S., Lu, H., Yang, F., Yang, M.H.: Superpixel tracking. In: ICCV, pp. 1323–1330 (2011)Google Scholar
  53. 53.
    Wojek, C., Schiele, B.: A dynamic conditional random field model for joint labeling of object and scene classes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 733–747. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  54. 54.
    Wu, Y., Lim, J., Yang, M.H.: Online object tracking: A benchmark. In: CVPR, pp. 2411–2418 (2013)Google Scholar
  55. 55.
    Yang, H., Shao, L., Zheng, F., Wang, L., Song, Z.: Recent advances and trends in visual tracking: A review. Neurocomputing 74(18), 3823–3831 (2011)CrossRefGoogle Scholar
  56. 56.
    Zhang, T., Ghanem, B., Liu, S., Ahuja, N.: Robust visual tracking via multi-task sparse learning. In: CVPR, pp. 2042–2049 (2012)Google Scholar
  57. 57.
    Zhong, W., Lu, H., Yang, M.H.: Robust object tracking via sparsity-based collaborative model. In: CVPR, pp. 1838–1845 (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Zhibin Hong
    • 1
  • Chaohui Wang
    • 2
  • Xue Mei
    • 3
  • Danil Prokhorov
    • 3
  • Dacheng Tao
    • 1
  1. 1.Centre for Quantum Computation and Intelligent Systems, Faculty of Engineering and Information TechnologyUniversity of TechnologySydneyAustralia
  2. 2.Max Planck Institute for Intelligent SystemsTübingenGermany
  3. 3.Toyota Research Institute, North AmericaAnn ArborUSA

Personalised recommendations