Tracking with Dynamic Hidden-State Shape Models

  • Zheng Wu
  • Margrit Betke
  • Jingbin Wang
  • Vassilis Athitsos
  • Stan Sclaroff
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5302)


Hidden State Shape Models (HSSMs) were previously proposed to represent and detect objects in images that exhibit not just deformation of their shape but also variation in their structure. In this paper, we introduce Dynamic Hidden-State Shape Models (DHSSMs) to track and recognize the non-rigid motion of such objects, for example, human hands. Our recursive Bayesian filtering method, called DP-Tracking, combines an exhaustive local search for a match between image features and model states with a dynamic programming approach to find a global registration between the model and the object in the image. Our contribution is a technique to exploit the hierarchical structure of the dynamic programming approach that on average considerably speeds up the search for matches. We also propose to embed an online learning approach into the tracking mechanism that updates the DHSSM dynamically. The learning approach ensures that the DHSSM accurately represents the tracked object and distinguishes any clutter potentially present in the image. Our experiments show that our method can recognize the digits of a hand while the fingers are being moved and curled to various degrees. The method is robust to various illumination conditions, the presence of clutter, occlusions, and some types of self-occlusions. The experiments demonstrate a significant improvement in both efficiency and accuracy of recognition compared to the non-recursive way of frame-by-frame detection.


Support Vector Machine Dynamic Programming Dynamic Time Warping Edge Point Dynamic Programming Approach 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Supplementary material

978-3-540-88682-2_49_MOESM1_ESM.mpg (29.3 mb)
Supplementary material(29,998 KB)


  1. 1.
    Bar-Shalom, Y., Fortmann, T.: Tracking and Data Association. Academic Press, London (1988)zbMATHGoogle Scholar
  2. 2.
    Athitsos, V., Wang, J., Sclaroff, S., Betke, M.: Detecting instances of shape classes that exhibit variable structure. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 121–134. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  3. 3.
    Wang, J., Athitsos, V., Sclaroff, S., Betke, M.: Detecting objects of variable shape structure with hidden state shape models. IEEE T PAMI 30(3), 477–492 (2008)CrossRefGoogle Scholar
  4. 4.
    Han, B., Zhu, Y., Comaniciu, D., Davis, L.: Kernel-based Bayesian filtering for object tracking. In: CVPR, vol. 1, pp. 227–234 (2005)Google Scholar
  5. 5.
    MacCormick, J., Isard, M.: Partitioned sampling, articulated objects, and interface-quality hand tracking. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 3–19. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  6. 6.
    Sudderth, E., Mandel, M., Freeman, W., Willsky, A.: Distributed occlusion reasoning for tracking with nonparametric belief propagation. In: NIPS (2004)Google Scholar
  7. 7.
    Sigal, L., Bhatia, S., Roth, S., Black, M.J., Isard, M.: Tracking loose-limbed people. In: CVPR, pp. 421–428 (2004)Google Scholar
  8. 8.
    Athitsos, V., Sclaroff, S.: Estimating 3d hand pose from a cluttered image. In: CVPR, pp. 432–440 (2003)Google Scholar
  9. 9.
    Shakhnarovich, G., Viola, P., Darrell, T.: Fast pose estimation with parameter-sensitive hashing. In: ICCV, pp. 750–758 (2003)Google Scholar
  10. 10.
    Stenger, B., Thayananthan, A., Torr, P., Cipolla, R.: Hand pose estimation using hierarchical detection. In: Proceeding of International Workshop on Human-Computer Interaction. LNCS (2004)Google Scholar
  11. 11.
    Forsyth, D., Arikan, O., Ikemoto, L., O’Brien, J., Ramanan, D.: Computational Studies of Human Motion: Part 1, Tracking and Motion Synthesis (2006)Google Scholar
  12. 12.
    Ramanan, D., Forsyth, D.A., Zisserman, A.: Strike a pose: Tracking people by finding stylized poses. In: CVPR, pp. 20–25 (2005)Google Scholar
  13. 13.
    Felzenszwalb, P.F., Schwartz, J.D.: Hierarchical matching of deformable shapes. In: CVPR, pp. 1–8 (2007)Google Scholar
  14. 14.
    Opelt, A., Pinz, A., Zisserman, A.: A boundary-fragment-model for object detection. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 575–588. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  15. 15.
    Shotton, J., Blake, A., Cipolla, R.: Contour-based learning for object detection. In: ICCV, pp. 503–510 (2005)Google Scholar
  16. 16.
    Heap, T., Hogg, D.: Wormholes in shape space: Tracking through discontinuous changes in shape. In: ICCV, pp. 344–349 (1998)Google Scholar
  17. 17.
    Rabiner, L.R.: A tutorial on Hidden Markov Models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  18. 18.
    Collins, R., Liu, Y.: On-line selection of discriminative tracking features. In: ICCV, pp. 346–354 (2003)Google Scholar
  19. 19.
    Schmidt, F.R., Farin, D., Cremers, D.: Fast matching of planar shapes in sub-cubic runtime. In: ICCV, pp. 1–6 (October 2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Zheng Wu
    • 1
  • Margrit Betke
    • 1
  • Jingbin Wang
    • 2
  • Vassilis Athitsos
    • 3
  • Stan Sclaroff
    • 1
  1. 1.Computer Science DepartmentBoston UniversityBostonUSA
  2. 2.Google Inc.USA
  3. 3.Computer Science and Engineering DepartmentUniversity of Texas at ArlingtonArlingtonUSA

Personalised recommendations