Bootstrap Initialization of Nonparametric Texture Models for Tracking

  • Kentaro Toyama
  • Ying Wu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1843)


Inbootstrap initialization for tracking, we exploit a weak prior model used to track a target to learn a stronger model, without manual intervention. We define a general formulation of this problem and present a simple taxonomy of such tasks.

The formulation is instantiated with algorithms for bootstrap initialization in two domains: In one, the goal is tracking the position of a face at a desktop; we learn color models of faces, using weak knowledge about the shape and movement of faces in video. In the other task, we seek coarse estimates of head orientation; we learn a person-specific ellipsoidal texture model for heads, given a generic model. For both tasks, we use nonparametric models of surface texture.

Experimental results verify that bootstrap initialization is feasible in both domains. We find that (1) independence assumptions in the learning process can be violated to a significant degree, if enough data is taken; (2) there are both domain-independent and domain-specific means to mitigate learning bias; and (3) repeated bootstrapping does not necessarily result in increasingly better models.


Model Point Dirichlet Distribution Head Orientation Texture Model Acquisition Function 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    A. Azarbayejani and A. Pentland. Recursive estimation of motion, structure, and focal length. IEEE Trans. Patt. Anal. and Mach. Intel., 17(6), June 1995.Google Scholar
  2. 2.
    J.M. Bernardo and A.F.M. Smith. Bayesian Theory. John Wiley, Chichester, 1994.zbMATHGoogle Scholar
  3. 3.
    S. Birchfield. Elliptical head tracking using intensity gradients and color histograms. In Proc. Computer Vision and Patt. Recog., pages 232–237, 1998.Google Scholar
  4. 4.
    A. Chiuso and S. Soatto. 3-D motion and structure causally integrated over time: Theory (stability) and practice (occlusions). Technical Report 99-003, ESSRL, 1999.Google Scholar
  5. 5.
    J.W. Davis and A.F. Bobick. The representation and recognition of action using temporal templates. In CVPR97, pages 928–934, 1997.Google Scholar
  6. 6.
    D. DeCarlo and D. Metaxas. The integration of optical flow and deformable models with applications to human face shape and motion estimation. In Proc. Computer Vision and Patt. Recog., pages 231–238, 1996.Google Scholar
  7. 7.
    P. Fua and C. Miccio. From regular images to animated heads: a least squares approach. In Proc. European Conf. on Computer Vision, pages 188–202, 1998.Google Scholar
  8. 8.
    M. Isard and A. Blake. ICondensation: Unifying low-level and high-level tracking in a stochastic framework. In Proc. European Conf. on Computer Vision, pages I:893–908, 1998.Google Scholar
  9. 9.
    T. S. Jebara and A. Pentland. Parametrized structure from motion for 3D adaptive feedback tracking of faces. In Proc. Computer Vision and Patt. Recog., 1997.Google Scholar
  10. 10.
    J. MacCormick and A. Blake. A probabilistic exclusion principle for tracking multiple objects. In Proc. Int’l Conf. on Computer Vision, pages I:572–578, 1999.Google Scholar
  11. 11.
    N. Oliver, A. Pentland, and F. Berard. LAFTER: Lips and face real time tracker. In Proc. Computer Vision and Patt. Recog., 1997.Google Scholar
  12. 12.
    Y. Raja, S. J. McKenna, and S. Gong. Tracking and segmenting people in varying lighting conditions using colour. In Proc. Int’l Conf. on Autom. Face and Gesture Recog., pages 228–233, 1998.Google Scholar
  13. 13.
    D. Reynard, A. Wildenberg, A. Blake, and J. Marchant. Learning dynamics of complex motions from image sequences. In Proc. European Conf. on Computer Vision, pages 357–368, 1996.Google Scholar
  14. 14.
    A. Schoedl, A. Haro, and I. A. Essa. Head tracking using a textured polygonal model. In Proc. Wkshp on Perceptual UI, pages 43–48, 1998.Google Scholar
  15. 15.
    R. Stiefelhagen, J. Yang, and A. Waibel. Tracking eyes and monitoring eye gaze. In Proc. Wkshp on Perceptual UI, Banff, Canada, 1997.Google Scholar
  16. 16.
    H. Tao and T. S. Huang. Bezier volume deformation model for facial animation and video tracking,. In Proc. IFIP Workshop on Modeling and Motion Capture Techniques for Virtual Environments (CAPTECH’98), November 1998.Google Scholar
  17. 17.
    K. Toyama. ‘Look Ma, no hands!’ Hands-free cursor control with real-time 3D face tracking. In Workshop on Perceptual User Interfaces, 1998.Google Scholar
  18. 18.
    T. Vetter, M. J. Jones, and T. Poggio. A bootstrapping algorithm for learning linear models of objects classes. In Proc. Computer Vision and Patt. Recog., pages 40–46, 1997.Google Scholar
  19. 19.
    Y. Wu, K. Toyama, and T. S. Huang. Wide-range, person-and illumination-insensitive head orientation estimation. In Proc. Int’l Conf. on Autom. Face and Gesture Recog., 2000.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Kentaro Toyama
    • 1
  • Ying Wu
    • 2
  1. 1.Microsoft ResearchRedmondUSA
  2. 2.University of Illinois (UIUC)UrbanaUSA

Personalised recommendations