Bootstrap Initialization of Nonparametric Texture Models for Tracking
Inbootstrap initialization for tracking, we exploit a weak prior model used to track a target to learn a stronger model, without manual intervention. We define a general formulation of this problem and present a simple taxonomy of such tasks.
The formulation is instantiated with algorithms for bootstrap initialization in two domains: In one, the goal is tracking the position of a face at a desktop; we learn color models of faces, using weak knowledge about the shape and movement of faces in video. In the other task, we seek coarse estimates of head orientation; we learn a person-specific ellipsoidal texture model for heads, given a generic model. For both tasks, we use nonparametric models of surface texture.
Experimental results verify that bootstrap initialization is feasible in both domains. We find that (1) independence assumptions in the learning process can be violated to a significant degree, if enough data is taken; (2) there are both domain-independent and domain-specific means to mitigate learning bias; and (3) repeated bootstrapping does not necessarily result in increasingly better models.
KeywordsModel Point Dirichlet Distribution Head Orientation Texture Model Acquisition Function
- 1.A. Azarbayejani and A. Pentland. Recursive estimation of motion, structure, and focal length. IEEE Trans. Patt. Anal. and Mach. Intel., 17(6), June 1995.Google Scholar
- 3.S. Birchfield. Elliptical head tracking using intensity gradients and color histograms. In Proc. Computer Vision and Patt. Recog., pages 232–237, 1998.Google Scholar
- 4.A. Chiuso and S. Soatto. 3-D motion and structure causally integrated over time: Theory (stability) and practice (occlusions). Technical Report 99-003, ESSRL, 1999.Google Scholar
- 5.J.W. Davis and A.F. Bobick. The representation and recognition of action using temporal templates. In CVPR97, pages 928–934, 1997.Google Scholar
- 6.D. DeCarlo and D. Metaxas. The integration of optical flow and deformable models with applications to human face shape and motion estimation. In Proc. Computer Vision and Patt. Recog., pages 231–238, 1996.Google Scholar
- 7.P. Fua and C. Miccio. From regular images to animated heads: a least squares approach. In Proc. European Conf. on Computer Vision, pages 188–202, 1998.Google Scholar
- 8.M. Isard and A. Blake. ICondensation: Unifying low-level and high-level tracking in a stochastic framework. In Proc. European Conf. on Computer Vision, pages I:893–908, 1998.Google Scholar
- 9.T. S. Jebara and A. Pentland. Parametrized structure from motion for 3D adaptive feedback tracking of faces. In Proc. Computer Vision and Patt. Recog., 1997.Google Scholar
- 10.J. MacCormick and A. Blake. A probabilistic exclusion principle for tracking multiple objects. In Proc. Int’l Conf. on Computer Vision, pages I:572–578, 1999.Google Scholar
- 11.N. Oliver, A. Pentland, and F. Berard. LAFTER: Lips and face real time tracker. In Proc. Computer Vision and Patt. Recog., 1997.Google Scholar
- 12.Y. Raja, S. J. McKenna, and S. Gong. Tracking and segmenting people in varying lighting conditions using colour. In Proc. Int’l Conf. on Autom. Face and Gesture Recog., pages 228–233, 1998.Google Scholar
- 13.D. Reynard, A. Wildenberg, A. Blake, and J. Marchant. Learning dynamics of complex motions from image sequences. In Proc. European Conf. on Computer Vision, pages 357–368, 1996.Google Scholar
- 14.A. Schoedl, A. Haro, and I. A. Essa. Head tracking using a textured polygonal model. In Proc. Wkshp on Perceptual UI, pages 43–48, 1998.Google Scholar
- 15.R. Stiefelhagen, J. Yang, and A. Waibel. Tracking eyes and monitoring eye gaze. In Proc. Wkshp on Perceptual UI, Banff, Canada, 1997.Google Scholar
- 16.H. Tao and T. S. Huang. Bezier volume deformation model for facial animation and video tracking,. In Proc. IFIP Workshop on Modeling and Motion Capture Techniques for Virtual Environments (CAPTECH’98), November 1998.Google Scholar
- 17.K. Toyama. ‘Look Ma, no hands!’ Hands-free cursor control with real-time 3D face tracking. In Workshop on Perceptual User Interfaces, 1998.Google Scholar
- 18.T. Vetter, M. J. Jones, and T. Poggio. A bootstrapping algorithm for learning linear models of objects classes. In Proc. Computer Vision and Patt. Recog., pages 40–46, 1997.Google Scholar
- 19.Y. Wu, K. Toyama, and T. S. Huang. Wide-range, person-and illumination-insensitive head orientation estimation. In Proc. Int’l Conf. on Autom. Face and Gesture Recog., 2000.Google Scholar