Abstract
This paper describes a novel method for tracking complex non-rigid motions by learning the intrinsic object structure. The approach builds on and extends the studies on non-linear dimensionality reduction for object representation, object dynamics modeling and particle filter style tracking. First, the dimensionality reduction and density estimation algorithm is derived for unsupervised learning of object intrinsic representation, and the obtained non-rigid part of object state reduces even to 2–3 dimensions. Secondly the dynamical model is derived and trained based on this intrinsic representation. Thirdly the learned intrinsic object structure is integrated into a particle filter style tracker. It is shown that this intrinsic object representation has some interesting properties and based on which the newly derived dynamical model makes particle filter style tracker more robust and reliable. Extensive experiments are done on the tracking of challenging non-rigid motions such as fish twisting with self-occlusion, large inter-frame lip motion and facial expressions with global head rotation. Quantitative results are given to make comparisons between the newly proposed tracker and the existing tracker. The proposed method also has the potential to solve other type of tracking problems.
Similar content being viewed by others
References
Isard M, Blake A. Contour tracking by stochastic propagation of conditional density. InProc. ECCV, Cambridge, UK, April 15–18, 1996, Vol.1, pp.343–356.
Wu Y, Huang T S. Color tracking by transductive learning. InProc. IEEE CVPR, Hilton Head Island, South Carolina, June 13–15, 2000, Vol.I, pp.133–138.
Black M, Jepson A. Eigentracking: Robust matching and tracking of articulated object using a view-based representation. InProc. ECCV, Cambridge, UK, April 15–18, 1996, Vol.1, pp.329–342.
Toyama K, Blake A. Probabilistic tracking in a metric space. InProc. IEEE ICCV, Vancouver, Canada, July 9–12, 2001, Vol.II, pp.50–57.
Birchfield S. Elliptical head tracking using intensity gradient and color histograms. InProc. IEEE CVPR, Santa Barbara, California, June 23–25, 1998, pp.232–237.
Heap T, Hogg D. Wormholes in shape space: Tracking through discontinuous changes in shape. InProc. IEEE ICCV, Bombay, India, January 4–7, 1998, pp.344–349.
Tipping M E, Bishop C M. Mixtures of probabilistic principal component analysers.Neural Computation, 1999, 11(2): 443–482.
Forsyth D A, Ponce J. Computer Vision: A Modern Approach. Prentice Hall, 2003, pp.520–574.
Blake A, Isard M, Reynard D. Learning to track the visual motion of contours.Artificial Intelligence, 1995, 78: 101–133.
North B, Blake A, Isard Met al. Learning and classification of complex dynamics.IEEE Trans. PAMI, 2000, 22(9): 1016–1034.
Tay T, Sung K K. Probabilistic learning and modeling of object dynamics for tracking. InProc. IEEE ICCV, Vancouver, Canada, July 9–12, 2001, Vol.II, pp.648–653.
Blake A, Isard M. Active Contours. Springer-Verlag, 1998.
Sidenbladh H, Black M. Learning image statistics for Bayesian tracking. InProc. IEEE ICCV, Vancouver, Canada, July 9–12, 2001, Vol.II, pp.709–716.
Sullivan J, Blake A, Isard Met al. Object localization by Bayesian correlation. InProc. IEEE ICCV, Kerkyra, Corfu, Greece, September 20–25, 1999, pp.1068–1075.
Cootes T F, Taylor C J. Active shape models—Smart snakes. InProc. British Machine Vision Conference, Springer-Verlag, UK, 1992, pp.266–275.
Tao H, Huang T S. Explanation-based facial motion tracking using a piecewise Bazier volume deformation model. InProc. IEEE CVPR, Fort, Collins, Colorado, 1999, pp.611–617.
Seung H S, Lee D D. The manifold ways of perception.Science, 2000, 290(5500): 2268–2269.
Roweis S, Saul L K. Nonlinear dimensionality reduction by locally linear embedding.Science, 2000, 290(5500): 2323–2326.
Roweis S, Saul L, Hinton G. Global coordination of local linear models. InAdvances in Neural Information Processing Systems, Dietterich T G, Becker S, Ghahramani Z (eds.), MIT Press, 2002, pp. 889–896.
Tenenbaum J B, deSilva V, Langford J C. A global geometric framework for nonlinear dimensionality reduction.Science, 2000, 290(5500): 2319–2323.
Ghahramani Z, Hinton G. The EM algorithm for mixtures of factor analyzers. University of Toronto Technical Report CRC-TR-96-1, 1996.
Verbeek J J, Vlassis N, Kroese B. Fast nonlinear dimensionality reduction with topology representing networks. InProc. 10th European Symposium on Artificial Neural Networks, Bruges, Belgium, April 24–26, 2002, pp.193–198.
deSilva V, Tenenbaum J. Unsupervised learning of curved manifolds. InProc. MSRI Workshop on Nonlinear Estimation and Classification, Berkeley, California, March 19–29, 2001, pp.453–466.
Isard M, Blake A. ICONDENSATION: Unifying lowlevel and high-level tracking in a stochastic framework. InProc. ECCV, Freiburg, Germany, June 2–6, 1998, Vol.1, pp.893–908.
Wang Q, Ai H, Xu G. A probabilistic dynamic contour model for accurate and robust lip tracking. InProc. IEEE Fourth International Conference on Multimodal Interfaces, Pittsburgh, Pennsylvania, October 14–16, 2002, pp.281–286.
Gavrila D, Philomin V. Real-time object detection for smart vehicles. InProc. IEEE ICCV, Kerkyra, Corfu, Greece, September 20–25, 1999, pp.87–93.
Kutulakos K. Approximate N-view stereo. InProc. ECCV, Dublin, Ireland, June 26–July 1, 2000, Vol.1, pp.67–83.
MPEG Video. Information technology—Coding of audio-visual objects Part 2: Visual Amendment 1: Visual extensions. ISO/IECJTC1/SC29/WG11/N3056, Dec, 1999.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported by the National Natural Science Foundation of China (Grant Nos. 60273005 and 60332010).
Qiang Wang received the B.E. degree from Tsinghua University in 1998. And now he is a Ph.D. candidate in Department of Computer Science and Technology at Tsinghua University. His research interest lies in human computer interaction, facial modeling and animation, and visual tracking.
Hai-Zhou Ai is a professor, Dept. Computer Science & Technology, Tsinghua University. He received his B.S., M.S., and Ph.D. degrees all from Tsinghua University in 1985, 1988, and 1991, respectively. He spent the period 1994–1996 at Flexible Production System Laboratory of University of Brussels, Belgium, as a postdoctoral researcher. His current research interests are face information processing, biometrics and visual surveillance.
Guang-You Xu is a chair professor, Dept. Computer Science & Technology, Tsinghua University. He graduated from the Dept. Automatic Control, Tsinghua University, in 1963. His research interests include computer vision, multimedia computing and human computer interaction.
Rights and permissions
About this article
Cite this article
Wang, Q., Ai, HZ. & Xu, GY. Learning-based tracking of complex non-rigid motion. J. Compt. Sci. & Technol. 19, 489–500 (2004). https://doi.org/10.1007/BF02944750
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02944750