Spatio-temporal Embedding for Statistical Face Recognition from Video

  • Wei Liu
  • Zhifeng Li
  • Xiaoou Tang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3952)


This paper addresses the problem of how to learn an appropriate feature representation from video to benefit video-based face recognition. By simultaneously exploiting the spatial and temporal information, the problem is posed as learning Spatio-Temporal Embedding (STE) from raw video. STE of a video sequence is defined as its condensed version capturing the essence of space-time characteristics of the video. Relying on the co-occurrence statistics and supervised signatures provided by training videos, STE preserves the intrinsic temporal structures hidden in video volume, meanwhile encodes the discriminative cues into the spatial domain. To conduct STE, we propose two novel techniques, Bayesian keyframe learning and nonparametric discriminant embedding (NDE), for temporal and spatial learning, respectively. In terms of learned STEs, we derive a statistical formulation to the recognition problem with a probabilistic fusion model. On a large face video database containing more than 200 training and testing sequences, our approach consistently outperforms state-of-the-art methods, achieving a perfect recognition accuracy.


Face Recognition Video Sequence Audio Signal Frame Synchronization Dimensionality Reduction Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Baker, S., Kanade, T.: Limits on Super-Resolution and How to Break Them. IEEE Trans. PAMI 24(9), 1167–1183 (2002)CrossRefGoogle Scholar
  2. 2.
    Bar-hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning a mahalanobis metric from equivalence constraints. J. of Machine Learning Research 6, 937–965 (2005)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Belhumeur, P., Hespanha, J., Kriegman, D.: Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection. IEEE Trans. PAMI 19(7), 711–720 (1997)CrossRefGoogle Scholar
  4. 4.
    Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley, New York (2000)zbMATHGoogle Scholar
  5. 5.
    Fukunaga, K.: Statistical Pattern Recognition. Academic Press, London (1990)zbMATHGoogle Scholar
  6. 6.
    Krüger, V., Zhou, S.: Exemplar-Based Face Recognition from Video. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 732–746. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    Lee, K., Ho, J., Yang, M., Kriegman, D.: Video-Based Face Recognition Using Probabilistic Appearance Manifolds. In: Proc. IEEE Conf. CVPR, pp. 313–320 (2003)Google Scholar
  8. 8.
    Liu, C., Shum, H., Zhang, C.: A Two-Step Approach to Hallucinating Faces: Global Parametric Model and Local Nonparametric Model. In: Proc. IEEE Conf. CVPR, pp. 192–198 (2001)Google Scholar
  9. 9.
    Liu, X., Chen, T.: Video-Based Face Recognition Using Adaptive Hidden Markov Models. In: Proc. IEEE Conf. CVPR, pp. 340–345 (2003)Google Scholar
  10. 10.
    Liu, W., Lin, D., Tang, X.: TensorPatch Super-Resolution and Coupled Residue Compensation. In: Proc. IEEE Conf. CVPR, pp. 478–484 (2005)Google Scholar
  11. 11.
    Messer, K., Matas, J., Kittler, J., Luettin, J., Matitre, G.: XM2VTSDB: The Extended M2VTS Database. In: Proc. 2nd Int. Conf. Audio- and Video-Based Biometric Person Authentication, pp. 72–77 (1999)Google Scholar
  12. 12.
    Satoh, S.: Comparative Evaluation of Face Sequence Matching for Content-based Video Access. In: Proc. IEEE Int. Conf. Automatic Face and Gesture Recognition, pp. 163–168 (2000)Google Scholar
  13. 13.
    Tang, X., Li, Z.: Frame Synchronization and Multi-Level Subspace Analysis for Video Based Face Recognition. In: Proc. IEEE Conf. CVPR, pp. 902–907 (2004)Google Scholar
  14. 14.
    Wang, X., Tang, X.: A Unified Framework for Subspace Face Recognition. IEEE Trans. PAMI 26(9), 1222–1228 (2004)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Yamaguchi, O., Fukui, K., Maeda, K.: Face Recognition Using Temporal Image Sequence. In: Proc. Int. Conf. Face and Gesture Recognition, pp. 318–323 (1998)Google Scholar
  16. 16.
    Zhou, S., Krueger, V., Chellappa, R.: Probabilistic Recognition of Human Faces from Video. Computer Vision and Image Understanding 91(1), 214–245 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Wei Liu
    • 1
  • Zhifeng Li
    • 1
  • Xiaoou Tang
    • 1
    • 2
  1. 1.Department of Information EngineeringThe Chinese University of Hong KongHong KongChina
  2. 2.Microsoft Research AsiaBeijingChina

Personalised recommendations