Face Association across Unconstrained Video Frames Using Conditional Random Fields

  • Ming Du
  • Rama Chellappa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7578)


Automatic face association across unconstrained video frames has many practical applications. Recent advances in the area of object detection have made it possible to replace the traditional tracking-based association approaches with the more robust detection-based ones. However, it is still a very challenging task for real-world unconstrained videos, especially if the subjects are in a moving platform and at distances exceeding several tens of meters. In this paper, we present a novel solution based on a Conditional Random Field (CRF) framework. The CRF approach not only gives a probabilistic and systematic treatment of the problem, but also elegantly combines global and local features. When ambiguities in labels cannot be solved by using the face appearance alone, our method relies on multiple contextual features to provide further evidence for association. Our algorithm works in an on-line mode and is able to reliably handle real-world videos. Results of experiments using challenging video data and comparisons with other methods are provided to demonstrate the effectiveness of our method.


False Detection Conditional Random Field Laplace Distribution Face Appearance Null State 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Sivic, J., Everingham, M., Zisserman, A.:“Who are you?” – learning person specific classifiers from video. In: CVPR, pp. 1145–1152 (2009)Google Scholar
  2. 2.
    Everingham, M., Sivic, J., Zisserman, A.: “Hello! my name is.. buffy” – automatic naming of characters in tv video. In: BMVC, vol. 3, pp. 899–908 (2006)Google Scholar
  3. 3.
    Ramanan, D., Baker, S., Kakade, S.: Leveraging archival video for building face datasets. In: ICCV, pp. 1–8 (2007)Google Scholar
  4. 4.
    Zhang, L., Li, Y., Nevatia, R.: Global data association for multi-object tracking using network flows. In: Computer Vision and Pattern Recognition, pp. 1–8 (2008)Google Scholar
  5. 5.
    Pirsiavash, H., Ramanan, D., Fowlkes, C.: Globally-optimal greedy algorithms for tracking a variable number of objects. In: CVPR, pp. 1201–1208 (2011)Google Scholar
  6. 6.
    Yang, B., Huang, C., Nevatia, R.: Learning affinities and dependencies for multi-target tracking using a crf model. In: CVPR, pp. 1233–1240 (2011)Google Scholar
  7. 7.
    Breitenstein, M.D., Reichlin, F., Leibe, B., Koller-Meier, E., Gool, L.J.V.: Robust tracking-by-detection using a detector confidence particle filter. In: ICCV (2009)Google Scholar
  8. 8.
    Cai, Y., de Freitas, N., Little, J.J.: Robust Visual Tracking for Multiple Targets. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 107–118. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  9. 9.
    Andriluka, M., Roth, S., Schiele, B.: People-tracking-by-detection and people-detection-by-tracking. In: CVPR, pp. 1–8 (2008)Google Scholar
  10. 10.
    Huang, C., Wu, B., Nevatia, R.: Robust Object Tracking by Hierarchical Association of Detection Responses. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 788–801. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  11. 11.
    Song, B., Jeng, T.-Y., Staudt, E., Roy-Chowdhury, A.K.: A Stochastic Graph Evolution Framework for Robust Multi-target Tracking. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 605–619. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  12. 12.
    Yilmaz, A., Javed, O., Shah, M.: Object tracking: a survey. ACM Comput. Surv. 38 (2006)Google Scholar
  13. 13.
    Zhao, T., Nevatia, R.: Tracking multiple humans in complex situations. PAMI 26, 1208–1221 (2004)CrossRefGoogle Scholar
  14. 14.
    Fitzgibbon, A.W., Zisserman, A.: On Affine Invariant Clustering and Automatic Cast Listing in Movies. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part III. LNCS, vol. 2352, pp. 304–320. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  15. 15.
    Berg, T.L., Berg, A.C., Edwards, J., Maire, M., White, R., Teh, Y.W., Learned-Miller, E., Forsyth, D.A.: Names and faces in the news. In: CVPR, vol. 2, pp. 848–854 (2004)Google Scholar
  16. 16.
    Sivic, J., Everingham, M., Zisserman, A.: Person Spotting: Video Shot Retrieval for Face Sets. In: Leow, W.-K., Lew, M., Chua, T.-S., Ma, W.-Y., Chaisorn, L., Bakker, E.M. (eds.) CIVR 2005. LNCS, vol. 3568, pp. 226–236. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. 17.
    Yang, B., Navatia, R.: An online learned crf model for multi-target tracking. In: CVPR (2012)Google Scholar
  18. 18.
    Gallagher, A.C., Chen, T.: Using group prior to identify people in consumer images. In: CVPR, pp. 1–8 (2007)Google Scholar
  19. 19.
    Anguelov, D., Lee, K.C., Gokturk, S.B., Sumengen, B.: Contextual identity recognition in personal photo albums. In: CVPR, pp. 1–7 (2007)Google Scholar
  20. 20.
    Gallagher, A.C., Chen, T.: Using context to recognize people in consumer images. IPSJ Transactions on Computer Vision and Applications 1, 115–126 (2009)CrossRefGoogle Scholar
  21. 21.
    Jepson, A.D., Fleet, D.J., El-Maraghi, T.: Robust online appearance model for visual tracking. In: CVPR, vol. 1, pp. 415–422 (2001)Google Scholar
  22. 22.
    Bourdev, L., Malik, J.: Poselets: body part detectors trained using 3d human pose annotations. In: International Conference on Computer Vision (2009)Google Scholar
  23. 23.
    Viola, P., Jones, M.J.: Robust real-time face detection. IJCV 57, 137–154 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ming Du
    • 1
  • Rama Chellappa
    • 1
  1. 1.Center for Automation ResearchUniversity of MarylandCollege ParkUSA

Personalised recommendations