Jointly Learning Dictionaries and Subspace Structure for Video-Based Face Recognition

Zhang, Guangxiao; He, Ran; Davis, Larry S.

doi:10.1007/978-3-319-16811-1_7

Guangxiao Zhang¹⁷,
Ran He¹⁸ &
Larry S. Davis¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9005))

Included in the following conference series:

Asian Conference on Computer Vision

2671 Accesses
1 Citations

Abstract

In video-sharing websites and surveillance scenarios, there are often a large amount of face videos. This paper proposes a joint dictionary learning and subspace segmentation method for video-based face recognition (VFR). We assume that the face images from one subject video lie in a union of multiple linear subspaces, and there exists a global dictionary to represent these images and segment them to their corresponding subspaces. This assumption results in a “chicken and egg” problem, where subspace clustering and dictionary learning are mutually dependent. To solve thiss problem, we propose a joint optimization model that includes three parts. The first part seeks a low-rank representation for subspace segmentation; the second part encourages the dictionary to accurately represent the data while tolerating frame-wise corruption or outliers; and the third part is a regularization on the dictionary. An alternating minimization method is employed as an efficient solution to the proposed joint formulation. In each iteration, it alternately learns the subspace structure and the dictionary by improving the learning results. Experiments on three video-based face databases show that our approach consistently outperforms the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Here and for the rest of the paper, a variable with a superscript * denotes the optimal solution. One should not confuse the notation with the symbol of Hermitian transpose.

References

Hu, Y., Mian, A., Owens, R.: Sparse approximated nearest points for image classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2011)
Google Scholar
Wang, R., Guo, H., Davis, L., Dai, Q.: Covariance discriminative learning: a natural and efficient approach to image set classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2012)
Google Scholar
Cui, Z., Zhang, H., Lao, S., Chen, X.: Image sets alignment for video-based face recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2012)
Google Scholar
Chen, Y.C., Patel, V., Phillips, P., Chellappa, R.: Dictionary-based face recognition from video. In: Proceedings of European Conference of Computer Vision (2012)
Google Scholar
Chen, Y.C., Patel, V., Shekhar, S., Chellappa, R., Phillips, P.: Video-based face recognition via joint sparse representation. In: Proceedings of IEEE Conference on Automatic Face and Gesture Recognition (2013)
Google Scholar
Yang, M., Zhu, P., Zhang, L.: Face recognition based on regularized points between image sets. In: Proceedings of IEEE Conference on Automatic Face and Gesture Recognition (2013)
Google Scholar
Ortiz, E., Wright, A., Shah, M.: Face recognition in movie trailers via mean sequence spars representation-based classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2013)
Google Scholar
Shakhnarovich, G., Fisher, J., Darrell, T.: Face recognition from long-term observations. In: Proceedings of European Conference on Computer Vision (2002)
Google Scholar
Satoh, S.: Conparative evaluation on face sequence matching for content-based video access. In: Proceedings of IEEE Automatic Face and Gesture Recognition (2000)
Google Scholar
Krüger, V., Zhou, S.: Exemplar-based face recognition from video. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part IV. LNCS, vol. 2353, pp. 732–746. Springer, Heidelberg (2002)
Chapter Google Scholar
Kim, M., Kumar, S., Pavlovic, V., Rowley, H.: Face tracking and recognition with visual constraints in real-world videos. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2008)
Google Scholar
Lee, K., Ho, J., Yang, M., Kriegman, D.: Visual tracking and recognition using probabilistic appearance manifolds. In: Proceedings of Computer Vision and Image Understanding (2005)
Google Scholar
Cevikalp, H., Triggs, B.: Face recognition based on image sets. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2010)
Google Scholar
Kim, T., Arandjelovic, O., Cipolla, R.: Discriminative learning and recognition of image set classes using canonical correlations. IEEE Trans. Pattern Anal. Mach. Intell. 29, 1005–1018 (2007)
Article Google Scholar
Wang, R., Shan, S., Chen, X., Gao, W.: Manifold-manifold distance with application to face recognition based on image set. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2008)
Google Scholar
Wang, R., Chen, X.: Manifold discrimininant analysis. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2009)
Google Scholar
Chen, S., Sanderson, C., Harandi, M.T., Lovell, B.: Improved image set classification via joint sparse approximated nearest subspaces. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2013)
Google Scholar
Huang, Z., Shan, S., Wang, R., Chen, X.: Coupling alignments with recognition for still-to-video face recognition. In: IEEE International Conference on Computer Vision (2013)
Google Scholar
Lu, J., Wang, G., Moulin, P.: Image set classification using holistic multiple order statistics features and localized multi-kernel metric learning. In: IEEE International Conference on Computer Vision (2013)
Google Scholar
Wright, J., Yang, A., Ganesh, A., Sastry, S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31, 210–227 (2009)
Article Google Scholar
Liu, G., Lin, Z., Yu, Y.: Robust subspace segmentation by low-rank representation. In: International Conference on Machine Learning (2010)
Google Scholar
Elhamifar, E., Vidal, R.: Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2765–2781 (2013)
Article Google Scholar
Favaro, P., Vidal, R., Ravichandran, A.: A closed form solution to robust subspace estimation and clustering. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2011)
Google Scholar
He, R., Sun, Z., Tan, T., Zheng, W.S.: Recovery of corrupted low-rank matrices via half-quadratic based non convex minimization. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2011)
Google Scholar
Aharon, M., Elad, M., Bruckstein, A.: K-svd: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54, 4311–4322 (2006)
Article Google Scholar
Gross, R., Shi, J.: The cmu motion of body (mobo) database. Technical Report CMU-RI-TR-01-18, Robotics Institute, Pittsburgh, PA (2001)
Google Scholar
Viola, P., Jones, M.: Robust real-time face detection. Int. J. Comput. Vision 57, 137–154 (2004)
Article Google Scholar

Download references

Acknowledgement

This work was supported by the Army Research Office MURI Grant W\(911\)NF-\(09\)-\(1\)-\(0383\). We also thank Dr. Ruiping Wang for sharing the processed data.

Author information

Authors and Affiliations

Institute for Advanced Computer Studies, University of Maryland, College Park, MD, 20742, USA
Guangxiao Zhang
Institute of Automation, Chinese Academy of Sciences, 95 Zhongguancun East Road, P.O. Box 2728, Beijing, 100190, China
Ran He & Larry S. Davis

Authors

Guangxiao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ran He
View author publications
You can also search for this author in PubMed Google Scholar
Larry S. Davis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guangxiao Zhang .

Editor information

Editors and Affiliations

Technische Universität München, Garching, Bayern, Germany
Daniel Cremers
University of Adelaide, Adelaide, South Australia, Australia
Ian Reid
Keio University, Yokohama, Kanagawa, Japan
Hideo Saito
University of California at Merced, Merced, California, USA
Ming-Hsuan Yang

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material (pdf 181 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, G., He, R., Davis, L.S. (2015). Jointly Learning Dictionaries and Subspace Structure for Video-Based Face Recognition. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9005. Springer, Cham. https://doi.org/10.1007/978-3-319-16811-1_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-16811-1_7
Published: 16 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16810-4
Online ISBN: 978-3-319-16811-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics