VFSC: A Very Fast Sparse Clustering to Cluster Faces from Videos

Nguyen, Dinh-Luan; Tran, Minh-Triet

doi:10.1007/978-3-319-54427-4_31

Dinh-Luan Nguyen¹⁶ &
Minh-Triet Tran¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10117))

Included in the following conference series:

Asian Conference on Computer Vision

2022 Accesses

Abstract

Face clustering is a task to partition facial images into disjoint clusters. In this paper, we investigate a specific problem of face clustering in videos. Unlike traditional face clustering problem with a given collection of images from multiple sources, our task deals with set of face tracks with information about frame ID. Thus, we can exploit two kinds of prior knowledge about the temporal and spatial information from face tracks: sequence of faces in the same track and contemporary faces in the same frame. We utilize this forehand lore and characteristic of low rank representation to introduce a new light weight but effective method entitled Very Fast Sparse Clustering (VFSC). Since the superior speed of VFSC, the method can be adapted into large scale real-time applications. Experimental results with two public datasets (BF0502 and Notting-Hill), on which our proposed method significantly breaks the limits of not only speed but also accuracy clustering of state-of-the-art algorithms (up to 250 times faster and 10% higher in accuracy), reveal the imminent power of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Yao, H., Duan, Q., Li, D., Wang, J.: An improved k-means clustering algorithm for fish image segmentation. Math. Comput. Model. 58, 790–798 (2013)
Article MATH Google Scholar
Kang, Z., Landry, S.J.: An eye movement analysis algorithm for a multielement target tracking task: maximum transition-based agglomerative hierarchical clustering. IEEE Trans. Hum.-Mach. Syst. 45, 13–24 (2015)
Article Google Scholar
Huang, Z., Wang, R., Shan, S., Chen, X.: Projection metric learning on Grassmann manifold with application to video based face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 140–149 (2015)
Google Scholar
Aggarwal, C.C., Reddy, C.K.: Data Clustering: Algorithms and Applications. CRC Press, Boca Raton (2013)
MATH Google Scholar
Sang, J., Xu, C.: Robust face-name graph matching for movie character identification. IEEE Trans. Multimedia 14, 586–596 (2012)
Article Google Scholar
Wu, B., Zhang, Y., Hu, B.G., Ji, Q.: Constrained clustering and its application to face clustering in videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3507–3514 (2013)
Google Scholar
Lu, C.-Y., Min, H., Zhao, Z.-Q., Zhu, L., Huang, D.-S., Yan, S.: Robust and efficient subspace segmentation via least squares regression. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7578, pp. 347–360. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33786-4_26
Chapter Google Scholar
Liu, G., Lin, Z., Yan, S., Sun, J., Yu, Y., Ma, Y.: Robust recovery of subspace structures by low-rank representation. IEEE Trans. Pattern Anal. Mach. Intell. 35, 171–184 (2013)
Article Google Scholar
Lu, C., Feng, J., Lin, Z., Yan, S.: Correlation adaptive subspace segmentation by trace lasso. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1345–1352 (2013)
Google Scholar
Elhamifar, E., Vidal, R.: Sparse subspace clustering. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 2790–2797. IEEE (2009)
Google Scholar
Wang, Y.X., Xu, H., Leng, C.: Provable subspace clustering: when LRR meets SSC. In: Advances in Neural Information Processing Systems, pp. 64–72 (2013)
Google Scholar
Fitzgibbon, A., Zisserman, A.: On affine invariant clustering and automatic cast listing in movies. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2352, pp. 304–320. Springer, Heidelberg (2002). doi:10.1007/3-540-47977-5_20
Chapter Google Scholar
Fitzgibbon, A.W., Zisserman, A.: Joint manifold distance: a new approach to appearance based clustering. In: Proceeding of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, p. I-26. IEEE (2003)
Google Scholar
Hu, Y., Mian, A.S., Owens, R.: Sparse approximated nearest points for image set classification. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 121–128 (2011)
Google Scholar
Wang, R., Shan, S., Chen, X., Gao, W.: Manifold-manifold distance with application to face recognition based on image set. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE (2008)
Google Scholar
Arandjelović, O., Cipolla, R.: Automatic cast listing in feature-length films with anisotropic manifold space. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1513–1520. IEEE (2006)
Google Scholar
Prince, S.J., Elder, J.H.: Bayesian identity clustering. In: 2010 Canadian Conference on Computer and Robot Vision (CRV), pp. 32–39. IEEE (2010)
Google Scholar
Wolf, L., Hassner, T., Maoz, I.: Face recognition in unconstrained videos with matched background similarity. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 529–534. IEEE (2011)
Google Scholar
Lu, Z.L., Leen, T.K.: Penalized probabilistic clustering. Neural Comput. 19, 1528–1567 (2007)
Article MathSciNet MATH Google Scholar
Cinbis, R.G., Verbeek, J., Schmid, C.: Unsupervised metric learning for face identification in TV video. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1559–1566. IEEE (2011)
Google Scholar
Bishop, C.M.: Pattern recognition. Mach. Learn. 128, 1–58 (2006)
Google Scholar
Xiao, S., Tan, M., Xu, D.: Weighted block-sparse low rank representation for face clustering in videos. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 123–138. Springer, Cham (2014). doi:10.1007/978-3-319-10599-4_9
Google Scholar
Nguyen, D.L., Nguyen, V.T., Tran, M.T., Yoshitaka, A.: Adaptive wildnet face network for detecting face in the wild. In: Eighth International Conference on Machine Vision, International Society for Optics and Photonics, p. 98750S (2015)
Google Scholar
Nguyen, D.L., Nguyen, V.T., Tran, M.T., Yoshitaka, A.: Boosting speed and accuracy in deformable part models for face image in the wild. In: 2015 International Conference on Advanced Computing and Applications (ACOMP), pp. 134–141. IEEE (2015)
Google Scholar
Zhang, K., Zhang, L., Yang, M.H.: Fast compressive tracking. IEEE Trans. Pattern Anal. Mach. Intell. 36, 2002–2015 (2014)
Article Google Scholar
Zeng, Z., Chan, T.-H., Jia, K., Xu, D.: Finding correspondence from multiple images via sparse and low-rank decomposition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 325–339. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33715-4_24
Chapter Google Scholar
Jolliffe, I.: Principal Component Analysis. Wiley, Hoboken (2002)
MATH Google Scholar
Lu, Z., Ip, H.H.S.: Constrained spectral clustering via exhaustive and efficient constraint propagation. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6316, pp. 1–14. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15567-3_1
Chapter Google Scholar
Zelnik-Manor, L., Perona, P.: Self-tuning spectral clustering. In: Advances in Neural Information Processing Systems, pp. 1601–1608 (2004)
Google Scholar
Everingham, M., Sivic, J., Zisserman, A.: Hello! My name is.. Buffy”-automatic naming of characters in TV video. In: BMVC, vol. 2, p. 6 (2006)
Google Scholar
Nguyen, D.-L., Nguyen, V.-T., Tran, M.-T., Yoshitaka, A.: Deep convolutional neural network in deformable part models for face detection. In: Bräunl, T., McCane, B., Rivera, M., Yu, X. (eds.) PSIVT 2015. LNCS, vol. 9431, pp. 669–681. Springer, Heidelberg (2016). doi:10.1007/978-3-319-29451-3_53
Chapter Google Scholar
Girshick, R., Iandola, F., Darrell, T., Malik, J.: Deformable part models are convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 437–446 (2015)
Google Scholar
Vretos, N., Solachidis, V., Pitas, I.: A mutual information based face clustering algorithm for movie content analysis. Image Vis. Comput. 29, 693–705 (2011)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Science, VNU-HCMC, Ho Chi Minh City, Vietnam
Dinh-Luan Nguyen & Minh-Triet Tran

Authors

Dinh-Luan Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Minh-Triet Tran
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dinh-Luan Nguyen .

Editor information

Editors and Affiliations

Institute of Information Science, Academia Sinica, Taipei, Taiwan
Chu-Song Chen
Tsinghua University, Beijing, China
Jiwen Lu
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
Kai-Kuang Ma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nguyen, DL., Tran, MT. (2017). VFSC: A Very Fast Sparse Clustering to Cluster Faces from Videos. In: Chen, CS., Lu, J., Ma, KK. (eds) Computer Vision – ACCV 2016 Workshops. ACCV 2016. Lecture Notes in Computer Science(), vol 10117. Springer, Cham. https://doi.org/10.1007/978-3-319-54427-4_31

Download citation

DOI: https://doi.org/10.1007/978-3-319-54427-4_31
Published: 16 March 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54426-7
Online ISBN: 978-3-319-54427-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics