Advertisement

Face-Specific Data Augmentation for Unconstrained Face Recognition

  • Iacopo MasiEmail author
  • Anh Tuấn Trần
  • Tal Hassner
  • Gozde Sahin
  • Gérard Medioni
Article
  • 188 Downloads

Abstract

We identify two issues as key to developing effective face recognition systems: maximizing the appearance variations of training images and minimizing appearance variations in test images. The former is required to train the system for whatever appearance variations it will ultimately encounter and is often addressed by collecting massive training sets with millions of face images. The latter involves various forms of appearance normalization for removing distracting nuisance factors at test time and making test faces easier to compare. We describe novel, efficient face-specific data augmentation techniques and show them to be ideally suited for both purposes. By using knowledge of faces, their 3D shapes, and appearances, we show the following: (a) We can artificially enrich training data for face recognition with face-specific appearance variations. (b) This synthetic training data can be efficiently produced online, thereby reducing the massive storage requirements of large-scale training sets and simplifying training for many appearance variations. Finally, (c) The same, fast data augmentation techniques can be applied at test time to reduce appearance variations and improve face representations. Together, with additional technical novelties, we describe a highly effective face recognition pipeline which, at the time of submission, obtains state-of-the-art results across multiple benchmarks. Portions of this paper were previously published by Masi et al. (European conference on computer vision, Springer, pp 579–596, 2016b, International conference on automatic face and gesture recognition, 2017).

Keywords

Face recognition Deep learning Data augmentation 

Notes

Acknowledgements

The authors wish to thank Jongmoo Choi for his help in this project. This research is based upon work supported in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via IARPA 2014-14071600011. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purpose notwithstanding any copyright annotation thereon. Moreover, we gratefully acknowledge the support of NVIDIA Corporation with the donation of the NVIDIA Titan X GPU used for this research.

Supplementary material

Open image in new windowFig. 12 Open image in new windowFig. 13
11263_2019_1178_MOESM1_ESM.pdf (10 mb)
Supplementary material 1 (pdf 10270 KB)

References

  1. AbdAlmageed, W., Wu, Y., Rawls, S., Harel, S., Hassner, T., Masi, I., Choi, J., Leksut, J., Kim, J., Natarajan, P., Nevatia, R., & Medioni, G. (2016). Face recognition using deep multi-pose representations. In Winter conference on applications of computer vision.Google Scholar
  2. Baltrusaitis, T., Robinson, P., & Morency, L. P. (2013). Constrained local neural fields for robust facial landmark detection in the wild. In Proceedings of international conference on computer vision workshops.Google Scholar
  3. Bansal, A., Nanduri, A., Castillo, C. D., Ranjan, R., & Chellappa, R. (2017). UMDFaces: An annotated face dataset for training deep networks. In International joint conference on biometrics.Google Scholar
  4. Cao, K., Rong, Y., Li, C., Tang, X., & Change Loy, C. (2018). Pose-robust face recognition via deep residual equivariant mapping. In Proceedings of conference on computer vision pattern recognition, pp. 5187–5196.Google Scholar
  5. Chang, F., Tran, A., Hassner, T., Masi, I., Nevatia, R., & Medioni, G. (2017). FacePoseNet: Making a case for landmark-free face alignment. In 7th IEEE international workshop on analysis and modeling of faces and, gestures, ICCV workshops.Google Scholar
  6. Chang, F. J., Tran, A. T., Hassner, T., Masi, I., Nevatia, R., & Medioni, G. (2018). Expnet: Landmark-free, deep, 3d facial expressions. In Automatic face and gesture recognition, pp. 122–129.Google Scholar
  7. Chang, F. J., Tran, A. T., Hassner, T., Masi, I., Nevatia, R., & Medioni, G. (2019). Deep, landmark-free FAME: Face alignment, modeling, and expression estimation. International Journal of Computer Vision.  https://doi.org/10.1007/s11263-019-01151-x.
  8. Chatfield, K., Simonyan, K., Vedaldi, A., & Zisserman, A. (2014). Return of the devil in the details: Delving deep into convolutional nets. In Proceedings of British machine vision conference.Google Scholar
  9. Chen, J. C., Ranjan, R., Kumar, A., Chen, C. H., Patel, V. M., & Chellappa, R. (2015). An end-to-end system for unconstrained face verification with deep convolutional neural networks. In Proceedings of conference on computer vision pattern recognition workshops, pp. 118–126.Google Scholar
  10. Chen, J. C., Patel, V. M., & Chellappa, R. (2016). Unconstrained face verification using deep CNN features. In Winter conference on application of computer vision.Google Scholar
  11. Chowdhury, A. R., Lin, T. Y., Maji, S., & Learned-Miller, E. (2016). One-to-many face recognition with bilinear CNNs. In Winter conference on application of computer vision, IEEE, pp. 1–9.Google Scholar
  12. Crispell, D. E., Biris, O., Crosswhite, N., Byrne, J., & Mundy, J. L. (2016). Dataset augmentation for pose and lighting invariant face recognition. In Applied imagery pattern recognition workshop (AIPR).Google Scholar
  13. Crosswhite, N., Byrne, J., Stauffer, C., Parkhi, O., Cao, Q., & Zisserman, A. (2017). Template adaptation for face verification and identification. In International conference on automatic face and gesture recognition.Google Scholar
  14. Crosswhite, N., Byrne, J., Stauffer, C., Parkhi, O., Cao, Q., & Zisserman, A. (2018). Template adaptation for face verification and identification. Image and Vision Computing, 79, 35–48.CrossRefGoogle Scholar
  15. Eigen, D., & Fergus, R. (2015). Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In Proceedings of international conference on computer vision, pp. 2650–2658.Google Scholar
  16. Ferrari, C., Lisanti, G., Berretti, S., & Del Bimbo, A. (2017). Investigating nuisance factors in face recognition with DCNN representation. In Proceedings of conference on computer vision pattern recognition workshops.Google Scholar
  17. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair S, Courville, A., & Bengio, Y. (2014). Generative adversarial nets. In Neural information processing systems.Google Scholar
  18. Günther, M., Rozsa, A., & Boult, TE. (2017). Affact-alignment free facial attribute classification technique. In International conference on automatic face and gesture recognition.Google Scholar
  19. Guo, Y., Zhang, L., Hu, Y., He, X., & Gao, J. (2016). MS-Celeb-1M: Challenge of recognizing one million celebrities in the real world. In Electronic imaging, Vol. 11.Google Scholar
  20. Hassner, T. (2013). Viewing real-world faces in 3d. In Proceedings of international conference on computer vision, pp. 3607–3614.Google Scholar
  21. Hassner, T., Assif, L., & Wolf, L. (2014). When standard RANSAC is not enough: Cross-media visual matching with hypothesis relevancy. Machine Vision and Applications, 25(4), 971–983. www.openu.ac.il/home/hassner/projects/poses.
  22. Hassner, T., Harel, S., Paz, E., & Enbar, R. (2015). Effective face frontalization in unconstrained images. In Proceedings of international conference on computer vision recognition.Google Scholar
  23. Hassner, T., Masi, I., Kim, J., Choi, J., Harel, S., Natarajan, P., & Medioni, G. (2016). Pooling faces: Template based face recognition with pooled face images. In Proceedings of international conference on computer vision recognition workshops.Google Scholar
  24. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of international conference on computer vision pattern recognition.Google Scholar
  25. Hu, J., Lu, J., Tan, Y. P. (2014a). Discriminative deep metric learning for face verification in the wild. In Proceedings of international conference on computer vision pattern recognition, pp. 1875–1882.Google Scholar
  26. Hu, J., Lu, J., Yuan, J., & Tan, Y. P. (2014b). Large margin multi-metric learning for face and kinship verification in the wild. In Asian conference on computer vision, Springer, pp. 252–267.Google Scholar
  27. Huang, GB., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical report 07-49, UMass, Amherst.Google Scholar
  28. Hughes, J. F., Van Dam, A., Foley, J. D., & Feiner, S. K. (2014). Computer graphics: Principles and practice. London: Pearson Education.zbMATHGoogle Scholar
  29. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., & Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093.
  30. Kemelmacher-Shlizerman, I., Suwajanakorn, S., & Seitz, S. M. (2014). Illumination-aware age progression. In Proceedings of IEEE conference on computer vision pattern recognition, pp. 3334–3341.Google Scholar
  31. Kemelmacher-Shlizerman, I., Seitz, SM., Miller, D., & Brossard, E. (2016). The MegaFace benchmark: 1 million faces for recognition at scale. In Proceedings of internationl conference on computer vision pattern recognition.Google Scholar
  32. Kim, K., Yang, Z., Masi, I., Nevatia, R., & Medioni, G. (2018). Face and body association for video-based face recognition. In Winter conference on application of computer vision, pp. 39–48.Google Scholar
  33. Klare, B. F., Klein, B., Taborsky, E., Blanton, A., Cheney, J., Allen, K., Grother P, Mah, A., Burge, M., & Jain, A. K. (2015). Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A. In Proceedings of international conference on computer vision pattern recognition, pp. 1931–1939.Google Scholar
  34. Klontz, J., Klare, B., Klum, S., Taborsky, E., Burge, M., & Jain, A. K. (2013). Open source biometric recognition. In International conference on biometrics: Theory, applications and systems.Google Scholar
  35. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Neural information processing systems, pp. 1097–1105.Google Scholar
  36. Lawrence, S., Giles, C. L., Tsoi, A. C., & Back, A. D. (1997). Face recognition: A convolutional neural-network approach. Transactions on Neural Networks, 8(1), 98–113.CrossRefGoogle Scholar
  37. Levi, G., & Hassner, T. (2015). Age and gender classification using convolutional neural networks. In Proceedings of international conference on computer vision pattern recognition workshops. http://www.openu.ac.il/home/hassner/projects/cnn_agegender.
  38. Li, H., Hua, G., Shen, X., Lin, Z., & Brandt, J. (2014). Eigen-pep for video face recognition. In Asian conference on computer vision, Springer, pp. 17–33.Google Scholar
  39. Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., & Song, L. (2017a). Sphereface: Deep hypersphere embedding for face recognition. In Proceedings of international conference on computer vision pattern recognition.Google Scholar
  40. Liu, X., Li, S., Kan, M., Shan, S., & Chen, X. (2017b). Self-error-correcting convolutional neural network for learning with noisy labels. In International IEEE conference on automatic face and gesture recognition, pp. 111–117.Google Scholar
  41. Masi, I., Lisanti, G., Bagdanov, A., Pala, P., & Del Bimbo, A. (2013). Using 3D models to recognize 2D faces in the wild. In Proceedings of international conference on computer vision pattern recognition workshops.Google Scholar
  42. Masi, I., Rawls, S., Medioni, G., & Natarajan, P. (2016a). Pose-Aware Face Recognition in the Wild. In Proceedings of international conference on computer vision pattern recognition.Google Scholar
  43. Masi, I., Tr\(\grave{\hat{{\rm a}}}\)n, A. T., Hassner, T., Leksut, J. T., & Medioni, G. (2016b). Do we really need to collect millions of faces for effective face recognition? In European conference on computer vision, Springer, pp. 579–596.Google Scholar
  44. Masi, I., Hassner, T., Tr\(\grave{\hat{{\rm a}}}\)n, A. T., & Medioni, G. (2017). Rapid synthesis of massive face sets for improved face recognition. In International conference on automatic face and gesture recognition.Google Scholar
  45. Masi, I., Chang, F. J., Choi, J., Harel, S., Kim, J., Kim, K., et al. (2018). Learning pose-aware models for pose-invariant face recognition in the wild. Transactions on Pattern Analysis and Machine Intelligence, 99, 1–1.Google Scholar
  46. McLaughlin, N., Martinez Del Rincon, J., & Miller, P. (2015). Data-augmentation for reducing dataset bias in person re-identification. In International conference on advanced video and signal based surveillance, IEEE.Google Scholar
  47. Mokhayeri, F., Granger, E., & Bilodeau, GA. (2018). Domain-specific face synthesis for video face recognition from a single sample per person. arXiv preprint arXiv:1801.01974.
  48. Neves, J., & Proença, H. (2019). “A leopard cannot change its spots” : Improving face recognition using 3d-based caricatures. IEEE Transactions on Information Forensics and Security, 14(1), 151–161.CrossRefGoogle Scholar
  49. Nguyen, M. H., Lalonde, J. F., Efros, A. A., & De la Torre, F. (2008). Image-based shaving. Computer Graphics Forum, 27(2), 627–635.CrossRefGoogle Scholar
  50. Nirkin, Y., Masi, I., Tran, A., Hassner, T., & Medioni, G. (2018). On face segmentation, face swapping, and face perception. In International conference on automatic face and gesture recognition.Google Scholar
  51. Parkhi, O. M., Simonyan, K., Vedaldi, A., & Zisserman, A. (2014). A compact and discriminative face track descriptor. In Proceedings of international conference on computer vision pattern recognition.Google Scholar
  52. Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015). Deep face recognition. In Proceedings of British machnical vision conference.Google Scholar
  53. Paysan, P., Knothe, R., Amberg, B., Romdhani, S., & Vetter, T. (2009). A 3d face model for pose and illumination invariant face recognition. In Sixth IEEE international conference on advanced video and signal based surveillance, AVSS ’09, pp. 296–301.Google Scholar
  54. Ranjan, R., Sankaranarayanan, S., Castillo, C. D., & Chellappa, R. (2017). An all-in-one convolutional neural network for face analysis. In International IEEE conference on automatic face and gesture recognition, pp. 17–24.Google Scholar
  55. Ranjan, R., Bansal, A., Zheng, J., Xu, H., Gleason, J., Lu, B., Nanduri, A., Chen, J., Castillo, C. D., & Chellappa, R. (2018). A fast and accurate system for face detection, identification, and verification. CoRR arXiv:1809.07586.
  56. Rashedi, E., Barati, E., Nokleby, M., & Chen, X. (2019). ‘Stream loss”: Convnet learning for face verification using unlabeled videos in the wild. Neurocomputing, 329, 311–319.CrossRefGoogle Scholar
  57. Sánchez, J., Perronnin, F., Mensink, T., & Verbeek, J. (2013). Image classification with the fisher vector: Theory and practice. International Journal of Computer Vision, 105(3), 222–245.MathSciNetCrossRefzbMATHGoogle Scholar
  58. Sankaranarayanan, S., Alavi, A., Castillo, C., & Chellappa, R. (2016a). Triplet probabilistic embedding for face verification and clustering. In International conference on biometrics: Theory, applications and systems.Google Scholar
  59. Sankaranarayanan, S., Alavi, A., & Chellappa, R. (2016b). Triplet similarity embedding for face verification. arxiv preprint arXiv:1602.03418.
  60. Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In Proceedings of conference on computer vision pattern recognition, pp. 815–823.Google Scholar
  61. Shen, Y., Luo, P., Yan, J., Wang, X., & Tang, X. (2018). Faceid-gan: Learning a symmetry three-player gan for identity-preserving face synthesis. In Proceedings of conference on computer vision pattern recognition, pp. 821–830.Google Scholar
  62. Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference on learning representations.Google Scholar
  63. Sun, Y., Chen, Y., Wang, X., & Tang, X. (2014a). Deep learning face representation by joint identification-verification. In Neural information processing systems, pp. 1988–1996.Google Scholar
  64. Sun, Y., Wang, X., & Tang, X. (2014b). Deep learning face representation from predicting 10,000 classes. In Proceedings of conference on computer vision pattern recognition, IEEE.Google Scholar
  65. Sun, Y., Wang, X., & Tang, X. (2014c). Deeply learned face representations are sparse, selective, and robust. arXiv preprint arXiv:1412.1265.
  66. Sun, Y., Liang, D., Wang, X., & Tang, X. (2015). Deepid3: Face recognition with very deep neural networks. arXiv preprint arXiv:1502.00873.
  67. Szeliski, R. (2010). Computer vision: algorithms and applications. Berlin: Springer.zbMATHGoogle Scholar
  68. Taigman, Y., Yang, M., Ranzato, M., & Wolf, L. (2014). Deepface: Closing the gap to human-level performance in face verification. In Proceedings IEEE conference on computer vision pattern recognition, pp. 1701–1708.Google Scholar
  69. Taigman, Y., Yang, M., Ranzato, M., & Wolf, L. (2015). Web-scale training for face identification. In Proceedings of conference computer vision pattern recognition.Google Scholar
  70. Tran, A. T., Hassner, T., Masi, I., & Medioni, G. (2017). Regressing robust and discriminative 3d morphable models with a very deep neural network. In Proceedings of conference on computer vision pattern recognition.Google Scholar
  71. Tran, A. T., Hassner, T., Masi, I., Paz, E., Nirkin, Y., & Medioni, G. (2018). Extreme 3d face reconstruction: Seeing through occlusions. In Proceedings of conference on computer vision pattern recognition.Google Scholar
  72. Wang, D., Otto, C., & Jain, A. K. (2015). Face search at scale: 80 million gallery. arXiv preprint arXiv:1507.07242.
  73. Wen, Y., Zhang, K., Li, Z., & Qiao, Y. (2016). A discriminative feature learning approach for deep face recognition. In European conference computer vision, Springer, pp. 499–515.Google Scholar
  74. Wen, Y., Zhang, K., Li, Z., & Qiao, Y. (2019). A comprehensive study on center loss for deep face recognition. International Journal of Computer Vision.  https://doi.org/10.1007/s11263-018-01142-4.
  75. Whitelam, C., Taborsky, E., Blanton, A., Maze, B., Adams, J., Miller, T., Kalka, N., Jain, A. K., Duncan, J. A., Allen, K., Cheney, J., & Grother, P. (2017). Iarpa janus benchmark-b face dataset. In Proceedings of conference on computer vision pattern recognition workshops.Google Scholar
  76. Wolf, L., Hassner, T., & Maoz, I. (2011a). Face recognition in unconstrained videos with matched background similarity. In Proceedings of IEEE conference on computer vision pattern recognition, pp. 529–534.Google Scholar
  77. Wolf, L., Hassner, T., & Taigman, Y. (2011b). Effective unconstrained face recognition by combining multiple descriptors and learned background statistics. Transactions on Pattern Analysis and Machine Intelligence, 33(10), 1978–1990.CrossRefGoogle Scholar
  78. Xie, S., & Tu, Z. (2015). Holistically-nested edge detection. In Proceedings of international conference on computer vision.Google Scholar
  79. Xie, S., Yang, T., Wang, X., & Lin, Y. (2015). Hyper-class augmented and regularized deep learning for fine-grained image classification. In Proceedings of conference on computer vision pattern recognition, pp. 2645–2654.Google Scholar
  80. Xie, L., Wang, J., Wei, Z., Wang, M., & Tian, Q. (2016). DisturbLabel: Regularizing CNN on the loss layer. In Proceedings of conference computer vision pattern recognition.Google Scholar
  81. Xiong, X., & De la Torre, F. (2013). Supervised descent method and its applications to face alignment. In Proceedings of conference on computer vision pattern recognition, IEEE.Google Scholar
  82. Xu, Z., Huang, S., Zhang, Y., & Tao, D. (2015). Augmenting strong supervision using web data for fine-grained categorization. In Proceedings of international conference on computer vision, pp. 2524–2532.Google Scholar
  83. Yager, N., & Dunstone, T. (2010). The biometric menagerie. Transactions on Pattern Analysis and Machine Intelligence, 32(2), 220–230.CrossRefGoogle Scholar
  84. Yang, H., & Patras, I. (2015). Mirror, mirror on the wall, tell me, is the error small? In Proceedings of conference on computer vision pattern recognition.Google Scholar
  85. Yang, Z., & Nevatia, R. (2016). A multi-scale cascade fully convolutional network face detector. In ICPR, pp. 633–638.Google Scholar
  86. Yang, J., Ren, P., Zhang, D., Chen, D., Wen, F., Li, H., & Hua, G. (2017). Neural aggregation network for video face recognition. In Proceedings of conference on computer. vision pattern recognition.Google Scholar
  87. Yi, D., Lei, Z., Liao, S., & Li, S. Z. (2014). Learning face representation from scratch. arXiv preprint arXiv:1411.7923. Available: http://www.cbsr.ia.ac.cn/english/CASIA-WebFace-Database.html.
  88. Yin, X., Yu, X., Sohn, K., Liu, X., & Chandraker, M. (2018). Feature transfer learning for deep face recognition with long-tail data. CoRR arXiv:1803.09014
  89. Zhao, J., Xiong, L., Karlekar Jayashree, P., Li, J., Zhao, F., Wang, Z., Sugiri Pranata, P., Shengmei Shen, P., Yan, S., & Feng, J. (2017). Dual-agent GANs for photorealistic and identity preserving profile face synthesis. In Neural information processing systems, pp. 66–76.Google Scholar
  90. Zheng, Z., Zheng, L., & Yang, Y. (2017). Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In Proceedings of international conference on computer vision.Google Scholar
  91. Zhou, E., Cao, Z., & Yin, Q. (2015). Naive-deep face recognition: Touching the limit of LFW benchmark or not? arXiv preprint arXiv:1501.04690.
  92. Zhu, X., Lei, Z., Liu, X., Shi, H., & Li, S. (2016). Face alignment across large poses: A 3d solution. In Proceedings of IEEE computer vision and pattern recognition, Las Vegas, NVGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Iacopo Masi
    • 1
    Email author
  • Anh Tuấn Trần
    • 2
  • Tal Hassner
    • 3
  • Gozde Sahin
    • 2
  • Gérard Medioni
    • 2
  1. 1.Information Sciences Institute (ISI)USCMarina Del ReyUSA
  2. 2.Institute for Robotics and Intelligent SystemsUSCLos AngelesUSA
  3. 3.Open University of IsraelRa’ananaIsrael

Personalised recommendations