Abstract
Compared with traditional strategies (e.g., information theoretic learning He et al, Robust recognition via information theoretic learning, 2014, [19]), deep learning techniques have recently revealed its superiority in various computer vision tasks. Deep face representation is a compact and discriminative description of raw face data (e.g., face images) extracted by deep networks. It is a crucial step in face recognition system which is one of the fundamental tasks in face analysis. This chapter gives a brief introduction of basic deep learning concepts in face analysis and synthesis. (1) A coherent study of the deep face representation is first presented. We start with a compact survey of CNN based face recognition, then followed with a discussion on unit activation functions in CNN. Moreover, a concrete instance (namely Light CNN) is described in details, serving as a representative of CNN architectures in deep face representation. (2) As the most prominent deep generative models, the classical versions of GAN and VAE are briefly introduced in particular. Additionally, we discuss their limitations as well as improved variants.
Part of this chapter is reprinted from Wu et al. [48], with permission from IEEE.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Amari, S.I.: Dynamics of pattern formation in lateral-inhibition type neural fields. Biol. Cybern. 27, 77–87 (1977)
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN (2017). arXiv:1701.07875
Bansal, A., Castillo, C., Ranjan, R., Chellappa, R.: The do’s and don’ts for cnn-based face verification. In: IEEE International Conference on Computer Vision Workshop (2017)
Bansal, A., Nanduri, A., Castillo, C., Ranjan, R., Chellappa, R.: Umdfaces: An annotated face dataset for training deep networks. In: International Joint Conference on Biometrics (2017)
Berthelot, D., Schumm, T., Metz, L.: BEGAN: boundary equilibrium generative adversarial networks (2017). arXiv:1703.10717
Best-Rowden, L., Han, H., Otto, C., Klare, B., Jain, A.K.: Unconstrained face recognition: identifying a person of interest from a media collection. IEEE Trans. Inf. Forensics Secur. 9(12), 2144–2157 (2014)
Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: Vggface2: a dataset for recognising faces across pose and age (2017). arXiv:1710.08092
Clevert, D., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (elus). In: International Conference on Learning Representation (2016)
Denton, E.L., Chintala, S., Fergus, R. et al.: Deep generative image models using a laplacian pyramid of adversarial networks. In: NIPS, pp. 1486–1494 (2015)
Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real NVP. In: ICLR (2017)
Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. In: NIPS, pp. 658–666 (2016)
Durugkar, I., Gemp, I., Mahadevan, S.: Generative multi-adversarial networks. In: ICLR (2017)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: NIPS, pp. 2672–2680 (2014)
Goodfellow, I.J., Warde-Farley, D., Mirza, M., Courville, A.C., Bengio, Y.: Maxout networks. In: International Conference on Machine Learning (2013)
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein GANs. In: NIPS, pp. 5769–5779 (2017)
Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: Ms-celeb-1m: a dataset and benchmark for large-scale face recognition. In: European Conference on Computer Vision (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: IEEE International Conference on Computer Vision (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
He, R., Hu, B., Yuan, X., Wang, L. et al.: Robust Recognition via Information Theoretic Learning. Springer, Berlin (2014)
Huang, H., Li, Z., He, R., Sun, Z., Tan, T.: Introvae: Introspective variational autoencoders for photographic image synthesis. In: Advances in Neural Information Processing Systems, pp. 10236–10245 (2018)
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. In: ICLR (2018)
Kingma, D.P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., Welling, M.: Improved variational inference with inverse autoregressive flow. In: NIPS, pp. 4743–4751 (2016)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)
Larsen, A.B.L., Sønderby, S.K., Larochelle, H., Winther, O.: Autoencoding beyond pixels using a learned similarity metric. In: ICML, pp. 1558–1566 (2016)
Lei, Z., Chu, R., He, R., Liao, S., Li, S.Z.: Face recognition by discriminant analysis with gabor tensor representation. In: International Conference on Biometrics (2007)
Li, Y., Swersky, K., Zemel, R.: Generative moment matching networks. In: ICML, pp. 1718–1727 (2015)
Lin, M., Chen, Q., Yan, S.: Network in network. In: International Conference on Learning Representation (2014)
Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: International Conference on Machine Learning (2013)
Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., Frey, B.: Adversarial autoencoders (2015). arXiv:1511.05644
Masi, I., Tran, A.T., Hassner, T., Leksut, J.T., Medioni, G.: Do we really need to collect millions of faces for effective face recognition? In: European Conference on Computer Vision (2016)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: International Conference on Machine Learning (2010)
Nguyen, T., Le, T., Vu, H., Phung, D.: Dual discriminator generative adversarial nets. In: NIPS, pp. 2667–2677 (2017)
van den Oord, A., Kalchbrenner, N., Espeholt, L., Vinyals, O., Graves, A. et al.: Conditional image generation with pixelcnn decoders. In: NIPS, pp. 4790–4798 (2016)
Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: British Machine Vision Conference (2015)
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection (2015). CoRR arXiv:1506.02640
Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. In: NIPS, pp. 2234–2242 (2016)
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR (2014)
Sønderby, C.K., Raiko, T., Maaløe, L., Sønderby, S.K., Winther, O.: Ladder variational autoencoders. In: NIPS, pp. 3738–3746 (2016)
Sun, Y., Chen, Y., Wang, X., Tang, X.: Deep learning face representation by joint identification-verification. In: Advances in Neural Information Processing Systems (2014)
Sun, Y., Wang, X., Tang, X.: Deeply learned face representations are sparse, selective, and robust. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
Sun, Z., Tan, T.: Ordinal measures for iris recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(12), 2211–2226 (2009)
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)
Taigman, Y., Yang, M., Ranzato, M.A., Wolf, L.: Web-scale training for face identification. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
Van Oord, A., Kalchbrenner, N., Kavukcuoglu, K.: Pixel recurrent neural networks. In: ICML, pp. 1747–1756 (2016)
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: CVPR (2018)
Wu, X., He, R., Sun, Z., Tan, T.: A light cnn for deep face representation with noisy labels. IEEE Trans. Inf. Forensics Secur. 13(11), 2884–2896 (2018)
Yi, D., Lei, Z., Liao, S., Li, S.Z.: Learning face representation from scratch (2014). CoRR arXiv:1411.7923
Zhang, H., Xu, T., Li, H., Zhang, S., Huang, X., Wang, X., Metaxas, D.: StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In: ICCV, pp. 5907–5915 (2017)
Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X., Metaxas, D.: StackGAN++: realistic image synthesis with stacked generative adversarial networks (2017). arXiv:1710.10916v2
Zhang, Z., Xie, Y., Yang, L.: Photographic text-to-image synthesis with a hierarchically-nested adversarial network (2018). arXiv:1802.09178
Zhao, J., Mathieu, M., LeCun, Y.: Energy-based generative adversarial network. In: ICLR (2017)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2020 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Li, Y., Huang, H., He, R., Tan, T. (2020). Foundation. In: Heterogeneous Facial Analysis and Synthesis. SpringerBriefs in Computer Science. Springer, Singapore. https://doi.org/10.1007/978-981-13-9148-4_2
Download citation
DOI: https://doi.org/10.1007/978-981-13-9148-4_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9147-7
Online ISBN: 978-981-13-9148-4
eBook Packages: Computer ScienceComputer Science (R0)