DeepGender2: A Generative Approach Toward Occlusion and Low-Resolution Robust Facial Gender Classification via Progressively Trained Attention Shift Convolutional Neural Networks (PTAS-CNN) and Deep Convolutional Generative Adversarial Networks (DCGAN)

Juefei-Xu, Felix; Verma, Eshan; Savvides, Marios

doi:10.1007/978-3-319-61657-5_8

Felix Juefei-Xu⁴,
Eshan Verma⁴ &
Marios Savvides⁴

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

4725 Accesses
5 Citations
3 Altmetric

Abstract

In this work, we have undertaken the task of occlusion and low-resolution robust facial gender classification. Inspired by the trainable attention model via deep architecture, and the fact that the periocular region is proven to be the most salient region for gender classification purposes, we are able to design a progressive convolutional neural network training paradigm to enforce the attention shift during the learning process. The hope is to enable the network to attend to particular high-profile regions (e.g., the periocular region) without the need to change the network architecture itself. The network benefits from this attention shift and becomes more robust toward occlusions and low-resolution degradations. With the progressively trained attention shift convolutional neural networks (PTAS-CNN) models, we have achieved better gender classification results on the large-scale PCSO mugshot database with 400 K images under occlusion and low-resolution settings, compared to the one undergone traditional training. In addition, our progressively trained network is sufficiently generalized so that it can be robust to occlusions of arbitrary types and at arbitrary locations, as well as low resolution. One way to further improve the robustness of the proposed gender classification algorithm is to invoke a generative approach for occluded image recovery, such as using the deep convolutional generative adversarial networks (DCGAN). The facial occlusions degradation studied in this work is a missing data challenge. For the occlusion problems, the missing data locations are known whether they are randomly scattered, or in a contiguous fashion. We have shown, on the PCSO mugshot database, that a deep generative model can very effectively recover the aforementioned degradation, and the recovered images show significant performance improvement on gender classification tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Notes

1.
A note on legend: (1) Symbols \(\mathcal {M}\) correspond to each model trained, with \(\mathcal {M}_F\) corresponding to the model trained on full face (equivalent to \(\mathcal {M}_0\)), \(\mathcal {M}_P\) to one with just periocular images and \(\mathcal {M}_k\), \(k \subseteq (1,\ldots ,6)\) to the incremental models trained. (2) The tabular results show model performance on the original images in column 1 and corrupted images in other columns.
2.
This can also model the dead pixel/shot noise of a sensor and these results can be used to accelerate inline gender detection by using partially demosaiced images.
3.
Effective pixel for 16x zooming factor is around 10\(\,\times \,\)13, which is a quite challenging low-resolution setting.

References

B. Amos, B. Ludwiczuk, M. Satyanarayanan, Openface: a general-purpose face recognition library with mobile applications. Technical report, CMU-CS-16-118, CMU School of Computer Science (2016)
Google Scholar
J. Ba, V. Mnih, K. Kavukcuoglu, Multiple object recognition with visual attention (2014), arXiv:1412.7755
D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly learning to align and translate (2014), arXiv:1409.0473
A. Bartle, J. Zheng, Gender classification with deep learning. Stanford cs224d Course Project Report (2015)
Google Scholar
P. Buchana, I. Cazan, M. Diaz-Granados, F. Juefei-Xu, M. Savvides, Simultaneous forgery identification and localization in paintings using advanced correlation filters, in IEEE International Conference on Image Processing (ICIP) (2016), pp. 1–5
Google Scholar
W. Chan, N. Jaitly, Q.V. Le, O. Vinyals, Listen, attend and spell (2015), arXiv:1508.01211
D.Y. Chen, K.Y. Lin, Robust gender recognition for uncontrolled environment of real-life images. IEEE Trans. Consum. Electron. 56(3), 1586–1592 (2010)
Article Google Scholar
M.M. Cheng, Z. Zhang, W.Y. Lin, P. Torr, BING: binarized normed gradients for objectness estimation at 300fps, in IEEE CVPR (2014), pp. 3286–3293
Google Scholar
J.K. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, Y. Bengio, Attention-based models for speech recognition, in Advances in Neural Information Processing Systems (2015), pp. 577–585
Google Scholar
W. Chuan-xu, L. Yun, L. Zuo-yong, Algorithm research of face image gender classification based on 2-D gabor wavelet transform and SVM, in ISCSCT’08. International Symposium on Computer Science and Computational Technology, 2008, vol. 1 (IEEE, 2008), pp. 312–315
Google Scholar
B.A. Golomb, D.T. Lawrence, T.J. Sejnowski, Sexnet: a neural network identifies sex from human faces, in NIPS, vol. 1 (1990), p. 2
Google Scholar
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, in Advances in Neural Information Processing Systems (2014), pp. 2672–2680
Google Scholar
S. Gutta, J.R. Huang, P. Jonathon, H. Wechsler, Mixture of experts for classification of gender, ethnic origin, and pose of human faces. IEEE Trans. Neural Netw. 11(4), 948–960 (2000)
Article Google Scholar
J. Harel, C. Koch, P. Perona, Graph-based visual saliency, in NIPS (2006), pp. 545–552
Google Scholar
X. Hou, J. Harel, C. Koch, Image signature: highlighting sparse salient regions. IEEE TPAMI 34(1), 194–201 (2012)
Article Google Scholar
S.Y.D. Hu, B. Jou, A. Jaech, M. Savvides, Fusion of region-based representations for gender identification, in IEEE/IAPR IJCB (2011), pp. 1–7
Google Scholar
L. Itti, C. Koch, E. Niebur et al., A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)
Article Google Scholar
T. Jabid, M.H. Kabir, O. Chae, Gender classification using local directional pattern (LDP), in IEEE ICPR (IEEE, 2010), pp. 2162–2165
Google Scholar
M. Jaderberg, K. Simonyan, A. Zisserman, et al., Spatial transformer networks, in NIPS (2015), pp. 2008–2016
Google Scholar
F. Juefei-Xu, M. Savvides, Can your eyebrows tell me who you are?, in 2011 5th International Conference on Signal Processing and Communication Systems (ICSPCS) (2011), pp. 1–8
Google Scholar
F. Juefei-Xu, M. Savvides, Unconstrained periocular biometric acquisition and recognition using COTS PTZ camera for uncooperative and non-cooperative subjects, in 2012 IEEE Workshop on Applications of Computer Vision (WACV) (2012), pp. 201–208
Google Scholar
F. Juefei-Xu, M. Savvides, An augmented linear discriminant analysis approach for identifying identical twins with the aid of facial asymmetry features, in 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2013), pp. 56–63
Google Scholar
F. Juefei-Xu, M. Savvides, An image statistics approach towards efficient and robust refinement for landmarks on facial boundary, in 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS) (2013), pp. 1–8
Google Scholar
F. Juefei-Xu, M. Savvides, Subspace based discrete transform encoded local binary patterns representations for robust periocular matching on NIST’s face recognition grand challenge. IEEE Trans. Image Process. 23(8), 3490–3505 (2014)
Article MathSciNet Google Scholar
F. Juefei-Xu, M. Savvides, Encoding and decoding local binary patterns for harsh face illumination normalization, in IEEE International Conference on Image Processing (ICIP) (2015), pp. 3220–3224
Google Scholar
F. Juefei-Xu, M. Savvides, Facial ethnic appearance synthesis, in Computer Vision - ECCV 2014 Workshops. Lecture Notes in Computer Science, vol. 8926 (Springer International Publishing, Berlin, 2015), pp. 825–840
Google Scholar
F. Juefei-Xu, M. Savvides, Pareto-optimal discriminant analysis, in IEEE International Conference on Image Processing (ICIP) (2015), pp. 611–615
Google Scholar
F. Juefei-Xu, M. Savvides, Pokerface: partial order keeping and energy repressing method for extreme face illumination normalization, in 2015 IEEE Seventh International Conference on Biometrics: Theory, Applications and Systems (BTAS) (2015), pp. 1–8
Google Scholar
F. Juefei-Xu, M. Savvides, Single face image super-resolution via solo dictionary learning, in IEEE International Conference on Image Processing (ICIP) (2015), pp. 2239–2243
Google Scholar
F. Juefei-Xu, M. Savvides, Weight-optimal local binary patterns, in Computer Vision - ECCV 2014 Workshops. Lecture Notes in Computer Science, vol. 8926 (Springer International Publishing, Berlin, 2015), pp. 148–159
Google Scholar
F. Juefei-Xu, M. Savvides, Fastfood dictionary learning for periocular-based full face hallucination, in 2016 IEEE Seventh International Conference on Biometrics: Theory, Applications and Systems (BTAS) (2016), pp. 1–8
Google Scholar
F. Juefei-Xu, M. Savvides, learning to invert local binary patterns, in 27th British Machine Vision Conference (BMVC) (2016)
Google Scholar
F. Juefei-Xu, M. Savvides, Multi-class Fukunaga Koontz discriminant analysis for enhanced face recognition. Pattern Recognit. 52, 186–205 (2016)
Article Google Scholar
F. Juefei-Xu, M. Cha, J.L. Heyman, S. Venugopalan, R. Abiantun, M. Savvides, Robust local binary pattern feature sets for periocular biometric identification, in 4th IEEE International Conference on Biometrics: Theory Applications and Systems (BTAS) (2010), pp. 1–8
Google Scholar
F. Juefei-Xu, M. Cha, M. Savvides, S. Bedros, J. Trojanova, Robust periocular biometric recognition using multi-level fusion of various local feature extraction techniques, in IEEE 17th International Conference on Digital Signal Processing (DSP) (2011), pp. 1–7
Google Scholar
F. Juefei-Xu, K. Luu, M. Savvides, T. Bui, C. Suen, Investigating age invariant face recognition based on periocular biometrics, in 2011 International Joint Conference on Biometrics (IJCB) (2011), pp. 1–7
Google Scholar
F. Juefei-Xu, C. Bhagavatula, A. Jaech, U. Prasad, M. Savvides, Gait-ID on the move: pace independent human identification using cell phone accelerometer dynamics, in 2012 IEEE Fifth International Conference on Biometrics: Theory, Applications and Systems (BTAS) (2012), pp. 8–15
Google Scholar
F. Juefei-Xu, D.K. Pal, M. Savvides, Hallucinating the full face from the periocular region via dimensionally weighted K-SVD, in 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2014), pp. 1–8
Google Scholar
F. Juefei-Xu, D.K. Pal, M. Savvides, Methods and software for hallucinating facial features by prioritizing reconstruction errors (2014). U.S. Provisional Patent Application Serial No. 61/998,043, 17 Jun 2014
Google Scholar
F. Juefei-Xu, K. Luu, M. Savvides, Spartans: single-sample periocular-based alignment-robust recognition technique applied to non-frontal scenarios. IEEE Trans. Image Process. 24(12), 4780–4795 (2015)
Article MathSciNet Google Scholar
F. Juefei-Xu, D.K. Pal, M. Savvides, NIR-VIS heterogeneous face recognition via cross-spectral joint dictionary learning and reconstruction, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2015), pp. 141–150
Google Scholar
F. Juefei-Xu, D.K. Pal, K. Singh, M. Savvides, A preliminary investigation on the sensitivity of COTS face recognition systems to forensic analyst-style face processing for occlusions, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2015), pp. 25–33
Google Scholar
F. Juefei-Xu, V.N. Boddeti, M. Savvides, Local binary convolutional neural networks (2016), arXiv:1608.06049
F. Juefei-Xu, V.N. Boddeti, M. Savvides, Local binary convolutional neural networks, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
D. Kingma, J. Ba, Adam: a method for stochastic optimization (2014), arXiv:1412.6980
A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in NIPS (2012), pp. 1097–1105
Google Scholar
G.B.H.E. Learned-Miller, Labeled faces in the wild: updates and new reporting procedures. Technical Report UM-CS-2014-003, University of Massachusetts, Amherst (2014)
Google Scholar
P.H. Lee, J.Y. Hung, Y.P. Hung, Automatic gender recognition using fusion of facial strips, in IEEE ICPR (IEEE, 2010), pp. 1140–1143
Google Scholar
X. Leng, Y. Wang, Improving generalization for gender classification, in IEEE ICIP (IEEE, 2008), pp. 1656–1659
Google Scholar
G. Levi, T. Hassner, Age and gender classification using convolutional neural networks, in IEEE CVPRW (2015)
Google Scholar
H. Lin, H. Lu, L. Zhang, A new automatic recognition system of gender, age and ethnicity, in WCICA 2006, The Sixth World Congress on Intelligent Control and Automation, 2006, vol. 2 (IEEE, 2006), pp. 9988–9991
Google Scholar
H. Lu, H. Lin, Gender recognition using adaboosted feature, in ICNC 2007, Third International Conference on Natural Computation, 2007, vol. 2 (IEEE, 2007), pp. 646–650
Google Scholar
L. Lu, Z. Xu, P. Shi, Gender classification of facial images based on multiple facial regions, in 2009 WRI World Congress on Computer Science and Information Engineering, vol. 6 (IEEE, 2009), pp. 48–52
Google Scholar
M.T. Luong, H. Pham, C.D. Manning, Effective approaches to attention-based neural machine translation, in Conference on Empirical Methods in Natural Language Processing (EMNLP) (2015)
Google Scholar
A. Martinez, R. Benavente, The AR Face Database. CVC Technical report No. 24 (1998)
Google Scholar
J. Merkow, B. Jou, M. Savvides, An exploration of gender identification using only the periocular region, in IEEE BTAS (2010), pp. 1–5
Google Scholar
V. Mnih, N. Heess, A. Graves, et al., Recurrent models of visual attention, in NIPS (2014), pp. 2204–2212
Google Scholar
B. Moghaddam, M.H. Yang, Gender classification with support vector machines, in IEEE FG (IEEE, 2000), pp. 306–311
Google Scholar
D.K. Pal, F. Juefei-Xu, M. Savvides, Discriminative invariant kernel features: a bells-and-whistles-free approach to unsupervised face recognition and pose estimation in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
A. Radford, L. Metz, S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks (2015), arXiv:1511.06434
S. Saito, T. Li, H. Li, Real-time facial segmentation and performance capture from RGB input (2016), arXiv:1604.02647
Chapter Google Scholar
M. Savvides, F. Juefei-Xu, Image Matching Using Subspace-Based Discrete Transform Encoded Local Binary Patterns (2013). US Patent US 2014/0212044 A1
Google Scholar
M. Savvides, F. Juefei-Xu, U. Prabhu, C. Bhagavatula, Unconstrained biometric identification in real world environments, in Advances in Human Factors and System Interactions. Advances in Intelligent Systems and Computing, vol. 497 (2016), pp. 231–244
Google Scholar
K. Seshadri, F. Juefei-Xu, D.K. Pal, M. Savvides, Driver cell phone usage detection on strategic highway research program (SHRP2) face view videos, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2015), pp. 35–43
Google Scholar
C. Tai, T. Xiao, X. Wang, W. E, Convolutional neural networks with low-rank regularization. ICLR (2016), http://arxiv.org/abs/1511.06067
S. Venugopalan, F. Juefei-Xu, B. Cowley, M. Savvides, Electromyograph and keystroke dynamics for spoof-resistant biometric authentication, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2015), pp. 109–118
Google Scholar
O. Vinyals, A. Toshev, S. Bengio, D. Erhan, Show and tell: a neural image caption generator, in IEEE CVPR (2015), pp. 3156–3164
Google Scholar
T. Xiao, Y. Xu, K. Yang, J. Zhang, Y. Peng, Z. Zhang, The application of two-level attention models in deep convolutional neural network for fine-grained image classification, in IEEE CVPR (2015), pp. 842–850
Google Scholar
H. Xu, K. Saenko, Ask, attend and answer: exploring question-guided spatial attention for visual question answering (2015), arXiv:1511.05234
K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, Y. Bengio, Show, attend and tell - neural image caption generation with visual attention, in ICML (2015), pp. 2048–2057
Google Scholar
Z. Yang, M. Li, H. Ai, An experimental study on automatic face gender classification, in IEEE ICPR, vol. 3 (IEEE, 2006), pp. 1099–1102
Google Scholar
L. Yao, A. Torabi, K. Cho, N. Ballas, C. Pal, H. Larochelle, A. Courville, Describing videos by exploiting temporal structure, in IEEE ICCV (2015), pp. 4507–4515
Google Scholar
R. Yeh, C. Chen, T.Y. Lim, M. Hasegawa-Johnson, M.N. Do, Semantic image inpainting with perceptual and contextual losses (2016), arXiv:1607.07539
N. Zehngut, F. Juefei-Xu, R. Bardia, D.K. Pal, C. Bhagavatula, M. Savvides, Investigating the feasibility of image-based nose biometrics, in IEEE International Conference on Image Processing (ICIP) (2015), pp. 522–526
Google Scholar
Y. Zhu, O. Groth, M. Bernstein, L. Fei-Fei, Visual7W: grounded question answering in images (2015), arXiv:1511.03416

Download references

Author information

Authors and Affiliations

Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, 15213, USA
Felix Juefei-Xu, Eshan Verma & Marios Savvides

Authors

Felix Juefei-Xu
View author publications
You can also search for this author in PubMed Google Scholar
Eshan Verma
View author publications
You can also search for this author in PubMed Google Scholar
Marios Savvides
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Felix Juefei-Xu .

Editor information

Editors and Affiliations

University of California, Riverside, California, USA
Bir Bhanu
Hong Kong Polytechnic University, Hong Kong, China
Ajay Kumar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Juefei-Xu, F., Verma, E., Savvides, M. (2017). DeepGender2: A Generative Approach Toward Occlusion and Low-Resolution Robust Facial Gender Classification via Progressively Trained Attention Shift Convolutional Neural Networks (PTAS-CNN) and Deep Convolutional Generative Adversarial Networks (DCGAN). In: Bhanu, B., Kumar, A. (eds) Deep Learning for Biometrics. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-61657-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-61657-5_8
Published: 02 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61656-8
Online ISBN: 978-3-319-61657-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics