Abstract
In this work, we have undertaken the task of occlusion and low-resolution robust facial gender classification. Inspired by the trainable attention model via deep architecture, and the fact that the periocular region is proven to be the most salient region for gender classification purposes, we are able to design a progressive convolutional neural network training paradigm to enforce the attention shift during the learning process. The hope is to enable the network to attend to particular high-profile regions (e.g., the periocular region) without the need to change the network architecture itself. The network benefits from this attention shift and becomes more robust toward occlusions and low-resolution degradations. With the progressively trained attention shift convolutional neural networks (PTAS-CNN) models, we have achieved better gender classification results on the large-scale PCSO mugshot database with 400 K images under occlusion and low-resolution settings, compared to the one undergone traditional training. In addition, our progressively trained network is sufficiently generalized so that it can be robust to occlusions of arbitrary types and at arbitrary locations, as well as low resolution. One way to further improve the robustness of the proposed gender classification algorithm is to invoke a generative approach for occluded image recovery, such as using the deep convolutional generative adversarial networks (DCGAN). The facial occlusions degradation studied in this work is a missing data challenge. For the occlusion problems, the missing data locations are known whether they are randomly scattered, or in a contiguous fashion. We have shown, on the PCSO mugshot database, that a deep generative model can very effectively recover the aforementioned degradation, and the recovered images show significant performance improvement on gender classification tasks.
Notes
- 1.
A note on legend: (1) Symbols \(\mathcal {M}\) correspond to each model trained, with \(\mathcal {M}_F\) corresponding to the model trained on full face (equivalent to \(\mathcal {M}_0\)), \(\mathcal {M}_P\) to one with just periocular images and \(\mathcal {M}_k\), \(k \subseteq (1,\ldots ,6)\) to the incremental models trained. (2) The tabular results show model performance on the original images in column 1 and corrupted images in other columns.
- 2.
This can also model the dead pixel/shot noise of a sensor and these results can be used to accelerate inline gender detection by using partially demosaiced images.
- 3.
Effective pixel for 16x zooming factor is around 10\(\,\times \,\)13, which is a quite challenging low-resolution setting.
References
B. Amos, B. Ludwiczuk, M. Satyanarayanan, Openface: a general-purpose face recognition library with mobile applications. Technical report, CMU-CS-16-118, CMU School of Computer Science (2016)
J. Ba, V. Mnih, K. Kavukcuoglu, Multiple object recognition with visual attention (2014), arXiv:1412.7755
D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly learning to align and translate (2014), arXiv:1409.0473
A. Bartle, J. Zheng, Gender classification with deep learning. Stanford cs224d Course Project Report (2015)
P. Buchana, I. Cazan, M. Diaz-Granados, F. Juefei-Xu, M. Savvides, Simultaneous forgery identification and localization in paintings using advanced correlation filters, in IEEE International Conference on Image Processing (ICIP) (2016), pp. 1–5
W. Chan, N. Jaitly, Q.V. Le, O. Vinyals, Listen, attend and spell (2015), arXiv:1508.01211
D.Y. Chen, K.Y. Lin, Robust gender recognition for uncontrolled environment of real-life images. IEEE Trans. Consum. Electron. 56(3), 1586–1592 (2010)
M.M. Cheng, Z. Zhang, W.Y. Lin, P. Torr, BING: binarized normed gradients for objectness estimation at 300fps, in IEEE CVPR (2014), pp. 3286–3293
J.K. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, Y. Bengio, Attention-based models for speech recognition, in Advances in Neural Information Processing Systems (2015), pp. 577–585
W. Chuan-xu, L. Yun, L. Zuo-yong, Algorithm research of face image gender classification based on 2-D gabor wavelet transform and SVM, in ISCSCT’08. International Symposium on Computer Science and Computational Technology, 2008, vol. 1 (IEEE, 2008), pp. 312–315
B.A. Golomb, D.T. Lawrence, T.J. Sejnowski, Sexnet: a neural network identifies sex from human faces, in NIPS, vol. 1 (1990), p. 2
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, in Advances in Neural Information Processing Systems (2014), pp. 2672–2680
S. Gutta, J.R. Huang, P. Jonathon, H. Wechsler, Mixture of experts for classification of gender, ethnic origin, and pose of human faces. IEEE Trans. Neural Netw. 11(4), 948–960 (2000)
J. Harel, C. Koch, P. Perona, Graph-based visual saliency, in NIPS (2006), pp. 545–552
X. Hou, J. Harel, C. Koch, Image signature: highlighting sparse salient regions. IEEE TPAMI 34(1), 194–201 (2012)
S.Y.D. Hu, B. Jou, A. Jaech, M. Savvides, Fusion of region-based representations for gender identification, in IEEE/IAPR IJCB (2011), pp. 1–7
L. Itti, C. Koch, E. Niebur et al., A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)
T. Jabid, M.H. Kabir, O. Chae, Gender classification using local directional pattern (LDP), in IEEE ICPR (IEEE, 2010), pp. 2162–2165
M. Jaderberg, K. Simonyan, A. Zisserman, et al., Spatial transformer networks, in NIPS (2015), pp. 2008–2016
F. Juefei-Xu, M. Savvides, Can your eyebrows tell me who you are?, in 2011 5th International Conference on Signal Processing and Communication Systems (ICSPCS) (2011), pp. 1–8
F. Juefei-Xu, M. Savvides, Unconstrained periocular biometric acquisition and recognition using COTS PTZ camera for uncooperative and non-cooperative subjects, in 2012 IEEE Workshop on Applications of Computer Vision (WACV) (2012), pp. 201–208
F. Juefei-Xu, M. Savvides, An augmented linear discriminant analysis approach for identifying identical twins with the aid of facial asymmetry features, in 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2013), pp. 56–63
F. Juefei-Xu, M. Savvides, An image statistics approach towards efficient and robust refinement for landmarks on facial boundary, in 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS) (2013), pp. 1–8
F. Juefei-Xu, M. Savvides, Subspace based discrete transform encoded local binary patterns representations for robust periocular matching on NIST’s face recognition grand challenge. IEEE Trans. Image Process. 23(8), 3490–3505 (2014)
F. Juefei-Xu, M. Savvides, Encoding and decoding local binary patterns for harsh face illumination normalization, in IEEE International Conference on Image Processing (ICIP) (2015), pp. 3220–3224
F. Juefei-Xu, M. Savvides, Facial ethnic appearance synthesis, in Computer Vision - ECCV 2014 Workshops. Lecture Notes in Computer Science, vol. 8926 (Springer International Publishing, Berlin, 2015), pp. 825–840
F. Juefei-Xu, M. Savvides, Pareto-optimal discriminant analysis, in IEEE International Conference on Image Processing (ICIP) (2015), pp. 611–615
F. Juefei-Xu, M. Savvides, Pokerface: partial order keeping and energy repressing method for extreme face illumination normalization, in 2015 IEEE Seventh International Conference on Biometrics: Theory, Applications and Systems (BTAS) (2015), pp. 1–8
F. Juefei-Xu, M. Savvides, Single face image super-resolution via solo dictionary learning, in IEEE International Conference on Image Processing (ICIP) (2015), pp. 2239–2243
F. Juefei-Xu, M. Savvides, Weight-optimal local binary patterns, in Computer Vision - ECCV 2014 Workshops. Lecture Notes in Computer Science, vol. 8926 (Springer International Publishing, Berlin, 2015), pp. 148–159
F. Juefei-Xu, M. Savvides, Fastfood dictionary learning for periocular-based full face hallucination, in 2016 IEEE Seventh International Conference on Biometrics: Theory, Applications and Systems (BTAS) (2016), pp. 1–8
F. Juefei-Xu, M. Savvides, learning to invert local binary patterns, in 27th British Machine Vision Conference (BMVC) (2016)
F. Juefei-Xu, M. Savvides, Multi-class Fukunaga Koontz discriminant analysis for enhanced face recognition. Pattern Recognit. 52, 186–205 (2016)
F. Juefei-Xu, M. Cha, J.L. Heyman, S. Venugopalan, R. Abiantun, M. Savvides, Robust local binary pattern feature sets for periocular biometric identification, in 4th IEEE International Conference on Biometrics: Theory Applications and Systems (BTAS) (2010), pp. 1–8
F. Juefei-Xu, M. Cha, M. Savvides, S. Bedros, J. Trojanova, Robust periocular biometric recognition using multi-level fusion of various local feature extraction techniques, in IEEE 17th International Conference on Digital Signal Processing (DSP) (2011), pp. 1–7
F. Juefei-Xu, K. Luu, M. Savvides, T. Bui, C. Suen, Investigating age invariant face recognition based on periocular biometrics, in 2011 International Joint Conference on Biometrics (IJCB) (2011), pp. 1–7
F. Juefei-Xu, C. Bhagavatula, A. Jaech, U. Prasad, M. Savvides, Gait-ID on the move: pace independent human identification using cell phone accelerometer dynamics, in 2012 IEEE Fifth International Conference on Biometrics: Theory, Applications and Systems (BTAS) (2012), pp. 8–15
F. Juefei-Xu, D.K. Pal, M. Savvides, Hallucinating the full face from the periocular region via dimensionally weighted K-SVD, in 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2014), pp. 1–8
F. Juefei-Xu, D.K. Pal, M. Savvides, Methods and software for hallucinating facial features by prioritizing reconstruction errors (2014). U.S. Provisional Patent Application Serial No. 61/998,043, 17 Jun 2014
F. Juefei-Xu, K. Luu, M. Savvides, Spartans: single-sample periocular-based alignment-robust recognition technique applied to non-frontal scenarios. IEEE Trans. Image Process. 24(12), 4780–4795 (2015)
F. Juefei-Xu, D.K. Pal, M. Savvides, NIR-VIS heterogeneous face recognition via cross-spectral joint dictionary learning and reconstruction, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2015), pp. 141–150
F. Juefei-Xu, D.K. Pal, K. Singh, M. Savvides, A preliminary investigation on the sensitivity of COTS face recognition systems to forensic analyst-style face processing for occlusions, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2015), pp. 25–33
F. Juefei-Xu, V.N. Boddeti, M. Savvides, Local binary convolutional neural networks (2016), arXiv:1608.06049
F. Juefei-Xu, V.N. Boddeti, M. Savvides, Local binary convolutional neural networks, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
D. Kingma, J. Ba, Adam: a method for stochastic optimization (2014), arXiv:1412.6980
A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in NIPS (2012), pp. 1097–1105
G.B.H.E. Learned-Miller, Labeled faces in the wild: updates and new reporting procedures. Technical Report UM-CS-2014-003, University of Massachusetts, Amherst (2014)
P.H. Lee, J.Y. Hung, Y.P. Hung, Automatic gender recognition using fusion of facial strips, in IEEE ICPR (IEEE, 2010), pp. 1140–1143
X. Leng, Y. Wang, Improving generalization for gender classification, in IEEE ICIP (IEEE, 2008), pp. 1656–1659
G. Levi, T. Hassner, Age and gender classification using convolutional neural networks, in IEEE CVPRW (2015)
H. Lin, H. Lu, L. Zhang, A new automatic recognition system of gender, age and ethnicity, in WCICA 2006, The Sixth World Congress on Intelligent Control and Automation, 2006, vol. 2 (IEEE, 2006), pp. 9988–9991
H. Lu, H. Lin, Gender recognition using adaboosted feature, in ICNC 2007, Third International Conference on Natural Computation, 2007, vol. 2 (IEEE, 2007), pp. 646–650
L. Lu, Z. Xu, P. Shi, Gender classification of facial images based on multiple facial regions, in 2009 WRI World Congress on Computer Science and Information Engineering, vol. 6 (IEEE, 2009), pp. 48–52
M.T. Luong, H. Pham, C.D. Manning, Effective approaches to attention-based neural machine translation, in Conference on Empirical Methods in Natural Language Processing (EMNLP) (2015)
A. Martinez, R. Benavente, The AR Face Database. CVC Technical report No. 24 (1998)
J. Merkow, B. Jou, M. Savvides, An exploration of gender identification using only the periocular region, in IEEE BTAS (2010), pp. 1–5
V. Mnih, N. Heess, A. Graves, et al., Recurrent models of visual attention, in NIPS (2014), pp. 2204–2212
B. Moghaddam, M.H. Yang, Gender classification with support vector machines, in IEEE FG (IEEE, 2000), pp. 306–311
D.K. Pal, F. Juefei-Xu, M. Savvides, Discriminative invariant kernel features: a bells-and-whistles-free approach to unsupervised face recognition and pose estimation in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
A. Radford, L. Metz, S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks (2015), arXiv:1511.06434
S. Saito, T. Li, H. Li, Real-time facial segmentation and performance capture from RGB input (2016), arXiv:1604.02647
M. Savvides, F. Juefei-Xu, Image Matching Using Subspace-Based Discrete Transform Encoded Local Binary Patterns (2013). US Patent US 2014/0212044 A1
M. Savvides, F. Juefei-Xu, U. Prabhu, C. Bhagavatula, Unconstrained biometric identification in real world environments, in Advances in Human Factors and System Interactions. Advances in Intelligent Systems and Computing, vol. 497 (2016), pp. 231–244
K. Seshadri, F. Juefei-Xu, D.K. Pal, M. Savvides, Driver cell phone usage detection on strategic highway research program (SHRP2) face view videos, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2015), pp. 35–43
C. Tai, T. Xiao, X. Wang, W. E, Convolutional neural networks with low-rank regularization. ICLR (2016), http://arxiv.org/abs/1511.06067
S. Venugopalan, F. Juefei-Xu, B. Cowley, M. Savvides, Electromyograph and keystroke dynamics for spoof-resistant biometric authentication, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2015), pp. 109–118
O. Vinyals, A. Toshev, S. Bengio, D. Erhan, Show and tell: a neural image caption generator, in IEEE CVPR (2015), pp. 3156–3164
T. Xiao, Y. Xu, K. Yang, J. Zhang, Y. Peng, Z. Zhang, The application of two-level attention models in deep convolutional neural network for fine-grained image classification, in IEEE CVPR (2015), pp. 842–850
H. Xu, K. Saenko, Ask, attend and answer: exploring question-guided spatial attention for visual question answering (2015), arXiv:1511.05234
K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, Y. Bengio, Show, attend and tell - neural image caption generation with visual attention, in ICML (2015), pp. 2048–2057
Z. Yang, M. Li, H. Ai, An experimental study on automatic face gender classification, in IEEE ICPR, vol. 3 (IEEE, 2006), pp. 1099–1102
L. Yao, A. Torabi, K. Cho, N. Ballas, C. Pal, H. Larochelle, A. Courville, Describing videos by exploiting temporal structure, in IEEE ICCV (2015), pp. 4507–4515
R. Yeh, C. Chen, T.Y. Lim, M. Hasegawa-Johnson, M.N. Do, Semantic image inpainting with perceptual and contextual losses (2016), arXiv:1607.07539
N. Zehngut, F. Juefei-Xu, R. Bardia, D.K. Pal, C. Bhagavatula, M. Savvides, Investigating the feasibility of image-based nose biometrics, in IEEE International Conference on Image Processing (ICIP) (2015), pp. 522–526
Y. Zhu, O. Groth, M. Bernstein, L. Fei-Fei, Visual7W: grounded question answering in images (2015), arXiv:1511.03416
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Juefei-Xu, F., Verma, E., Savvides, M. (2017). DeepGender2: A Generative Approach Toward Occlusion and Low-Resolution Robust Facial Gender Classification via Progressively Trained Attention Shift Convolutional Neural Networks (PTAS-CNN) and Deep Convolutional Generative Adversarial Networks (DCGAN). In: Bhanu, B., Kumar, A. (eds) Deep Learning for Biometrics. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-61657-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-61657-5_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61656-8
Online ISBN: 978-3-319-61657-5
eBook Packages: Computer ScienceComputer Science (R0)