Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Voxel-based 3D face reconstruction and its application to face recognition using sequential deep learning

  • 23 Accesses

Abstract

In this paper, a novel 3D face reconstruction technique is proposed along with a sequential deep learning-based framework for face recognition. It uses the voxels generated from the voxelization process. It uses the reflection principle for generating the reconstructed point in 3D using the mid-face plane. From the reconstructed face, a sequential deep learning framework is developed to recognize gender, emotion, occlusion, and person. The developed framework utilizes the concepts of variational autoencoders, bidirectional long short-term memory, and triplet loss training. The sequential deep learning model extracts and refines the reconstructed voxels by generating deep features. The support vector machine is applied to deep features for the final prediction. The proposed 3D face recognition system is compared with the three well-known deep learning approaches over three occluded datasets. Experimental results show that the proposed 3D face recognition technique is invariant to occlusion and facial expression. The proposed technique recognizes the gender with accuracy of 97.28%, 92.12%, and 94.44%, emotion with accuracy of 94.57%, 87.78%, and 89.95%, occlusion with accuracy of 94.02%, 81.26%, and 89.85% and person face with accuracy of 90.01%, 78.21%, and 85.68% for Bosphorus, UMBDB and KinectFaceDB datasets respectively. The proposed framework performs better than state-of-the-art approaches in terms of computational time as well as face recognition accuracy.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

References

  1. 1.

    Zhang X, Gao Y (2009) Face recognition across pose: a review. Pattern Recogn 42(11):2876–2896

  2. 2.

    Bowyer KW, Chang K, Flynn P (2006) A survey of approaches and challenges in 3D and multi-modal 3D+ 2D face recognition. Comput Vis Image Underst 101(1):1–15

  3. 3.

    Xu C, Wang Y, Tan T and Quan L (2004) Depth vs. intensity: which is more important for face recognition?. In Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004. (Vol. 1, pp. 342-345). IEEE

  4. 4.

    Stoykova E, Ayd A, Benzie P, Grammalidis N, Malassiotis S, Ostermann J, Piekh S, Sainov V, Theobalt C, Thevar T, Zabulis X (2007) 3-D time-varying scene capture technologies—a survey. IEEE Transactions on Circuits and Systems for Video Technology 17(11):1568–1586

  5. 5.

    Patil H, Kothari A, Bhurchandi K (2015) 3-D face recognition: features, databases, algorithms and challenges. Artif Intell Rev 44(3):393–441

  6. 6.

    Kaufman A, Cohen D, Yagel R (1993) Volume graphics. Computer 26(7):51–64

  7. 7.

    Kingma DP and Welling M, (2013) Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.

  8. 8.

    MacKay DJ and Mac Kay DJ (2003) Information theory, inference and learning algorithms. Cambridge university press.

  9. 9.

    Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18(5–6):602–610

  10. 10.

    Schroff F, Kalenichenko D and Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 815-823).

  11. 11.

    Heisele B, Ho P and Poggio T, (2001) Face recognition with support vector machines: global versus component-based approach. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001 (Vol. 2, pp. 688-694). IEEE

  12. 12.

    Da Costa DM, Peres SM, Lima CA and Mustaro P (2015) Face recognition using support vector machine and multiscale directional image representation methods: a comparative study. In 2015 International Joint Conference on Neural Networks (IJCNN) (pp. 1-8). IEEE

  13. 13.

    Haq EU, Huarong X and Khattak MI (2017) Face recognition by SVM using local binary patterns. In 2017 14th Web Information Systems and Applications Conference (WISA) (pp. 172-175). IEEE

  14. 14.

    Scholkopf B, Sung KK, Burges CJ, Girosi F, Niyogi P, Poggio T, Vapnik V (1997) Comparing support vector machines with Gaussian kernels to radial basis function classifiers. IEEE Trans Signal Process 45(11):2758–2765

  15. 15.

    Liu F, Zeng D, Li J, Zhao QJ (2017) On 3D face reconstruction via cascaded regression in shape space. Frontiers of Information Technology & Electronic Engineering 18(12):1978–1990

  16. 16.

    Tran L and Liu X (2018) Nonlinear 3D face morphable model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 7346-7355).

  17. 17.

    Richardson E, Sela M and Kimmel R, (2016) 3D face reconstruction by learning from synthetic data. In 2016 Fourth International Conference on 3D Vision (3DV) (pp. 460-469). IEEE.

  18. 18.

    Dou P, Wu Y, Shah SK, Kakadiaris IA (2018) Monocular 3D facial shape reconstruction from a single 2D image with coupled-dictionary learning and sparse coding. Pattern Recogn 81:515–527

  19. 19.

    Feng M, Zulqarnain Gilani S, Wang Y and Mian A (2018) 3D face reconstruction from light field images: a model-free approach. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 501-518).

  20. 20.

    Kim H, Zollhöfer M, Tewari A, Thies J, Richardt C, Theobalt C (2017) Inversefacenet: deep single-shot inverse face rendering from a single image. arXiv preprint arXiv:1703.10956.

  21. 21.

    Jackson AS, Bulat A, Argyriou V and Tzimiropoulos G (2017) Large pose 3D face reconstruction from a single image via direct volumetric CNN regression. In proceedings of the IEEE international conference on computer vision (pp. 1031-1039)

  22. 22.

    Eigen D, Puhrsch C and Fergus R, (2014) Depth map prediction from a single image using a multi-scale deep network. In advances in neural information processing systems (pp. 2366-2374).

  23. 23.

    Saxena A, Chung SH, Ng AY (2008) 3-d depth reconstruction from a single still image. Int J Comput Vis 76(1):53–69

  24. 24.

    Tulsiani S, Zhou T, Efros AA and Malik J (2017) Multi-view supervision for single-view reconstruction via differentiable ray consistency. In proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2626-2634)

  25. 25.

    Tatarchenko M, Dosovitskiy A and Brox T, (2017) Octree generating networks: efficient convolutional architectures for high-resolution 3d outputs. In proceedings of the IEEE international conference on computer vision (pp. 2088-2096)

  26. 26.

    Richardson E, Sela M, Or-El R and Kimmel R (2017) Learning detailed face reconstruction from a single image. In proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1259-1268).

  27. 27.

    Richardson E, Sela M and Kimmel R, (2016) 3D face reconstruction by learning from synthetic data. In 2016 fourth international conference on 3D vision (3DV) (pp. 460-469). IEEE

  28. 28.

    Roth J, Tong Y and Liu X, (2016) Adaptive 3D face reconstruction from unconstrained photo collections. In proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4197-4206)

  29. 29.

    Kemelmacher-Shlizerman I and Seitz SM, (2011) November. Face reconstruction in the wild. In 2011 international conference on computer vision (pp. 1746-1753). IEEE

  30. 30.

    Kemelmacher-Shlizerman I, Basri R (2011) 3D face reconstruction from a single image using a single reference face shape. IEEE Trans Pattern Anal Mach Intell 33(2):394–405

  31. 31.

    Gecer B, Ploumpis S, Kotsia I and Zafeiriou S (2019) GANFIT: generative adversarial network fitting for high Fidelity 3D face reconstruction. arXiv preprint arXiv:1902.05978

  32. 32.

    Zhu Z, Luo P, Wang X and Tang X (2013) Deep learning identity-preserving face space. In Proceedings of the IEEE International Conference on Computer Vision (pp. 113-120)

  33. 33.

    Tang Y, Salakhutdinov R and Hinton G (2012) Deep lambertian networks. arXiv preprint arXiv:1206.6445.

  34. 34.

    Richardson E, Sela M and Kimmel R, (2016) 3D face reconstruction by learning from synthetic data. In 2016 Fourth International Conference on 3D Vision (3DV) (pp. 460-469). IEEE

  35. 35.

    Richardson E, Sela M, Or-El R and Kimmel R (2017) Learning detailed face reconstruction from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1259-1268)

  36. 36.

    Laine S, Karras T, Aila T, Herva A and Lehtinen J (2016) Facial performance capture with deep neural networks. arXiv preprint arXiv:1609.06536, 3

  37. 37.

    Liu Z, Luo P, Wang X and Tang X, (2015) Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision (pp. 3730-3738)

  38. 38.

    Nair V, Susskind J and Hinton GE (2008) Analysis-by-synthesis by learning to invert generative black boxes. In International Conference on Artificial Neural Networks (pp. 971-981). Springer, Berlin, Heidelberg

  39. 39.

    Peng X, Feris RS, Wang X and Metaxas DN (2016) A recurrent encoder-decoder network for sequential face alignment. In European conference on computer vision(pp. 38-56). Springer, Cham

  40. 40.

    Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci 10(2–3):191–203

  41. 41.

    Savran A, Alyüz N, Dibeklioğlu H, Çeliktutan O, Gökberk B, Sankur B and Akarun L (2008) Bosphorus database for 3D face analysis. In European Workshop on Biometrics and Identity Management (pp. 47-56). Springer, Berlin, Heidelberg

  42. 42.

    Colombo A, Cusano C and Schettini R (2011) UMB-DB: a database of partially occluded 3D faces. In 2011 IEEE international conference on computer vision workshops (ICCV workshops) (pp. 2113-2119). IEEE

  43. 43.

    Min R, Kose N, Dugelay JL (2014) Kinectfacedb: a kinect database for face recognition. IEEE Transactions on Systems, Man, and Cybernetics: Systems 44(11):1534–1548

  44. 44.

    Richardson E, Sela M, Or-El R and Kimmel R (2017) Learning detailed face reconstruction from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1259-1268)

  45. 45.

    Cao X, Chen Z, Chen A, Chen X, Li S and Yu J (2018) Sparse photometric 3D face reconstruction guided by Morphable models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4635-4644)

  46. 46.

    Feng ZH, Huber P, Kittler J, Hancock P, Wu XJ, Zhao Q, Koppen P and Rätsch M, (2018) Evaluation of dense 3D reconstruction from 2D face images in the wild. In 2018 13th IEEE international conference on Automatic Face & Gesture Recognition (FG 2018) (pp. 780-786). IEEE

  47. 47.

    Xu Y, Zhu X, Li Z, Liu G, Lu Y, Liu H (2013) Using the original and ‘symmetrical face’ training samples to perform representation based two-step face recognition. Pattern Recogn 46(4):1151–1158

  48. 48.

    Xu Y, Li X, Yang J, Zhang D (2014) Integrate the original face image and its mirror image for face recognition. Neurocomputing 131:191–199

  49. 49.

    Xu Y, Fang X, Li X, Yang J, You J, Liu H, Teng S (2014) Data uncertainty in face recognition. IEEE transactions on cybernetics 44(10):1950–1961

  50. 50.

    Singh S, Kasana SS (2018) Efficient classification of the hyperspectral images using deep learning. Multimed Tools Appl 77(20):27061–27074

  51. 51.

    Celis D and Rao M (2019) Learning facial recognition biases through VAE latent representations. In proceedings of the 1st international workshop on fairness, accountability, and transparency in MultiMedia (pp. 26-32). ACM

  52. 52.

    Zhou X, Lin J, Jiang J and Chen S (2019) Learning a 3D gaze estimator with improved Itracker combined with bidirectional LSTM. In 2019 IEEE international conference on Multimedia and expo (ICME) (pp. 850-855). IEEE

  53. 53.

    Tian G, Yuan Y and Liu Y, (2019) Audio2Face: generating speech/face animation from single audio with attention-based bidirectional LSTM networks. In 2019 IEEE international conference on Multimedia & Expo Workshops (ICMEW) (pp. 366-371). IEEE

  54. 54.

    Li H, Xu H (2019) Video-based sentiment analysis with hvnLBP-TOP feature and bi-LSTM. In proceedings of the AAAI conference on artificial intelligence (Vol. 33, pp. 9963-9964)

  55. 55.

    Huang C, Li Y, Chen CL and Tang X (2019) Deep imbalanced learning for face recognition and attribute prediction. IEEE Trans Pattern Anal Mach Intell

  56. 56.

    Tsai HH, Chang YC (2018) Facial expression recognition using a combination of multiple facial features and support vector machine. Soft Comput 22(13):4389–4405

  57. 57.

    Richhariya B, Gupta D (2019) Facial expression recognition using iterative universum twin support vector machine. Appl Soft Comput 76:53–67

  58. 58.

    Verma VK, Srivastava S, Jain T and Jain A (2019) Local invariant feature-based gender recognition from facial images. In soft computing for problem solving (pp. 869-878). Springer, Singapore

  59. 59.

    Kar NB, Babu KS, Sangaiah AK, Bakshi S (2019) Face expression recognition system based on ripplet transform type II and least square SVM. Multimed Tools Appl 78(4):4789–4812

  60. 60.

    Zhang YD, Zhang Y, Hou XX, Chen H, Wang SH (2018) Seven-layer deep neural network based on sparse autoencoder for voxelwise detection of cerebral microbleed. Multimed Tools Appl 77(9):10521–10538

  61. 61.

    Zia MS, Hussain M, Jaffar MA (2018) A novel spontaneous facial expression recognition using dynamically weighted majority voting based ensemble classifier. Multimed Tools Appl 77(19):25537–25567

  62. 62.

    Xiao Y, Wu J, Lin Z, Zhao X (2018) A deep learning-based multi-model ensemble method for cancer prediction. Comput Methods Prog Biomed 153:1–9

  63. 63.

    Yu L, Zhou R, Tang L, Chen R (2018) A DBN-based resampling SVM ensemble learning paradigm for credit classification with imbalanced data. Appl Soft Comput 69:192–202

  64. 64.

    Sharma S and Kumar V (2018) Performance evaluation of 2D face recognition techniques under image processing attacks. Modern physics letters B, 32(19), p.1850212

  65. 65.

    Sharma S and Kumar V (2019) Transfer learning in 2.5 D face image for occlusion presence and gender classification. In handbook of research on deep learning innovations and trends (pp. 97-113). IGI global

  66. 66.

    Liu Z, Zhang L, Pu J, Liu G and Liu S (2019) Using the original and symmetrical face test samples to perform two-step collaborative representation for face recognition. International journal of pattern recognition and artificial intelligence, 33(02), p.1956001

  67. 67.

    Rajput SS and Arya KV (2019) A robust facial image super-resolution model via mirror-patch based neighbor representation. Multimedia tools and applications, pp.1-20.

Download references

Author information

Correspondence to Sahil Sharma.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sharma, S., Kumar, V. Voxel-based 3D face reconstruction and its application to face recognition using sequential deep learning. Multimed Tools Appl (2020). https://doi.org/10.1007/s11042-020-08688-x

Download citation

Keywords

  • Face reconstruction
  • Voxel
  • Sequential deep learning
  • Face recognition
  • Gender
  • Emotion
  • Occlusion