Advertisement

Dependable Person Recognition by Means of Local Descriptors of Dynamic Facial Features

  • Aniello Castiglione
  • Giampiero Grazioli
  • Simone Iengo
  • Michele Nappi
  • Stefano RicciardiEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1123)

Abstract

In this work, a complementary approach that adds a dynamic component to face biometrics is proposed. The dynamic appearance and the time-dependent local features characterizing the face of an individual during speech utterance are indeed considered in their spatial and temporal components. Ultimately, the aim is to capture, represent and compare facial patterns related to speech utterance, to improve biometric system dependability thanks to an intrinsically difficult to forge descriptor. The proposed approach applies the concept of dynamic texture to the domain of person identification through dynamic facial patterns modeled by means of the Volume Local Binary Pattern (VLBP) descriptor, which effectively combines local features and movement. To the aim of improving the efficiency of this technique, only the occurrences of the Local Binary Patterns related to Three Orthogonal Planes (LBP-TOP) have been considered. A deep feed forward network has been trained and optimized on video samples from the XM2VTS database concerning utterance of a given sentence. The results obtained in the recognition task performed on test video sequences confirm that the proposed approach features state-of-the-art performances with regard to accuracy and robustness of the identification.

Keywords

Biometrics Face recognition Image analysis Face biometrics Dependability 

Notes

Acknowledgments

We gratefully acknowledge the work done by D. Iengo and D. Vanore for implementing and testing the proposed architecture. This work has been partially supported by Italian National Research Project PRIN 2015 (201548C5NT) entitled “COntactlesS Multibiometric mObile System in the wild: COSMOS”.

References

  1. 1.
    Abate, A.F., Acampora, G., Ricciardi, S.: An interactive virtual guide for the AR based visit of archaeological sites. J. Vis. Lang. Comput. 22(6), 415–425 (2011).  https://doi.org/10.1016/j.jvlc.2011.02.005CrossRefGoogle Scholar
  2. 2.
    Abate, A.F., Nappi, M., Narducci, F., Ricciardi, S.: Fast iris recognition on smartphone by means of spatial histograms. In: Cantoni, V., Dimov, D., Tistarelli, M. (eds.) Biometric Authentication BIOMET 2014. LNCS, vol. 8897, pp. 67–74. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-13386-7
  3. 3.
    Bandisak, P., Suwansantisuk, W., Kumhom, P.: Classification of speaking activity based on lip features in a sequence of video frames, vol. 11049 (2019).  https://doi.org/10.1117/12.2521574
  4. 4.
    Castiglione, A., Raymond Choo, K., Nappi, M., Ricciardi, S.: Context aware ubiquitous biometrics in edge of military things. IEEE Cloud Comput. 4(6), 16–20 (2017).  https://doi.org/10.1109/MCC.2018.1081072CrossRefGoogle Scholar
  5. 5.
    Cetingul, H.E., Yemez, Y., Erzin, E., Tekalp, A.M.: Discriminative analysis of lip motion features for speaker identification and speech-reading. IEEE Trans. Image Process. 15(10), 2879–2891 (2006).  https://doi.org/10.1109/TIP.2006.877528CrossRefzbMATHGoogle Scholar
  6. 6.
    Chan, A.B., Vasconcelos, N.: Modeling, clustering, and segmenting video with mixtures of dynamic textures. IEEE Trans. Pattern Anal. Mach. Intell. 30(5), 909–926 (2008).  https://doi.org/10.1109/TPAMI.2007.70738CrossRefGoogle Scholar
  7. 7.
    Chetty, G., Wagner, M.: Automated lip feature extraction for liveness verification in audio-video authentication. In: Proceedings of Image and Vision Computing, pp. 17–22 (2004)Google Scholar
  8. 8.
    Chung, J.S., Senior, A., Vinyals, O., Zisserman, A.: Lip reading sentences in the wild. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3444–3453, July 2017.  https://doi.org/10.1109/CVPR.2017.367
  9. 9.
    Doretto, G., Chiuso, A., Wu, Y.N., Soatto, S.: Dynamic textures. Int. J. Comput. Vis. 51(2), 91–109 (2003).  https://doi.org/10.1023/A:1021669406132CrossRefzbMATHGoogle Scholar
  10. 10.
    Dupont, S., Luettin, J.: Audio-visual speech modeling for continuous speech recognition. IEEE Trans. Multimedia 2(3), 141–151 (2000).  https://doi.org/10.1109/6046.865479CrossRefGoogle Scholar
  11. 11.
    Faraj, M.I., Bigun, J.: Motion features from lip movement for person authentication. In: 18th International Conference on Pattern Recognition (ICPR 2006), vol. 3, pp. 1059–1062, August 2006.  https://doi.org/10.1109/ICPR.2006.814
  12. 12.
    Faraj, M.I., Bigun, J.: Audio-visual person authentication using lip-motion from orientation maps. Pattern Recogn. Lett. 28(11), 1368–1382 (2007).  https://doi.org/10.1016/j.patrec.2007.02.017. Advances on Pattern recognition for speech and audio processing
  13. 13.
    Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks, vol. 5, pp. 3771–3779 (2014)Google Scholar
  14. 14.
    Hannun, A.Y., et al.: Deep Speech: Scaling up end-to-end speech recognition. CoRR abs/1412.5567 (2014). http://arxiv.org/abs/1412.5567
  15. 15.
    Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012).  https://doi.org/10.1109/MSP.2012.2205597CrossRefGoogle Scholar
  16. 16.
    Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874, June 2014.  https://doi.org/10.1109/CVPR.2014.241
  17. 17.
    Liu, X., Cheung, Y.: Learning multi-boosted HMMs for lip-password based speaker verification. IEEE Trans. Inf. Forensics Secur. 9(2), 233–246 (2014).  https://doi.org/10.1109/TIFS.2013.2293025CrossRefGoogle Scholar
  18. 18.
    Lu, Y., Gu, K., He, S.: Research on visual speech recognition based on local binary pattern and stacked sparse autoencoder. In: Ahram, T., Karwowski, W., Taiar, R. (eds.) IHSED 2018. AISC, vol. 876, pp. 1082–1087. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-02053-8_165CrossRefGoogle Scholar
  19. 19.
    Mendhurwar, K., Mudur, S., Popa, T.: Time series matching for biometric visual passwords. In: ACM SIGGRAPH 2017 Posters SIGGRAPH 2017, pp. 87:1–87:2. ACM, New York (2017).  https://doi.org/10.1145/3102163.3102239, https://doi.acm.org/10.1145/3102163.3102239
  20. 20.
    Nainan, S., Kulkarni, V.: Lip tracking using deformable models and geometric approaches. In: Satapathy, S.C., Joshi, A. (eds.) Information and Communication Technology for Intelligent Systems. SIST, vol. 106, pp. 655–663. Springer, Singapore (2019).  https://doi.org/10.1007/978-981-13-1742-2_65CrossRefGoogle Scholar
  21. 21.
    Nappi, M., Ricciardi, S., Tistarelli, M.: Deceiving faces: when plastic surgery challenges face recognition. Image Vis. Comput. 54, 71–82 (2016).  https://doi.org/10.1016/j.imavis.2016.08.012CrossRefGoogle Scholar
  22. 22.
    Nappi, M., Ricciardi, S., Tistarelli, M.: Context awareness in biometric systems and methods: state of the art and future scenarios. Image Vis. Comput. 76, 27–37 (2018).  https://doi.org/10.1016/j.imavis.2018.05.001CrossRefGoogle Scholar
  23. 23.
    Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002).  https://doi.org/10.1109/TPAMI.2002.1017623CrossRefzbMATHGoogle Scholar
  24. 24.
    Ricciardi, S., et al.: Dependability issues in visual-haptic interfaces. J. Vis. Lang. Comput. 21(1), 33–40 (2010)Google Scholar
  25. 25.
    Sandbach, G., Zafeiriou, S., Pantic, M., Yin, L.: Static and dynamic 3D facial expression recognition: a comprehensive survey. Image Vis. Comput. 30(10), 683–697 (2012).  https://doi.org/10.1016/j.imavis.2012.06.005CrossRefGoogle Scholar
  26. 26.
    Siatras, S., Nikolaidis, N., Krinidis, M., Pitas, I.: Visual lip activity detection and speaker detection using mouth region intensities. IEEE Trans. Circuits Syst. Video Technol. 19(1), 133–137 (2009).  https://doi.org/10.1109/TCSVT.2008.2009262CrossRefGoogle Scholar
  27. 27.
    Sodoyer, D., Rivet, B., Girin, L., Schwartz, J.L., Jutten, C.: An analysis of visual speech information applied to voice activity detection, vol. 1, pp. I601–I604 (2006)Google Scholar
  28. 28.
    Tirunagari, S., Poh, N., Windridge, D., Iorliam, A., Suki, N., Ho, A.T.S.: Detection of face spoofing using visual dynamics. IEEE Trans. Inf. Forensics Secur. 10(4), 762–777 (2015).  https://doi.org/10.1109/TIFS.2015.2406533CrossRefGoogle Scholar
  29. 29.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. CVPR 1, I511–I518 (2001)Google Scholar
  30. 30.
    Wang, S.L., Liew, A.W.C.: Physiological and behavioral lip biometrics: a comprehensive study of their discriminative power. Pattern Recogn. 45(9), 3328–3335 (2012).  https://doi.org/10.1016/j.patcog.2012.02.016CrossRefGoogle Scholar
  31. 31.
    Yuan, Y., Zhao, J., Xi, W., Qian, C., Zhang, X., Wang, Z.: SALM: smartphone-based identity authentication using lip motion characteristics. In: 2017 IEEE International Conference on Smart Computing (SMARTCOMP), pp. 1–8, May 2017.  https://doi.org/10.1109/SMARTCOMP.2017.7947043
  32. 32.
    Zhao, G., Barnard, M., Pietikainen, M.: Lipreading with local spatiotemporal descriptors. IEEE Trans. Multimedia 11(7), 1254–1265 (2009).  https://doi.org/10.1109/TMM.2009.2030637CrossRefGoogle Scholar
  33. 33.
    Zhao, G., Pietikainen, M.: Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 915–928 (2007).  https://doi.org/10.1109/TPAMI.2007.1110CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Department of Science and TechnologyUniversity of Naples ParthenopeNaplesItaly
  2. 2.Department of Computer ScienceUniversity of SalernoFiscianoItaly
  3. 3.Department of BiosciencesUniversity of MoliseCampobassoItaly

Personalised recommendations