Dependable Person Recognition by Means of Local Descriptors of Dynamic Facial Features

  • Aniello Castiglione
  • Giampiero Grazioli
  • Simone Iengo
  • Michele Nappi
  • Stefano RicciardiEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1123)


In this work, a complementary approach that adds a dynamic component to face biometrics is proposed. The dynamic appearance and the time-dependent local features characterizing the face of an individual during speech utterance are indeed considered in their spatial and temporal components. Ultimately, the aim is to capture, represent and compare facial patterns related to speech utterance, to improve biometric system dependability thanks to an intrinsically difficult to forge descriptor. The proposed approach applies the concept of dynamic texture to the domain of person identification through dynamic facial patterns modeled by means of the Volume Local Binary Pattern (VLBP) descriptor, which effectively combines local features and movement. To the aim of improving the efficiency of this technique, only the occurrences of the Local Binary Patterns related to Three Orthogonal Planes (LBP-TOP) have been considered. A deep feed forward network has been trained and optimized on video samples from the XM2VTS database concerning utterance of a given sentence. The results obtained in the recognition task performed on test video sequences confirm that the proposed approach features state-of-the-art performances with regard to accuracy and robustness of the identification.


Biometrics Face recognition Image analysis Face biometrics Dependability 



We gratefully acknowledge the work done by D. Iengo and D. Vanore for implementing and testing the proposed architecture. This work has been partially supported by Italian National Research Project PRIN 2015 (201548C5NT) entitled “COntactlesS Multibiometric mObile System in the wild: COSMOS”.


  1. 1.
    Abate, A.F., Acampora, G., Ricciardi, S.: An interactive virtual guide for the AR based visit of archaeological sites. J. Vis. Lang. Comput. 22(6), 415–425 (2011). Scholar
  2. 2.
    Abate, A.F., Nappi, M., Narducci, F., Ricciardi, S.: Fast iris recognition on smartphone by means of spatial histograms. In: Cantoni, V., Dimov, D., Tistarelli, M. (eds.) Biometric Authentication BIOMET 2014. LNCS, vol. 8897, pp. 67–74. Springer, Cham (2014).
  3. 3.
    Bandisak, P., Suwansantisuk, W., Kumhom, P.: Classification of speaking activity based on lip features in a sequence of video frames, vol. 11049 (2019).
  4. 4.
    Castiglione, A., Raymond Choo, K., Nappi, M., Ricciardi, S.: Context aware ubiquitous biometrics in edge of military things. IEEE Cloud Comput. 4(6), 16–20 (2017). Scholar
  5. 5.
    Cetingul, H.E., Yemez, Y., Erzin, E., Tekalp, A.M.: Discriminative analysis of lip motion features for speaker identification and speech-reading. IEEE Trans. Image Process. 15(10), 2879–2891 (2006). Scholar
  6. 6.
    Chan, A.B., Vasconcelos, N.: Modeling, clustering, and segmenting video with mixtures of dynamic textures. IEEE Trans. Pattern Anal. Mach. Intell. 30(5), 909–926 (2008). Scholar
  7. 7.
    Chetty, G., Wagner, M.: Automated lip feature extraction for liveness verification in audio-video authentication. In: Proceedings of Image and Vision Computing, pp. 17–22 (2004)Google Scholar
  8. 8.
    Chung, J.S., Senior, A., Vinyals, O., Zisserman, A.: Lip reading sentences in the wild. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3444–3453, July 2017.
  9. 9.
    Doretto, G., Chiuso, A., Wu, Y.N., Soatto, S.: Dynamic textures. Int. J. Comput. Vis. 51(2), 91–109 (2003). Scholar
  10. 10.
    Dupont, S., Luettin, J.: Audio-visual speech modeling for continuous speech recognition. IEEE Trans. Multimedia 2(3), 141–151 (2000). Scholar
  11. 11.
    Faraj, M.I., Bigun, J.: Motion features from lip movement for person authentication. In: 18th International Conference on Pattern Recognition (ICPR 2006), vol. 3, pp. 1059–1062, August 2006.
  12. 12.
    Faraj, M.I., Bigun, J.: Audio-visual person authentication using lip-motion from orientation maps. Pattern Recogn. Lett. 28(11), 1368–1382 (2007). Advances on Pattern recognition for speech and audio processing
  13. 13.
    Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks, vol. 5, pp. 3771–3779 (2014)Google Scholar
  14. 14.
    Hannun, A.Y., et al.: Deep Speech: Scaling up end-to-end speech recognition. CoRR abs/1412.5567 (2014).
  15. 15.
    Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012). Scholar
  16. 16.
    Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874, June 2014.
  17. 17.
    Liu, X., Cheung, Y.: Learning multi-boosted HMMs for lip-password based speaker verification. IEEE Trans. Inf. Forensics Secur. 9(2), 233–246 (2014). Scholar
  18. 18.
    Lu, Y., Gu, K., He, S.: Research on visual speech recognition based on local binary pattern and stacked sparse autoencoder. In: Ahram, T., Karwowski, W., Taiar, R. (eds.) IHSED 2018. AISC, vol. 876, pp. 1082–1087. Springer, Cham (2019). Scholar
  19. 19.
    Mendhurwar, K., Mudur, S., Popa, T.: Time series matching for biometric visual passwords. In: ACM SIGGRAPH 2017 Posters SIGGRAPH 2017, pp. 87:1–87:2. ACM, New York (2017).,
  20. 20.
    Nainan, S., Kulkarni, V.: Lip tracking using deformable models and geometric approaches. In: Satapathy, S.C., Joshi, A. (eds.) Information and Communication Technology for Intelligent Systems. SIST, vol. 106, pp. 655–663. Springer, Singapore (2019). Scholar
  21. 21.
    Nappi, M., Ricciardi, S., Tistarelli, M.: Deceiving faces: when plastic surgery challenges face recognition. Image Vis. Comput. 54, 71–82 (2016). Scholar
  22. 22.
    Nappi, M., Ricciardi, S., Tistarelli, M.: Context awareness in biometric systems and methods: state of the art and future scenarios. Image Vis. Comput. 76, 27–37 (2018). Scholar
  23. 23.
    Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002). Scholar
  24. 24.
    Ricciardi, S., et al.: Dependability issues in visual-haptic interfaces. J. Vis. Lang. Comput. 21(1), 33–40 (2010)Google Scholar
  25. 25.
    Sandbach, G., Zafeiriou, S., Pantic, M., Yin, L.: Static and dynamic 3D facial expression recognition: a comprehensive survey. Image Vis. Comput. 30(10), 683–697 (2012). Scholar
  26. 26.
    Siatras, S., Nikolaidis, N., Krinidis, M., Pitas, I.: Visual lip activity detection and speaker detection using mouth region intensities. IEEE Trans. Circuits Syst. Video Technol. 19(1), 133–137 (2009). Scholar
  27. 27.
    Sodoyer, D., Rivet, B., Girin, L., Schwartz, J.L., Jutten, C.: An analysis of visual speech information applied to voice activity detection, vol. 1, pp. I601–I604 (2006)Google Scholar
  28. 28.
    Tirunagari, S., Poh, N., Windridge, D., Iorliam, A., Suki, N., Ho, A.T.S.: Detection of face spoofing using visual dynamics. IEEE Trans. Inf. Forensics Secur. 10(4), 762–777 (2015). Scholar
  29. 29.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. CVPR 1, I511–I518 (2001)Google Scholar
  30. 30.
    Wang, S.L., Liew, A.W.C.: Physiological and behavioral lip biometrics: a comprehensive study of their discriminative power. Pattern Recogn. 45(9), 3328–3335 (2012). Scholar
  31. 31.
    Yuan, Y., Zhao, J., Xi, W., Qian, C., Zhang, X., Wang, Z.: SALM: smartphone-based identity authentication using lip motion characteristics. In: 2017 IEEE International Conference on Smart Computing (SMARTCOMP), pp. 1–8, May 2017.
  32. 32.
    Zhao, G., Barnard, M., Pietikainen, M.: Lipreading with local spatiotemporal descriptors. IEEE Trans. Multimedia 11(7), 1254–1265 (2009). Scholar
  33. 33.
    Zhao, G., Pietikainen, M.: Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 915–928 (2007). Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Department of Science and TechnologyUniversity of Naples ParthenopeNaplesItaly
  2. 2.Department of Computer ScienceUniversity of SalernoFiscianoItaly
  3. 3.Department of BiosciencesUniversity of MoliseCampobassoItaly

Personalised recommendations