Comparing Methods for Assessment of Facial Dynamics in Patients with Major Neurocognitive Disorders

  • Yaohui WangEmail author
  • Antitza Dantcheva
  • Jean-Claude Broutart
  • Philippe Robert
  • Francois Bremond
  • Piotr Bilinski
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11134)


Assessing facial dynamics in patients with major neurocognitive disorders and specifically with Alzheimer’s disease (AD) has shown to be highly challenging. Classically such assessment is performed by clinical staff, evaluating verbal and non-verbal language of AD-patients, since they have lost a substantial amount of their cognitive capacity, and hence communication ability. In addition, patients need to communicate important messages, such as discomfort or pain. Automated methods would support the current healthcare system by allowing for telemedicine, i.e., lesser costly and logistically inconvenient examination. In this work we compare methods for assessing facial dynamics such as talking, singing, neutral and smiling in AD-patients, captured during music mnemotherapy sessions. Specifically, we compare 3D ConvNets, Very Deep Neural Network based Two-Stream ConvNets, as well as Improved Dense Trajectories. We have adapted these methods from prominent action recognition methods and our promising results suggest that the methods generalize well to the context of facial dynamics. The Two-Stream ConvNets in combination with ResNet-152 obtains the best performance on our dataset, capturing well even minor facial dynamics and has thus sparked high interest in the medical community.


Facial dynamics Facial expressions Neurocognitive disorders Alzheimer’s disease 


  1. 1.
    Ashida, S.: The effect of reminiscence music therapy sessions on changes in depressive symptoms in elderly persons with dementia. J. Music Ther. 37(3), 170–182 (2000)CrossRefGoogle Scholar
  2. 2.
    Broutart, J.C., Robert, P., Balas, D., Broutart, N., Cahors, J.: Démence et perte cognitive: Prise en charge du patient et de sa famille, chap. Mnémothérapie, reviviscence et maladie d’Alzheimer. De Boeck Superieur, March 2017Google Scholar
  3. 3.
    Dantcheva, A., Bilinski, P., Nguyen, H.T., Broutart, J.C., Bremond, F.: Expression recognition for severely demented patients in music reminiscence-therapy. In: EUSIPCO (2017)Google Scholar
  4. 4.
    Dantcheva, A., Bremond, F.: Gender estimation based on smile-dynamics. IEEE Trans. Inf. Forensics Secur. (TIFS) 12(3), 719–729 (2017)CrossRefGoogle Scholar
  5. 5.
    Dawadi, P.N., Cook, D.J., Schmitter-Edgecombe, M., Parsey, C.: Automated assessment of cognitive health using smart home technologies. Technol. Health Care 21(4), 323–343 (2013)Google Scholar
  6. 6.
    Dibeklioglu, H., Hammal, Z., Cohn, J.F.: Dynamic multimodal measurement of depression severity using deep autoencoding. IEEE J. Biomed. Health Inform. PP(99), 1 (2017)Google Scholar
  7. 7.
    Ekman, P., Friesen, W.: Facial Action Coding System: A Technique for the Measurement of Facial Movement. Consulting Psychologists, Palo Alto (1978)Google Scholar
  8. 8.
    Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)Google Scholar
  9. 9.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  10. 10.
    Folstein, M.F., Folstein, S.E., McHugh, P.R.: “Mini-mental state”: a practical method for grading the cognitive state of patients for the clinician. J. Psychiatr. Res. 12(3), 189–198 (1975)CrossRefGoogle Scholar
  11. 11.
    Han, S., Meng, Z., Khan, A.S., Tong, Y.: Incremental boosting convolutional neural network for facial action unit recognition. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29, pp. 109–117 (2016)Google Scholar
  12. 12.
    Hasani, B., Mahoor, M.H.: Facial expression recognition using enhanced deep 3D convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 2278–2288. IEEE (2017)Google Scholar
  13. 13.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385 (2015)
  14. 14.
    Jung, H., Lee, S., Yim, J., Park, S., Kim, J.: Joint fine-tuning in deep neural networks for facial expression recognition. In: IEEE International Conference on Computer Vision (ICCV), pp. 2983–2991. IEEE (2015)Google Scholar
  15. 15.
    König, A., Crispim Junior, C.F., Derreumaux, A., Bensadoun, G., Petit, P.D., Bremond, F., David, R., Verhey, F., Aalten, P., Robert, P.: Validation of an automatic video monitoring system for the detection of instrumental activities of daily living in dementia patients. J. Alzheimer’s Dis. 44(2), 675–685 (2015)CrossRefGoogle Scholar
  16. 16.
    Leo, M., Medioni, G., Trivedi, M., Kanade, T., Farinella, G.M.: Computer vision for assistive technologies. Comput. Vis. Image Underst. 154, 1–15 (2017)CrossRefGoogle Scholar
  17. 17.
    Li, W., Abtahi, F., Zhu, Z.: Action unit detection with region adaptation, multi-labeling learning and optimal temporal fusing. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6766–6775. IEEE (2017)Google Scholar
  18. 18.
    Liu, P., Han, S., Meng, Z., Tong, Y.: Facial expression recognition via a boosted deep belief network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1805–1812 (2014)Google Scholar
  19. 19.
    Martinez, B., Valstar, M.F., Jiang, B., Pantic, M.: Automatic analysis of facial actions: a survey. IEEE Trans. Affect. Comput. (2017)Google Scholar
  20. 20.
    Mathias, M., Benenson, R., Pedersoli, M., Van Gool, L.: Face detection without bells and whistles. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 720–735. Springer, Cham (2014). Scholar
  21. 21.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Raglio, A., et al.: Music, music therapy and dementia: a review of literature and the recommendations of the Italian psychogeriatric association. Maturitas 72(4), 305–310 (2012)CrossRefGoogle Scholar
  23. 23.
    Ridder, H.M., Gummesen, E., et al.: The use of extemporizing in music therapy to facilitate communication in a person with dementia: an explorative case study. Aust. J. Music Ther. 26, 6 (2015)Google Scholar
  24. 24.
    Rodriguez, P., et al.: Deep pain: exploiting long short-term memory networks for facial expression classification. IEEE Trans. Cybern. (2017)Google Scholar
  25. 25.
    Romdhane, R., et al.: Automatic video monitoring system for assessment of Alzheimer’s disease symptoms. J. Nutr. Health Aging 16(3), 213–218 (2012)CrossRefGoogle Scholar
  26. 26.
    Saha, S., Navarathna, R., Helminger, L., Weber, R.M.: Unsupervised deep representations for learning audience facial behaviors. arXiv preprint arXiv:1805.04136 (2018)
  27. 27.
    Sandbach, G., Zafeiriou, S., Pantic, M., Rueckert, D.: Recognition of 3D facial expression dynamics. Image Vis. Comput. 30(10), 762–773 (2012)CrossRefGoogle Scholar
  28. 28.
    Sariyanidi, E., Gunes, H., Cavallaro, A.: Automatic analysis of facial affect: a survey of registration, representation, and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(6), 1113–1133 (2015)CrossRefGoogle Scholar
  29. 29.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)Google Scholar
  30. 30.
    Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 568–576 (2014)Google Scholar
  31. 31.
    Soomro, K., Roshan Zamir, A., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. In: CRCV-TR-12-01 (2012)Google Scholar
  32. 32.
    Suzuki, M., et al.: Behavioral and endocrinological evaluation of music therapy for elderly patients with dementia. Nurs. Health Sci. 6(1), 11–18 (2004)CrossRefGoogle Scholar
  33. 33.
    Svansdottir, H., Snaedal, J.: Music therapy in moderate and severe dementia of Alzheimer’s type: a case-control study. Int. Psychogeriatr. 18(04), 613–621 (2006)CrossRefGoogle Scholar
  34. 34.
    Tran, D.L., Walecki, R., Rudovic, O., Eleftheriadis, S., Schuller, B.W., Pantic, M.: DeepCoder: semi-parametric variational autoencoders for facial action unit intensity estimation. CoRR abs/1704.02206 (2017)Google Scholar
  35. 35.
    Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)Google Scholar
  36. 36.
    Vink, A.C., Bruinsma, M.S., Scholten, R.J.: Music therapy for people with dementia. The Cochrane Library (2003)Google Scholar
  37. 37.
    Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)CrossRefGoogle Scholar
  38. 38.
    Walecki, R., Rudovic, O., Pavlovic, V., Pantic, M.: Variable-state latent conditional random field models for facial expression analysis. Image Vis. Comput. 58, 25–37 (2017)CrossRefGoogle Scholar
  39. 39.
    Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Dense trajectories and motion boundary descriptors for action recognition. Research Report RR-8050, INRIA, August 2012Google Scholar
  40. 40.
    Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3551–3558 (2013)Google Scholar
  41. 41.
    Wang, L., Qiao, Y., Tang, X.: Action recognition with trajectory-pooled deep-convolutional descriptors. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015Google Scholar
  42. 42.
    Wang, L., Xiong, Y., Wang, Z., Qiao, Y.: Towards good practices for very deep two-stream convnets. CoRR abs/1507.02159 (2015)Google Scholar
  43. 43.
    Zafeiriou, L., Nikitidis, S., Zafeiriou, S., Pantic, M.: Slow features nonnegative matrix factorization for temporal data decomposition. In: IEEE International Conference on Image Processing (ICIP), pp. 1430–1434. IEEE (2014)Google Scholar
  44. 44.
    Zhao, K., Chu, W.S., Zhang, H.: Deep region and multi-label learning for facial action unit detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3391–3399 (2016)Google Scholar
  45. 45.
    Zhu, Y., Shang, Y., Shao, Z., Guo, G.: Automated depression diagnosis based on deep networks to encode facial appearance and dynamics. IEEE Trans. Affect. Comput. PP(99), 1 (2017). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Yaohui Wang
    • 1
    Email author
  • Antitza Dantcheva
    • 1
    • 3
  • Jean-Claude Broutart
    • 2
  • Philippe Robert
    • 3
  • Francois Bremond
    • 1
    • 3
  • Piotr Bilinski
    • 4
  1. 1.INRIA Sophia-Antipolis, STARSSophia AntipolisFrance
  2. 2.GSF NoisiezBiotFrance
  3. 3.EA CoBTeK-University Cote d’AzurNiceFrance
  4. 4.University of OxfordOxfordUK

Personalised recommendations