Abstract
How to improve social interaction ability for children with autism spectrum disorder (ASD) has long been a challenge faced by researchers and therapists. Recent research indicates that computer-assisted approaches may be effective in addressing this issue. This study aimed to understand children’s behaviors and then provide appropriate support to improve their social interaction ability. We have established an intelligent system, inside which a child can freely play interactive social skills games with virtual characters. The virtual characters can adjust their own behaviors by adapting to the child’s cognitive state (e.g., focus of attention) and affective state (e.g., happiness or surprise). The child’s behavior is identified by recognition of social signals, which includes head pose and eye gaze estimation, gesture detection, and affective state detection supported by a series of algorithms proposed in this study. Furthermore, this intelligent system has been enabled in a nonintrusive manner using a novel approach of multicamera surveillance to provide the child with natural interaction with the system. Experimental results indicate that our system can accurately estimate a user’s head pose and detect the user’s eye gaze with a correctness rate of 96%. An expression recognition test was performed with a CK+ database and live videos, and the rates for recognition were 91.5% and 87.3%, respectively. The results obtained suggest that the methods have strong potential as alternative methods for sensing human behavior and providing appropriate support for children with ASD.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahn, B., Choi, D. G., Park, J., & Kweon, I. S. (2018). Real-time head pose estimation using multi-task deep neural network. Robotics and Autonomous Systems, 103, 1–12.
American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Washington, DC: American Psychiatric Association.
Asteriadis, S., Tzouveli, P., Karpouzis, K., & Kollias, S. (2009). Estimation of behavioral user state based on eye gaze and head pose—Application in an e-learning environment. Multimedia Tools and Applications, 41(3), 469–493.
Bardins, S., Poitschke, T., & Kohlbecher, S. (2008). Gaze-based interaction in various environments. In 1st ACM Workshop Vis. Netw. Beh. Anal., Vancouver, BC (pp. 47–54). https://doi.org/10.1145/1461893.1461903
Batista, J. C., Albiero, V., Bellon, O. R., & Silva, L. (2017). Aumpnet: Simultaneous action units detection and intensity estimation on multipose facial images using a single convolutional neural network. In 2017 12th IEEE international conference on automatic face & gesture recognition (FG 2017) (pp. 866–871). IEEE.
Callahan, E. H., Gillis, J. M., Romanczyk, R. G., & Mattson, R. E. (2011). The behavioral assessment of social interactions in young children: An examination of convergent and incremental validity. Research in Autism Spectrum Disorders, 5(2), 768–774.
Chen, J., Chen, D., Li, X., & Zhang, K. (2014). Towards improving social communication skills with multimodal sensory information. IEEE Transactions on Industrial Informatics, 10(1), 323–330.
Chen, J., Luo, N., Liu, Y., Liu, L., Zhang, K., & Kolodziej, J. (2016). A hybrid intelligence-aided approach to affect-sensitive e-learning. Computing, 98(1–2), 215–233.
Chen, J., Wang, G., Zhang, K., Wang, G., & Liu, L. (2019). A pilot study on evaluating children with autism spectrum disorder using computer games. Computers in Human Behavior, 90, 204–214.
Duffield, T. C., Parsons, T. D., Landry, A., Karam, S., & Hall, T. A. (2017). Virtual environments as an assessment modality with pediatric ASD populations: A brief report. Child Neuropsychology, 24(8), 1–8.
Ekman, P., & Friesen, W. V. (1978). Facial action coding system: Investigator’s guide. Palo Alto, CA: Consulting Psychologists Press.
Foster, M. E., Avramides, K., Bernardini, S., Chen, J., Frauenberger, C., Lemon, O., et al. (2010). Supporting children’s social interaction ability through interactive narratives with virtual characters. In Proceedings of the international conference on multimedia, Florence, Italy (pp. 1111–1114). New York: ACM.
Ji, Q., & Zhu, Z. (2003). Non-intrusive eye and gaze tracking for natural human computer interaction. MMI-Interaktiv J., no. 6.
Jordan, R., & Jones, G. (1999). Meeting the needs of children with autistic spectrum disorder. London, U.K.: Fulton.
Jyoti, V., & Lahiri, U. (2020). Human-computer interaction based joint attention cues: Implications on functional and physiological measures for children with autism spectrum disorder. Computers in Human Behavior, 104, 106163. https://doi.org/10.1016/j.chb.2019.106163
Li, S., & Deng, W. (2018). Deep facial expression recognition: A survey. arXiv preprint arXiv, 1804, 08348.
Liao, C. T., Chuang, H. J., & Lai, S. H. (2012, March). Learning expression kernels for facial expression intensity estimation. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2217–2220). IEEE.
Liu, H. (2011). Exploring human hand capabilities into embedded multifingered object manipulation. IEEE Transactions on Industrial Informatics, 7(3), 389–398.
Liu, M., Li, S., Shan, S., Wang, R., & Chen, X. (2014). Deeply learning deformable facial action parts model for dynamic expression analysis. In Asian conference on computer vision (pp. 143–157). Cham, Switzerland: Springer.
Liu, Y., Wang, L., & Li, W. (2017). Emotion analysis based on facial expression recognition in virtual learning environment. International Journal of Computer and Communication Engineering, 6(1), 49.
Louwerse, A., Van der Geest, J. N., Tulen, J. H. M., et al. (2013). Effects of eye gaze directions of facial images on looking behaviour and autonomic responses in adolescents with autism spectrum disorders. Research in Autism Spectrum Disorders, 7(9), 1043–1053.
Lucey, P., Cohn, J. F., Kanade, T., Saragih, J., & Matthews, I. (2010). The extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression. In 2010 IEEE Computer Society conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway, NJ: IEEE.
Lucey, P., Cohn, J. F., Prkachin, K. M., & Solomon, P. E. (2011). Painful data: The UNBC-McMaster shoulder pain expression archive database. In IEEE international conference on automatic face & gesture recognition. New York: IEEE Computer Society.
Ranjan, R., Patel, V. M., & Chellappa, R. (2017). Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. In IEEE transactions on pattern analysis and machine intelligence (pp. 1–1). New York: IEEE Computer Society.
Ruddarraju, R., & Essa, I. A. (2003). Fast multiple camera head pose tracking. In Proceedings 16th vision interface (p. 6). Halifax, Canada.
Shen, W. Y., & Lin, H. T. (2013). Active sampling of pairs and points for large-scale linear bipartite ranking. In asian conference on machine learning. Proceedings of Machine Learning Research, 29, 388–403.
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. Computer Science, 1409, 1556.
Wainer, A. L., & Ingersoll, B. R. (2011). The use of innovative computer technology for teaching social communication to individuals with autism spectrum disorders. Research in Autism Spectrum Disorders, 5(1), 96–107.
Xu, R., Chen, J., Han, J., Tan, L., & Xu, L. (2019). Towards emotion-sensitive learning cognitive state analysis of big data in education: Deep learning-based facial expression analysis using ordinal information. Computing, 1, 1–16.
Yang, P., Liu, Q., & Metaxas, D. N. (2009). RankBoost with l1 regularization for facial expression recognition and intensity estimation. In IEEE 12th international conference on computer vision, ICCV 2009, Kyoto, Japan, September 27–October 4, 2009. Piscataway, NJ: IEEE.
Yun, W. H., Lee, D., Park, C., & Kim, J. (2015). Automatic engagement level estimation of kids in a learning environment. International Journal of Machine Learning and Computing, 5(2), 148.
Zhang, K., Zhang, Z., Li, Z., & Qiao, Y. (2016). Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10), 1499–1503.
Zhang, Z. (2000). A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 1330.
Zhao, X., Liang, X., Liu, L., Li, T., Han, Y., Vasconcelos, N., et al. (2016). Peak-piloted deep network for facial expression recognition. In European conference on computer vision (pp. 425–442). Cham, Switzerland: Springer.
Acknowledgments
This study is an improvement on the previous work (Chen et al., 2014). We thank our colleagues at the National Engineering Research Center for E-Learning of the Central China Normal University who participated in this study. We also appreciate the children and their teachers and families. This work was supported by the National Natural Science Foundation under Grant 61977027, Hubei Province Technological Innovation Major Project under Grant 2019AAA044, Research Funds of CCNU from the Colleges’ Basic Research and Operation of MOE under Grants CCNU19Z02002, CCNU18KFY02.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Chen, J., Wang, G., Zhang, K., Xu, R., Chen, D., Li, X. (2020). Toward Improving Social Interaction Ability for Children with Autism Spectrum Disorder Using Social Signals. In: Pinkwart, N., Liu, S. (eds) Artificial Intelligence Supported Educational Technologies. Advances in Analytics for Learning and Teaching. Springer, Cham. https://doi.org/10.1007/978-3-030-41099-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-41099-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41098-8
Online ISBN: 978-3-030-41099-5
eBook Packages: EducationEducation (R0)