Conversational Interaction Recognition Based on Bodily and Facial Movement

Deng, Jingjing; Xie, Xianghua; Zhou, Shangming

doi:10.1007/978-3-319-11758-4_26

Jingjing Deng¹⁷,
Xianghua Xie¹⁷ &
Shangming Zhou¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8814))

Included in the following conference series:

International Conference Image Analysis and Recognition

2124 Accesses

Abstract

We examine whether 3D pose and face features can be used to both learn and recognize different conversational interactions. We believe this to be among the first work devoted to this subject and show that this task is indeed possible with a promising degree of accuracy using both features derived from pose and face. To extract 3D pose we use the Kinect Sensor, and we use a combined local and global model to extract face features from normal RGB cameras. We show that whilst both of these features are contaminated with noises. They can still be used to effectively train classifiers. The differences in interaction among different scenarios in our data set are extremely subtle. Both generative and discriminative methods are investigated, and a subject specific supervised learning approach is employed to classify the testing sequences to seven different conversational scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: A review. ACM Computing Survey 43(16), 1–43 (2011)
Article Google Scholar
Yao, A., Gall, J., Fanelli, G., Gool, L.V.: Does human action recognition benefit from pose estimation? In: BMVC (2011)
Google Scholar
Belhumeur, P., Hespanha, J., Kriegman, D.: Eigenfaces vs fisherfaces: recognition using class specific linear projection. IEEE T-PAMI 19(7), 711–720 (1997)
Article Google Scholar
Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. J. of Machine Learning Research 3, 993–1022 (2003)
MATH Google Scholar
Buehler, P., Everingham, M., Zisserman, A.: Learning sign language by watching TV (using weakly aligned subtitles). In: CVPR (2009)
Google Scholar
Cootes, T., Edward, G., Taylor, C.: Active appearance models. IEEE T-PAMI 23(6), 681–685 (2001)
Article Google Scholar
Cristinacce, D., Cootes, T.: Automatic feature localisation with constrained local models. PR 41, 3054–3067 (2008)
Article MATH Google Scholar
Daubney, B., Xie, X.: Entropy driven hierarchical search for 3d human pose estimation. In: BMVC, pp. 1–11 (2011)
Google Scholar
Daubney, B., Xie, X.: Tracking 3d human pose with large root node uncertainty. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1321–1328 (June 2011)
Google Scholar
Deng, J., Xie, X., Daubney, B.: A bag of words approach to subject specific 3d human pose interaction classification with random decision forests. Graphical Models 76(3), 162–171 (2014)
Article Google Scholar
Deng, J., Xie, X., Daubney, B., Fang, H., Grant, P.W.: Recognizing conversational interaction based on 3D human pose. In: Blanc-Talon, J., Kasinski, A., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2013. LNCS, vol. 8192, pp. 138–149. Springer, Heidelberg (2013)
Chapter Google Scholar
Fang, H., Deng, J., Xie, X., Grant, P.: From clamped local shape models to global shape model. In: IEEE ICIP, pp. 3513–3517 (September 2013)
Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Addictive logistic regression: a statistical view of boosting. Annals of Statistics 28, 337–407 (2000)
Article MathSciNet MATH Google Scholar
Gee, A.H., Cipolla, R.: Determining the gaze of faces in images. IVC 12, 639–647 (1994)
Article Google Scholar
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America 101, 5228–5235 (2004)
Article Google Scholar
Kovar, L., Gleicher, M.: Automated extraction and parameterization of motions in large data sets. ACM ToG 23(3), 559–568 (2004)
Article Google Scholar
Müller, M., Röder, T., Clausen, M.: Efficient content-based retrieval of motion capture data. ACM ToG 24(3), 677–685 (2005)
Article Google Scholar
Niebles, J., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. IJCV 79(3), 299–318 (2008)
Article Google Scholar
Oliver, N., Rosario, B., Pentland, A.: A bayesian computer vision system for modeling human interactions. IEEE T-PAMI 22(8), 831–843 (2000)
Article Google Scholar
Viola, P., Jones, M.: Robust real-time face detection. IJCV 57(2), 137–154 (2004)
Article Google Scholar
Zhang, D., Gatica-Perez, D., Bengio, S., McCowan, I.: Modeling individual and group actions in meetings with layered hmms. IEEE Multimedia 8(3), 509–520 (2006)
Article Google Scholar
Zhou, S.M., Lyons, R.A., Bodger, O., Demmler, J.C., Atkinson, M.A.: Svm with entropy regularization and particle swarm optimization for identifying childrens health and socioeconomic determinants of education attainments using linked datasets. In: IEEE Inter. Conf. Neural Networks, pp. 3867–3874 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Swansea University, Swansea, UK
Jingjing Deng, Xianghua Xie & Shangming Zhou

Authors

Jingjing Deng
View author publications
You can also search for this author in PubMed Google Scholar
Xianghua Xie
View author publications
You can also search for this author in PubMed Google Scholar
Shangming Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xianghua Xie .

Editor information

Editors and Affiliations

Faculty of Engineering, University of Porto, Porto, Portugal
Aurélio Campilho
Dept. of Electrical and Computer Eng., University of Waterloo, Waterloo, Ontario, Canada
Mohamed Kamel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Deng, J., Xie, X., Zhou, S. (2014). Conversational Interaction Recognition Based on Bodily and Facial Movement. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2014. Lecture Notes in Computer Science(), vol 8814. Springer, Cham. https://doi.org/10.1007/978-3-319-11758-4_26

Download citation

DOI: https://doi.org/10.1007/978-3-319-11758-4_26
Published: 10 October 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11757-7
Online ISBN: 978-3-319-11758-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics