Abstract
One type of experiential learning in the medical domain is chat interactions with a virtual human. These virtual humans play the role of a patient and allow students to practice skills such as communication and empathy in a safe, but realistic sandbox. These interactions last 10–15 min, and the typical virtual human has approximately 200 responses. Part of the realism of the virtual human’s response is the associated animation. These animations can be time consuming to create and associate with each response.
We turned to crowdsourcing to assist with this problem. We decomposed the process of creating basic animations into a simple task that nonexpert workers can complete. We provided workers with a set of predefined basic animations: six focused on head animation and nine focused on body animation. These animations could be mixed and matched for each question/response pair. Then, we used this unsupervised process to create a machine learning model for animation prediction: one for head animation and one for body animation. Multiple models were evaluated and their performance was assessed.
In an experiment, we evaluated participant perception of multiple versions of a virtual human suffering from dyspepsia (heartburn-like symptoms). For the version of the virtual human that utilized our machine learning approach, participants rated the character’s animation on par with a commercial expert. Head animation specifically was rated more natural and typically expected than other versions. Additionally, analysis of time and cost show the machine learning approach to be quicker and cheaper than an expert alternative.
References
Adda G, Mariani J, Besacier L, Gelas H (2013) Economic and ethical background of crowdsourcing for speech. In: Crowdsourcing for speech processing: applications to data collection, pp 303–334
Borish M, Lok B (2016) Rapid low-cost virtual human bootstrapping via the crowd. Trans Intell Syst Technol 7(4):47
Brand M, Hertzmann A (2000) Style machines. In: 27th SIGGRAPH, pp 183–192
Callison-Burch C, Dredze M (2010) Creating speech and language data with Amazons Mechanical Turk. In: Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s Mechanical Turk, number June, pp 1–12
Cassell J, Thorisson KR (1999) The power of a nod and a glance: envelope vs. emotional feedback in animated conversational agents. Appl Artif Intell 13(4–5):519–538
Deng Z, Gu Q, Li Q (2009) Perceptually consistent example-based human motion retrieval. In: Interactive 3D graphics and games, vol 1, pp 191–198
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software. ACM SIGKDD Explor Newsl 11(1):10
Ho C-C, MacDorman KF (2010) Revisiting the uncanny valley theory: developing and validating an alternative to the Godspeed indices. Comput Hum Behav 26(6):1508–1518
Hoon LN, Chai WY, Aidil K, Abd A (2014) Development of real-time lip sync animation framework based on viseme human speech. Arch Des Res 27(4):19–29
Hoque ME, Courgeon M, Mutlu B, Picard RW, Link C, Martin JC (2013) MACH: My Automated Conversation coacH. In: Pervasive and ubiquitous computing, pp 697–706
Huang L, Morency LP, Gratch J (2011) Virtual rapport 2.0. In: Intelligent virtual agents, pp 68–79
Levine S, Theobalt C (2009) Real-time prosody-driven synthesis of body language. ACM Trans Graph 28(5):17
Madirolas G, de Polavieja G (2014) Wisdom of the confident: using social interactions to eliminate the bias in wisdom of the crowds. In: Collective intelligence, pp 2012–2015
Marsella S, Lhommet M, Feng A (2013) Virtual character performance from speech. In: 12th SIGGRAPH/Eurographics symposium on computer animation, pp 25–35
Mason W, Street W, Watts DJ (2009) Financial incentives and the performance of crowds. SIGKDD 11(2):100–108
Mcclendon JL, Mack NA, Hodges LF (2014) The use of paraphrase identification in the retrieval of appropriate responses for script based conversational agents. In: Twenty-seventh international flairs conference, pp 19–201
Min J, Chai J (2012) Motion graphs++. ACM Trans Graph 31(6):153
Rossen B, Lok B (2012) A crowdsourcing method to develop virtual human conversational agents. IJHCS 70(4):301–319
Rossen B, Cendan J, Lok B (2010) Using virtual humans to bootstrap the creation of other virtual humans. In: Intelligent virtual agents, pp 392–398
Sargin ME, Aran O, Karpov A, Ofli F, Yasinnik Y, Wilson S, Erzin E, Yemez Y, Tekalp AM (2006) Combined gesture-speech analysis and speech driven gesture synthesis. In: Multimedia and Expo, number Jan 2016, pp 893–896
Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: EMNLP, p 1642
Toshinori C, Ishiguro H, Hagita N (2014) Analysis of relationship between head motion events and speech in dialogue conversations. Speech Comm 57:233–243
Triola MM, Campion N, Mcgee JB, Albright S, Greene P, Smothers V, Ellaway R (2007) An XML standard for virtual patients: exchanging case-based simulations in medical education. In: AMIA, pp 741–745
Triola MM, Huwendiek S, Levinson AJ, Cook DA (2012) New directions in e-learning research in health professions education: report of two symposia. Med Teach 34(1):15–20
Wang L, Cardie C (2014) Improving agreement and disagreement identification in online discussions with a socially-tuned sentiment lexicon. In: ACL, vol 97, p 97
Xu Y, Pelachaud C, Marsella S (2014) Compound gesture generation: a model based on ideational units. In: IVA, pp 477–491
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this entry
Cite this entry
Borish, M., Lok, B. (2016). Utilizing Unsupervised Crowdsourcing to Develop a Machine Learning Model for Virtual Human Animation Prediction. In: Müller, B., et al. Handbook of Human Motion. Springer, Cham. https://doi.org/10.1007/978-3-319-30808-1_21-1
Download citation
DOI: https://doi.org/10.1007/978-3-319-30808-1_21-1
Received:
Accepted:
Published:
Publisher Name: Springer, Cham
Online ISBN: 978-3-319-30808-1
eBook Packages: Springer Reference EngineeringReference Module Computer Science and Engineering