Abstract
In this study we propose a methodology to investigate possible prosody and voice quality correlates of social signals, and test-run it on annotated naturalistic recordings of scenario meetings. The core method consists of computing a set of prosody and voice quality measures, followed by a Principal Components Analysis (PCA) and Support Vector Machine (SVM) classification to identify the core factors predicting the associated social signal or related annotation. We apply the methodology to controlled data and two types of annotations in the AMI meeting corpus that are relevant for social signalling: dialogue acts and speaker roles.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Carletta, J., et al.: The ami meeting corpus: A pre-announcement. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, pp. 28–39. Springer, Heidelberg (2006)
Charfuelan, M., Schröder, M., Steiner, I.: Prosody and voice quality of vocal social signals: the case of dominance in scenario meetings. In: Proc. Interspeech, Makuhari, Japan (2010)
Fernandez, R., Picard, R.W.: Dialog act classification from prosodic features using support vector machines. In: Proc. Speech Prosody, Aix-en-Provence, France (2002)
Germesin, S., Wilson, T.: Agreement detection in multiparty conversation. In: Proc. ICMI-MLMI 2009, Cambridge, Massachusetts, USA (2009)
Gobl, C., Chasaide, A.N.: The role of voice quality in communicating emotion, mood and attitude. Speech Commun. 40(1-2), 189–212 (2003)
Gobl, C., Chasaide, A.N.: Voice source variation and its communicative functions. In: The Handbook of Phonetic Sciences, 2nd edn., pp. 378–423 (2010)
Hammarberg, B., Fritzell, B., Gauffin, J., Sundberg, J., Wedin, L.: Perceptual and acoustic correlates of abnormal voice quality. Acta Otolaryngologica (90) (1980)
Hanson, H.M.: Glottal characteristics of female speakers: Acoustic correlates. The Journal of the Acoustical Society of America 101(1), 466–481 (1997)
Ishi, C.T., Ishiguro, H., Hagita, N.: Evaluation of prosodic and voice quality features on automatic extraction of paralinguistic information. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, Beijing, China (2006)
Jayagopi, D.B., Ba, S., Odobez, J., Gatica-Perez, D.: Predicting two facets of social verticality in meetings from five-minute time slices and nonverbal cues. In: Proc. 10th ICMI 2008, Chania, Crete, Greece, pp. 45–52 (2008)
Lugger, M., Yang, B., Wokurek, W.: Robust estimation of voice quality parameters under realworld disturbances. In: IEEE ICASSP, Toulouse, France (2006)
Monzo, C., Alías, F., Iriondo, I., Gonzalvo, X., Planet, S.: Discriminating expressive speech styles by voice quality parameterization. In: Proc. 16th Internat. Cong. of Phonetic Sciences (ICPhS), Saarbrücken, Germany (2007)
Nordstrom, K., Tzanetakis, G., Driessen, P.: Transforming perceived vocal effort and breathiness using adaptive pre-emphasis linear prediction. IEEE Transactions on Audio, Speech and Language Proscessing 16(6) (2008)
Schröder, M.: Speech and Emotion Research: An overview of research frameworks and a dimensional approach to emotional speech synthesis. Ph.D. thesis, PHONUS 7, Research Report of the Institute of Phonetics, Saarland University (2004)
Schröder, M., Grice, M.: Expressing vocal effort in concatenative synthesis. In: Proc. 15th Internat. Cong. of Phonetic Sciences (ICPhS), Barcelona, Spain (2003)
Stevens, K., Hanson, H.: Classification of glottal vibration from acoustic measurements. In: Vocal Fold Physiology: Voice Quality Control, ch. 9, no. 147-170 (1994)
Vinciarelli, A., Salamin, H., Pantic, M.: Social signal processing: Understanding social interactions through nonverbal behavior analysis. In: IEEE Computer Vision and Pattern Recognition Workshops, pp. 42–49 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Charfuelan, M., Schröder, M. (2011). Investigating the Prosody and Voice Quality of Social Signals in Scenario Meetings. In: D’Mello, S., Graesser, A., Schuller, B., Martin, JC. (eds) Affective Computing and Intelligent Interaction. ACII 2011. Lecture Notes in Computer Science, vol 6974. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24600-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-24600-5_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24599-2
Online ISBN: 978-3-642-24600-5
eBook Packages: Computer ScienceComputer Science (R0)