Abstract
As the previous chapter outlined approaches to personality assessment in psychology, this chapter summarizes works and insights of researchers from the speech community. Many of these researchers are linguists or computer scientists, hence the aim of approaching an individual’s personality translates into the aim of modeling or experimenting with personality. Essentially, the assessment of perceivable manifestations of personality is the basis for any experimentation. When analyzing personality in terms of speech, the scope of interest is narrowed down from overall personality, i.e., maybe being able to judge about personality from previous knowledge about actions or incidents, towards focusing on perceivable characteristics, in this respect it means perceivable at the very point in time the conversation or the experiment occurs as well as comprehensible to any person including persons having no prior knowledge about the speaker. Resulting limitations and cleavages of this respective will be addressed in the present chapter.
Voices are not merely a handy means to transmit information to the user. All voices—natural, recorded, or synthetic—activate automatic judgments about personality.
Nass et al. (1995)
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The terms “outer” and “inner” were chosen by the authors.
- 2.
Kreiman’s categories are basis for the current work, which extends the proposed categories.
- 3.
The term translates to sonorous, pompous.
- 4.
Note that the actual correct reference to pitch in term of speech synthesis is the acoustic correlate of the perceived pitch. i.e., F0. For simplification and comparability in the literature review, the term pitch is retained throughout this chapter. For details on how to obtain acoustic measurements for the perceived pitch and respective terminology please refer to Sect. 5.3.2.
- 5.
In his experiment Apple re-synthesized recordings after altering the speech using the LPC method of Atal and Hanauer (1971). LPC abbreviates linear predictive coding and is one out of many methods in speech synthesis.
- 6.
More details on measurements are given in Sect. 5.7. As for now, the F-measure can be seen as accuracy-related measure accounting for a single class out of a multi-class classification task which is less biased by class distribution imbalance. The value of \(0.8\) corresponds to good classification success.
References
Addington DW (1968) The relationship of selected vocal characteristics to personality perceptions. Speech Monogr 35(4):492–503
Allport GW, Cantril H (1934) Judging personality from voice. J Soc Psychol 5(1):37–55
Apple W, Streeter LA, Krauss RM (1979) Effects of pitch and speech rate on personal attributions. J Personality Soc Psychol 37(5):715–727
Aronovitch CD (1976) The voice of personality: stereotyped judgments and their relation to voice quality and sex of speaker. J Soc Psychol 99(2):207–220
Atal BS, Hanauer SL (1971) Speech analysis and synthesis by linear prediction of the speech wave. J Acoust Soc Am 50(2B):637–655
Ball D, Hill L, Freeman B, Eley TC, Strelau J, Riemann R, Spinath FM, Angleitner A, Plomin R (1997) The serotonin transporter gene and peer-rated neuroticism. NeuroReport 8(5):1301–1304
Berry DS (1990) Vocal attractiveness and vocal babyishness: effects on stranger, self, and friend impressions. J Nonverbal Behav 14(3):141–153
van Bezooijen R (1995) Sociocultural aspects of pitch differences between japanese and dutch women. Lang Speech 38:253–265
Bickmore T, Cassell J (2004) Natural intelligent and effective interaction with multimodal dialogue systems. Kluwer Academic, New York
Breese J, Ball G (1998) Modeling emotional state and personality for conversational agents. Technical Report MSR-TR-98-41, Microsoft Research
Brown B, Strong W, Rencher A (1974) Fifty-four voices from two: the effects of simultaneous manipulations of rate, mean fundamental frequency, and variance of fundamental frequency on ratings of personality from speech. J Acoust Soc Am 55:313–318
Buller DB, Aune RK (1988) The effects of vocalics and nonverbal sensitivity on compliance a speech accommodation theory explanation. Hum Commun Res 14:548–568
Burkhardt F, Ballegooy Mv, Engelbrecht K-P, Polzehl T, Stegmann J (2009a) Emotion detection in dialog systems: applications, strategies and challenges. In: Proceedings of international conference on affective computing and intelligent interaction (ACII (2009)) vol 1. IEEE Netherlands, Amsterdam
Burkhardt F, Polzehl T, Stegmann J, Metze F, Huber R (2009b) Detecting real life anger. In: Proceedings of international conference on acoustics, speech, and signal processing (ICASSP (2009)) vol 1. Taipei, Taiwan, IEEE, pp 4761–4764
Cantril H, Allport G (1935) The psychology of radio. Harper and Brothers, New York
Cassell J, Sullivan J, Prevost S, Churchill E (eds) (2000) Embodied conversational agents. The MIT Press, Cambridge
Cassell J, Bickmore T (2003) Negotiated collusion: modeling social language and its relationship effects in intelligent agents. User Model User Adapt Interact 13(1–2):89–132
Catrambone R, Stasko J, Xiao J (2002) Anthropomorphic agents as a user interface paradigm: experimental findings and a framework for research. In: 24th annual conference of the cognitive science society, pp 166–171
Chen Y, Naveed A, Porzel R (2010) Behavior and preference in minimal personality: a study on embodied conversational agents. In: International conference on multimodal interfaces and the workshop on machine learning for multimodal interaction, ICMI-MLMI’ 10, 49:1–49:4, New York, NY, USA. ACM
Dix A, Finlay J, Abowd G, Beale R (2004) Human-computer interaction, 3rd edn. Prentice-Hall, Upper Saddle River
Enos F, Benus S, Cautin RL, Graciarena M, Hirschberg J, Shriberg E, (2006) Personality factors in human deception detection: comparing human to machine performance, ISCA, pp 813–816
Eyben F, Wöllmer M, Schuller B (2010) OpenSMILE—The Munich versatile and fast open-source audio feature extractor, 1459. ACM Press, New York
Gill AJ, French RM (2007) Level of representation and semantic distance: rating author personality from texts. In: Proceedings of the second european cognitive science conference (EuroCogsci07), Delphi, Greece
Gosling S (2003) A very brief measure of the Big-Five personality domains. J Res Pers 37(6):504–528
Hunt RG, Lin TK (1967) Accuracy of judgments of personal attributes from speech. J Personality Soc Psychol 6(4):450–453
Ivanov AV, Riccardi G, Sporka AJ, Franc J (2011) Recognition of personality traits from human spoken conversations. Most (August) pp 1549–1552
John OP, Srivastava S (1999) The Big Five trait taxonomy: history, measurement, and theoretical perspectives. J Personality 2(2):102–138
Kreiman J, Sidtis D (2011) Foundations of voice studies, an interdisciplinary approach to voice production and perception. Wiley-Blackwell, West Sussex
Mairesse F, Walker MA, Mehl MR, Moore RK (2007) Using linguistic cues for the automatic recognition of personality in conversation and text. J Artif Intell Res 30:457–500
Mairesse F, Walker M (2007) PERSONAGE: personality generation for dialogue, Association for computational linguistics
Mairesse F, Walker MA (2011) Controlling user perceptions of linguistic style: trainable generation of personality traits. Comput Linguistics 37(January 2009):1–34
Mallory EB, Miller VR (1958) A possible basis for the association of voice characteristics and personality traits. Speech Monogr 25:255–260
Mehl MR, Pennebaker JW, Crow DM, Dabbs J, Price JH (2001) The electronically activated recorder (EAR): a device for sampling naturalistic daily activities and conversations. Behavior Res Methods Instrum Comput J Psychon Soc Inc 33(4):517–523
Mehl MR, Gosling SD, Pennebaker JW (2006) Personality in its natural habitat: manifestations and implicit folk theories of personality in daily life. J Person Soc Psychol 90(5):862–877
Metze F, Batliner A, Eyben F, Polzehl T, Schuller B, Steidl S (2010) Emotion recognition using imperfect speech recognition. In: Proceedings of the annual conference of the international speech communication association (Interspeech 2009), Makuhari, Japan, IEEE, pp 1–6
Metze F, Black A, Polzehl T (2011) A review of personality in voice-based man machine interaction. In: Human-Computer Interaction. Interaction techniques and environments—14th international conference, HCI International 2011, Springer, pp 358–367
Metze F, Polzehl T, Wagner M (2009) Fusion of acoustic and linguistic speech features for emotion detection. In: Proceedings of international conference on semantic computing (ICSC 2009) vol 1. Berleley, USA, CA, IEEE
Miller N, Maruyama G, Beaber RJ, Valone K (1976) Speed of speech and persuasion. J Pers Soc Psychol 34(4):615624
Mohammadi G, Mortillaro M, Vinciarelli A (2010) The voice of personality: mapping nonverbal vocal behavior into trait attributions. In: Proceedings of the international workshop on social signal processing, pp 17–20
Mohammadi G, Vinciarelli A (2011) Humans as feature extractors: combining prosody and personality perception for improved speaking style recognition. In: Proceedings of IEEE international conference on systems, man and cybernetics, pp 363–366
Moore W (1939) Personality traits and voice quality deficiencies. J Speech Hear Disord 4:33–36
Nass C, Moon Y, Fogg B, Reeves B, Dryer DC (1995) Can computer personalities be human personalities? Int J Hum Comput Stud 43:223–239
Nass C, Brave S (2005) Wired for speech: how voice activates and advances the human-computer relationship. The MIT Press, Cambridge
Nass C, Lee KM (2001) Does computer-synthesized speech manifest personality? Experimental tests of recognition, similarity-attraction, and consistency-attraction. J Exp Psychol, pp 171–181
Oberlander J, Gill A (2004) Individual differences and implicit language: personality, parts-of-speech and pervasiveness. In: Cognitive Science Society, Chicago, IL, USA
Paeschke A (2003) Prosodische analyse emotionaler sprechweise—dissertation TU berlin, vol 1. Reihe Mündliche Kommunikation. Logos Verlag, Berlin
Pear T (1931) Voice and personality. Chapman & Hall, London
Peng Y, Zebrowitz L, Lee H (1993) The impact of cultural background and cross-cultural experience on impressions of american and korean male speakers. J Cross-Cultural Psychol 24(2):203–220
Pianesi F, Mana N, Cappelletti A, Lepri B, Zancanaro M (2008) Multimodal recognition of personality traits in social interactions. In: Proceedings of the 10th international conference on multimodal interfaces IMCI 08, 53
Pierrehumbert J (1979) The perception of fudamental frequency declination. J Acoust Soc Am 66:363369
Pittam J (1994) Voice in social interaction: an interdisciplinary approach. Sage Publications Inc., Baldwin City
Polzehl T (2006) Automatische klassifizierung emotionaler sprechweisen. In: Tagungsband 1.Kongress Multimediatechnik, Wismar, Germany
Polzehl T, Metze F (2008) Using prosodic features to prioritize voice messages. In: SIGIR, Singapore proceedings of speech search workshop at SIGIR
Polzehl T, Schmitt A, Metze F (2009a) Comparing features for acoustic anger classification in german and english IVR systems. In: Proceedings of international workshop of spoken dialogue systems (IWsDs 2009) vol 1. University of Ulm, Germany, Ulm
Polzehl T, Sundaram S, Ketabdar H, Wagner M, Metze F (2009b) Emotion classification in children’s speech using fusion of acoustic and linguistic features. In: Proceedings of the annual conference of the international speech communication association (Interspeech 2009), Brighton, England. ISCA, pp 340–343
Polzehl T, Metze F, Schmitt A (2010a) Linguistic and prosodic emotion recognition. Deutsche Jahrestagung für Akustik (DAGA). DAGA, DAGA, pp 1–2
Polzehl T, Möller S, Metze F (2010b) Automatically assessing acoustic manifestations of personality in speech. In: Workshop on spoken language technology, Berkeley, USA IEEE
Polzehl T, Möller S, Metze F (2010c) Automatically assessing personality from speech. In: Proceedings of international conference on semantic computing (ICSC 2010), IEEE, pp 1–6
Polzehl T, Schmitt A, Metze F (2010d) Approaching multi-lingual emotion recognition from speech—on language dependency of acoustic/prosodic features for anger detection. In: SpeechProsody, Chicago, IL, USA. University of Illionois, pp 1–6
Polzehl T, Schmitt A, Metze F (2010e) Salient Features for Anger Recognition in German and English IVR Portals. In: Spoken dialogue systems technology and design. Springer, Berlin, Germany, pp 81–110
Polzehl T, Möller S, Metze F (2011a) Modeling speaker personality using voice. In: Proceedings of the annual conference of the international speech communication association (Interspeech 2011), Florence, Italy. ISCA
Polzehl T, Schmitt A, Metze F, Wagner M (2011b) Anger recognition in speech using acoustic and linguistic cues. Speech communication, special issue: sensing emotion and affect—facing realism in speech processing
Polzehl T, Schoenenberg K, Möller S, Metze F, Mohammadi G, Vinciarelli A (2012) On speaker-independent personality perception and prediction from speech. In: Proceedings of INTERSPEECH 2012
Rammstedt B, John O (2007) Measuring personality in one minute or less: a 10-item short version of the big five inventory in english and german. J Res Pers 41(1):203–212
Reeves B, Nass C (1996) The media equation: how people treat computers, television, and new media like real people and places. Cambridge University Press, Cambridge
Sanford FH (1942) Speech and personality: a comparative case study. J Personality 10:169198
Schapire RE, Singer Y (2000) BoosTexter: a boosting-based system for text categorization. In: Machine Learning, 135–168
Scherer KR (1974) Voice quality analysis of american and german speakers. J Psycholinguistic Res 3:281–298. doi:10.1007/BF01069244
Scherer KR (1979) Personality markers in speech, Cambridge University Press, Cambridge, pp 147–209
Scherer KR, Scherer U (1981) Speech behavior and personality. Speech Evaluation Psychiatry, 115–135
Scherer KR (1977) Effect of stress on fundamental frequency of the voice. J Acoust Soc Am 62:25–26
Schmitt A, Pieraccini R, Polzehl T (2010a) For heavens sake, gimme a live person! designing emotion-detection customer care voice applications in automated call centers. Advances in speech recognition. Springer, US, Berlin, Germany, pp 81–110
Schmitt A, Polzehl T, Minker W (2010b) Facing reality: simulating deployment of anger—recognition in IVR systems. In: Spoken dialogue systems for ambient environments—lecture notes in computer science, vol V. 6392, Springer, Makuhari, Japan, pp 23–48
Schmitt A, Polzehl T, Minker W, Liscombe J (2010c) The influence of the utterance length on the recognition of aged voices. In Calzolari N, Choukri K, Maegaard B, Mariani JOJ, Piperidis S, Rosner M, Tapias D (eds) Proceedings of the seventh conference on international language resources and evaluation (LREC’10), Valletta, Malta. European Language Resources Association (ELRA), pp 1–6
Schmitt A, Polzehl T, Minker W (2010d) Modeling a-priori likelihoods for angry user turns with hidden markov models. In: SpeechProsody, Chicago, IL., USA. University of Illionoise, pp 1–6
Schröder M, Trouvain J (2003) The german text-to-speech synthesis system mary: a tool for research, development and teaching. Int J Speech Technol 6(4):365–377
Schröder M, Grice M (2003) Expressing vocal effort in concatenative synthesis, 25892592. Citeseer
Schuller B, Metze F, Steidl S, Batliner A, Eyben F, Polzehl T (2009) Late fusion of individual engines for imrpoved recognition of negative emotion in speech—learning vs. democratic vote. In: International conference on acoustics, speech and signal processing (ICASSP). IEEE
Smith R, Parker E, Noble E (1975) Alcohol’s effect on some formal aspects of verbal social communication. Archives Gen Psychiatry 32(11):1394–1398
Stagner R (1936) Judgments of voice and personality. J Educational Psychol 27(4):272–277
Taylor HC (1934) Social agreement on personality traits as judged from speech. J Soc Psychol 5:244–248
Trouvain J, Schmidt S, Schröder M, Schmitz M, Barry WJ (2006) Modelling personality features by changing prosody in synthetic speech. Number Table 2. ISCA, pp 4–7
Tsalikis J, DeShields OJ, LaTour M (1991) The role of accent on the credibility and effectiveness of the salesman. J Pers Sell Sales Management 11:31–41
Tusing K (2000) The sounds of dominance. vocal precursors of perceived dominance during interpersonal influence. Hum Commun Res 26(1):148–171
Walker MA, Cahn JE, Whittaker SJ (1997) Improvising linguistic style: social and affective bases for agent personality. In: Proceedings of autonomous agents, p 10
Winkler R (2003) Merkmale junger und alter stimmen: analyse ausgewählter parameter im kontext von wahrnehmung und klassifikation, vol 6. Reihe Mndliche Kommunikation. Logos Verlag, Berlin
Zen G, Lepri B, Ricci E, Lanz O (2010) Space speaks: towards socially and personality aware visual surveillance. In: Proceedings of the 1st ACM international workshop on multimodal pervasive video analysis, p 3742
Zuckerman M, Driver RE (1989) What sounds beautiful is good: the vocal attractiveness stereotype. J Nonverbal Behav 13(2):67–82
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Polzehl, T. (2015). Speech-Based Personality Assessment. In: Personality in Speech. T-Labs Series in Telecommunication Services. Springer, Cham. https://doi.org/10.1007/978-3-319-09516-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-09516-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09515-8
Online ISBN: 978-3-319-09516-5
eBook Packages: EngineeringEngineering (R0)