Speech-Based Personality Assessment

Polzehl, Tim

doi:10.1007/978-3-319-09516-5_2

Tim Polzehl⁵

Part of the book series: T-Labs Series in Telecommunication Services ((TLABS))

1488 Accesses

Abstract

As the previous chapter outlined approaches to personality assessment in psychology, this chapter summarizes works and insights of researchers from the speech community. Many of these researchers are linguists or computer scientists, hence the aim of approaching an individual’s personality translates into the aim of modeling or experimenting with personality. Essentially, the assessment of perceivable manifestations of personality is the basis for any experimentation. When analyzing personality in terms of speech, the scope of interest is narrowed down from overall personality, i.e., maybe being able to judge about personality from previous knowledge about actions or incidents, towards focusing on perceivable characteristics, in this respect it means perceivable at the very point in time the conversation or the experiment occurs as well as comprehensible to any person including persons having no prior knowledge about the speaker. Resulting limitations and cleavages of this respective will be addressed in the present chapter.

Voices are not merely a handy means to transmit information to the user. All voices—natural, recorded, or synthetic—activate automatic judgments about personality.

Nass et al. (1995)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The terms “outer” and “inner” were chosen by the authors.
2.
Kreiman’s categories are basis for the current work, which extends the proposed categories.
3.
The term translates to sonorous, pompous.
4.
Note that the actual correct reference to pitch in term of speech synthesis is the acoustic correlate of the perceived pitch. i.e., F0. For simplification and comparability in the literature review, the term pitch is retained throughout this chapter. For details on how to obtain acoustic measurements for the perceived pitch and respective terminology please refer to Sect. 5.3.2.
5.
In his experiment Apple re-synthesized recordings after altering the speech using the LPC method of Atal and Hanauer (1971). LPC abbreviates linear predictive coding and is one out of many methods in speech synthesis.
6.
More details on measurements are given in Sect. 5.7. As for now, the F-measure can be seen as accuracy-related measure accounting for a single class out of a multi-class classification task which is less biased by class distribution imbalance. The value of \(0.8\) corresponds to good classification success.

References

Addington DW (1968) The relationship of selected vocal characteristics to personality perceptions. Speech Monogr 35(4):492–503
Article Google Scholar
Allport GW, Cantril H (1934) Judging personality from voice. J Soc Psychol 5(1):37–55
Article Google Scholar
Apple W, Streeter LA, Krauss RM (1979) Effects of pitch and speech rate on personal attributions. J Personality Soc Psychol 37(5):715–727
Article Google Scholar
Aronovitch CD (1976) The voice of personality: stereotyped judgments and their relation to voice quality and sex of speaker. J Soc Psychol 99(2):207–220
Article Google Scholar
Atal BS, Hanauer SL (1971) Speech analysis and synthesis by linear prediction of the speech wave. J Acoust Soc Am 50(2B):637–655
Article Google Scholar
Ball D, Hill L, Freeman B, Eley TC, Strelau J, Riemann R, Spinath FM, Angleitner A, Plomin R (1997) The serotonin transporter gene and peer-rated neuroticism. NeuroReport 8(5):1301–1304
Article Google Scholar
Berry DS (1990) Vocal attractiveness and vocal babyishness: effects on stranger, self, and friend impressions. J Nonverbal Behav 14(3):141–153
Article Google Scholar
van Bezooijen R (1995) Sociocultural aspects of pitch differences between japanese and dutch women. Lang Speech 38:253–265
Google Scholar
Bickmore T, Cassell J (2004) Natural intelligent and effective interaction with multimodal dialogue systems. Kluwer Academic, New York
Google Scholar
Breese J, Ball G (1998) Modeling emotional state and personality for conversational agents. Technical Report MSR-TR-98-41, Microsoft Research
Google Scholar
Brown B, Strong W, Rencher A (1974) Fifty-four voices from two: the effects of simultaneous manipulations of rate, mean fundamental frequency, and variance of fundamental frequency on ratings of personality from speech. J Acoust Soc Am 55:313–318
Article Google Scholar
Buller DB, Aune RK (1988) The effects of vocalics and nonverbal sensitivity on compliance a speech accommodation theory explanation. Hum Commun Res 14:548–568
Article Google Scholar
Burkhardt F, Ballegooy Mv, Engelbrecht K-P, Polzehl T, Stegmann J (2009a) Emotion detection in dialog systems: applications, strategies and challenges. In: Proceedings of international conference on affective computing and intelligent interaction (ACII (2009)) vol 1. IEEE Netherlands, Amsterdam
Google Scholar
Burkhardt F, Polzehl T, Stegmann J, Metze F, Huber R (2009b) Detecting real life anger. In: Proceedings of international conference on acoustics, speech, and signal processing (ICASSP (2009)) vol 1. Taipei, Taiwan, IEEE, pp 4761–4764
Google Scholar
Cantril H, Allport G (1935) The psychology of radio. Harper and Brothers, New York
Google Scholar
Cassell J, Sullivan J, Prevost S, Churchill E (eds) (2000) Embodied conversational agents. The MIT Press, Cambridge
Google Scholar
Cassell J, Bickmore T (2003) Negotiated collusion: modeling social language and its relationship effects in intelligent agents. User Model User Adapt Interact 13(1–2):89–132
Article Google Scholar
Catrambone R, Stasko J, Xiao J (2002) Anthropomorphic agents as a user interface paradigm: experimental findings and a framework for research. In: 24th annual conference of the cognitive science society, pp 166–171
Google Scholar
Chen Y, Naveed A, Porzel R (2010) Behavior and preference in minimal personality: a study on embodied conversational agents. In: International conference on multimodal interfaces and the workshop on machine learning for multimodal interaction, ICMI-MLMI’ 10, 49:1–49:4, New York, NY, USA. ACM
Google Scholar
Dix A, Finlay J, Abowd G, Beale R (2004) Human-computer interaction, 3rd edn. Prentice-Hall, Upper Saddle River
Google Scholar
Enos F, Benus S, Cautin RL, Graciarena M, Hirschberg J, Shriberg E, (2006) Personality factors in human deception detection: comparing human to machine performance, ISCA, pp 813–816
Google Scholar
Eyben F, Wöllmer M, Schuller B (2010) OpenSMILE—The Munich versatile and fast open-source audio feature extractor, 1459. ACM Press, New York
Google Scholar
Gill AJ, French RM (2007) Level of representation and semantic distance: rating author personality from texts. In: Proceedings of the second european cognitive science conference (EuroCogsci07), Delphi, Greece
Google Scholar
Gosling S (2003) A very brief measure of the Big-Five personality domains. J Res Pers 37(6):504–528
Article Google Scholar
Hunt RG, Lin TK (1967) Accuracy of judgments of personal attributes from speech. J Personality Soc Psychol 6(4):450–453
Article Google Scholar
Ivanov AV, Riccardi G, Sporka AJ, Franc J (2011) Recognition of personality traits from human spoken conversations. Most (August) pp 1549–1552
Google Scholar
John OP, Srivastava S (1999) The Big Five trait taxonomy: history, measurement, and theoretical perspectives. J Personality 2(2):102–138
Google Scholar
Kreiman J, Sidtis D (2011) Foundations of voice studies, an interdisciplinary approach to voice production and perception. Wiley-Blackwell, West Sussex
Book Google Scholar
Mairesse F, Walker MA, Mehl MR, Moore RK (2007) Using linguistic cues for the automatic recognition of personality in conversation and text. J Artif Intell Res 30:457–500
MATH Google Scholar
Mairesse F, Walker M (2007) PERSONAGE: personality generation for dialogue, Association for computational linguistics
Google Scholar
Mairesse F, Walker MA (2011) Controlling user perceptions of linguistic style: trainable generation of personality traits. Comput Linguistics 37(January 2009):1–34
Google Scholar
Mallory EB, Miller VR (1958) A possible basis for the association of voice characteristics and personality traits. Speech Monogr 25:255–260
Article Google Scholar
Mehl MR, Pennebaker JW, Crow DM, Dabbs J, Price JH (2001) The electronically activated recorder (EAR): a device for sampling naturalistic daily activities and conversations. Behavior Res Methods Instrum Comput J Psychon Soc Inc 33(4):517–523
Article Google Scholar
Mehl MR, Gosling SD, Pennebaker JW (2006) Personality in its natural habitat: manifestations and implicit folk theories of personality in daily life. J Person Soc Psychol 90(5):862–877
Article Google Scholar
Metze F, Batliner A, Eyben F, Polzehl T, Schuller B, Steidl S (2010) Emotion recognition using imperfect speech recognition. In: Proceedings of the annual conference of the international speech communication association (Interspeech 2009), Makuhari, Japan, IEEE, pp 1–6
Google Scholar
Metze F, Black A, Polzehl T (2011) A review of personality in voice-based man machine interaction. In: Human-Computer Interaction. Interaction techniques and environments—14th international conference, HCI International 2011, Springer, pp 358–367
Google Scholar
Metze F, Polzehl T, Wagner M (2009) Fusion of acoustic and linguistic speech features for emotion detection. In: Proceedings of international conference on semantic computing (ICSC 2009) vol 1. Berleley, USA, CA, IEEE
Google Scholar
Miller N, Maruyama G, Beaber RJ, Valone K (1976) Speed of speech and persuasion. J Pers Soc Psychol 34(4):615624
Article Google Scholar
Mohammadi G, Mortillaro M, Vinciarelli A (2010) The voice of personality: mapping nonverbal vocal behavior into trait attributions. In: Proceedings of the international workshop on social signal processing, pp 17–20
Google Scholar
Mohammadi G, Vinciarelli A (2011) Humans as feature extractors: combining prosody and personality perception for improved speaking style recognition. In: Proceedings of IEEE international conference on systems, man and cybernetics, pp 363–366
Google Scholar
Moore W (1939) Personality traits and voice quality deficiencies. J Speech Hear Disord 4:33–36
Article Google Scholar
Nass C, Moon Y, Fogg B, Reeves B, Dryer DC (1995) Can computer personalities be human personalities? Int J Hum Comput Stud 43:223–239
Article Google Scholar
Nass C, Brave S (2005) Wired for speech: how voice activates and advances the human-computer relationship. The MIT Press, Cambridge
Google Scholar
Nass C, Lee KM (2001) Does computer-synthesized speech manifest personality? Experimental tests of recognition, similarity-attraction, and consistency-attraction. J Exp Psychol, pp 171–181
Google Scholar
Oberlander J, Gill A (2004) Individual differences and implicit language: personality, parts-of-speech and pervasiveness. In: Cognitive Science Society, Chicago, IL, USA
Google Scholar
Paeschke A (2003) Prosodische analyse emotionaler sprechweise—dissertation TU berlin, vol 1. Reihe Mündliche Kommunikation. Logos Verlag, Berlin
Google Scholar
Pear T (1931) Voice and personality. Chapman & Hall, London
Google Scholar
Peng Y, Zebrowitz L, Lee H (1993) The impact of cultural background and cross-cultural experience on impressions of american and korean male speakers. J Cross-Cultural Psychol 24(2):203–220
Article Google Scholar
Pianesi F, Mana N, Cappelletti A, Lepri B, Zancanaro M (2008) Multimodal recognition of personality traits in social interactions. In: Proceedings of the 10th international conference on multimodal interfaces IMCI 08, 53
Google Scholar
Pierrehumbert J (1979) The perception of fudamental frequency declination. J Acoust Soc Am 66:363369
Article Google Scholar
Pittam J (1994) Voice in social interaction: an interdisciplinary approach. Sage Publications Inc., Baldwin City
Google Scholar
Polzehl T (2006) Automatische klassifizierung emotionaler sprechweisen. In: Tagungsband 1.Kongress Multimediatechnik, Wismar, Germany
Google Scholar
Polzehl T, Metze F (2008) Using prosodic features to prioritize voice messages. In: SIGIR, Singapore proceedings of speech search workshop at SIGIR
Google Scholar
Polzehl T, Schmitt A, Metze F (2009a) Comparing features for acoustic anger classification in german and english IVR systems. In: Proceedings of international workshop of spoken dialogue systems (IWsDs 2009) vol 1. University of Ulm, Germany, Ulm
Google Scholar
Polzehl T, Sundaram S, Ketabdar H, Wagner M, Metze F (2009b) Emotion classification in children’s speech using fusion of acoustic and linguistic features. In: Proceedings of the annual conference of the international speech communication association (Interspeech 2009), Brighton, England. ISCA, pp 340–343
Google Scholar
Polzehl T, Metze F, Schmitt A (2010a) Linguistic and prosodic emotion recognition. Deutsche Jahrestagung für Akustik (DAGA). DAGA, DAGA, pp 1–2
Google Scholar
Polzehl T, Möller S, Metze F (2010b) Automatically assessing acoustic manifestations of personality in speech. In: Workshop on spoken language technology, Berkeley, USA IEEE
Google Scholar
Polzehl T, Möller S, Metze F (2010c) Automatically assessing personality from speech. In: Proceedings of international conference on semantic computing (ICSC 2010), IEEE, pp 1–6
Google Scholar
Polzehl T, Schmitt A, Metze F (2010d) Approaching multi-lingual emotion recognition from speech—on language dependency of acoustic/prosodic features for anger detection. In: SpeechProsody, Chicago, IL, USA. University of Illionois, pp 1–6
Google Scholar
Polzehl T, Schmitt A, Metze F (2010e) Salient Features for Anger Recognition in German and English IVR Portals. In: Spoken dialogue systems technology and design. Springer, Berlin, Germany, pp 81–110
Google Scholar
Polzehl T, Möller S, Metze F (2011a) Modeling speaker personality using voice. In: Proceedings of the annual conference of the international speech communication association (Interspeech 2011), Florence, Italy. ISCA
Google Scholar
Polzehl T, Schmitt A, Metze F, Wagner M (2011b) Anger recognition in speech using acoustic and linguistic cues. Speech communication, special issue: sensing emotion and affect—facing realism in speech processing
Google Scholar
Polzehl T, Schoenenberg K, Möller S, Metze F, Mohammadi G, Vinciarelli A (2012) On speaker-independent personality perception and prediction from speech. In: Proceedings of INTERSPEECH 2012
Google Scholar
Rammstedt B, John O (2007) Measuring personality in one minute or less: a 10-item short version of the big five inventory in english and german. J Res Pers 41(1):203–212
Article Google Scholar
Reeves B, Nass C (1996) The media equation: how people treat computers, television, and new media like real people and places. Cambridge University Press, Cambridge
Google Scholar
Sanford FH (1942) Speech and personality: a comparative case study. J Personality 10:169198
Article Google Scholar
Schapire RE, Singer Y (2000) BoosTexter: a boosting-based system for text categorization. In: Machine Learning, 135–168
Google Scholar
Scherer KR (1974) Voice quality analysis of american and german speakers. J Psycholinguistic Res 3:281–298. doi:10.1007/BF01069244
Scherer KR (1979) Personality markers in speech, Cambridge University Press, Cambridge, pp 147–209
Google Scholar
Scherer KR, Scherer U (1981) Speech behavior and personality. Speech Evaluation Psychiatry, 115–135
Google Scholar
Scherer KR (1977) Effect of stress on fundamental frequency of the voice. J Acoust Soc Am 62:25–26
Article Google Scholar
Schmitt A, Pieraccini R, Polzehl T (2010a) For heavens sake, gimme a live person! designing emotion-detection customer care voice applications in automated call centers. Advances in speech recognition. Springer, US, Berlin, Germany, pp 81–110
Google Scholar
Schmitt A, Polzehl T, Minker W (2010b) Facing reality: simulating deployment of anger—recognition in IVR systems. In: Spoken dialogue systems for ambient environments—lecture notes in computer science, vol V. 6392, Springer, Makuhari, Japan, pp 23–48
Google Scholar
Schmitt A, Polzehl T, Minker W, Liscombe J (2010c) The influence of the utterance length on the recognition of aged voices. In Calzolari N, Choukri K, Maegaard B, Mariani JOJ, Piperidis S, Rosner M, Tapias D (eds) Proceedings of the seventh conference on international language resources and evaluation (LREC’10), Valletta, Malta. European Language Resources Association (ELRA), pp 1–6
Google Scholar
Schmitt A, Polzehl T, Minker W (2010d) Modeling a-priori likelihoods for angry user turns with hidden markov models. In: SpeechProsody, Chicago, IL., USA. University of Illionoise, pp 1–6
Google Scholar
Schröder M, Trouvain J (2003) The german text-to-speech synthesis system mary: a tool for research, development and teaching. Int J Speech Technol 6(4):365–377
Article Google Scholar
Schröder M, Grice M (2003) Expressing vocal effort in concatenative synthesis, 25892592. Citeseer
Google Scholar
Schuller B, Metze F, Steidl S, Batliner A, Eyben F, Polzehl T (2009) Late fusion of individual engines for imrpoved recognition of negative emotion in speech—learning vs. democratic vote. In: International conference on acoustics, speech and signal processing (ICASSP). IEEE
Google Scholar
Smith R, Parker E, Noble E (1975) Alcohol’s effect on some formal aspects of verbal social communication. Archives Gen Psychiatry 32(11):1394–1398
Article Google Scholar
Stagner R (1936) Judgments of voice and personality. J Educational Psychol 27(4):272–277
Article Google Scholar
Taylor HC (1934) Social agreement on personality traits as judged from speech. J Soc Psychol 5:244–248
Article Google Scholar
Trouvain J, Schmidt S, Schröder M, Schmitz M, Barry WJ (2006) Modelling personality features by changing prosody in synthetic speech. Number Table 2. ISCA, pp 4–7
Google Scholar
Tsalikis J, DeShields OJ, LaTour M (1991) The role of accent on the credibility and effectiveness of the salesman. J Pers Sell Sales Management 11:31–41
Google Scholar
Tusing K (2000) The sounds of dominance. vocal precursors of perceived dominance during interpersonal influence. Hum Commun Res 26(1):148–171
Google Scholar
Walker MA, Cahn JE, Whittaker SJ (1997) Improvising linguistic style: social and affective bases for agent personality. In: Proceedings of autonomous agents, p 10
Google Scholar
Winkler R (2003) Merkmale junger und alter stimmen: analyse ausgewählter parameter im kontext von wahrnehmung und klassifikation, vol 6. Reihe Mndliche Kommunikation. Logos Verlag, Berlin
Google Scholar
Zen G, Lepri B, Ricci E, Lanz O (2010) Space speaks: towards socially and personality aware visual surveillance. In: Proceedings of the 1st ACM international workshop on multimodal pervasive video analysis, p 3742
Google Scholar
Zuckerman M, Driver RE (1989) What sounds beautiful is good: the vocal attractiveness stereotype. J Nonverbal Behav 13(2):67–82
Article Google Scholar

Download references

Author information

Authors and Affiliations

Quality and Usability Lab, Telekom Innovation Laboratories, TU Berlin, Berlin, Germany
Tim Polzehl

Authors

Tim Polzehl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tim Polzehl .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Polzehl, T. (2015). Speech-Based Personality Assessment. In: Personality in Speech. T-Labs Series in Telecommunication Services. Springer, Cham. https://doi.org/10.1007/978-3-319-09516-5_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-09516-5_2
Published: 31 August 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09515-8
Online ISBN: 978-3-319-09516-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics