Abstract
The paper presents the voice recognition EER (Equal Error Rate) scores for speakers’ basic emotional states. The database of Polish emotional speech used during the tests includes recordings of six acted emotional states (anger, sadness, happiness, fear, disgusts, surprise) and the neutral state of 13 amateur speakers (2118 utterances). The voice recognition procedure was proceeded with MFCC features and GMM classifiers. The EER scores distinctly depend on speakers’ emotional states, even for a simulated database. The mean EER results tend to be only slightly less sensitive to an emotional state, even when using speech in various kinds of emotional arousal in a training set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cowie, R.: Describing the Emotional States Expressed in Speech. In: Proc. of ISCA, Belfast 2000, pp. 11–18 (2000)
Douglas-Cowie, E., Campbell, N., Cowie, R., Roach, P.: Emotional speech: Towards a new generation of databases. Speech Communication 40, 33–60 (2003)
Ververdis, D., Kotropoulos, C.: A State of the Art on Emotional Speech Databases. In: Proc. of 1st Richmedia Conf., Laussane, Switzerland, pp. 109–119 (October 2003)
Staroniewicz, P.: Polish emotional speech database–design. In: Proc. of 55th Open Seminar on Acoustics, Wroclaw, Poland, pp. 373–378 (2008)
Staroniewicz, P., Majewski, W.: Polish Emotional Speech Database – Recording and Preliminary Validation. In: Esposito, A., Vích, R. (eds.) Cross-Modal Analysis of Speech, Gestures, Gaze and Facial Expressions. LNCS (LNAI), vol. 5641, pp. 42–49. Springer, Heidelberg (2009)
Staroniewicz, P.: Recognition of Emotional State in Polish Speech – Comparison between Human and Automatic Efficiency. In: Fierrez, J., Ortega-Garcia, J., Esposito, A., Drygajlo, A., Faundez-Zanuy, M. (eds.) BioID MultiComm2009. LNCS, vol. 5707, pp. 33–40. Springer, Heidelberg (2009)
Staroniewicz, P.: Test of Robustness of GMM Speaker Verification in VoIP Telephony. Archives of Acoustics 32(4), 187–192 (2007)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digital Signal Processing 10, 19–41 (2000)
Bimbot, F., et al.: A tutorial on text-independent speaker verification. EURASIP Journal on Applied Signal Processing 4, 430–451 (2004)
Staroniewicz, P.: Speaker Recognition for VoIP Transmission Using Gaussian Mixture Models. In: Computer Recogition Systems, pp. 739–745. Springer, Heidelberg (2005)
COST Action 2102 Modal Analysis of Verbal and Non-verbal Communication. Memorandum of Understanding, Brussels, July 11 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Staroniewicz, P. (2011). Influence of Speakers’ Emotional States on Voice Recognition Scores. In: Esposito, A., Vinciarelli, A., Vicsi, K., Pelachaud, C., Nijholt, A. (eds) Analysis of Verbal and Nonverbal Communication and Enactment. The Processing Issues. Lecture Notes in Computer Science, vol 6800. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25775-9_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-25775-9_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25774-2
Online ISBN: 978-3-642-25775-9
eBook Packages: Computer ScienceComputer Science (R0)