Abstract
This work investigates whether vocal emotion expressions of full-blown discrete emotions can be recognized cross-lingually. This study will enable us to get more information regarding nature and function of emotion. Furthermore, this work will help in developing a generalized vocal emotion recognition system, which will increase the efficiency required for human-machine interaction systems. An emotional speech database was created with 140 simulated utterances (20 per emotion) per speaker, consisting of short sentences of six full-blown discrete basic emotions and one ’no-emotion’ (i.e. neutral) in five native languages (not dialects) of Assam. A new feature set is proposed based on Eigenvalues of Autocorrelation Matrix (EVAM) of each frame of utterance. The Gaussian Mixture Model is used as classifier. The performance of EVAM feature set is compared at two sampling frequencies (44.1 kHz and 8.1 kHz) and with additive white noise with signal-to-noise ratios of 0 db, 5 db, 10 db and 20 db.
Chapter PDF
Similar content being viewed by others
Keywords
References
Holmes, J., Holmes, W.: Speech Synthesis and Recognition, 2nd edn. Taylor & Francis, New York (2001)
Rose, P.: Forensic Speaker Identification, p. 302. Taylor & Francis, New York (2002)
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., Taylor, J.G.: Emotion recognition in human-computer interaction. IEEE Signal Process. Mag. 18(1), 32–80 (2001)
Picard, R.W.: Affective Computing. The MIT Press, Cambridge (1997)
Juslin, P.N., Laukka, P.: Communication of Emotions in Vocal Expression and Music Performance. Psychological Bulletin 129(5), 770–814 (2003)
Scherer, K.R., Banse, R., Wallbott, H.G.: Emotion Inferences from Vocal Expression Correlate Across Languages and Cultures. J. Cross-Cultural Psychology 32(1), 76–92 (2001)
Laukka, P.: Vocal Expression of Emotion – Discrete-emotion and Dimensional Accounts. Comprehensive Summaries of Uppsala Dissertations from the Faculty of Social Sciences 141, ACTA Universitatis Upsaliensis, Uppsala (2004)
Scherer, K.R., Johnstone, T., Klasmeyer, G.: Vocal Expression of Emotion. In: Davidson, R.J., Scherer, K.R., Goldsmith, H.H. (eds.) Handbook of Affective Science, Part IV, ch. 23, 1st edn. Oxford University Press, Oxford (2003)
Ekman, P.: Basic Emotions. In: Dalgleish, T., Power, M. (eds.) Handbook of Cognition and Emotion, ch. 3. John Wiley & Sons, Ltd., Sussex (1999)
Marple Jr., S.L.: Digital Spectral Analysis With Applications. Prentice Hall Inc., Englewood Cliffs (1987)
Reynolds, D.A., Rose, R.C.: Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models. IEEE Trans. Speech Audio Process. 3(1), 72–83 (1995)
Fukunaga, K.: Introduction to Statistical Pattern Recognition, 2nd edn. Morgan Kaufmann, Academic Press, New York (1990)
Linde, Y., Buzo, A., Gray, R.M.: An Algorithm for Vector Quantizer Design. IEEE Transactions on Communications 28(1), 84–95 (1980)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kandali, A.B., Routray, A., Basu, T.K. (2009). Cross-Lingual Vocal Emotion Recognition in Five Native Languages of Assam Using Eigenvalue Decomposition. In: Chaudhury, S., Mitra, S., Murthy, C.A., Sastry, P.S., Pal, S.K. (eds) Pattern Recognition and Machine Intelligence. PReMI 2009. Lecture Notes in Computer Science, vol 5909. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11164-8_84
Download citation
DOI: https://doi.org/10.1007/978-3-642-11164-8_84
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11163-1
Online ISBN: 978-3-642-11164-8
eBook Packages: Computer ScienceComputer Science (R0)