Application of Vector Quantization in Emotion Recognition from Human Speech
Recognition of emotions from speech is a complex task that is furthermore complicated by the fact that there is no unambiguous answer to what the “correct” emotion is for a given speech sample. In this paper, we discuss emotion classification of a well known German database consisting of 6 basic emotions: sadness, boredom, neutral, fear, happiness, and anger using Mel frequency Cepstral Coefficients (MFCCs). A concern with MFCC is the large number of features. We discuss the use of LBG-VQ algorithm to minimize the amount of data to be handled. At last, emotion classification is done using Euclidean distance, Manhattan distance and Chebyshev distance of the codebooks between neutral state and other emotional states for the same sample.
KeywordsEmotion recognition Mel frequency cepstral coefficient vector quantization German database
Unable to display preview. Download preview PDF.
- 2.Litman, D., Forbes, K.: Recognizing emotions from student speech in tutoring dialogues. In: The Proceedings of the ASRU 2003 (2003)Google Scholar
- 3.Lee, C.M., Narayanan, S.: Towards detecting emotion in spoken dialogs. IEEE Trans. on Speech and Audio Processing 13(2) (2005)Google Scholar
- 4.Tato, R., Santos, R., Kompe, R., Pardo, J.: Emotional space improves emotion recognition. In: The Proceedings of the Seventh International Conference on Spoken Language Processing, vol. 3, pp. 2029–2032 (2002)Google Scholar
- 5.Yacoub, S., Simske, S., Lin, X., Burns, J.: Recognition of emotions in interactive voice response systems. In: The Proceedings of the Eighth European Conference on Speech Communication and Technology, pp. 729–732 (2003)Google Scholar
- 6.Oudeyer, P.Y.: The production and recognition of emotions in speech: features and algorithms. International Journal of Human Computer Interaction 59(1-2), 157–183 (2003)Google Scholar
- 7.Yu, F., Chang, E., Xu, Y.Q., Shum, H.Y.: Emotion detection from speech to enrich multimedia content. In: The Proceedings of the Second IEEE Pacific Rim Conference on Multimedia, pp. 550–557 (2001)Google Scholar
- 8.Kwon, O.W., Chan, K., Hao, J., Lee, T.W.: Emotion recognition by speech signals. In: The Proceedings of the Eighth European Conference on Speech Communication and Technology (EUROSPEECH), pp. 125–128 (2003)Google Scholar
- 9.German Emotional Speech Database, http://emotion-research.net/biblio/tuDatabase
- 10.Deller, J., Hansen, J., Proakis, J.: Discrete-time processing of speech signals, 2nd edn. IEEE Press, New York (2000)Google Scholar