Application of Vector Quantization in Emotion Recognition from Human Speech

  • Preeti Khanna
  • M. Sasi Kumar
Part of the Communications in Computer and Information Science book series (CCIS, volume 141)


Recognition of emotions from speech is a complex task that is furthermore complicated by the fact that there is no unambiguous answer to what the “correct” emotion is for a given speech sample. In this paper, we discuss emotion classification of a well known German database consisting of 6 basic emotions: sadness, boredom, neutral, fear, happiness, and anger using Mel frequency Cepstral Coefficients (MFCCs). A concern with MFCC is the large number of features. We discuss the use of LBG-VQ algorithm to minimize the amount of data to be handled. At last, emotion classification is done using Euclidean distance, Manhattan distance and Chebyshev distance of the codebooks between neutral state and other emotional states for the same sample.


Emotion recognition Mel frequency cepstral coefficient vector quantization German database 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., Taylor, J.: Emotion recognition in human-computer interactions. IEEE Signal Proceedings 18(1), 32–80 (2001)CrossRefGoogle Scholar
  2. 2.
    Litman, D., Forbes, K.: Recognizing emotions from student speech in tutoring dialogues. In: The Proceedings of the ASRU 2003 (2003)Google Scholar
  3. 3.
    Lee, C.M., Narayanan, S.: Towards detecting emotion in spoken dialogs. IEEE Trans. on Speech and Audio Processing 13(2) (2005)Google Scholar
  4. 4.
    Tato, R., Santos, R., Kompe, R., Pardo, J.: Emotional space improves emotion recognition. In: The Proceedings of the Seventh International Conference on Spoken Language Processing, vol. 3, pp. 2029–2032 (2002)Google Scholar
  5. 5.
    Yacoub, S., Simske, S., Lin, X., Burns, J.: Recognition of emotions in interactive voice response systems. In: The Proceedings of the Eighth European Conference on Speech Communication and Technology, pp. 729–732 (2003)Google Scholar
  6. 6.
    Oudeyer, P.Y.: The production and recognition of emotions in speech: features and algorithms. International Journal of Human Computer Interaction 59(1-2), 157–183 (2003)Google Scholar
  7. 7.
    Yu, F., Chang, E., Xu, Y.Q., Shum, H.Y.: Emotion detection from speech to enrich multimedia content. In: The Proceedings of the Second IEEE Pacific Rim Conference on Multimedia, pp. 550–557 (2001)Google Scholar
  8. 8.
    Kwon, O.W., Chan, K., Hao, J., Lee, T.W.: Emotion recognition by speech signals. In: The Proceedings of the Eighth European Conference on Speech Communication and Technology (EUROSPEECH), pp. 125–128 (2003)Google Scholar
  9. 9.
    German Emotional Speech Database,
  10. 10.
    Deller, J., Hansen, J., Proakis, J.: Discrete-time processing of speech signals, 2nd edn. IEEE Press, New York (2000)Google Scholar
  11. 11.
    Soong, F., Rosenberg, E., Juang, B., Rabiner, L.: A vector quantization approach to speaker recognition. AT&T Technical Journal 66, 14–26 (1987)CrossRefGoogle Scholar
  12. 12.
    Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. IEEE Transactions on Communications 28, 84–95 (1980)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Preeti Khanna
    • 1
  • M. Sasi Kumar
    • 2
  1. 1.SBM, SVKM’s NMIMSMumbaiIndia
  2. 2.CDAC, Kharghar, NaviMumbaiIndia

Personalised recommendations