International Journal of Speech Technology

, Volume 14, Issue 1, pp 35–48 | Cite as

Two stage emotion recognition based on speaking rate

  • Shashidhar G. Koolagudi
  • Rao Sreenivasa Krothapalli


This paper proposes two stage speech emotion recognition approach using speaking rate. The emotions considered in this study are anger, disgust, fear, happy, neutral, sadness, sarcastic and surprise. At the first stage, based on speaking rate, eight emotions are categorized into 3 broad groups namely active (fast), normal and passive (slow). In the second stage, these 3 broad groups are further classified into individual emotions using vocal tract characteristics. Gaussian mixture models (GMM) are used for developing the emotion models. Emotion classification performance at broader level, based on speaking rate is found to be around 99% for speaker and text dependent cases. Performance of overall emotion classification is observed to be improved using the proposed two stage approach. Along with spectral features, the formant features are explored in the second stage, to achieve robust emotion recognition performance in case of speaker, gender and text independent cases.


Active emotions Emotion categorization Fast speech Gaussian mixture models Formant features Normal speech Normal emotions Passive emotions Prosodic features Slow speech Speaking rate 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Alm, C. O., & Llora, X. (2006). Evolving emotional prosody. In ICSLP ninth international conference on spoken language processing INTERSPEECH 2006, Pittsburgh, PA, USA, 17–21 September 2006. Google Scholar
  2. Banziger, T., & Scherer, K. R. (2005). The role of intonation in emotional expressions. Speech Communication 46, 252–267. CrossRefGoogle Scholar
  3. Benesty, J., Sondhi, M. M., & Huang, Y. (Eds.) (2008). Springer handbook on speech processing. Berlin: Springer. Google Scholar
  4. Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., & Weiss, B. (2005). A database of german emotional speech, In Interspeech. Google Scholar
  5. Cowie, R., & Cornelius, R. R. (2003). Describing the emotional states that are expressed in speech. Speech Communication, 40, 5–32. CrossRefMATHGoogle Scholar
  6. Francis, A. L., & Nusbaum, H. C. Paying attention to speaking rate. Center for Computational Psychology, Department of Psychology, The University of Chicago. Google Scholar
  7. Koolagudi, S. G., Maity, S., Kumar, V. A., Chakrabarti, S., & Rao, K. S. (2009). In Communications in computer and information science. IITKGP-SESC: Speech database for emotion analysis, JIIT University, Noida, India, 17–19 August 2009. Berlin: Springer. ISSN: 1865-0929. Google Scholar
  8. Lee, C. M., & Narayanan, S. S. (2005). Toward detecting emotions in spoken dialogs. IEEE Transactions on Speech and Audio Processing, 13, 293–303. CrossRefGoogle Scholar
  9. Li, A. & Zu, Y. (2008). Speaking rate effects on discourse prosody in standard Chinese. In Fourth international conference on speech prosody 2008 (pp. 449–452). Campinas, Brazil, 6–9 May 2008. Google Scholar
  10. Lussier, E. F., & Morgan, N. (1999). Effects of speaking rate and word frequency on pronunciations in conventional speech. Speech Communication, 29, 137–158. CrossRefGoogle Scholar
  11. Murty, K., & Yegnanarayana, B. (2008). Epoch extraction from speech signals. IEEE Transactions on Audio, Speech, and Language Processing, 16, 1602–1613. CrossRefGoogle Scholar
  12. Rabiner, L. R., & Juang, B. H. (1993). Fundamentals of speech recognition. Englewood Cliffs: Prentice-Hall. Google Scholar
  13. Rao, K. S., & Yegnanarayana, B. (2007). Modeling durations of syllables using neural networks. Computer Speech and Language, 21, 282–295. CrossRefGoogle Scholar
  14. Reddy, M. S. H., Kumar, K. S., Guruprasad, S., & Yegnanarayana, B. (2009). Subsegmental features for analysis of speech at different speaking rates. In International conference on natural language processing, India (pp. 75–80). New York: Macmillan. Google Scholar
  15. Richardson, M., Hwang, M. Y., Acero, A., & Huang, X. (1999). Improvements on speech recognition for fast talkers. In Eurospeech conference, September 1999. Google Scholar
  16. Sagar, T. V., Rao, K. S., Prasanna, S. R. M., & Dandapat, S. (2007). Characterisation and incorporation of emotions in speech. In IEEE INDICON, Delhi, India, September 2006. Google Scholar
  17. Schroder, M., Cowie, R., Douglas-Cowie, E., Westerdijk, M., & Gielen, S. (2001). Acoustic correlates of emotion dimensions in view of speech synthesis. In 7th European conference on speech communication and technology, EUROSPEECH 2001 Scandinavia, 2nd INTERSPEECH Event, Aalborg, Denmark, 3–7 September 2001. Google Scholar
  18. Seshadri, G., & Yegnanarayana, B. (2009). Perceived loudness of speech based on the characteristics of glottal excitation source. The Journal of the Acoustical Society of America, 126, 2061–2071. CrossRefGoogle Scholar
  19. Ververidis, D., Kotropoulos, C., & Pitas, I. (2004). Automatic emotional speech classification. In ICASSP 2004 IEEE (pp. I593–I596). Google Scholar
  20. Yang, H., Guo, W., & Liang, Q. (2008). A speaking rate adjustable digital speech repeater for listening comprehension in second-language learning. In International conference on computer science and software engineering (Vol. 5, pp. 893–896). 12–14 December 2008. CrossRefGoogle Scholar
  21. Yuan, J., Liberman, M., & Cieri, C. (2006). Towards an integrated understanding of speaking rate in conversation. In Interspeech 2006 (pp. 541–544). Pittsburgh, PA. Google Scholar
  22. Zheng, J., Franco, H., Weng, F., Sankar, A., & Bratt, H. (2000). Word-level rate of speech modeling using rate-specific phones and pronunciations. In International conf. on acoustic, speech and signal processing (ICASSP-2000) (pp. 1775–1778). Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Shashidhar G. Koolagudi
    • 1
  • Rao Sreenivasa Krothapalli
    • 1
  1. 1.School of Information TechnologyIndian Institute of Technology KharagpurKharagpurIndia

Personalised recommendations