Emotion Classification Based on Speaking Rate

Koolagudi, Shashidhar G.; Ray, Sudhin; Sreenivasa Rao, K.

doi:10.1007/978-3-642-14834-7_30

Shashidhar G. Koolagudi⁹,
Sudhin Ray⁹ &
K. Sreenivasa Rao⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 94))

Included in the following conference series:

International Conference on Contemporary Computing

1178 Accesses
3 Citations

Abstract

In this paper, vocal tract characteristics related to speaking rate are explored to categorise the emotions. The emotions considered are anger, disgust, fear, happy, neutral, sadness, sarcastic and surprise. These emotions are grouped into 3 broad categories namely normal, fast and slow based on speaking rate. Mel frequency cepstral coefficients (MFCC’s) are used as features and Gaussian Mixture Models are used for developing the emotion classification models. The basic hypothesis is that the sequence of vocal tract shapes in producing the speech for the given utterance is unique with respect to the speaking rate. The overall classification performance of emotions using speaking rate is observed to be 91% in case of single female utterances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sreenivasa Rao, K., Yegnanarayana, B.: Modeling durations of syllables using neural networks. Computer Speech and Language 21, 282–295 (2007)
Article Google Scholar
Agwuelea, A., Harvey, M.S., Lindblom, B.: The Effect of Speaking Rate on Consonant Vowel Coarticulation. Phonetica 65, 194–209 (2008)
Article Google Scholar
Martinez, J.F., Tapias, D., Alvarez, I.: Toward speech rate independence in large vocabulary continuous speech recognition. In: International Conference on Signal and Speesh Processing, pp. 725–728 (1998)
Google Scholar
Van Doorn, J.: Does artificially increased speech rate help? In: 8th Aust. International Conference on Speech Science and Technology, pp. 750–755 (2000)
Google Scholar
Goldman Eisler, F.: The significance of changes in the rate of articulation. Language and Speech 4, 171–175 (1961)
Google Scholar
Wang, D., Narayanan, S.: Speech rate estimation via temporal correlation and selected sub-band correlation. In: International Conference on Acoustics, Speech, and Signal Processing (2000)
Google Scholar
Gandour, J., Tumtavitikulc, A., Satthamnuwongb, N.: Effects of Speaking Rate on Thai Tones. Phonetica 56(3-4), 123–134 (1999)
Article Google Scholar
Koolagudi Shashidhar, G., Sudhamay, M., Kumar, V.A., Saswat, C., Sreenivasa Rao, K.: IITKGP-SESC: Speech Database for Emotion Analysis. In: Communications in Computer and Information Science. LNCS. Springer, Heidelberg (2009), ISSN: 1865-0929
Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley, Chichester (2001)
MATH Google Scholar
Rabiner, L.R., Juang, B.H.: Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technology, Indian Institute of Technology Kharagpur, Kharagpur, 721302, West Bengal, India
Shashidhar G. Koolagudi, Sudhin Ray & K. Sreenivasa Rao

Authors

Shashidhar G. Koolagudi
View author publications
You can also search for this author in PubMed Google Scholar
Sudhin Ray
View author publications
You can also search for this author in PubMed Google Scholar
K. Sreenivasa Rao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Sciences, University of Florida, 32611, Gainesville, FL, USA
Sanjay Ranka
University of Florida, Gainesville, Fl, USA
Arunava Banerjee
Department of Computer Science and Engineering, Indian Institute of Technology, 110016, New Delhi, INDIA
Kanad Kishore Biswas
Computer Science, College of Engineering and Science, Louisiana Tech University, LA 71272, Ruston, USA
Sumeet Dua
University of Florida, Gainesville, FL, USA
Prabhat Mishra
Department of Computer Science & Engineering, Indian Institute of Technology, 208016, Kanpur, India
Rajat Moona
National Tsing Hua University, Hsin-Chu, Taiwan, R.O.C.
Sheung-Hung Poon
Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong
Cho-Li Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Koolagudi, S.G., Ray, S., Sreenivasa Rao, K. (2010). Emotion Classification Based on Speaking Rate. In: Ranka, S., et al. Contemporary Computing. IC3 2010. Communications in Computer and Information Science, vol 94. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14834-7_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-14834-7_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14833-0
Online ISBN: 978-3-642-14834-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics