Abstract
In this paper, vocal tract characteristics related to speaking rate are explored to categorise the emotions. The emotions considered are anger, disgust, fear, happy, neutral, sadness, sarcastic and surprise. These emotions are grouped into 3 broad categories namely normal, fast and slow based on speaking rate. Mel frequency cepstral coefficients (MFCC’s) are used as features and Gaussian Mixture Models are used for developing the emotion classification models. The basic hypothesis is that the sequence of vocal tract shapes in producing the speech for the given utterance is unique with respect to the speaking rate. The overall classification performance of emotions using speaking rate is observed to be 91% in case of single female utterances.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sreenivasa Rao, K., Yegnanarayana, B.: Modeling durations of syllables using neural networks. Computer Speech and Language 21, 282–295 (2007)
Agwuelea, A., Harvey, M.S., Lindblom, B.: The Effect of Speaking Rate on Consonant Vowel Coarticulation. Phonetica 65, 194–209 (2008)
Martinez, J.F., Tapias, D., Alvarez, I.: Toward speech rate independence in large vocabulary continuous speech recognition. In: International Conference on Signal and Speesh Processing, pp. 725–728 (1998)
Van Doorn, J.: Does artificially increased speech rate help? In: 8th Aust. International Conference on Speech Science and Technology, pp. 750–755 (2000)
Goldman Eisler, F.: The significance of changes in the rate of articulation. Language and Speech 4, 171–175 (1961)
Wang, D., Narayanan, S.: Speech rate estimation via temporal correlation and selected sub-band correlation. In: International Conference on Acoustics, Speech, and Signal Processing (2000)
Gandour, J., Tumtavitikulc, A., Satthamnuwongb, N.: Effects of Speaking Rate on Thai Tones. Phonetica 56(3-4), 123–134 (1999)
Koolagudi Shashidhar, G., Sudhamay, M., Kumar, V.A., Saswat, C., Sreenivasa Rao, K.: IITKGP-SESC: Speech Database for Emotion Analysis. In: Communications in Computer and Information Science. LNCS. Springer, Heidelberg (2009), ISSN: 1865-0929
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley, Chichester (2001)
Rabiner, L.R., Juang, B.H.: Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Koolagudi, S.G., Ray, S., Sreenivasa Rao, K. (2010). Emotion Classification Based on Speaking Rate. In: Ranka, S., et al. Contemporary Computing. IC3 2010. Communications in Computer and Information Science, vol 94. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14834-7_30
Download citation
DOI: https://doi.org/10.1007/978-3-642-14834-7_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14833-0
Online ISBN: 978-3-642-14834-7
eBook Packages: Computer ScienceComputer Science (R0)