Abstract
In this paper, an attempt is made to explore the dynamics of speech prosody to characterize and classify emotional states in a speech signal. The local or fine variations describing the prosodic dynamics are combined with the static prosodic parameters for a possible enhancement in the emotional speech recognition (ESR) accuracy. The efficient vector quantization (VQ) clustering algorithm has been applied to compress the static and dynamic parameters before further processing in a radial basis neural network (RBFNN) platform. Results reveal an improvement in ESR accuracy of 86.05% by involving both static and dynamic prosodic features as compared to 84.92% accuracy when the combination of static prosodic feature simulated alone.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Palo, H.K., Mohanty, M.N.: Compartive analysis of neural networks for speech emotion recognition. Int. J. Eng. Technol. 7(4), 111–126 (2018)
Rao, K.S., Reddy, R., Maity, S., Koolagudi, S. G.: Characterization of emotions using the dynamics of prosodic features. In: Speech Prosody 2010-Fifth International Conference (2010)
Mannepalli, K., Maloji, S., Sastry, P.N., Danthala, S., Mannepalli, D.P.: Text independent emotion recognition for Telugu speech by using prosodic features. Int. J. Eng. Technol. 7(4), 111–126; 7(2), 594–596 (2018)
Cao, H., Verma, R., Nenkova, A.: Speaker-sensitive emotion recognition via ranking: studies on acted and spontaneous speech. Comput. Speech Lang. 29(1), 186–202 (2015)
Palo, H.K., Mohanty, M.N.: Modified-VQ features for speech emotion recognition. J. Appl. Sci. 16(9), 406–418 (2016)
Ramakrishnan, S.: Recognition of emotion from speech: a review. In: Speech Enhancement, Modeling and Recognition-Algorithms and Applications. InTech (2012)
Mishra, A.N., Chandra, M., Biswas, A., Sharan, S.N.: Robust features for connected Hindi digits recognition. Int. J. Sign. Process. Image Process. Pattern Recogn. 4(2), 79–90 (2011)
Kwon, O.W., Chan, K., Hao, J., Lee, T-W.: Emotion recognition by speech signals. In: Interspeech (2003)
Palo, H.K., Chandra, M., Mohanty, M.N.: Recognition of human speech emotion using variants of Mel-Frequency cepstral coefficients. In: Advances in Systems, Control and Automation, pp. 491–498. Springer, Singapore (2018)
Jackson, P., Haq, S.: Surrey audio-visual expressed emotion (SAVEE) database, pp. 398–423. University of Surrey, Guildford, UK (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Palo, H.K., Mohanty, M.N. (2020). Analysis of Speech Emotions Using Dynamics of Prosodic Parameters. In: Mallick, P., Balas, V., Bhoi, A., Chae, GS. (eds) Cognitive Informatics and Soft Computing. Advances in Intelligent Systems and Computing, vol 1040. Springer, Singapore. https://doi.org/10.1007/978-981-15-1451-7_36
Download citation
DOI: https://doi.org/10.1007/978-981-15-1451-7_36
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1450-0
Online ISBN: 978-981-15-1451-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)