Abstract
Emotion detection currently is found to be an important and interesting part of speech analysis. The analysis can be done by selection of an effective parameter or by combination of a number of parameters to gain higher accuracy level. Definitely selection of a number of parameters together will provide a reliable solution for getting higher level of accuracy than that of for the single parameter. Energy, MFCCs, pitch values, timbre, and vocal tract frequencies are found to be effective parameters with which detection accuracy can be improved. It is observed that results with the language are proportional with results with other languages indicating that language will be an independent parameter for emotion detection. Similarly, by addition of an effective classifier like neural network can further yield the recognition accuracy nearly to 100 %. The work attempts to interpret the fact that combining the results of each parameter has improved detection accuracy.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
L. Rabiner, B. H. Juang, “Fundamentals of Speech Recognition”, Pearson Education, 200.
Madhavi S. Pednekar, Kavita Tiware and Sachin Bhagwat, “Continuous Speech Recognition for Marathi Language Using Statistical Method”, IEEE International Conference on “Computer Vision and Information Technology, Advances and Applications~, ACVIT-09, December 2009, pp. ISSN 2319–7080 International Journal of Computer Science and Communication Engineering Volume 3 issue 1(February 2014 issue) 45 810–816.
J Xu, H Zhou, G-B Huang, in Information Fusion 2012 15th International Conference On Extreme learning machine based fast object recognition (IEEE, Singapore, 2012), pp. 1490–1496.
L.R. Rabiner, “A tutorial on hidden markov models and selected applications in speech recognition”, In proc. of the IEEE, Vol. 71, no. 2, pp. 227–286, Feb 1989.
A.B. Kandali, A.B. Routray, Basu T.K., “Emotion Recognition From Assamese Speeches Using M FCC And GM M Classifier”, IEEE Region Conference TEN CON 2008, India, pp. 1–5.
A Khan, A Majid, A Mirza, Combination and optimization of classifiers in gender classification using genetic program., IOS press. Int. J. Knowl. Based Intell. Eng.Syst. 9, 1–11 (2005)
Yashpalsing D. Chavhan and M.L. Dhore, “Speech Emotion Recognition using SVM” IEEE International Conference on „Computer Vision and Information Technology, Advances and Applications ~ , ACVIT-09, December 2009, pp. 799–804.
Montero, J. M., Gutiérrez-Arriola, J., Palazuelos, S.,Enríquez, E., Aguilera, S., & Pardo, J. M., “Emotional Speech Synthesis: From Speech Database to T-T-S”, ICSLP 98, Vol. 3, p. 923–926. Burkhardt, F., “Simulation emotional ersprechweise mit Sprach syntheseverfahren” [Simulation of emotional manner of speech using speech synthesis techniques], PhD Thesis, TU Berlin, 2000.
ISCA Workshop on Speech & Emotion, p. 151–156.
Vroomen, J., Collier, R., & Mozziconacci, S. J. L., “Duration and Intonation in Emotional Speech”, Eurospeech 93, Vol. 1, p. 577–580.
Montero, J. M., Gutiérrez-Arriola, J., Colás, J., Enríquez,E., & Pardo, J. M., “Analysis and Modeling of Emotional Speech in Spanish”, ICPhS 99, p. 957–960.
Murray, I. R., Edgington, M. D., Campion, D., & Lynn., “ Rule-based Emotion Synthesis Using Concatenated Speech”, ISCA Workshop on Speech & Emotion, 2000, p. 173–177.
MA Zissman, Comparison of four approaches to automatic language identification of telephone speech. IEEE Transactions on Speech and Audio Processing. 4(1), 31 (1996).
AF Martin, CS Greenberg, in Odyssey. The 2009 nist language recog. evaluation (ISCA, Brno, Czech, 2010), p. 30.
A.Falaschi, M.Guistianiani, M.Verola, “A hidden markov model approach to speech synthesis”, In proc. of Eurospeech, Paris, France, 1989, pp 187–190.
S. Martincic- Ipsic and I. Ipsic, “Croatian H M M Based Speech Synthesis,” 28th Int. Conf. Information Technology Interfaces ITI 2006, pp. 19–22, 2006, Cavtat, Croatia.
Firoz Shah. A, Raji Sukumar. A, and Babu Anto. P, “Discreet Wavelet Transforms and Artificial Neural Networks for Speech Emotion Recognition”, International Journal of Computer Theory and Engineering, Vol. 2, No. 3, 1793–8201, June 2010, pp. 319–322.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media Singapore
About this paper
Cite this paper
Darekar, R.V., Dhande, A.P. (2017). Toward Improved Performance of Emotion Detection: Multimodal Approach. In: Satapathy, S., Bhateja, V., Joshi, A. (eds) Proceedings of the International Conference on Data Engineering and Communication Technology. Advances in Intelligent Systems and Computing, vol 469. Springer, Singapore. https://doi.org/10.1007/978-981-10-1678-3_42
Download citation
DOI: https://doi.org/10.1007/978-981-10-1678-3_42
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-1677-6
Online ISBN: 978-981-10-1678-3
eBook Packages: EngineeringEngineering (R0)