Abstract
Recognition of Emotion from speech is an extremely challenging task in current research. Using the reduced dimension method for feature extraction, Singular Value Decomposition (SVD) has proposed. Classification using Support Vector Machines (SVM) with SVD features shows an excellent result, which is the novelty of this work. The proposed features are evaluated for the task of emotion classification using simulation method. SVM has been designed as the classifier for classifying the unseen emotions in speech. It is shown that the classifier with such features outperforms the methods substantially. Using such features for classification outperforms the accuracy level approximately 90 % that leads towards automatic recognition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Schuller, B., Batliner, A., Steidl, S., Seppi, D.: Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge. Speech Communication 53(9–10), 1062–1087 (2011)
Bosh, L.: Emotions: what is possible in the ASR framework. In: ISCA Workshop on Speech and Emotion, Belfast (2000)
Lambrou, T., Kudumakis, P., Speller, R., Sandler, M., Linney, A.: Classification of audio signals using statistical features on time and wavelet transform domains. In: ASSP 1998, vol. 6, pp. 3621–3624, 12–15 May 1998
Ververidis, D., Kotropoulos, C.: Emotional speech recognition: Resources, features, and methods. Speech Commun. 48, 1162–1181 (2006)
Lee, C., Mower, E., Busso, C., Lee, S., Narayanan, S.: Emotion recognition using a hierarchical binary decision tree approach. In: Proceedings of the Interspeech, Brighton, pp. 320–323 (2009)
Kwon, O.-W., Chan, K., Hao, J., Lee, T.-W.: Emotion recognition by speech signals. In: Proceedings of the Interspeech, pp. 125–128 (2003)
Lee, C.M., Narayanan, S.S.: Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13(2), 293–303 (2005)
Batliner, A., Fischer, K., Huber, R., Spilker, J., Nöth, E.: Desperately seeking emotions: actors, wizards, and human beings. In: Proceedings of the ISCA Workshop on Speech and Emotion, Newcastle, Northern Ireland, pp. 195–200 (2000)
Ayadi, M.M.H.E., Kamel, M.S., Karray, F.: Speech emotion recognition using gaussian mixture vector autoregressive models. In: Proceedings of the ICASSP, Honolulu, HY, pp. 957–960 (2007)
Steidl, S., Schuller, B., Batliner, A., Seppi, D.: The hinterland of emotions: facing the open-microphone challenge. In: Proceedings of the ACII, Amsterdam, Netherlands, pp. 690–697 (2009)
Kharat, G.U., Dudul, S.V.: Human emotion recognition system using optimally designed SVM with different facial feature extraction techniques. WSEAS Trans. Comput. 7(6), 650–659 (2008)
Litman, D., Forbes, K.: Recognizing emotions from student speech in tutoring dialogues. In: Proceedings of the ASRU, Virgin Island, USA, pp. 25–30 (2003)
Shami, M., Verhelst, W.: Automatic classification of expressiveness in speech: a multi-corpus study. In: Müller, C. (ed.) Speaker Classifcation II. LNCS (LNAI), vol. 4441, pp. 43–56. Springer, Heidelberg (2007)
Chuang, Z.-J., Wu, C.-H.: Emotion recognition using acoustic features and textual content. In: Proceedings of the ICME, Taipei, Taiwan, pp. 53–56 (2004)
McGilloway, S., Cowie, R., Doulas-Cowie, E., Gielen, S., Westerdijk, M., Stroeve, S.: Approaching automatic recognition of emotion from voice: a rough benchmark. In: Proceedings of the ISCA Workshop on Speech and Emotion, Newcastle, Northern Ireland, pp. 207–212 (2000)
Morrison, D., Wang, R., Xu, W., Silva, L.C.D.: Incremental learning for spoken affect classification and its application in call centres. Int. J. Intell. Systems Technol. Appl. 2, 242–254 (2007)
Mohanty, M.N., Routray, A., Pradhan, A.K., Kabisatpathy, P.: Power quality disturbances classification usingsupport vector machines with optimized time-frequency kernels. Int. J. Power Electron. 4(2), 181–196 (2012)
Frénay, B., Verleysen, M.: Using SVMs with randomised feature spaces: an extreme learning approach. In: Proceedings of ESANN, pp. 315–320 (2010)
Groutage, D., Bennink, D.: A new matrix decomposition based on optimum transformation of the singular value decomposition basis sets yields principal features of time-frequency distributions. In: Proceedings of the Tenth IEEE Workshop on Statistical Signal and Array Processing, August 2000)
Mohanty, M.N., Routray, A.: Estimation of autocorrelation space for classification of bio-medical signals. In: Panigrahi, B.K., Das, S., Suganthan, P.N., Nanda, P.K. (eds.) SEMCCO 2012. LNCS, vol. 7677, pp. 697–704. Springer, Heidelberg (2012)
Haykins, S.: Neural Networks, 2nd edn. Prentice Hall, New Jersey (1999)
Jolliffe, I.T.: Principal Component Analysis. Springer, Berlin (2002)
Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic Press, New York (1990)
Bellman, R.: Adaptive Control Processes. Princeton University Press, Princeton (1961)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Mohanty, M.N., Routray, A. (2015). Machine Learning Approach for Emotional Speech Classification. In: Panigrahi, B., Suganthan, P., Das, S. (eds) Swarm, Evolutionary, and Memetic Computing. SEMCCO 2014. Lecture Notes in Computer Science(), vol 8947. Springer, Cham. https://doi.org/10.1007/978-3-319-20294-5_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-20294-5_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20293-8
Online ISBN: 978-3-319-20294-5
eBook Packages: Computer ScienceComputer Science (R0)