Machine Learning Approach for Emotional Speech Classification

Mohanty, Mihir Narayan; Routray, Aurobinda

doi:10.1007/978-3-319-20294-5_43

Mihir Narayan Mohanty¹⁶ &
Aurobinda Routray¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8947))

Included in the following conference series:

International Conference on Swarm, Evolutionary, and Memetic Computing

1686 Accesses
1 Citations

Abstract

Recognition of Emotion from speech is an extremely challenging task in current research. Using the reduced dimension method for feature extraction, Singular Value Decomposition (SVD) has proposed. Classification using Support Vector Machines (SVM) with SVD features shows an excellent result, which is the novelty of this work. The proposed features are evaluated for the task of emotion classification using simulation method. SVM has been designed as the classifier for classifying the unseen emotions in speech. It is shown that the classifier with such features outperforms the methods substantially. Using such features for classification outperforms the accuracy level approximately 90 % that leads towards automatic recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Schuller, B., Batliner, A., Steidl, S., Seppi, D.: Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge. Speech Communication 53(9–10), 1062–1087 (2011)
Article Google Scholar
Bosh, L.: Emotions: what is possible in the ASR framework. In: ISCA Workshop on Speech and Emotion, Belfast (2000)
Google Scholar
Lambrou, T., Kudumakis, P., Speller, R., Sandler, M., Linney, A.: Classification of audio signals using statistical features on time and wavelet transform domains. In: ASSP 1998, vol. 6, pp. 3621–3624, 12–15 May 1998
Google Scholar
Ververidis, D., Kotropoulos, C.: Emotional speech recognition: Resources, features, and methods. Speech Commun. 48, 1162–1181 (2006)
Article Google Scholar
Lee, C., Mower, E., Busso, C., Lee, S., Narayanan, S.: Emotion recognition using a hierarchical binary decision tree approach. In: Proceedings of the Interspeech, Brighton, pp. 320–323 (2009)
Google Scholar
Kwon, O.-W., Chan, K., Hao, J., Lee, T.-W.: Emotion recognition by speech signals. In: Proceedings of the Interspeech, pp. 125–128 (2003)
Google Scholar
Lee, C.M., Narayanan, S.S.: Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13(2), 293–303 (2005)
Article Google Scholar
Batliner, A., Fischer, K., Huber, R., Spilker, J., Nöth, E.: Desperately seeking emotions: actors, wizards, and human beings. In: Proceedings of the ISCA Workshop on Speech and Emotion, Newcastle, Northern Ireland, pp. 195–200 (2000)
Google Scholar
Ayadi, M.M.H.E., Kamel, M.S., Karray, F.: Speech emotion recognition using gaussian mixture vector autoregressive models. In: Proceedings of the ICASSP, Honolulu, HY, pp. 957–960 (2007)
Google Scholar
Steidl, S., Schuller, B., Batliner, A., Seppi, D.: The hinterland of emotions: facing the open-microphone challenge. In: Proceedings of the ACII, Amsterdam, Netherlands, pp. 690–697 (2009)
Google Scholar
Kharat, G.U., Dudul, S.V.: Human emotion recognition system using optimally designed SVM with different facial feature extraction techniques. WSEAS Trans. Comput. 7(6), 650–659 (2008)
Google Scholar
Litman, D., Forbes, K.: Recognizing emotions from student speech in tutoring dialogues. In: Proceedings of the ASRU, Virgin Island, USA, pp. 25–30 (2003)
Google Scholar
Shami, M., Verhelst, W.: Automatic classification of expressiveness in speech: a multi-corpus study. In: Müller, C. (ed.) Speaker Classifcation II. LNCS (LNAI), vol. 4441, pp. 43–56. Springer, Heidelberg (2007)
Chapter Google Scholar
Chuang, Z.-J., Wu, C.-H.: Emotion recognition using acoustic features and textual content. In: Proceedings of the ICME, Taipei, Taiwan, pp. 53–56 (2004)
Google Scholar
McGilloway, S., Cowie, R., Doulas-Cowie, E., Gielen, S., Westerdijk, M., Stroeve, S.: Approaching automatic recognition of emotion from voice: a rough benchmark. In: Proceedings of the ISCA Workshop on Speech and Emotion, Newcastle, Northern Ireland, pp. 207–212 (2000)
Google Scholar
Morrison, D., Wang, R., Xu, W., Silva, L.C.D.: Incremental learning for spoken affect classification and its application in call centres. Int. J. Intell. Systems Technol. Appl. 2, 242–254 (2007)
Google Scholar
Mohanty, M.N., Routray, A., Pradhan, A.K., Kabisatpathy, P.: Power quality disturbances classification usingsupport vector machines with optimized time-frequency kernels. Int. J. Power Electron. 4(2), 181–196 (2012)
Article Google Scholar
Frénay, B., Verleysen, M.: Using SVMs with randomised feature spaces: an extreme learning approach. In: Proceedings of ESANN, pp. 315–320 (2010)
Google Scholar
Groutage, D., Bennink, D.: A new matrix decomposition based on optimum transformation of the singular value decomposition basis sets yields principal features of time-frequency distributions. In: Proceedings of the Tenth IEEE Workshop on Statistical Signal and Array Processing, August 2000)
Google Scholar
Mohanty, M.N., Routray, A.: Estimation of autocorrelation space for classification of bio-medical signals. In: Panigrahi, B.K., Das, S., Suganthan, P.N., Nanda, P.K. (eds.) SEMCCO 2012. LNCS, vol. 7677, pp. 697–704. Springer, Heidelberg (2012)
Chapter Google Scholar
Haykins, S.: Neural Networks, 2nd edn. Prentice Hall, New Jersey (1999)
Google Scholar
Jolliffe, I.T.: Principal Component Analysis. Springer, Berlin (2002)
MATH Google Scholar
Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic Press, New York (1990)
MATH Google Scholar
Bellman, R.: Adaptive Control Processes. Princeton University Press, Princeton (1961)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

ITER, Siksha ‘O’ Anusandhan University, Bhuaneswar, 751030, Odisha, India
Mihir Narayan Mohanty
Department of Electrical Engineering, IIT, Kharagpur, WB, India
Aurobinda Routray

Authors

Mihir Narayan Mohanty
View author publications
You can also search for this author in PubMed Google Scholar
Aurobinda Routray
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mihir Narayan Mohanty .

Editor information

Editors and Affiliations

Department of Electrical Engineering, IIT, New Delhi, India
Bijaya Ketan Panigrahi
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
Ponnuthurai Nagaratnam Suganthan
Electronics and Communication Sciences Unit, Indian Statistical Institute, Kolkata, India
Swagatam Das

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mohanty, M.N., Routray, A. (2015). Machine Learning Approach for Emotional Speech Classification. In: Panigrahi, B., Suganthan, P., Das, S. (eds) Swarm, Evolutionary, and Memetic Computing. SEMCCO 2014. Lecture Notes in Computer Science(), vol 8947. Springer, Cham. https://doi.org/10.1007/978-3-319-20294-5_43

Download citation

DOI: https://doi.org/10.1007/978-3-319-20294-5_43
Published: 16 July 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20293-8
Online ISBN: 978-3-319-20294-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics