Speech Based Emotion Recognition

Sethu, Vidhyasaharan; Epps, Julien; Ambikairajah, Eliathamby

doi:10.1007/978-1-4939-1456-2_7

Vidhyasaharan Sethu⁴,
Julien Epps^4,5 &
Eliathamby Ambikairajah^4,5

2244 Accesses
15 Citations

Abstract

This chapter will examine current approaches to speech based emotion recognition. Following a brief introduction that describes the current widely utilised approaches to building such systems, it will attempt to broadly segregate components commonly involved in emotion recognition systems based on their function (i.e., feature extraction, normalisation, classifier, etc.) to give a broad view of the landscape. The next section of the chapter will then attempt to explain in more detail those components that are part of the most current systems. The chapter will also present a broad overview of how phonetic and speaker variability are dealt with in emotion recognition systems. Finally, the chapter presents the authors’ views on what are the current and future research challenges in the field.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

A.R. Damasio, A second chance for emotion, in Cognitive Neuroscience of Emotion, ed. by R. Lane et al. (Oxford University Press, New York, 2000), pp. 12–23
Google Scholar
K. Scherer, On the nature and function of emotion: a component process approach, in Approaches to Emotion, ed. by K.R. Scherer, P. Ekman (Lawrence Erlbaum Associates, Inc., Hillsdale, 1984), pp. 293–317
Google Scholar
S. Tompkins, Affect Imagery Consciousness-Volume I the Positive Affects: The Positive Affects (Springer Publishing Company, New York, 1962)
Google Scholar
O. Mowrer, Learning Theory and Behavior (Wiley, New York, 1960)
Book Google Scholar
G. Bower, Mood and memory. Am. Psychol. 36, 129–148 (1981)
Article Google Scholar
K.R. Scherer, What are emotions? And how can they be measured? Soc. Sci. Inf. 44, 695–729 (2005)
Article Google Scholar
P. Verduyn, I. Van Mechelen, F. Tuerlinckx, The relation between event processing and the duration of emotional experience. Emotion 11, 20 (2011)
Article Google Scholar
P. Verduyn, E. Delvaux, H. Van Coillie, F. Tuerlinckx, I. Van Mechelen, Predicting the duration of emotional experience: two experience sampling studies. Emotion 9, 83 (2009)
Article Google Scholar
A. Moors, P.C. Ellsworth, K.R. Scherer, N.H. Frijda, Appraisal theories of emotion: state of the art and future development. Emot. Rev. 5, 119–124 (2013)
Article Google Scholar
I. Fónagy, Emotions, voice and music. Res. Aspects Singing 33, 51–79 (1981)
Google Scholar
J. Ohala, Cross-language use of pitch: an ethological view. Phonetica 40, 1 (1983)
Article Google Scholar
E. Kramer, Judgment of personal characteristics and emotions from nonverbal properties of speech. Psychol. Bull. 60, 408 (1963)
Article Google Scholar
I.R. Murray, J.L. Arnott, Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. J. Acoust. Soc. Am. 93, 1097–1108 (1993)
Article Google Scholar
I. Pollack, H. Rubenstein, A. Horowitz, Communication of verbal modes of expression. Lang. Speech 3, 121–130 (1960)
Google Scholar
K.R. Scherer, R. Banse, H.G. Wallbott, Emotion inferences from vocal expression correlate across languages and cultures. J. Cross Cult. Psychol. 32, 76–92 (2001). doi:10.1177/0022022101032001009
Article Google Scholar
C. Darwin, The Expressions of Emotions in Man and Animals (John Murray, London, 1872)
Book Google Scholar
P. Ekman, An argument for basic emotions. Cogn. Emot. 6, 169–200 (1992)
Article Google Scholar
B. De Gelder, Recognizing emotions by ear and by eye, in Cognitive Neuroscience of Emotion, ed. by R. Lane et al. (Oxford University Press, New York, 2000), pp. 84–105
Google Scholar
T. Johnstone, K. Scherer, Vocal communication of emotion, in Handbook of Emotions, ed. by M. Lewis, J. Haviland, 2nd edn. (Guilford, New York, 2000), pp. 220–235
Google Scholar
K.R. Scherer, Vocal communication of emotion: a review of research paradigms. Speech Comm. 40, 227–256 (2003)
Article MATH Google Scholar
K. Scherer, Vocal affect expression: a review and a model for future research. Psychol. Bull. 99, 143–165 (1986)
Article Google Scholar
R. Frick, Communicating emotion: the role of prosodic features. Psychol. Bull. 97, 412–429 (1985)
Article Google Scholar
R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias, W. Fellenz, J.G. Taylor, Emotion recognition in human–computer interaction. Signal Proc. Mag. 18, 32–80 (2001)
Article Google Scholar
R. Cowie, R.R. Cornelius, Describing the emotional states that are expressed in speech. Speech Comm. 40, 5–32 (2003)
Article MATH Google Scholar
J. Averill, A semantic atlas of emotional concepts. JSAS Cat. Sel. Doc. Psychol. 5, 330 (1975)
Google Scholar
R. Plutchik, The Psychology and Biology of Emotion (HarperCollins College Div, New York, 1994)
Google Scholar
R. Cowie, E. Douglas-cowie, B. Apolloni, J. Taylor, A. Romano, W. Fellenz, What a neural net needs to know about emotion words, in Computational Intelligence and Applications, ed. by N. Mastorakis (Word Scientific Engineering Society, Singapore, 1999), pp. 109–114
Google Scholar
H. Scholsberg, A scale for the judgment of facial expressions. J. Exp. Psychol. 29, 497 (1941)
Article Google Scholar
H. Schlosberg, Three dimensions of emotion. Psychol. Rev. 61, 81 (1954)
Article Google Scholar
J.A. Russell, A. Mehrabian, Evidence for a three-factor theory of emotions. J. Res. Pers. 11, 273–294 (1977)
Article Google Scholar
R. Lazarus, Emotion and Adaptation (Oxford University Press, New York, 1991)
Google Scholar
M. Schröder, Speech and emotion research: an overview of research frameworks and a dimensional approach to emotional speech synthesis (Ph. D thesis), Saarland University (2004)
Google Scholar
P. Ekman, E.R. Sorenson, W.V. Friesen, Pan-cultural elements in facial displays of emotion. Science 164, 86–88 (1969). doi:10.1126/science.164.3875.86
Article Google Scholar
R.R. Cornelius, The Science of Emotion: Research and Tradition in the Psychology of Emotions (Prentice-Hall, Inc, Upper Saddle River, 1996)
Google Scholar
M. Wöllmer, F. Eyben, S. Reiter, B. Schuller, C. Cox, E. Douglas-Cowie, R. Cowie, Abandoning emotion classes-towards continuous emotion recognition with modelling of long-range dependencies, in INTERSPEECH (2008), pp. 597–600
Google Scholar
M. Grimm, K. Kroschel, Emotion estimation in speech using a 3d emotion space concept, in Robust Speech Recognition and Understanding, ed. by M. Grimm, K. Kroschel (I-Tech, Vienna, 2007), pp. 281–300
Chapter Google Scholar
H.P. Espinosa, C.A.R. García, L.V. Pineda, Features selection for primitives estimation on emotional speech, in 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) (2010), pp. 5138–5141
Google Scholar
H. Gunes, B. Schuller, M. Pantic, R. Cowie, Emotion representation, analysis and synthesis in continuous space: A survey, in 2011 IEEE International Conference on Automatic Face & Gesture Recognition and Workshops (FG 2011) (2011), pp. 827–834
Google Scholar
T. Sobol-Shikler, Automatic inference of complex affective states. Comput. Speech Lang. 25(1), 45–62 (2011). http://dx.doi.org/10.1016/j.csl.2009.12.005
Article Google Scholar
B. Schuller, B. Vlasenko, F. Eyben, G. Rigoll, A. Wendemuth, Acoustic emotion recognition: A benchmark comparison of performances, in ASRU 2009. IEEE Workshop on Automatic Speech Recognition & Understanding, 2009 (2009), pp. 552–557
Google Scholar
M. Wollmer, B. Schuller, F. Eyben, G. Rigoll, Combining long short-term memory and dynamic Bayesian networks for incremental emotion-sensitive artificial listening. IEEE J. Sel. Top. Signal Process. 4, 867–881 (2010)
Article Google Scholar
R. Barra, J.M. Montero, J. Macias-Guarasa, L.F. D’Haro, R. San-Segundo, R. Cordoba, Prosodic and segmental rubrics in emotion identification, in 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings (2006), pp. I–I
Google Scholar
M. Borchert, A. Dusterhoft, Emotions in speech – Experiments with prosody and quality features in speech for use in categorical and dimensional emotion recognition environments, in Proceedings of 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE ’05 (2005), pp. 147–151
Google Scholar
M. Lugger, B. Yang, An incremental analysis of different feature groups in speaker independent emotion recognition, in ICPhS (2007)
Google Scholar
M. Pantic, L.J.M. Rothkrantz, Toward an affect-sensitive multimodal human–computer interaction. Proc. IEEE 91, 1370–1390 (2003)
Article Google Scholar
D. Ververidis, C. Kotropoulos, Emotional speech recognition: resources, features, and methods. Speech Commun. 48, 1162–1181 (2006)
Article Google Scholar
L. Vidrascu, L. Devillers, Five emotion classes detection in real-world call center data: The use of various types of paralinguistic features, in Proceedings of International Workshop on Paralinguistic Speech – 2007 (2007), pp. 11–16.
Google Scholar
S. Yacoub, S. Simske, X. Lin, J. Burns, Recognition of emotions in interactive voice response systems, in Eighth European Conference on Speech Communication and Technology (2003), pp. 729–732
Google Scholar
D. Bitouk, R. Verma, A. Nenkova, Class-level spectral features for emotion recognition. Speech Commun. 52, 613–625 (2010)
Article Google Scholar
M. El Ayadi, M.S. Kamel, F. Karray, Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn. 44, 572–587 (2011). doi:10.1016/j.patcog.2010.09.020
Article MATH Google Scholar
C. Lee, S. Narayanan, R. Pieraccini, Combining acoustic and language information for emotion recognition, in Seventh International Conference on Spoken Language Processing (2002), pp. 873–876
Google Scholar
B. Schuller, A. Batliner, S. Steidl, D. Seppi, Emotion recognition from speech: Putting ASR in the loop, in IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ICASSP 2009 (2009), pp. 4585–4588
Google Scholar
B. Schuller, G. Rigoll, M. Lang, Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture, in IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP ’04), vol. 1 (2004) pp. I-577–I-580
Google Scholar
S. Mozziconacci, D. Hermes, Role of intonation patterns in conveying emotion in speech, in 14th International Conference of Phonetic Sciences (1999), pp. 2001–2004
Google Scholar
F. Burkhardt, W. Sendlmeier, Verification of acoustical correlates of emotional speech using formant-synthesis, in ISCA Tutorial and Research Workshop (ITRW) on Speech and Emotion (2000), pp. 151–156
Google Scholar
M. Wöllmer, F. Eyben, B. Schuller, E. Douglas-Cowie, R. Cowie, Data-driven clustering in emotional space for affect recognition using discriminatively trained LSTM networks, in INTERSPEECH (2009), pp. 1595–1598
Google Scholar
A. Batliner, S. Steidl, B. Schuller, D. Seppi, T. Vogt, J. Wagner, L. Devillers, L. Vidrascu, V. Aharonson, L. Kessous, Whodunnit–searching for the most important feature types signalling emotion-related user states in speech. Comput. Speech Lang. 25, 4–28 (2011)
Article Google Scholar
T. Kinnunen, H. Li, An overview of text-independent speaker recognition: from features to supervectors. Speech Commun. 52, 12–40 (2010)
Article Google Scholar
E. Ambikairajah, H. Li, W. Liang, Y. Bo, V. Sethu, Language identification: a tutorial. IEEE Circuits Syst. Magazine 11, 82–108 (2011)
Article Google Scholar
M. Kockmann, L. Burget, J. Cernocky, Brno University of Technology System for Interspeech 2009 Emotion Challenge, in INTERSPEECH-2009 (2009), pp. 348–351
Google Scholar
V. Sethu, J. Epps, E. Ambikairajah, Speaker variability in speech based emotion models – analysis and normalisation, in ICASSP (2013)
Google Scholar
C. Clavel, I. Vasilescu, L. Devillers, G. Richard, T. Ehrette, Fear-type emotion recognition for future audio-based surveillance systems. Speech Commun. 50(6), 487–503 (2008). http://dx.doi.org/10.1016/j.specom.2008.03.012
Article Google Scholar
V. Sethu, Automatic emotion recognition: an investigation of acoustic and prosodic parameters (PhD Thesis), The University of New South Wales, Sydney (2009)
Google Scholar
V. Sethu, E. Ambikairajah, J. Epps, On the use of speech parameter contours for emotion recognition. EURASIP J. Audio Speech Music Process. 2013, 1–14 (2013)
Article Google Scholar
C. Busso, S. Lee, S.S. Narayanan, Using neutral speech models for emotional speech analysis, in INTERSPEECH (2007), pp. 2225–2228
Google Scholar
B. Schuller, G. Rigoll, M. Lang, Hidden Markov model-based speech emotion recognition, in 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP ’03), vol. 2 (2003), pp. II-1–II-4
Google Scholar
V. Sethu, E. Ambikairajah, J. Epps, Group delay features for emotion detection, in INTERSPEECH-2007 (2007), pp. 2273–2276
Google Scholar
C.M. Lee, S. Yildirim, M. Bulut, A. Kazemzadeh, C. Busso, Z. Deng, S. Lee, S. Narayanan, Emotion recognition based on phoneme classes, in INTERSPEECH (2004)
Google Scholar
D. Küstner, R. Tato, T. Kemp, B. Meffert, Towards real life applications in emotion recognition, in Affective Dialogue Systems, ed. by E. André, L. Dybkaer, W. Minker, P. Heisterkamp (Springer, Berlin, 2004), pp. 25–35
Chapter Google Scholar
B. Schuller, S. Steidl, A. Batliner, The INTERSPEECH 2009 emotion challenge, in INTERSPEECH-2009, Brighton (2009), pp. 312–315
Google Scholar
B. Schuller, S. Steidl, A. Batliner, A. Vinciarelli, K. Scherer, F. Ringeval, M. Chetouani, F. Weninger, F. Eyben, E. Marchi, M. Mortillaro, H. Salamin, A. Polychroniou, F. Valente, S. Kim, The INTERSPEECH 2013 computational paralinguistics challenge: Social signals, conflict, emotion, autism, in Interspeech, Lyon (2013)
Google Scholar
C.-C. Lee, E. Mower, C. Busso, S. Lee, S. Narayanan, Emotion recognition using a hierarchical binary decision tree approach. Speech Commun. 53(11), 1162–1171 (2011). http://dx.doi.org/10.1016/j.specom.2011.06.004
Article Google Scholar
D. Morrison, R. Wang, L.C. De Silva, Ensemble methods for spoken emotion recognition in call-centres. Speech Commun. 49(2), 98–112 (2007). http://dx.doi.org/10.1016/j.specom.2006.11.004
Article Google Scholar
J. Cichosz, K. Slot, Emotion recognition in speech signal using emotion-extracting binary decision trees, Doctoral Consortium. ACII (2007)
Google Scholar
M.W. Bhatti, W. Yongjin, G. Ling, A neural network approach for human emotion recognition in speech, in Proceedings of the 2004 International Symposium on Circuits and Systems, 2004. ISCAS ’04., vol. 2 (2004), pp. II-181–II-184
Google Scholar
V. Petrushin, Emotion in speech: recognition and application to call centers, in Conference on Artificial Neural Networks in Engineering (1999), pp. 7–10
Google Scholar
L. Chul Min, S.S. Narayanan, R. Pieraccini, Classifying emotions in human–machine spoken dialogs, in 2002 IEEE International Conference on Multimedia and Expo, 2002. ICME ’02 Proceedings, vol. 1 (2002), pp. 737–740
Google Scholar
C.C. Burges, A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2, 121–167 (1998). doi:10.1023/A:1009715923555
Article Google Scholar
W.M. Campbell, D.E. Sturim, D.A. Reynolds, A. Solomonoff, SVM based speaker verification using a GMM supervector kernel and NAP variability compensation, in 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings (2006), pp. I-I
Google Scholar
H. Hao, X. Ming-Xing, W. Wei, GMM supervector based SVM with spectral features for speech emotion recognition, in IEEE International Conference on Acoustics, Speech and Signal Processing, 2007. ICASSP 2007 (2007), pp. IV-413–IV-416
Google Scholar
O. Kwon, K. Chan, J. Hao, T. Lee, Emotion recognition by speech signals, in INTERSPEECH-2003 (2003), pp. 125–128
Google Scholar
O. Pierre-Yves, The production and recognition of emotions in speech: features and algorithms. Int. J. Hum. Comput. Stud. 59(7), 157–183 (2003). http://dx.doi.org/10.1016/S1071-5819(02)00141-6
Article Google Scholar
D.G. Childers, C.K. Lee, Vocal quality factors: analysis, synthesis, and perception. J. Acoust. Soc. Am. 90, 2394–2410 (1991)
Article Google Scholar
B. Schuller, S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, C. Müller, S. Narayanan, Paralinguistics in speech and language—state-of-the-art and the challenge. Comput. Speech Lang. 27(1), 4–39 (2013). http://dx.doi.org/10.1016/j.csl.2012.02.005
Article Google Scholar
T.L. Nwe, S.W. Foo, L.C. De Silva, Speech emotion recognition using hidden Markov models. Speech Commun. 41, 603–623 (2003)
Article Google Scholar
A. Nogueiras, A. Moreno, A. Bonafonte, J. Mariño, Speech emotion recognition using hidden Markov models, in Proceedings of EUROSPEECH-2001 (2001), pp. 2679–2682
Google Scholar
B. Vlasenko, B. Schuller, A. Wendemuth, G. Rigoll, Frame vs. turn-level: emotion recognition from speech considering static and dynamic processing, in Affective Computing and Intelligent Interaction, ed. by A. Paiva, R. Prada, R. Picard (Springer, Berlin, 2007), pp. 139–147
Chapter Google Scholar
L. Rabiner, B. Juang, An introduction to hidden Markov models. IEEE ASSP Magazine 3, 4–16 (1986)
Article Google Scholar
P. Chang-Hyun, S. Kwee-Bo, Emotion recognition and acoustic analysis from speech signal, in Proceedings of the International Joint Conference on Neural Networks, 2003., vol. 4 (2003), pp. 2594–2598
Google Scholar
S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Article Google Scholar
A. Batliner, Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing. The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom, (Wiley, 2013)
Google Scholar
A. Batliner, R. Huber, in Speaker Characteristics and Emotion Classification, ed. by C. Müller, vol. 4343 (Springer, Berlin, 2007), pp. 138–151
Chapter Google Scholar
C. Busso, M. Bulut, S.S. Narayanan, in Toward Effective Automatic Recognition Systems of Emotion in Speech, ed. by J. Gratch, S. Marsella (Oxford University Press, New York, 2012)
Google Scholar
B. Schuller, A. Batliner, S. Steidl, D. Seppi, Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge. Speech Commun. 53, 1062–1087 (2011). doi:10.1016/j.specom.2011.01.011
Article Google Scholar
L. Van der Maaten, G. Hinton, Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 85 (2008)
Google Scholar
C. Busso, M. Bulut, C.-C. Lee, A. Kazemzadeh, E. Mower, S. Kim, J. Chang, S. Lee, S. Narayanan, IEMOCAP: interactive emotional dyadic motion capture database. Lang. Res. Eval. 42, 335–359 (2008)
Article Google Scholar
V. Sethu, E. Ambikairajah, J. Epps, Phonetic and speaker variations in automatic emotion classification, in INTERSPEECH-2008 (2008), pp. 617–620
Google Scholar
D. Ni, V. Sethu, J. Epps, E. Ambikairajah, Speaker variability in emotion recognition – An adaptation based approach, in 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2012), pp. 5101–5104
Google Scholar
K. Jae-Bok, P. Jeong-Sik, O. Yung-Hwan, On-line speaker adaptation based emotion recognition using incremental emotional information, in 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2011), pp. 4948–4951
Google Scholar
V. Sethu, E. Ambikairajah, J. Epps, Speaker normalisation for speech-based emotion detection, in 2007 15th International Conference on Digital Signal Processing (2007), pp. 611–614
Google Scholar
C. Busso, A. Metallinou, S.S. Narayanan, Iterative feature normalization for emotional speech detection, in 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2011), pp. 5692–5695
Google Scholar
B. Schuller, M. Wimmer, D. Arsic, T. Moosmayr, G. Rigoll, Detection of security related affect and behaviour in passenger transport, in Interspeech, Brisbane (2008), pp. 265–268
Google Scholar
L. Ming, A. Metallinou, D. Bone, S. Narayanan, Speaker states recognition using latent factor analysis based Eigenchannel factor vector modeling, in 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2012), pp. 1937–1940
Google Scholar
T. Rahman, C. Busso, A personalized emotion recognition system using an unsupervised feature adaptation scheme, in 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2012), pp. 5117–5120
Google Scholar
M. Tahon, A. Delaborde, L. Devillers, Real-life emotion detection from speech in human–robot interaction: Experiments across diverse corpora with child and adult voices, in INTERSPEECH (2011), pp. 3121–3124
Google Scholar
B. Schuller, B. Vlasenko, F. Eyben, M. Wollmer, A. Stuhlsatz, A. Wendemuth, G. Rigoll, Cross-corpus acoustic emotion recognition: variances and strategies. IEEE Trans. Affect. Comput. 1, 119–131 (2010). doi:10.1109/T-AFFC.2010.8
Article Google Scholar
E. Douglas-Cowie, N. Campbell, R. Cowie, P. Roach, Emotional speech: towards a new generation of databases. Speech Commun. 40, 33–60 (2003)
Article MATH Google Scholar
E. Bozkurt, E. Erzin, C.E. Erdem, A.T. Erdem, Improving automatic emotion recognition from speech signals, in INTERSPEECH (2009), pp. 324–327
Google Scholar
C.-C. Lee, E. Mower, C. Busso, S. Lee, S.S. Narayanan, Emotion recognition using a hierarchical binary decision tree approach, in INTERSPEECH-2009 (2009), pp. 320–323
Google Scholar
B. Vlasenko, A. Wendemuth, Processing affected speech within human machine interaction, in INTERSPEECH (2009), pp. 2039–2042
Google Scholar
I. Luengo, E. Navas, I. Hernaez, Combining spectral and prosodic information for emotion recognition in the Interspeech 2009 emotion challenge, in INTERSPEECH-2009 (2009), pp. 332–335
Google Scholar
S. Planet, I. Iriondo, J. Socoró, C. Monzo, J. Adell, GTM-URL contribution to the INTERSPEECH 2009 emotion challenge, in INTERSPEECH-2009 (2009), pp. 316–319
Google Scholar
P. Dumouchel, N. Dehak, Y. Attabi, R. Dehak, N. Boufaden, Cepstral and long-term features for emotion recognition, in INTERSPEECH-2009 (2009), pp. 344–347
Google Scholar
T. Vogt, E. André, Exploring the benefits of discretization of acoustic features for speech emotion recognition, in INTERSPEECH (2009), pp. 328–331
Google Scholar
R. Barra Chicote, F. Fernández Martínez, L. Lutfi, S. Binti, J.M. Lucas Cuesta, J. Macías Guarasa, J.M. Montero Martínez, R. San Segundo Hernández, J.M. Pardo Muñoz, Acoustic emotion recognition using dynamic Bayesian networks and multi-space distributions, in INTERSPEECH 2009 (2009), pp. 336–339
Google Scholar
O. Räsänen, J. Pohjalainen, Random subset feature selection in automatic recognition of developmental disorders, affective states, and level of conflict from speech. Presented at the 14th Annual Conference of the International Speech Communication Association (Interspeech), Lyon (2013)
Google Scholar
G. Gosztolya, R. Busa-Fekete, L. Tóth, Detecting autism, emotions and social signals using AdaBoost. Presented at the 14th Annual Conference of the International Speech Communication Association (Interspeech), Lyon
Google Scholar
H.-y. Lee, T.-y. Hu, H. Jing, Y.-F. Chang, Y. Tsao, Y.-C. Kao, T.-L. Pao, Ensemble of machine learning and acoustic segment model techniques for speech emotion and autism spectrum disorders recognition. Presented at the 14th Annual Conference of the International Speech Communication Association (Interspeech), Lyon (2013)
Google Scholar
V. Sethu, J. Epps, E. Ambikairajah, H. Li, GMM based speaker variability compensated system for Interspeech 2013 ComParE emotion challenge. Presented at the 14th Annual Conference of the International Speech Communication Association (Interspeech), Lyon (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical Engineering and Telecommunications, The University of New South Wales, Sydney, NSW, Australia
Vidhyasaharan Sethu, Julien Epps & Eliathamby Ambikairajah
ATP Research Laboratory, National ICT Australia (NICTA), Eveleigh, NSW, Australia
Julien Epps & Eliathamby Ambikairajah

Authors

Vidhyasaharan Sethu
View author publications
You can also search for this author in PubMed Google Scholar
Julien Epps
View author publications
You can also search for this author in PubMed Google Scholar
Eliathamby Ambikairajah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vidhyasaharan Sethu .

Editor information

Editors and Affiliations

Dept. of Electrical Engineering, Santa Clara University, Santa Clara, California, USA
Tokunbo Ogunfunmi
School of EE&C Engineering, The University of Western Australia, Crawley, West Australia, Australia
Roberto Togneri
Qualcomm Inc., Santa Clara, California, USA
Madihally (Sim) Narasimha

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sethu, V., Epps, J., Ambikairajah, E. (2015). Speech Based Emotion Recognition. In: Ogunfunmi, T., Togneri, R., Narasimha, M. (eds) Speech and Audio Processing for Coding, Enhancement and Recognition. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-1456-2_7

Download citation

DOI: https://doi.org/10.1007/978-1-4939-1456-2_7
Published: 18 September 2014
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-1455-5
Online ISBN: 978-1-4939-1456-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics