Text-Independent Speaker Identification in Emotional Environments: A Classifier Fusion Approach

Jawarkar, N. P.; Holambe, R. S.; Basu, T. K.

doi:10.1007/978-3-642-27552-4_77

N. P. Jawarkar³,
R. S. Holambe⁴ &
T. K. Basu⁵

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 133))

123 Accesses
8 Citations

Abstract

The study of text-independent speaker identification in emotional environments is presented in this paper. The study includes identifying the speaker using speech samples in five basic emotions viz. anger, happiness, sadness, disgust, and fear. The work presented compares the performance of four feature sets: Mel frequency cepstral coefficients (MFCC), Line spectral frequencies (LSF), Teager energy based mel cepstral coefficients (TMFCC) and Temporal energy of subband cepstral coefficients (TESBCC). Next, the performance of the speaker identification is studied with combination of two features MFCC-LSF and TESBCC-LSF. A novel classifier fusion method is proposed and its performance is compared with that of the individual classifiers. The database containing speech utterances recorded in the five basic emotions from thirty four speakers in one of the Indian languages (Marathi) is used for experimentation. Gaussian mixture model is used for classification. Fusion of classifiers enhances the speaker identification accuracy in both emotional and neutral environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Marcos, F., Monte-Moreno, E.: State – of – the – art in speaker recognition. IEEE A & E Systems Magazine, 7–12 (2005)
Google Scholar
Kinnunen, T., Haizhou, L.: An overview of text-independent speaker recognition: From features to supervectors. Speech Communication 52, 12–20 (2009)
Article Google Scholar
Sen, N., Basu, T.K.: Temporal energy and correlation features from Nyquist filter bank for text-independent speaker identification. In: Proc. of IEEE Students’ Technology Symposium, IIT Kharagpur (India), pp. 166–170 (2011)
Google Scholar
Kandali, A.B., Routray, A., Basu, T.K.: Vocal emotion recognition in five native languages of Assam using new wavelet features. Int. J. Speech Tech., 1–13 (2009)
Google Scholar
Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture models. IEEE Trans. on Speech & Audio Processing 3, 72–83 (1995)
Article Google Scholar
Picard, R.W.: Affective computing, MIT Media Lab Perceptual Computing Section Tech. Rep. 321 (1995)
Google Scholar
Shahin, I.: Speaker identification in emotional environments. Iranian Journal of Electrical and Computer Engineering 8, 41–46 (2009)
MathSciNet Google Scholar
Wu, W., Zheng, I.F., Xu, M.X., Bao, H.J.: Study on speaker verification on emotional speech. In: Proc. of Int. Conf. on Spoken Language Processing, INTERSPEECH, pp. 2102–2105 (2006)
Google Scholar
Kaiser, Z.F.: On Teagers energy algorithm and its generalization to continuous signals. In: Proc. 4th IEEE Digital Signal Processing Workshop, MOHONK, New Paltz, NY (1990)
Google Scholar
Itakura, F.: Line spectrum representation of linear predictive coefficients of speech signals. J. Acoust. Soc. Am. 53, 537(A) (1995)
Google Scholar
Jawarkar, N.P., Holambe, R.S., Basu, T.K.: Use of Fuzzy Min-Max Neural Network for Speaker Identification. In: IEEE International Conf. on Recent Trends in Information Technology, ICRTIT, pp. 178–182. MIT, Chennai (2011)
Google Scholar
Kittler, J., Hatef, M., Duin, R., Mataz, J.: On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20, 226–239 (1998)
Article Google Scholar
Mashao, D.J., Skosan, M.: Combining classifier decisions for robust speaker identification. Pattern Recognition 39, 147–155 (2006)
Article Google Scholar
Chen, K., Chi, H.: A method of combining multiple probabilistic classifiers through soft competition on different feature sets. Neurocomputing 20, 227–252 (1998)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

B.N. College of Engineering, Pusad, MS, India
N. P. Jawarkar
SGGS College of Engineering & Technology, Nanded, MS, India
R. S. Holambe
Institute of Technology & Marine Engineering, Jhinga, WB, India
T. K. Basu

Authors

N. P. Jawarkar
View author publications
You can also search for this author in PubMed Google Scholar
R. S. Holambe
View author publications
You can also search for this author in PubMed Google Scholar
T. K. Basu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to N. P. Jawarkar .

Editor information

Editors and Affiliations

South China Normal University, Guangzhou, 510631, China, People's Republic
Sabo Sambath
South China Normal University, Guangzhou, 510631, China, People's Republic
Egui Zhu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jawarkar, N.P., Holambe, R.S., Basu, T.K. (2012). Text-Independent Speaker Identification in Emotional Environments: A Classifier Fusion Approach. In: Sambath, S., Zhu, E. (eds) Frontiers in Computer Education. Advances in Intelligent and Soft Computing, vol 133. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27552-4_77

Download citation

DOI: https://doi.org/10.1007/978-3-642-27552-4_77
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27551-7
Online ISBN: 978-3-642-27552-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics