Skip to main content

Text-Independent Speaker Identification in Emotional Environments: A Classifier Fusion Approach

  • Chapter
Frontiers in Computer Education

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 133))

Abstract

The study of text-independent speaker identification in emotional environments is presented in this paper. The study includes identifying the speaker using speech samples in five basic emotions viz. anger, happiness, sadness, disgust, and fear. The work presented compares the performance of four feature sets: Mel frequency cepstral coefficients (MFCC), Line spectral frequencies (LSF), Teager energy based mel cepstral coefficients (TMFCC) and Temporal energy of subband cepstral coefficients (TESBCC). Next, the performance of the speaker identification is studied with combination of two features MFCC-LSF and TESBCC-LSF. A novel classifier fusion method is proposed and its performance is compared with that of the individual classifiers. The database containing speech utterances recorded in the five basic emotions from thirty four speakers in one of the Indian languages (Marathi) is used for experimentation. Gaussian mixture model is used for classification. Fusion of classifiers enhances the speaker identification accuracy in both emotional and neutral environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Marcos, F., Monte-Moreno, E.: State – of – the – art in speaker recognition. IEEE A & E Systems Magazine, 7–12 (2005)

    Google Scholar 

  2. Kinnunen, T., Haizhou, L.: An overview of text-independent speaker recognition: From features to supervectors. Speech Communication 52, 12–20 (2009)

    Article  Google Scholar 

  3. Sen, N., Basu, T.K.: Temporal energy and correlation features from Nyquist filter bank for text-independent speaker identification. In: Proc. of IEEE Students’ Technology Symposium, IIT Kharagpur (India), pp. 166–170 (2011)

    Google Scholar 

  4. Kandali, A.B., Routray, A., Basu, T.K.: Vocal emotion recognition in five native languages of Assam using new wavelet features. Int. J. Speech Tech., 1–13 (2009)

    Google Scholar 

  5. Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture models. IEEE Trans. on Speech & Audio Processing 3, 72–83 (1995)

    Article  Google Scholar 

  6. Picard, R.W.: Affective computing, MIT Media Lab Perceptual Computing Section Tech. Rep. 321 (1995)

    Google Scholar 

  7. Shahin, I.: Speaker identification in emotional environments. Iranian Journal of Electrical and Computer Engineering 8, 41–46 (2009)

    MathSciNet  Google Scholar 

  8. Wu, W., Zheng, I.F., Xu, M.X., Bao, H.J.: Study on speaker verification on emotional speech. In: Proc. of Int. Conf. on Spoken Language Processing, INTERSPEECH, pp. 2102–2105 (2006)

    Google Scholar 

  9. Kaiser, Z.F.: On Teagers energy algorithm and its generalization to continuous signals. In: Proc. 4th IEEE Digital Signal Processing Workshop, MOHONK, New Paltz, NY (1990)

    Google Scholar 

  10. Itakura, F.: Line spectrum representation of linear predictive coefficients of speech signals. J. Acoust. Soc. Am. 53, 537(A) (1995)

    Google Scholar 

  11. Jawarkar, N.P., Holambe, R.S., Basu, T.K.: Use of Fuzzy Min-Max Neural Network for Speaker Identification. In: IEEE International Conf. on Recent Trends in Information Technology, ICRTIT, pp. 178–182. MIT, Chennai (2011)

    Google Scholar 

  12. Kittler, J., Hatef, M., Duin, R., Mataz, J.: On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20, 226–239 (1998)

    Article  Google Scholar 

  13. Mashao, D.J., Skosan, M.: Combining classifier decisions for robust speaker identification. Pattern Recognition 39, 147–155 (2006)

    Article  Google Scholar 

  14. Chen, K., Chi, H.: A method of combining multiple probabilistic classifiers through soft competition on different feature sets. Neurocomputing 20, 227–252 (1998)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to N. P. Jawarkar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag GmbH Berlin Heidelberg

About this chapter

Cite this chapter

Jawarkar, N.P., Holambe, R.S., Basu, T.K. (2012). Text-Independent Speaker Identification in Emotional Environments: A Classifier Fusion Approach. In: Sambath, S., Zhu, E. (eds) Frontiers in Computer Education. Advances in Intelligent and Soft Computing, vol 133. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27552-4_77

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-27552-4_77

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-27551-7

  • Online ISBN: 978-3-642-27552-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics