Speaker discrimination based on fuzzy fusion and feature reduction techniques

Khennouf, S.; Sayoud, H.

doi:10.1007/s10772-017-9484-3

Speaker discrimination based on fuzzy fusion and feature reduction techniques

Published: 22 December 2017

Volume 21, pages 51–63, (2018)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

219 Accesses
1 Citation
Explore all metrics

Abstract

In this paper, we propose a research work on speaker discrimination using a multi-classifier fusion with focus on feature reduction effects. Speaker discrimination consists in the automatic distinction between two speakers using the vocal characteristics of their speeches. A number of features are extracted using Mel Frequency Spectral Coefficients and then reduced using Relative Speaker Characteristic (RSC) along with the Principal Components Analysis (PCA). Several classification methods are implemented to ensure the discrimination task. Since different classifiers are employed, two fusion algorithms at the decision level, referred to as Weighted Fusion and Fuzzy Fusion, are proposed to boost the classification performances. These algorithms are based on the weighting of the different classifiers outputs. Furthermore, the effects of speaker gender and feature reduction on the speaker discrimination task have been examined too. The evaluation of our approaches was conducted on a subset of Hub-4 Broadcast-News. The experimental results have shown that the speaker discrimination accuracy is improved by 5–15% using the (RSC–PCA) feature reduction. In addition, the proposed fusion methods recorded an improvement of about 10% compared to the individual scores of the classifiers. Finally, we noticed that the gender has an important impact on the discrimination performances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Benzeghiba, M. et al. (2007). Automatic speech recognition and speech variability: A review. Speech Commununication, 49(10), 763–786.
Article Google Scholar
Bimbot, F. (2009). Automatic speaker recognition. Language and speech processing, pp. 321–354.
Bimbot, F., Magrin-Chagnolleau, I., & Mathan, L. (1995). Second-order statistical measures for text-independent speaker identification. Speech Communication, 17(1–2), 177–192.
Article Google Scholar
Bulgakova, E. et al. (2015). Speaker verification using spectral and durational segmental characteristics. In: International conference on speech and computer. New York: Springer, pp. 397–404.
Burget, L. et al. (2011). Discriminatively trained probabilistic linear discriminant analysis for speaker verification. In: Proceedings of the 36th international conference on acoustics, speech and signal processing, Prague, Czech Republic, May 22–27.
Corinna, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.
MATH Google Scholar
Dhonde, S. B., & Jagade, S. M. (2015). Feature extraction techniques in speaker recognition: A review. International Journal on Recent Technologies in Mechanical and Electrical Engineering (IJRMEE), 2(no 5), 104–106.
Google Scholar
El-Samie, F. E. A. (2011). Information security for automatic speaker identification (1st ed.). New York: Springer.
Book Google Scholar
Ghodsi, A. (2006). Dimensionality reduction a short tutorial, Department of Statistics and Actuarial Science, University of Waterloo, Ontario, Canada, pp. 37–38‏.
Guandong, X., Zong, Y., & Zhenglu, Y. (2013). Applied data mining. Boston: CRC Press.
Google Scholar
Jebara, T. (2012). Machine learning: Discriminative and generative (Vol. 755). New York: Springer.
MATH Google Scholar
Kinnunen, T., & Li, H. (2010). An overview of text-independent speaker recognition: From features to supervectors. Speech Communication, 52(1), 12–40.
Article Google Scholar
Lee, C. H., Soong, F. K., & Paliwal, K. (Eds.) (2012). Automatic speech and speaker recognition: Advanced topics, (Vol. 355). New York: Springer.
Google Scholar
Lei, Y., & Scheffer, N., et al. (2014). A novel scheme for speaker recognition using a phonetically-aware deep neural network. In Proceedings of the 39th international conference on acoustics, speech and signal processing, Florence, Italy, May 4–9.
Li, W. et al. (2015). Sparsity analysis and compensation for i-vector based speaker verification. In Proceeding of the international conference on speech and computer. New York: Springer.
Man-Wai, M., & Hon-Bill, Y. (2014). A study of voice activity detection techniques for NIST speaker recognition evaluations. Computer Speech & Language, 28(1), 295–313.
Article Google Scholar
Meignier, S. (2002). Indexation en locuteurs de documents sonores: Segmentation d’un document et Appariement d’une collection. Ph.D. thesis, Univ. d’Avignon et des Pays de Vaucluse, France.
Ming, L. et al. (2016). Speaker verification based on the fusion of speech acoustics and inverted articulatory signals. Computer Speech & Language, 36, 196–211.
Article Google Scholar
Nakagawa, S., Longbiao, W., & Ohtsuka, S. (2012). Speaker identification and verification by combining MFCC and phase information. IEEE Transactions on Audio, Speech, and Language Processing, 20(4), 1085–1095.
Article Google Scholar
Nakagawa, S., Wang, L., & Ohtsuka, S. (2012). Speaker identification and verification by combining MFCC and phase information. IEEE Transactions on Audio, Speech, and Language Processing, 20(4), 1085–1095.
Article Google Scholar
Ouamour, S., Guerti, M., & Sayoud, H. (2008). A new relativistic vision in speaker discrimination. Canadian Acoustics, 36(4), 24–35.
Google Scholar
Ouamour, S., Sayoud, H., & Guerti, M. (2009). Optimal spectral resolution in speaker authentication application in noisy environment and telephony. International Journal of Mobile Computing and Multimedia Communications (IJMCMC), 1(2), 36–47.
Article Google Scholar
Pribil, J., Pribilova, A., Matousek, J. (2016). GMM-based speaker gender and age classification after voice conversion. In First international workshop on sensing, processing and learning for intelligent machines (SPLINE), Denmark.
Richardson, F., Reynolds, D. A., & Dehak, N. (2015). Deep neural network approaches to speaker and language recognition. IEEE Signal Processing Letters, 22(10), 1671–1675.
Article Google Scholar
Sayoud, H. (2003). Automatic speaker recognition using neural approaches, PhD thesis, USTHB University, Algiers, Algeria.
Shlens, J. (2014). A tutorial on principal component analysis, arXiv preprint arXiv.1404.1100.
Venables, W. N., & Ripley, B. D. (2013). Modern applied statistics with S-PLUS. Berlin: Springer.
MATH Google Scholar
Wu, D., & Jie, C. (2015). Multimodel biometrics fusion based on FAR and FRR using triangular norm. International Journal of Computational Intelligence Systems, 8(4), 779–786.
Article Google Scholar

Download references

Author information

Authors and Affiliations

USTHB University, Alger, Algeria
S. Khennouf & H. Sayoud

Authors

S. Khennouf
View author publications
You can also search for this author in PubMed Google Scholar
H. Sayoud
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to H. Sayoud.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khennouf, S., Sayoud, H. Speaker discrimination based on fuzzy fusion and feature reduction techniques. Int J Speech Technol 21, 51–63 (2018). https://doi.org/10.1007/s10772-017-9484-3

Download citation

Received: 26 July 2017
Accepted: 13 December 2017
Published: 22 December 2017
Issue Date: March 2018
DOI: https://doi.org/10.1007/s10772-017-9484-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speaker discrimination based on fuzzy fusion and feature reduction techniques

Abstract

Access this article

Similar content being viewed by others

A review of unsupervised feature selection methods

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Detection and Classification Methods for Animal Sounds

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Speaker discrimination based on fuzzy fusion and feature reduction techniques

Abstract

Access this article

Similar content being viewed by others

A review of unsupervised feature selection methods

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Detection and Classification Methods for Animal Sounds

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation