Abstract
This paper presents significance of Mel-frequency Cepstral Coefficients (MFCC) Frequency band selection for text-independent speaker identification. Recent studies have been focused on speaker specific information that may extends beyond telephonic passband. The selection of the frequency band is an important factor to effectively capture the speaker specific information present in the speech signal for speaker recognition. This paper focuses on development of a speaker identification system based on MFCC features which are modeled using vector quantization. Here, the frequency band is varied up to 7.75 kHz. Speaker identification experiments evaluated on TIMIT database consisting of 630 speaker shows that the average recognition rate achieved is 97.37 % in frequency band 0–4.85 kHz for 20 MFCC filters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Frédéric Bimbot, Jean-François Bonastre, Corinne Fredouille, Guillaume Gravier, Ivan Magrin-Chagnolleau, Sylvain Meignier, Teva Merlin, Javier Ortega-García, Dijana Petrovska-Delacrétaz, Douglas A. Reynolds: A tutorial on text-independent speaker verification, EURASIP Journal on Applied Signal Processing 2004, Hindawi, pp. 430–451 (2004).
Md Jahangir Alam, Tomi Kinnunen, Patrick Kenny, Pierre Ouellet, Douglas O’Shaughnessy: Multitaper MFCC and PLP features for speaker verification using i-vectors, Journal on Speech Communication, Elsevier, vol. 55, no. 2, pp. 237–251 (2013).
Claude Turner, Anthony Joseph, Murat Aksu, Heather Langdond: The Wavelet and Fourier Transforms in Feature Extraction for Text-Dependent, Filterbank-Based Speaker Recognition, Journal onProcedia Computer Science, Elsevier, vol. 6, pp. 124–129 (2011).
Mangesh S. Deshpande, Raghunath S. Holambe: New Filter Structure based Admissible Wavelet Packet Transform for Text-Independent Speaker Identification, International Journal of Recent Trends in Engineering, vol. 2, no. 5, pp. 121–125 (2009).
Dr. Shaila D. Apte: Speech Processing Applications, in Speech and Audio Processing, Section 1, Section 2 and Section 3, pp. 1–6, 67, 91–92, 105–107, 129–132, Wiley India Edition.
Tomi Kinnunen, Haizhou Li: An overview of text-independent speaker recognition: From features to supervectors, Journal onSpeech Communication, Elsevier, vol. 52, no. 1, pp. 12–40 (2010).
Tomi Kinnunen, Rahim Saeidi, FilipSedlák, Kong Aik Lee, Johan Sandberg, Maria Hansson-Sandsten, Haizhou Li: Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification, IEEE Transactions Audio, Speech and Language Processing, vol.20, no.7, pp. 1990–2001 (2012).
Pawan K. Ajmera, Dattatray V. Jadhav, Ragunath S. Holambe: Text-independent speaker identification using Radon and discrete cosine transforms based features from speech spectrogram, Journal on Pattern Recognition, Elsevier, vol. 44, no. 10–11, pp. 2749–2759 (2011).
WU Zunjing, CAO Zhigang: Improved MFCC-Based Feature for Robust Speaker Identification, TUP Journals & Magazines, vol.10, no 2, pp. 158–161 (2005).
Jian-Da Wu, Bing-Fu Lin: Speaker identification using discrete wavelet packet transform technique with irregular decomposition, Journal on Expert Systems with Applications, Elsevier, vol. 36, no. 2, pp. 3136–3143 (2009).
R. Shantha Selva Kumari, S. Selva Nidhyananthan, Anand.G: Fused Mel Feature sets based Text-Independent Speaker Identification using Gaussian Mixture Model, International Conference on Communication Technology and System Design 2011, Journal on Procedia Engineering, Elsevier, vol. 30, pp. 319–326 (2012).
Seiichi Nakagawa, Longbiao Wang, and Shinji Ohtsuka: Speaker Identification and Verification by Combining MFCC and Phase Information, IEEE Transactions Audio, Speech and Language Processing, vol.20, no.4, pp. 1085–1095 (2012).
Sumithra Manimegalai Govindan, Prakash Duraisamy, Xiaohui Yuan: Adaptive wavelet shrinkage for noise robust speaker recognition, Journal on Digital Signal Processing, Elsevier, vol. 33, pp. 180–190 (2014).
Noor Almaadeed, Amar Aggoun, Abbes Amira: Speaker identification using multimodal neural networks and wavelet analysis, IET Journals and Magazines, vol. 4, no. 1, pp. 18–28 (2015).
Khaled Daqrouq, Tarek A. Tutunji: Speaker identification using vowels features through a combinedmethod of formants, wavelets, and neural network classifiers, Journal on Applied Soft Computing, Elsevier, vol. 27, pp. 231–239 (2015).
Pradhan, G.; Prasanna, S.: Significance of speaker information in wideband speech, in Communications (NCC), 2011 National Conference on, pp. 1–5, (2011).
J.S. Garofolo, L.F. Lamel, W.M. Fisher, J.G. Fiscus, D.S. Pallett, N.L. Dahlgren, V. Zue, TIMIT acoustic-phonetic continuous speech corpus, http://catalog.ldc.upenn.edu/ldc93s1, 1993.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media Singapore
About this paper
Cite this paper
Dhonde, S.B., Jagade, S.M. (2017). Significance of Frequency Band Selection of MFCC for Text-Independent Speaker Identification. In: Satapathy, S., Bhateja, V., Joshi, A. (eds) Proceedings of the International Conference on Data Engineering and Communication Technology. Advances in Intelligent Systems and Computing, vol 469. Springer, Singapore. https://doi.org/10.1007/978-981-10-1678-3_21
Download citation
DOI: https://doi.org/10.1007/978-981-10-1678-3_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-1677-6
Online ISBN: 978-981-10-1678-3
eBook Packages: EngineeringEngineering (R0)