Multi-feature Fusion for Closed Set Text Independent Speaker Identification
An intra-modal fusion, a fusion of different features of the same modal is proposed for speaker identification system. Two fusion methods at feature level and at decision level for multiple features are proposed in this study. We used multiple features from MFCC and wavelet transform of speech signal. Wavelet transform based features capture frequency variation across time while MFCC features mainly approximate the base frequency information, and both are important. A final score is calculated using weighted sum rule by taking matching results of different features. We evaluate the proposed fusion strategies on VoxForge speech dataset using K-Nearest Neighbor classifier. We got the promising result with multiple features in compare to separate one. Further, multi-features also performed well at different SNRs on NOIZEUS, a noisy speech corpus.
KeywordsMulti-feature fusion intra-modal fusion speaker identification MFCC wavelet transform K-Nearest Neighbor (KNN)
Unable to display preview. Download preview PDF.
- 1.Multimodel Data Fusion, http://www.multitel.be/?page=data
- 2.Marcel, S., Bengio, S.: Improving face verification using skin color information. In: 16th International Conference on Pattern Recognition, pp. 378–381 (2002)Google Scholar
- 4.Wang, Y., Tan, T., Jain, A.K.: Combining face and iris biometrics for identity verification. In: Kittler, J., Nixon, M.S. (eds.) AVBPA 2003. LNCS, vol. 2688. Springer, Heidelberg (2003)Google Scholar
- 5.Hong, L., Jain, A.K., Pankanti, S.: Can multi-biometrics improve performance? In: Technical Report MSU-CSE-99-39, Department of Computer Science, Michigan State University, East Lansing, Michigan (1999)Google Scholar
- 6.An Introduction to Data Fusion, Royal Military Academy, http://www.sic.rma.ac.be/Research/Fusion/Intro/content.html
- 7.Wang, L., Minami, K., Yamamoto, K., Nakagawa, S.: Speaker identification by combining MFCC and phase information in noisy environments. In: 35th International Conference on Acoustics, Speech, and Signal Processing, Dallas, Texas, U.S.A. (2010)Google Scholar
- 10.Tzanetakis, G., Essl, G., Cook, P.: Audio analysis using the discrete wavelet transform. In: The Proceedings of Conference in Acoustics and Music Theory Applications, Skiathos, Greece (2001)Google Scholar
- 11.Liu, Y., Shengjun, L., Dongsheng, Z.: Feature Extraction and classification of lung sounds based on wavelet coefficients. In: Proceeding of the 6th International Progress, Wavelet Analysis and Active Media Technology, Chongqing, China, pp. 773–778. World Scientific, Singapore (2005)CrossRefGoogle Scholar
- 12.Toh, A.M., Togneri, R., Northolt, S.: Spectral entropy as speech features for speech recognition. In: The Proceedings of PEECS, Perth, pp. 22–25 (2005)Google Scholar
- 13.VoxForge Speech Corpus, http://www.voxforge.org
- 14.NOIZEUS: A Noisy Speech Corpus for Evaluation of Speech Enhancement Algorithms, http://www.utdallas.edu/~loizou/speech/noizeus/