Multi-feature Fusion for Closed Set Text Independent Speaker Identification

Verma, Gyanendra K.

doi:10.1007/978-3-642-19423-8_18

Gyanendra K. Verma⁴

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 141))

Included in the following conference series:

International Conference on Information Intelligence, Systems, Technology and Management

1133 Accesses
6 Citations

Abstract

An intra-modal fusion, a fusion of different features of the same modal is proposed for speaker identification system. Two fusion methods at feature level and at decision level for multiple features are proposed in this study. We used multiple features from MFCC and wavelet transform of speech signal. Wavelet transform based features capture frequency variation across time while MFCC features mainly approximate the base frequency information, and both are important. A final score is calculated using weighted sum rule by taking matching results of different features. We evaluate the proposed fusion strategies on VoxForge speech dataset using K-Nearest Neighbor classifier. We got the promising result with multiple features in compare to separate one. Further, multi-features also performed well at different SNRs on NOIZEUS, a noisy speech corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Multimodel Data Fusion, http://www.multitel.be/?page=data
Marcel, S., Bengio, S.: Improving face verification using skin color information. In: 16th International Conference on Pattern Recognition, pp. 378–381 (2002)
Google Scholar
Czyz, J., Kittler, J., Vandendorpe, L.: Multiple classifier combination for face-based identity verification. Pattern Recognition 37(7), 1459–1469 (2004)
Article Google Scholar
Wang, Y., Tan, T., Jain, A.K.: Combining face and iris biometrics for identity verification. In: Kittler, J., Nixon, M.S. (eds.) AVBPA 2003. LNCS, vol. 2688. Springer, Heidelberg (2003)
Google Scholar
Hong, L., Jain, A.K., Pankanti, S.: Can multi-biometrics improve performance? In: Technical Report MSU-CSE-99-39, Department of Computer Science, Michigan State University, East Lansing, Michigan (1999)
Google Scholar
An Introduction to Data Fusion, Royal Military Academy, http://www.sic.rma.ac.be/Research/Fusion/Intro/content.html
Wang, L., Minami, K., Yamamoto, K., Nakagawa, S.: Speaker identification by combining MFCC and phase information in noisy environments. In: 35th International Conference on Acoustics, Speech, and Signal Processing, Dallas, Texas, U.S.A. (2010)
Google Scholar
Patel, I., Srinivas Rao, Y.: A Frequency Spectral Feature Modeling for Hidden Markov Model Based Automated Speech Recognition. In: Meghanathan, N., Boumerdassi, S., Chaki, N., Nagamalai, D. (eds.) NeCoM 2010. CCIS, vol. 90, pp. 134–143. Springer, Heidelberg (2010)
Chapter Google Scholar
Dutta, T.: Dynamic time warping based approach to text dependent speaker identification using spectrograms. Congress on Image and Signal Processing 2, 354–360 (2008)
Article Google Scholar
Tzanetakis, G., Essl, G., Cook, P.: Audio analysis using the discrete wavelet transform. In: The Proceedings of Conference in Acoustics and Music Theory Applications, Skiathos, Greece (2001)
Google Scholar
Liu, Y., Shengjun, L., Dongsheng, Z.: Feature Extraction and classification of lung sounds based on wavelet coefficients. In: Proceeding of the 6th International Progress, Wavelet Analysis and Active Media Technology, Chongqing, China, pp. 773–778. World Scientific, Singapore (2005)
Chapter Google Scholar
Toh, A.M., Togneri, R., Northolt, S.: Spectral entropy as speech features for speech recognition. In: The Proceedings of PEECS, Perth, pp. 22–25 (2005)
Google Scholar
VoxForge Speech Corpus, http://www.voxforge.org
NOIZEUS: A Noisy Speech Corpus for Evaluation of Speech Enhancement Algorithms, http://www.utdallas.edu/~loizou/speech/noizeus/

Download references

Author information

Authors and Affiliations

Indian Institute of Information Technology, Allahabad, Jhalwa, Allahabad, India
Gyanendra K. Verma

Authors

Gyanendra K. Verma
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science, College of Engineering and Science, Louisiana Tech University, 71272, Ruston, LA, USA
Sumeet Dua
CISE Department, CSE 301, University of Florida, 32611, Gainesville, FL, USA
Sartaj Sahni
Management Development Institute, 122 007, Sukhrali, Gurgaon, India
D. P. Goyal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Verma, G.K. (2011). Multi-feature Fusion for Closed Set Text Independent Speaker Identification. In: Dua, S., Sahni, S., Goyal, D.P. (eds) Information Intelligence, Systems, Technology and Management. ICISTM 2011. Communications in Computer and Information Science, vol 141. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19423-8_18

Download citation

DOI: https://doi.org/10.1007/978-3-642-19423-8_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19422-1
Online ISBN: 978-3-642-19423-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics