Cluster-Dependent Feature Transformation for Telephone-Based Speaker Verification
This paper presents a cluster-based feature transformation technique for telephone-based speaker verification when labels of the handset types are not available during the training phase. The technique combines a cluster selector with cluster-dependent feature transformations to reduce the acoustic mismatches among different handsets. Specifically, a GMM-based cluster selector is trained to identify the cluster that best represents the handset used by a claimant. Handset distorted features are then transformed by cluster-specific feature transformation to remove the acoustic distortion before being presented to the clean speaker models. Experimental results show that cluster-dependent feature transformation with number of clusters larger than the actual number of handsets can achieve a performance level very close to that achievable by the handset-based transformation approaches.
KeywordsGaussian Mixture Model Transformation Parameter Equal Error Rate Feature Transformation Speaker Model
Unable to display preview. Download preview PDF.
- M.W. Mak and S.Y. Kung, “Combining stochastic feautre transformation and handset identification for telephone-based speaker verification,” in Proc. ICASSP’2002, 2002, pp. I701–I704.Google Scholar
- C. L. Tsang, M.W. Mak, and S. Y. Kung, “Divergence-based out-of-class rejection for telephone handset identification,” in Proc. ICSLP’02, 2002, pp. 2329–2332.Google Scholar
- D.A. Reynolds, “HTIMIT and LLHDB: speech corpora for the study of handset transducer effects,” in ICASSP’97, 1997, vol. 2, pp. 1535–1538.Google Scholar
- Eric W.M. Yu, M. W. Mak, and S.Y. Kung, “Speaker verification from coded telephone speech using stochastic feature transformation and handset identification,” in Pacific-Rim Conference on Multimedia 2002, 2002, pp. 598–606.Google Scholar