The Speaker Recognition of Noisy Short Utterance

Chen, Ying; Tang, Zhen-Min

doi:10.1007/978-3-642-42057-3_84

Ying Chen²¹ &
Zhen-Min Tang²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8261))

Included in the following conference series:

International Conference on Intelligent Science and Big Data Engineering

2 Citations

Abstract

The noisy short utterance is polluted by noise and its corpus is not full, so the recognition rate significantly decreased. This paper proposed noise separation algorithm based on constrained Non-negative matrix factorization (CNMF), use it to separate pure speech from noisy speech. And then the speech frames are classified to high quality class and low quality class using differences detection and discrimination algorithm (DDADA) proposed in this paper. Combining features group with GMM-UBM two-stage classification model to make full use of limited information. Experiments show that the above algorithms improve speaker recognition rate of noisy short utterance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Fatima, N., Zheng, T.F.: Short utterance speaker recognition. In: International Conference on Sustems and Informatics(ICSAI), pp. 1746–1750 (2012)
Google Scholar
May, T., van de Par, S., Kohlrausch, A.: Noise-robust speaker recognition combining missing data techniques and universal background modeling. IEEE Transactions on Audio, Speech and Language Processing 20, 108–121 (2012)
Article Google Scholar
Dehak, N., et al.: Front-end factor analysis for speaker verification. IEEE Transactions on Audio,Speech and Language Processing 19, 788–798 (2011)
Article Google Scholar
Murty, K.S., Yegnanarayana, B.: Combining evidence from residual phase and MFCC features for speaker recognition. IEEE Signal Process 13, 52–55 (2006)
Article Google Scholar
Chetouani, M., Faundez-Zanuy, M., Gas, B., Zarader, J.L.: Investigation on LP-residual representations for speaker identification. Pattern Recognition 42, 487–494 (2009)
Article MATH Google Scholar
Zheng, N., Ching, P.C., Lee, T.: Time frequency analysis of vocal source, signal for speaker recognition. In: Proc. ICSLP, pp. 2333–2336 (2004)
Google Scholar
Chen, W.N., Zheng, N., Lee, T.: Discrimination power of vocal source and vocal tract related features for speaker segmentation. IEEE Transactions on Audio, Speech and Language Processing 15, 1884–1892 (2007)
Article Google Scholar
Joder, C., Weninger, F., Eyben, F., Virette, D., Schuller, B.: Real-time speech separation by semi-supervised nonnegative matrix factorization. In: Proc. of Inter. Conf Latent Variable Analysis and Signal Separation (2012)
Google Scholar
Joder, C., Schuller, B.: Exploring Nonnegative Matrix Factorization for Audio Classification: Application to Speaker Recognition. ITG-Fachbericht 236: Sprachkommunikation, Braunschweig (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science & Engineering, Nanjing University of Science and Technology, Jiangsu, Nanjing, 210094, China
Ying Chen & Zhen-Min Tang

Authors

Ying Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhen-Min Tang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Automation and Electrical Engineering, University of Science and Technology, Xueyuan Road No. 30, 100083, Beijing, China
Changyin Sun
Department of Psychology, Peking University, Yiheyuan Road No. 5, 100871, Beijing, China
Fang Fang
Department of Computer Science and Technology, Nanjing University, Xianlin Avenue No. 163, 210023, Nanjing, China
Zhi-Hua Zhou
School of Automation, Southeast University, Sipailou No. 2, 210096, Nanjing, China
Wankou Yang
Institute of Automation, Chinese Academy of Sciences, No. 95 East Zhongguancun Road, 100190, Beijing, China
Zhi-Yong Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Y., Tang, ZM. (2013). The Speaker Recognition of Noisy Short Utterance. In: Sun, C., Fang, F., Zhou, ZH., Yang, W., Liu, ZY. (eds) Intelligence Science and Big Data Engineering. IScIDE 2013. Lecture Notes in Computer Science, vol 8261. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-42057-3_84

Download citation

DOI: https://doi.org/10.1007/978-3-642-42057-3_84
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-42056-6
Online ISBN: 978-3-642-42057-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics