Abstract
An approach to segregate the target speech form the mixture utterance in low signal noise ratio (SNR) was proposed. Within the framework of computational auditory scene analysis (CASA), phase was the cue for segregation, and short time Fourier transforms (STFT) was used to extract the phase of the signal. Binary masking was used to group the target speech units based on the difference of phase between the mixture, clean speech and noise. The threshold of the binary masks was not linear. It adapted with the frequency change, and obtained from pretest. Experiments illustrated that the improvement of signal to noise ratio was more than 20dB in babble, m109, white and machinegun noise in -30dB to -20dB. The waveform of the result signal shown it remained most detail of the original signal, and had a well intelligibility. Phase is a robust cue in monaural speech segregation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Wang, D., Hu, G.: Cocktail Party Processing. In: Zurada, J.M., Yen, G.G., Wang, J. (eds.) Computational Intelligence: Research Frontiers. LNCS, vol. 5050, pp. 333–348. Springer, Heidelberg (2008)
Kerlin, J.R., Shahin, A.J., Miller, L.M.: Attentional Gain Control of Ongoing Cortical Speech Representations in a "Cocktail Party". J. Neurosci., 620–628 (2010)
Boll, S.F.: A spectral subtraction algorithm for suppression of acoustic noise in speech. In: ICASSP 1979, pp. 200–203. IEEE Press, New York (1979)
Jan, T., Wang, W.W., Wang, D.L.: A multistage approach to blind separation of convolutive speech mixtures. Speech Commun. 53, 524–539 (2011)
Brown, G.J., Cooke, M.: Computational auditory scene analysis. Comput. Speech Lang. 8, 297–336 (1994)
Yang, S., Srinivasan, S., Zhaozhang, J., DeLiang, W.: A computational auditory scene analysis system for speech segregation and robust speech recognition. Comput. Speech Lang. 24, 77–93 (2010)
Narayanan, A., Wang, D.L.: Robust speech recognition from binary masks. J. Acoust. Soc. Am. 128, L217–L222 (2010)
Wang, D., Lim, J.: The unimportance of phase in speech enhancement. IEEE Transactions on Acoustics Speech and Signal Processing 30, 679–681 (1982)
Oppenheim, A.V., Lim, J.S.: The importance of phase in signals, vol. 69, pp. 529–541. IEEE press (1981)
Hu, G., Wang, D.: A Tandem Algorithm for Pitch Estimation and Voiced Speech Segregation. IEEE Transactions on Audio, Speech, and Language Processing 18, 2067–2079 (2010)
Hu, G., Wang, D.: Auditory Segmentation Based on Onset and Offset Analysis. IEEE Transactions on Audio, Speech, and Language Processing 15, 396–405 (2007)
Woodruff, J., Wang, D.L.: Integrating Monaural and Binaural Analysis gor localizing Multiple Reverberant Sound Sources. IEEE Transactions on Audio, Speech, and Language Processing, 2706–2709 (2010)
Wang, D.: On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis: Speech Separation by Humans and Machines, pp. 181–197. Kluwer (2005)
Brungart, D.S., Chang, P.S., Simpson, B.D., Wang, D.L.: Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation. J. Acoust. Soc. Am. 120, 4007–4018 (2006)
Li, N., Loizou, P.C.: Effect of spectral resolution on the intelligibility of ideal binary masked speech. J. Acoust. Soc. Am. 123, L59–L64 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhou, H., Jiang, Y., Chen, X., Zu, Y. (2011). Monaural Speech Segregation Using Signal Phase. In: Wu, Y. (eds) Advances in Computer, Communication, Control and Automation. Lecture Notes in Electrical Engineering, vol 121. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25541-0_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-25541-0_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25540-3
Online ISBN: 978-3-642-25541-0
eBook Packages: EngineeringEngineering (R0)