Advertisement

International Journal of Speech Technology

, Volume 18, Issue 4, pp 547–554 | Cite as

Ideal binary masking for reducing convolutive noise

  • Nasir Saleem
  • Ehtasham Mustafa
  • Aamir Nawaz
  • Adnan Khan
Article

Abstract

It is important to know the degree to which convolutive noise disrupts the perceptual aspects of speech and its intelligibility. This paper presents the ideal binary masking criterion for reducing the convolutive noise (reverberation) and to improve the quality and intelligibility of speech. The noise is suppressed using ideal binary time–frequency masking that is based on signal-to-reverberation ratio (SRR) of individual time–frequency channels. All T–F channels with the SRR greater than pre-selected threshold are retained while others are eliminated. The performance of algorithm is evaluated using IEEE sentences corrupted with different degrees of reverberation times (RT60) ranging from 0.3 to 2.0 s. The results indicate that with the increase of reverberation time, the intelligibility and perceptual aspects of speech decrease. Additional analyses indicated that ideal binary masking reduced the temporary envelope spreading effect introduced by the reverberation. The algorithm is evaluated with perceptual evaluation of speech quality, SNRLOSS, log-likelihood-ratio and frequency weighted segmental signal-to-noise ratio.

Keywords

Ideal binary masking Convolutive noise PESQ SNRLOSS LLR FwSNRseg 

References

  1. Assmann, P. F., & Summerfield, Q. (2004). The perception of speech under adverse acoustic conditions. In S. Greenberg (Ed.), Speech processing in auditory system. A. N: W. A. Ainsworth.Google Scholar
  2. Bolt, R. H., & MacDonald, A. D. (1949). Theory of speech masking by reverberation. Journal of the Acoustic Society of America, 21, 577–580.CrossRefGoogle Scholar
  3. Furuya, K., & Kataoka, A. (2007). Robust speech dereverberation using multichannel blind deconvolution with spectral subtraction. IEEE Transactions on Audio, Speech, and Language Processing, 15, 1579–1591.CrossRefGoogle Scholar
  4. Grundlehner, B., Lecocq, J., Balan, R., & Rosca, J. (2005). Performance assessment method for speech enhancement. In Proceedings of 1st annual, IEEE.Google Scholar
  5. Haykin, S. (2000). Unsupervised adaptive filtering: Blind de-convolution (Vol. 2, pp. 1–12). New York: Wiley.Google Scholar
  6. Huang, Y., Benesty, J., & Chen, J. (2007). De-reverberation. In J. Benesty, M. Sondhi, & Y. Huang (Eds.), Springer handbook of speech processing (pp. 929–943). New York: Springer.Google Scholar
  7. Kjellberg, A. (2004). Effects of reverberation time on the cognitive load in speech communication: Theoretical considerations. Noise Health, 7, 11–21.Google Scholar
  8. Kokkinakis, K., & Loizou, P. C. (2009). Selective-tap blind de-reverberation for two-microphone enhancement of reverberant speech. IEEE Signal Processing Letters, 16, 961–964.CrossRefGoogle Scholar
  9. Krishnamoorthy, P., & Prasanna, S. R. (2009). Reverberant speech enhancement by temporal and spectral processing. IEEE Transactions on Audio, Speech, and Language Processing, 17, 253–266.CrossRefGoogle Scholar
  10. Loizou, P. C. (2007). Speech enhancement: Theory and practice. In S. R. Quackenbush, T. P. Barnwell III, & M. A. Clement (Eds.), Objective—measures of speech quality (2nd ed.). Eaglewood Cliffs: Prentice Hall.Google Scholar
  11. Ma, J., & Loizou, P. C. (2011). SNR loss: A new objective measure for predicting speech intelligibility of noise-suppressed speech. Speech Communication, 53(3), 340–354.CrossRefGoogle Scholar
  12. Miyoshi, M., & Kaneda, Y. (1988). Inverse filtering of room acoustics. IEEE Transactions on Speech and Audio Processing, 36, 145–152.CrossRefGoogle Scholar
  13. Nabelek, A. K., & Dagenais, P. A. (1986). Vowel errors in noise and in reverberation by hearing-impaired listeners. Journal of the Acoustic Society of America, 80, 741–748.CrossRefGoogle Scholar
  14. Nabelek, A. K., & Letowski, T. R. (1988). Similarities of vowels in non-reverberant and reverberant fields. Journal of the Acoustic Society of America, 83, 1891–1899.CrossRefGoogle Scholar
  15. Nabelek, A. K., Letowski, T. R., & Tucker, F. M. (1989). Reverberant overlap and self-masking in consonant identification. Journal of the Acoustic Society of America, 86, 1259–1265.CrossRefGoogle Scholar
  16. Nabelek, A. K., & Picket, J. M. (1974). Monaural and binaural speech perception through hearing aids under noise and reverberation with normal and hearing-impaired listeners. Journal of Speech and Hearing Research, 17, 724–739.CrossRefGoogle Scholar
  17. Neuman, A. C., Wroblewski, M., Hajicek, J., & Rubinstein, A. (2010). Combined effects of noise and reverberation on speech recognition performance of normal-hearing children and adults. Ear and Hearing, 31, 336–344.CrossRefGoogle Scholar
  18. Rix, A.W., Hollier, M. P., Hekstra, A. P. & Beerends, J. G. (2001). Perceptual evaluation of speech quality (PESQ).Google Scholar
  19. Roman, N., & Woodruff, J. (2013). Speech intelligibility in reverberation with ideal binary masking: Effects of early reflections and signal-to-noise ratio threshold. Journal of the Acoustical Society of America, 133, 1707–1717.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Nasir Saleem
    • 1
  • Ehtasham Mustafa
    • 1
  • Aamir Nawaz
    • 1
  • Adnan Khan
    • 1
  1. 1.Institute of Engineering & TechnologyGomal UniversityD. I. KhanPakistan

Personalised recommendations