Advertisement

International Journal of Speech Technology

, Volume 18, Issue 2, pp 157–166 | Cite as

A wavelet based method for removal of highly non-stationary noises from single-channel hindi speech patterns of low input SNR

  • Sachin Singh
  • Manoj Tripathy
  • R. S. Anand
Article

Abstract

This paper presents a binary mask thresholding function in Doubachies10 wavelet transform for enhancement of highly non-stationary noise mixed single-channel Hindi speech patterns of low (negative) SNR. In the wavelet transform, a five level of decomposition is used and detailed coefficients of all five levels are given to binary mask thresholding function for removing noise and enhancing the speech patterns. The robustness of the proposed method is compared with the wildly popular methods such as log-mmse, test-psc, Wiener, IdBM, and spectral-subtraction on the basis of performance measure parameters viz SNR, PSNR, PESQ, and Cepstrum distance. The algorithms were implemented in MATLAB 7.1.

Keywords

Speech enhancement Hindi speech patterns SNR  PESQ Cepstrum distance Wavelet transform 

References

  1. Bahoura, M., & Rouat, J. (2006). Wavelet speech enhancement based on time-scale adaptation. Speech Communication, 48(12), 1620–1637.CrossRefGoogle Scholar
  2. Boll, S. F. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics Speech and Signal Processing, 27(2), 113–120.CrossRefGoogle Scholar
  3. Dendrinos, M., Bakamidis, S., & Carayannis, G. (1991). Speech enhancement from noise: A regenerative approach. Speech Communication, 10(1), 45–57.CrossRefGoogle Scholar
  4. Ephraim, Y., & Malah, D. (1985). Speech enhancement using a minimum mean square error log-spectral amplitude estimator. IEEE Transactions on Audio, Speech and Language Processing, 33, 443–445.Google Scholar
  5. Ephraim, Y. (1992). Statistical-model-based speech enhancement systems. Proceedings of the IEEE, 80, 1526–1555.CrossRefGoogle Scholar
  6. Ephraim, Y., & Van Trees, H. L. (1995). A signal subspace approach for speech enhancement. IEEE Transactions on Acoustics Speech and Signal Processing, 3(4), 251–266.CrossRefGoogle Scholar
  7. Gabor, D. (1946). Theory of communication. The Journal of Electrical Engineering, 93, 429–457.Google Scholar
  8. Goupillaud, P., Grossmann, A., & Morlet, J. (1984). Cycle-octave and related transforms in seismic analysis. Journal of Applied Geophysics, 23(1), 85–102.Google Scholar
  9. Hu, Y., & Loizou, P. C. (2007). A comparative intelligibility study of single-microphone noise reduction algorithms. Journal Acoustic Socity of America, 122, 1777–1786.CrossRefGoogle Scholar
  10. Jensen, S. H., & Hansen, P. C. (1995). Reduction of broad-band noise in speech by truncated QSVD. IEEE Transactions on Acoustics Speech and Signal Processing, 3(6), 439–448.CrossRefGoogle Scholar
  11. Johnson, M. T., Yuan, X., & Ren, Y. (2007). Speech signal enhancement through adaptive wavelet thresholding. Speech Communication, 2(49), 123–133.CrossRefGoogle Scholar
  12. Kitawaki, N., & Nagabuchi, H. (1988). Objective quality evaluation for low bit-rate speech coding systems. IEEE Journal on Selected Areas in Communications, 6, 262–273.CrossRefGoogle Scholar
  13. Li, J., & Liu, H. (2012). New wavelet packet transform algorithm based on critical bandwidth. Computer Engineering and Applications, 14(48), 5–7.Google Scholar
  14. McAulay, R., & Malpass, M. (1980). Speech enhancement using a soft-decision noise suppression filter. IEEE Transactions on Acoustics Speech and Signal Processing, 28(2), 137–145.CrossRefGoogle Scholar
  15. Pearce, D., & Hirsch, H. G. (2000). The aurora experimental framework for the performance evaluation of speech recognition system under noisy conditions. International conference on spoken language processing, Beijing, 16–20 Oct 2000.Google Scholar
  16. Perceptual evaluation of speech quality (PESQ) An objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs. ITU-T Recommendation P.862.1 (2003).Google Scholar
  17. Prahallad, K., Elluru, N. K., Keri, V., Rajendran, S., Black, A. W. (2012). The IIIT-H indic speech databases. In Proceedings of Interspeech, Portland, Oregon, USA (2012). http://speech.iiit.ac.in/index.php/research-svl/69.html.
  18. Rangachari, S., & Loizou, P. C. (2006). A noise-estimation algorithm for highly non-stationary environments. Speech Communication, 48, 220–231.CrossRefGoogle Scholar
  19. Sanam, T. F. (2012). Enhancement of noisy speech based on a custom thresholding function with a statistically determined threshold. The International Journal of Speech Technology, 15(4), 463–475.Google Scholar
  20. Scalart, P., & Filho, J. (1996). Speech enhancement based on a priori signal to noise estimation. In Proceedings of IEEE International conference on acoust speech, signal processing (pp. 629–632).Google Scholar
  21. Singh, S., Tripathy, M., & Anand, R. S. (2013). Noise removal in single channel Hindi speech patterns by using binary mask thresholding function in various mother wavelets. IEEE International Conference on Signal Processing, Computing and Control (ISPCC), Shimla, India, 26–28 Sept 2013.Google Scholar
  22. Singh, S.,Tripathy, M., & Anand, R. S. (2014). Wavelet packet based multiple noises suppression in single channel speech using binary mask threshold. IEEE international conference on signal propagation and computer technology (ICSPCT), Ajmer, India, 12–13 July 2014.Google Scholar
  23. Singh, S., Tripathy, M., & Anand, R. S. (2014). “Subjective and objective analysis of speech enhancement algorithms for single channel speech patterns of Indian and english languages”, Taylor & Francis. IETE Technical Review, 31(1), 34–46.CrossRefGoogle Scholar
  24. Stark, A. P., et al. (2008). Noise driven short-time phase spectrum compensation procedure for speech enhancement. In Proceedings of Interspeech, Brisbane, Australia.Google Scholar
  25. Tao, H., & Qin, H. (2008). Chengbo Research of signal denoising method based on an improved wavelet thresholding. Piezoelectronics & Acoustooptics, 1(30), 93–95.Google Scholar
  26. Wojcicki, K., & Loizou, P. C. (2012). Channel selection in the modulation domain for improved speech intelligibility in noise. The Journal of the Acoustical Society of America, 131(4), 2904–2913.Google Scholar
  27. Yi, H., & Loizou, P. C. (2004). Speech enhancement based on wavelet thresholding the multitaper Spectrum. IEEE Signal Processing Letters, 12, 59–67.Google Scholar
  28. Zhang, X. (2010). Digital : Speech signal processing and MATLAB simulation. Beijing: Publishing House of Electronics Industry.Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of Electrical EngineeringIndian Institute of Technology RoorkeeRoorkeeIndia

Personalised recommendations