Advertisement

International Journal of Speech Technology

, Volume 13, Issue 2, pp 101–115 | Cite as

An investigation of speech enhancement using wavelet filtering method

  • Khaled Daqrouq
  • Ibrahim N. Abu-Isbeih
  • Omar Daoud
  • Emad Khalaf
Article

Abstract

This paper investigates the utilization of wavelet filters via multistage convolution by Reverse Biorthogonal Wavelets (RBW) in high and low pass band frequency parts of speech signal. Speech signal is decomposed into two pass bands of frequency; high and low, and then the noise is removed in each band individually in different stages via wavelet filters. This approach provides better outcomes because it does not cut the speech information, which occurs when utilizing conventional thresholding. We tested the proposed method via several noise probability distribution functions. Subjective evaluation is engaged in conjunction with objective evaluation to accomplish optimal investigation method. The method is simple but has surprise high quality results. The method shows superiority over Donoho and Johnstone thresholding method and Birge-Massart thresholding strategy method.

Keywords

Wavelet filters Speech signal Enhancement Thresholding Objective evaluation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bahoura, M., & Rouat, J. (2006). Wavelet speech enhancement based on time–scale adaptation. Speech Communication, 48, 1620–1637. CrossRefGoogle Scholar
  2. Berouti, M., Schwartz, R., & Makhoul, J. (1979). Enhancement of speech corrupted by acoustic noise. In Proceeding of the IEEE conference on acoustics, speech and signal processing (pp. 208–211). Google Scholar
  3. Birgé, L., & Massart, P. (1997). From model selection to adaptive estimation. In Festschrift for Lucien Le Cam (pp. 55–88). New York: Springer. Google Scholar
  4. Boll, S. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics Speech and Signal Processing, 27, 113–120. CrossRefGoogle Scholar
  5. Breithaupt, C., & Martin, R. (2003). MMSE estimation of magnitude-squared DFT coefficients with super-Gaussian priors. In IEEE proceeding of international conference on acoustics, speech and signal processing (Vol. I, pp. 896–899). Google Scholar
  6. Cohen, A., Daubechies, I., & Feauveau, J. (1992). Biorthogonal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics, 45(5), 485–560. MATHCrossRefMathSciNetGoogle Scholar
  7. Daqrouq, K., & Abu-Isbeih, I.N. (2007). Arrhythmia detection using wavelet transform. In IEEE Region 8, EUROCON 2007, Warsaw, Poland. Google Scholar
  8. Daqrouq, K., & Abu-Sheikha, N. (2005). Heart rate variability analysis using wavelet transform. Asian Journal for Information Technology, 4(4). Google Scholar
  9. Dat, T., Takeda, K., & Itakura, F. (2005). Generalized gamma modeling of speech and its online estimation for speech enhancement. In Proceeding of ICASSP-2005 (pp. 181–184). Google Scholar
  10. Daubechies, I. (1988). Orthonormal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics, 41(11), 909–996. MATHCrossRefMathSciNetGoogle Scholar
  11. Daubechies, I. (1992). Ten lectures on wavelets. In CBMS-NSF conference series in applied mathematics. Philadelphia: SIAM. Google Scholar
  12. Deller, J., Hansen, J., & Proakis, J. (2000). Discrete-time processing of speech signals (2nd ed.). New York: IEEE Press. Google Scholar
  13. Diethorn, E. (2000). Subband noise reduction methods for speech enhancement. In S. L. Gay, & J. Benesty (Eds.), Acoustic signal processing for telecommunication. Dordrecht: Kluwer Academic. Chapter 9. Google Scholar
  14. Donoho, D. (1993). Nonlinear wavelet methods for recovering signals, images, and densities from indirect and noisy data. Proceedings of Symposia in Applied Mathematics, 47, 173–205. MathSciNetGoogle Scholar
  15. Donoho, D. (1995). Denoising by soft thresholding. IEEE Transactions on Information Theory, 41(3), 613–627. MATHCrossRefMathSciNetGoogle Scholar
  16. Donoho, D., & Johnstone, I. (1994). Ideal spatial adaptation by wavelet shrinkage. Biometrika, 81, 425–455. MATHCrossRefMathSciNetGoogle Scholar
  17. Donoho, D., & Johnstone, I. (1995). Adapting to unknown smoothness via wavelet shrinkage. Journal of the American Statistical Association, 90, 1200–1224. MATHCrossRefMathSciNetGoogle Scholar
  18. Ephraim, Y., & Malah, D. (1984). Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-32(6), 1109–1121. CrossRefGoogle Scholar
  19. Ephraim, Y., & Malah, D. (1985). Speech enhancement using a minimum mean square error log-spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-33, 443–445. CrossRefGoogle Scholar
  20. Gabor, D. (1946). Theory of communications. Journal of the Institute of Electrical Engineering London, 93, 429–457. Google Scholar
  21. Ghanbari, Y., & Karami, M. (2004). Spectral subtraction in the wavelet domain for speech enhancement. Internat. J. Software Inf. Technol. (IJSIT), 1, 26–30. Google Scholar
  22. Ghanbari, Y., & Kerami-Mollaei, M.R. (2006). A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets. Speech Communication, 48, 927–940. CrossRefGoogle Scholar
  23. Hansen, J., & Pellom, B. (1998). An effective quality evaluation protocol for speech enhancement algorithms. In Proc. int. conf. spoken lang. process. (Vol. 7, pp. 2819–2822). Google Scholar
  24. Haykin, S. (1996). Adaptive filter theory (3rd ed.). New York: Prentice Hall. Google Scholar
  25. Hu, Y., & Loizou, P. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 16(1), 229–238. CrossRefGoogle Scholar
  26. Huang, H., & Pan, J. (2006). Uniform and warped low delay filter-banks for speech enhancement. Signal Processing, 86, 792–803. MATHCrossRefGoogle Scholar
  27. Huang, Q., Yang, J., & Shoushui, W. (2007). Variational Bayesian learning for speech modeling and enhancement. Signal Processing, 87, 2026–2035. MATHCrossRefGoogle Scholar
  28. ITU-T Rec. P. 835 (2003). Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm, ITU-T, ITU-T Rec. P. 835. Google Scholar
  29. Johnson, M., Yuan, X., & Ren, Y. (2007). Speech signal enhancement through adaptive wavelet thresholding. Speech Communication, 49, 123–133. CrossRefGoogle Scholar
  30. Johnstone, I., & Silverman, B. (1997). Wavelet threshold estimators for data with correlated noise. Journal of the Royal Statistical Society, Series B (Gen.), 59, 319–351. MATHCrossRefMathSciNetGoogle Scholar
  31. Kamath, S., & Loizou, P. (2002). A Multi-band spectral subtraction method for enhancing speech corrupted by colored noise. In IEEE international conference on acoustics, speech, and signal processing (Vol. 4, pp. 4160–4164). Google Scholar
  32. Kamrul, H. (2004). Reducing signal-bias from MAD estimated noise level for DCT speech enhancement. Signal Processing, 84, 151–162. MATHCrossRefGoogle Scholar
  33. Kitawaki, N., Nagabuchi, H., & Itoh, K. (1988). Objective quality evaluation for low bit-rate speech coding systems. IEEE Journal on Selected Areas in Communications, 6(2), 262–273. CrossRefGoogle Scholar
  34. Klatt, D. (1982). Prediction of perceived phonetic distance from critical band spectra. In IEEE international conference on acoustics, speech, and signal processing (Vol. 7, pp. 1278–1281). Google Scholar
  35. Klein, M., & Kabal, P. (2002). Signal subspace speech enhancement with perceptual post-filtering. In IEEE international conference on acoustics, speech, and signal processing (Vol. 1, pp. 537–540). Google Scholar
  36. Lotter, T., & Vary, P. (2005). Speech enhancement by MAP spectral amplitude estimation using a super-Gaussian speech model. EURASIP Journal on Applied Signal Processing, 7, 1110–1126. Google Scholar
  37. Mallat, S. (1989a). A theory for multiresolution signal decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7), 674–693. MATHCrossRefGoogle Scholar
  38. Mallat, S. (1989b). Multifrequency channel decompositions of images and wavelet models. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(12), 2091–2110. CrossRefGoogle Scholar
  39. Mallat, S., & Hwang, W. (1992). Singularity detection and processing with wavelets. IEEE Transactions on Information Theory, 38, 617–643. CrossRefMathSciNetGoogle Scholar
  40. Martin, R. (2002). Speech enhancement using MMSE short time spectral estimation with gamma distributed speech priors. In IEEE int. conf. acoustics, speech, signal processing, Orlando, Florida. Google Scholar
  41. Sameti, H. (1998). Hmm-based strategies for enhancement of speech signals embedded in nonstationary noise. IEEE Transactions on Acoustics, Speech, and Signal Processing, 6, 445–455. Google Scholar
  42. Senapati, S., & Chakroborty, S. (2008). Speech enhancement by joint statistical characterization in the Log Gabor Wavelet domain Goutam Saha. Speech Communication, 50, 504–518. CrossRefGoogle Scholar
  43. Seok, J., & Bae, K. (1997). Speech enhancement with reduction of noise components in the wavelet domain. In IEEE international conference on acoustics, speech, and signal processing (ICASSP’97) (Vol. 2, pp. 1323–1326). Google Scholar
  44. Sheikhzadeh, H., & Abutalebi, H. (2001). An improved waveletbased speech enhancement system. In Proceeding of the 7th Eur. conference speech comm. technol. (EuroSpeech), Aalborg, Denmark. Google Scholar
  45. Tufekci, Z., Gowdy, J., Gurbuz, S., & Patterson, E. (2006). Applied mel-frequency discrete wavelet coefficients and parallel model compensation for noise-robust speech recognition. Speech Communication, 48, 1294–1307. CrossRefGoogle Scholar
  46. Turbin, V., & Faucheur, N. (2007). Estimation of speech quality of noise reduced signals. In Proceeding online workshop meas. speech audio quality network. Google Scholar
  47. Veprek, P., & Scordilis, M. (2002). Analysis, enhancement and evaluation of five pitch determination techniques. Speech Communication, 37, 249–270. MATHCrossRefGoogle Scholar
  48. Vidakovic, B., & Lozoya, C. (1998). On time-dependant wavelet denoising. IEEE Transaction on Signal Processing, 46, 2549–2548. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Khaled Daqrouq
    • 1
  • Ibrahim N. Abu-Isbeih
    • 1
  • Omar Daoud
    • 1
  • Emad Khalaf
    • 2
  1. 1.Communications and Electronics DepartmentPhiladelphia UniversityAmmanJordan
  2. 2.Computer Eng. DepartmentPhiladelphia UniversityAmmanJordan

Personalised recommendations