Enhancement of speech signal using diminished empirical mean curve decomposition-based adaptive Wiener filtering

  • Anil GargEmail author
  • O. P. Sahu
Theoretical advances


During the last few decades, speech signal enhancement has been one of the wide-spreading research topics. Numerous algorithms are being proposed to enhance the perceptibility and the quality of speech signal. These algorithms are often formulated to recover the clear signal from the signals that are ruined by noise. Usually, short-time Fourier transform and wavelet transform are widely used to process the speech signal. This paper attempts to overcome the regular drawbacks of the speech enhancement algorithms. As the frequency domain has good noise-removing ability, the short-time Fourier domain is also aimed to enhance the speech. Additionally, this paper introduces a decomposition model, named diminished empirical mean curve decomposition, to adaptively tune the Wiener filtering process and to accomplish effective speech enhancement. The performances of the proposed method and the conventional methods are compared, and it is observed that the proposed method is superior to the conventional methods.


Speech signal Enhancement STFT D-EMCD Wiener filtering 



  1. 1.
    Moore AH, Peso Parada P, Naylor PA (2016) Speech enhancement for robust automatic speech recognition: evaluation using a baseline system and instrumental measures. Comput Speech Lang 86:85–96Google Scholar
  2. 2.
    Zao L, Coelho R, Flandrin P (2014) Speech enhancement with EMD and hurst-based mode selection. IEEE/ACM Trans Audio Speech Lang Process 22(5):899–911CrossRefGoogle Scholar
  3. 3.
    Xu Y, Du J, Dai LR, Lee CH (2015) A regression approach to speech enhancement based on deep neural networks. IEEE/ACM Trans Audio Speech Lang Process 23(1):7–19CrossRefGoogle Scholar
  4. 4.
    Aroudi A, Veisi H, Sameti H (2015) Hidden Markov model-based speech enhancement using multivariate Laplace and Gaussian distributions. IET Signal Process 9(2):177–185CrossRefGoogle Scholar
  5. 5.
    Baby D, Virtanen T, Gemmeke JF, Van Hamme H (2015) Coupled dictionaries for exemplar-based speech enhancement and automatic speech recognition. IEEE/ACM Trans Audio Speech Lang Process 23(11):1788–1799CrossRefGoogle Scholar
  6. 6.
    Chen Z, Hohmann V (2015) Online monaural speech enhancement based on periodicity analysis and a priori SNR estimation. IEEE/ACM Trans Audio Speech Lang Process 23(11):1904–1916Google Scholar
  7. 7.
    Deng F, Bao C, Kleijn WB (2015) Sparse hidden Markov models for speech enhancement in non-stationary noise environments. IEEE/ACM Trans Audio Speech Lang Process 23(11):1973–1987CrossRefGoogle Scholar
  8. 8.
    Vihari S, Murthy AS, Soni P, Naik DC (2016) Comparison of speech enhancement algorithms. Procedia Comput Sci 89:666–676CrossRefGoogle Scholar
  9. 9.
    Doi H, Toda T, Nakamura K, Saruwatari H, Shikano K (2014) Alaryngeal speech enhancement based on one-to-many eigenvoice conversion. IEEE/ACM Trans Audio Speech Lang Process 22(1):172–183CrossRefGoogle Scholar
  10. 10.
    Gerkmann T, Krawczyk-Becker M, Le Roux J (2015) Phase processing for single-channel speech enhancement: history and recent advances. IEEE Signal Process Mag 32(2):55–66CrossRefGoogle Scholar
  11. 11.
    Islam MT, Shahnaz C, Zhu WP, Ahmad MO (2015) Speech enhancement based on student t modeling of teager energy operated perceptual wavelet packet coefficients and a custom thresholding function. IEEE/ACM Trans Audio Speech Lang Process 23(11):1800–1811CrossRefGoogle Scholar
  12. 12.
    Jin YG, Shin JW, Kim NS (2014) Spectro-temporal filtering for multichannel speech enhancement in short-time Fourier transform domain. IEEE Signal Process Lett 21(3):352–355CrossRefGoogle Scholar
  13. 13.
    Kim SM, Kim HK (2014) Direction-of-arrival based SNR estimation for dual-microphone speech enhancement. IEEE/ACM Trans Audio Speech Lang Process 22(12):2207–2217CrossRefGoogle Scholar
  14. 14.
    Ghanbari Y, Karami-Mollaei MR (2006) A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets. Speech Commun 48(8):927–940CrossRefGoogle Scholar
  15. 15.
    Ephraim Y, Malah D (1985) Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans Acoust Speech Signal Process 33(2):443–445CrossRefGoogle Scholar
  16. 16.
    Cohen I (2004) Speech enhancement using a noncausal a priori SNR estimator. IEEE Signal Process Lett 11(9):725–728CrossRefGoogle Scholar
  17. 17.
    Berouti M, Schwartz R, Makhoul J (1979) Enhancement of speech corrupted by acoustic noise. In: IEEE international conference on acoustics, speech, and signal processing, ICASSP ‘79, pp 208–211Google Scholar
  18. 18.
    Kamath S, Loizou P (2002) A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. In: IEEE international conference on acoustics, speech, and signal processing (ICASSP). IEEE, Orlando, p IV-4164Google Scholar
  19. 19.
    Lu Y, Loizou PC (2008) A geometric approach to spectral subtraction. Speech Commun 50(6):453–466CrossRefGoogle Scholar
  20. 20.
    Ayat S, Manzuri-Shalmani MT, Dianat R (2006) An improved wavelet-based speech enhancement by using speech signal features. Comput Electr Eng 32(6):411–425CrossRefzbMATHGoogle Scholar
  21. 21.
    Balaji GN, Subashini TS, Chidambaram N (2015) Detection of heart muscle damage from automated analysis of echocardiogram video. IETE J Res 61(3):236–243CrossRefGoogle Scholar
  22. 22.
    Sunil Kumar BS, Manjunath AS, Christopher S (2018) Improved entropy encoding for high efficient video coding standard. Alexandria Eng J 57(1):1–9CrossRefGoogle Scholar
  23. 23.
    Wagh AM, Todmal SR (2015) Eyelids, eyelashes detection algorithm and Hough transform method for noise removal in iris recognition. Int J Comput Appl 112(3):28–31Google Scholar
  24. 24.
    Sreedharan NPN, Ganesan B, Raveendran R, Sarala P, Dennis B, Rajakumar BR (2018) Grey Wolf optimisation-based feature selection and classification for facial emotion recognition. IET Biom 7(5):490–499CrossRefGoogle Scholar
  25. 25.
    Bhowmick A, Chandra M (2017) Speech enhancement using voiced speech probability based wavelet decomposition. Comput Electr Eng 62:706–718CrossRefGoogle Scholar
  26. 26.
    Chung H, Plourde E, Champagne B (2017) Regularized non-negative matrix factorization with Gaussian mixtures and masking model for speech enhancement. Speech Commun 87:18–30CrossRefGoogle Scholar
  27. 27.
    Mowlaee P, Stahl J, Kulmer J (2017) Iterative joint MAP single-channel speech enhancement given non-uniform phase prior. Speech Commun 86:85–96CrossRefGoogle Scholar
  28. 28.
    Kammi S, Karami-Mollaei MR (2017) Noisy speech enhancement with sparsity regularization. Speech Commun 87:58–69CrossRefGoogle Scholar
  29. 29.
    Li R, Liu Y, Shi Y, Dong L, Cui W (2016) ILMSAF based speech enhancement with DNN and noise classification. Speech Commun 85:53–70CrossRefGoogle Scholar
  30. 30.
    Zhao Y, Qiu RC, Zhao X, Wang B (2016) Speech enhancement method based on low-rank approximation in a reproducing kernel Hilbert space. Appl Acoust 112:79–83CrossRefGoogle Scholar
  31. 31.
    Liu Y, Nower N, Morita S, Unoki M (2016) Speech enhancement of instantaneous amplitude and phase for applications in noisy reverberant environments. Speech Commun 84:1–14CrossRefGoogle Scholar
  32. 32.
    Sun M, Zhang X, Van Hamme H, Zheng TF (2016) Unseen noise estimation using separable deep auto encoder for speech enhancement. IEEE/ACM Trans Audio Speech Lang Process 24(1):93–104CrossRefGoogle Scholar
  33. 33.
    Chazan SE, Goldberger J, Gannot S (2016) A hybrid approach for speech enhancement using MoG model and neural network phoneme classifier. IEEE/ACM Trans Audio Speech Lang Process 24(12):2516–2530CrossRefGoogle Scholar
  34. 34.
    Wang SS et al (2016) Wavelet speech enhancement based on nonnegative matrix factorization. IEEE Signal Process Lett 23(8):1101–1105CrossRefGoogle Scholar
  35. 35.
    Bhatnagar K, Gupta S (2017) Extending the neural model to study the impact of effective area of optical fiber on laser intensity. Int J Intell Eng Syst 10(4):274–283Google Scholar
  36. 36.
    Muaidi H (2014) Levenberg–Marquardt learning neural network for part-of-speech tagging of arabic sentences. Wseas Trans Comput 13:300–309Google Scholar
  37. 37.
    Boll SF (1979) Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans Signal Process 27(2):113–120CrossRefGoogle Scholar
  38. 38.
    Cohen I, Berdugo B (2001) Speech enhancement for non-stationary noise environments. Signal Process 81(11):2403–2418CrossRefzbMATHGoogle Scholar
  39. 39.
    Plapous C, Marro C, Mauuary L, Scalart P (2004) A two-step noise reduction technique. In: 2004 IEEE international conference on acoustics, speech, and signal processing, vol 1, pp I-289–I292Google Scholar
  40. 40.
    Plapous C, Marro C, Scalart P (2006) Improved signal-to-noise ratio estimation for speech enhancement. IEEE Trans ASLP 14(6):2098–2108Google Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.National Institute of TechnologyKurukshetraIndia

Personalised recommendations