Speech Enhancement from the Fullband Output SNR Perspective
Most of the speech enhancement algorithms are implemented in the timefrequency domain, i.e., the short-time Fourier transform (STFT) domain. The two main advantages of the STFT are that the algorithms can be implemented very efficiently and the different frequency bins can apparently be manipulated in a very flexible way in order to better compromise between noise reduction and speech distortion. Therefore, it is important to understand how things work from the fullband output SNR perspective and how gains/filters for noise reduction can be improved by fully exploiting all facets of this fundamental measure.
Unable to display preview. Download preview PDF.
- 1.Y. Zhao, J. Benesty, and J. Chen, “Single-channel noise reduction in the STFT domain from the fullband output SNR perspective,” in Proc. EUSIPCO, 2016, pp. 1956–1959.Google Scholar
- 2.J. Benesty, J. Chen, and E. Habets, Speech Enhancement in the STFT Domain. Springer Briefs in Electrical and Computer Engineering, 2011.Google Scholar
- 3.D.Wang, “On ideal binary mask as the computational goal of auditory scene analysis,” in Speech Separation by Humans and Machines, Pierre Divenyi, Ed., pp. 181–197, Kluwer, 2005.Google Scholar
- 4.J. N. Franklin, Matrix Theory. Englewood Cliffs, NJ: Prentice-Hall, 1968.Google Scholar
- 5.J. Benesty, J. Chen, and Y. Huang, Microphone Array Signal Processing. Berlin, Germany: Springer-Verlag, 2008.Google Scholar
- 6.M. Brandstein and D. B. Ward, Eds., Microphone Arrays: Signal Processing Techniques and Applications. Berlin, Germany: Springer-Verlag, 2001.Google Scholar