Introduction and Comparison of Machine Learning Techniques to the Estimation of Binaural Speech Intelligibility
We proposed and evaluated a speech intelligibility estimation method for binaural signals. The assumption here was that both the speech and competing noise are directional sources. We trained a mapping function between the subjective intelligibility and some objective measures. We attempted SNR calculation on a simple binaural to monaural mix-down, better SNR selection from left and right channels (better-ear), and a sub-band wise better-ear selection (band-wise betterear). For the mapping function training, we tried neural networks (NN), support vector regression (SVR), and random forests (RF). A combination of better-ear and RF gave the best results, with root mean square error (RMSE) of about 4% and correlation of 0.99 in a closed set test.
KeywordsSpeech Intelligibility Binaural Speech Objective Estimation Machine Learning Diagnostic Rhyme Test
Unable to display preview. Download preview PDF.
- 1.Edmonds, B.A., Culling, J.F.: The spatial unmasking of speech: Evidence for better-ear listening. J. Acoust. Soc. Am. 120(3), 1539–1545 (Sept 2006)Google Scholar
- 2.French, N.R., Steinberg, J.C.: Factors governing the intelligibility of speech sounds. J. Acoust. Soc. Am. 19(1), 90–119 (1947)Google Scholar
- 3.Fujimori, M., Kondo, K., Takano, K., Nakagawa, K.: On a revised word-pair list for the Japanese intelligibility test. In: Proc. Int. Symp. on Frontiers in Sp. and Hearing Res. Tokyo, Japan (Mar 2006)Google Scholar
- 4.Itahashi, S.: A noise database and Japanese common speech data corpus. J. Acoust. Soc. Japan 47(12), 951–953 (Dec 1991), in JapaneseGoogle Scholar
- 5.Kondo, K.: Subjective Quality Measurement of Speech. Springer-Verlag, Heidelberg, Germany (2012)Google Scholar
- 6.Ma, J., Hu, Y., Loizou, P.C.: Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions. J. Acoust. Soc. Am. 125(5), 3387–3405 (May 2009)Google Scholar
- 7.Quackenbush, S.R., III, T.P.B., Clements, M.A.: Objective Measures of Speech Quality. Prentice-Hall, Englewood Cliffs, NJ, USA (1988)Google Scholar
- 8.Steeneken, H.J.M., Houtgast, T.: A physical method for measuring speech transmission quality. J. Acoust. Soc. Am. 67(1), 318–326 (1980)Google Scholar
- 9.Taira, K., Kondo, K.: Estimation of binaural intelligibility using the frequencyweighted segmental SNR of stereo channel signals. In: Proc. APSIPA-ASC. pp. 101–104. Hong Kong (Dec 2015)Google Scholar
- 10.Wijngaarden, S.J., Drullman, R.: Binaural intelligibility prediction based on the speech transmission index. J. Acoust. Soc. Am. 123(6), 4514–4523 (June 2008)Google Scholar