Abstract
We proposed and evaluated a speech intelligibility estimation method for binaural signals. The assumption here was that both the speech and competing noise are directional sources. We trained a mapping function between the subjective intelligibility and some objective measures. We attempted SNR calculation on a simple binaural to monaural mix-down, better SNR selection from left and right channels (better-ear), and a sub-band wise better-ear selection (band-wise betterear). For the mapping function training, we tried neural networks (NN), support vector regression (SVR), and random forests (RF). A combination of better-ear and RF gave the best results, with root mean square error (RMSE) of about 4% and correlation of 0.99 in a closed set test.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Edmonds, B.A., Culling, J.F.: The spatial unmasking of speech: Evidence for better-ear listening. J. Acoust. Soc. Am. 120(3), 1539–1545 (Sept 2006)
French, N.R., Steinberg, J.C.: Factors governing the intelligibility of speech sounds. J. Acoust. Soc. Am. 19(1), 90–119 (1947)
Fujimori, M., Kondo, K., Takano, K., Nakagawa, K.: On a revised word-pair list for the Japanese intelligibility test. In: Proc. Int. Symp. on Frontiers in Sp. and Hearing Res. Tokyo, Japan (Mar 2006)
Itahashi, S.: A noise database and Japanese common speech data corpus. J. Acoust. Soc. Japan 47(12), 951–953 (Dec 1991), in Japanese
Kondo, K.: Subjective Quality Measurement of Speech. Springer-Verlag, Heidelberg, Germany (2012)
Ma, J., Hu, Y., Loizou, P.C.: Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions. J. Acoust. Soc. Am. 125(5), 3387–3405 (May 2009)
Quackenbush, S.R., III, T.P.B., Clements, M.A.: Objective Measures of Speech Quality. Prentice-Hall, Englewood Cliffs, NJ, USA (1988)
Steeneken, H.J.M., Houtgast, T.: A physical method for measuring speech transmission quality. J. Acoust. Soc. Am. 67(1), 318–326 (1980)
Taira, K., Kondo, K.: Estimation of binaural intelligibility using the frequencyweighted segmental SNR of stereo channel signals. In: Proc. APSIPA-ASC. pp. 101–104. Hong Kong (Dec 2015)
Wijngaarden, S.J., Drullman, R.: Binaural intelligibility prediction based on the speech transmission index. J. Acoust. Soc. Am. 123(6), 4514–4523 (June 2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Kondo, K., Taira, K. (2017). Introduction and Comparison of Machine Learning Techniques to the Estimation of Binaural Speech Intelligibility. In: Pan, JS., Tsai, PW., Huang, HC. (eds) Advances in Intelligent Information Hiding and Multimedia Signal Processing. Smart Innovation, Systems and Technologies, vol 63. Springer, Cham. https://doi.org/10.1007/978-3-319-50209-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-50209-0_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50208-3
Online ISBN: 978-3-319-50209-0
eBook Packages: EngineeringEngineering (R0)