Intelligibility prediction for distorted sentences by the normalized covariance measure

Chen, Fei

doi:10.1007/s10772-011-9099-z

Intelligibility prediction for distorted sentences by the normalized covariance measure

Published: 04 August 2011

Volume 14, pages 237–243, (2011)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Fei Chen¹

110 Accesses
Explore all metrics

Abstract

Speech-transmission index (STI) has been extensively used for predicting the intelligibility of speech corrupted by reverberation and additive noise. This study further evaluated its performance in predicting the intelligibility of three types of distorted sentences, i.e., time-reversed stimuli, vocoded stimuli, and stimuli containing recovered envelope from Hilbert fine-structure condition (R-HFS). The distorted sentences were simulated, and the intelligibility was predicted by the normalized covariance measure (NCM), which was a STI-based index. The NCM measure was evaluated with the intelligibility scores available for the three types of distorted stimuli, and the performance was also compared with those obtained with the PESQ measure and coherence-based speech intelligibility index. It was found that the NCM measure consistently well predicted the intelligibility in all three conditions of speech distortion: (1) the intelligibility of time-reversed speech continuously declined till the segmentation duration for speech reversal increased to 200 ms; (2) the intelligibility of tone-vocoded and noise-vocoded stimuli improved with more channels used in vocoder, and the intelligibility of these two types of vocoded sentences showed a small difference; and (3) the intelligibility of R-HFS stimuli decreased when the number of analysis bands varied from one to eight. Supplementary to previous outcomes on speech intelligibility prediction, the results in present work support that the intelligibility of distorted sentences could be well predicted by the NCM measure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

American National Standards Institute (1997). Methods for calculation of the speech intelligibility index, S3.5–1997.
Chen, F., & Loizou, P. C. (2010). Contribution of consonant landmarks to speech recognition in simulated acoustic-electric hearing. Ear and Hearing, 31, 259–267.
Article Google Scholar
Chen, F., & Loizou, P. C. (2011a). Predicting the intelligibility of vocoded speech. Ear and Hearing, 32, 331–338.
Article Google Scholar
Chen, F., & Loizou, P. C. (2011b). Predicting the intelligibility of vocoded and wideband Mandarin Chinese. Journal of Acoustical Society of America, 129, 3281–3290.
Article Google Scholar
Dorman, M., Loizou, P. C., & Rainey, D. (1997). Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noise-band outputs. Journal of Acoustical Society of America, 102, 2403–2411.
Article Google Scholar
Friesen, L., Shannon, R., Baskent, D., & Wang, X. (2001). Speech recognition in noise as a function of the number of spectral channels: comparison of acoustic hearing and cochlear implants. Journal of Acoustical Society of America, 110, 1150–1163.
Article Google Scholar
Gilberta, G., & Lorenzi, C. (2006). The ability of listeners to use recovered envelope cues from speech fine structure. Journal of Acoustical Society of America, 119, 2438–2444.
Article Google Scholar
Goldsworthy, R., & Greenberg, J. (2004). Analysis of speech-based speech transmission index methods with implications for nonlinear operations. Journal of Acoustical Society of America, 116, 3679–3689.
Article Google Scholar
Greenwood, D. A. (1990). Cochlear frequency-position function for several species—29 years later. Journal of Acoustical Society of America, 87, 2592–2605.
Article Google Scholar
Houtgast, T., & Steeneken, H. (1971). Evaluation of speech transmission channels by using artificial signals. Acustica, 25, 355–367.
Google Scholar
Houtgast, T., & Steeneken, H. (1985). A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria. Journal of Acoustical Society of America, 77, 1069–1077.
Article Google Scholar
ITU-T (2000). Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs. ITU-T Recommendation P. 862.
Ma, J., Hu, Y., & Loizou, P. C. (2009). Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions. Journal of Acoustical Society of America, 125, 3387–3405.
Article Google Scholar
Moore, B., & Glasberg, B. (1993). Suggested formulas for calculation auditory-filter bandwidths and excitation patterns. Journal of Acoustical Society of America, 74, 750–753.
Article Google Scholar
Nilsson, M., Soli, S., & Sullivan, J. (1994). Development of the hearing in noise test for the measurement of speech reception thresholds in quiet and noise. Journal of Acoustical Society of America, 95, 1085–1099.
Article Google Scholar
Rix, A. W., Beerends, J. G., Hollier, M. P., & Hekstra, A. P. (2001). Perceptual evaluation of speech quality (PESQ)—a new method for speech quality assessment of telephone networks and codecs. In Proc. IEEE int. conf. acoust., speech, signal process, Salt Lake City, USA (pp. 749–752).
Google Scholar
Saberi, K., & Perrott, D. R. (1999). Cognitive restoration of reversed speech. Nature (London), 398, 760.
Article Google Scholar
Shannon, R., Zeng, F. G., Kamath, V., Wygonski, J., & Ekelid, M. (1995). Speech recognition with primarily temporal cues. Science, 270, 303–304.
Article Google Scholar
Steeneken, H., & Houtgast, T. (1980). A physical method for measuring speech transmission quality. Journal of Acoustical Society of America, 67, 318–326.
Article Google Scholar
Steeneken, H., & Houtgast, T. (1982). Some applications of the speech transmission index (STI) in auditoria. Acustica, 51, 229–234.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, The University of Texas at Dallas, Richardson, TX, 75083, USA
Fei Chen

Authors

Fei Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fei Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, F. Intelligibility prediction for distorted sentences by the normalized covariance measure. Int J Speech Technol 14, 237–243 (2011). https://doi.org/10.1007/s10772-011-9099-z

Download citation

Received: 02 June 2011
Accepted: 15 July 2011
Published: 04 August 2011
Issue Date: September 2011
DOI: https://doi.org/10.1007/s10772-011-9099-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Intelligibility prediction for distorted sentences by the normalized covariance measure

Abstract

Access this article

Similar content being viewed by others

Modelling Speech Intelligibility in Adverse Conditions

Learning to Predict Speech Intelligibility from Speech Distortions

Speech intelligibility improvement in noisy reverberant environments based on speech enhancement and inverse filtering

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Intelligibility prediction for distorted sentences by the normalized covariance measure

Abstract

Access this article

Similar content being viewed by others

Modelling Speech Intelligibility in Adverse Conditions

Learning to Predict Speech Intelligibility from Speech Distortions

Speech intelligibility improvement in noisy reverberant environments based on speech enhancement and inverse filtering

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation