Speech Quality Assessment

Loizou, Philipos C.

doi:10.1007/978-3-642-19551-8_23

Philipos C. Loizou⁷

Part of the book series: Studies in Computational Intelligence ((SCI,volume 346))

1806 Accesses
50 Citations

Abstract

This chapter provides an overview of the various methods and techniques used for assessment of speech quality. A summary is given of some of the most commonly used listening tests designed to obtain reliable ratings of the quality of processed speech from human listeners. Considerations for conducting successful subjective listening tests are given along with cautions that need to be exercised. While the listening tests are considered the gold standard in terms of assessment of speech quality, they can be costly and time consuming. For that reason, much research effort has been placed on devising objective measures that correlate highly with subjective rating scores. An overview of some of the most commonly used objective measures is provided along with a discussion on how well they correlate with subjective listening tests.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 299.00; Price excludes VAT (USA)

Softcover Book: USD 379.99; Price excludes VAT (USA)

Hardcover Book: USD 379.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Quackenbush, S., Barnwell, T., Clements, M.: Objective measures of speech quality. Prentice Hall, Englewood Cliffs (1988)
Google Scholar
Loizou, P.: Speech Enhancement: Theory and Practice. CRC Press LLC, Boca Raton (2007)
Google Scholar
Grancharov, V., Kleijn, W.: Speech Quality Assessment. In: Benesty, J., Sondhi, M., Huang, Y. (eds.) Handbook of Speech Processing, pp. 83–99. Springer, Heidelberg (2008)
Chapter Google Scholar
Berouti, M., Schwartz, M., Makhoul, J.: Enhancement of speech corrupted by acoustic noise. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, pp. 208–211 (1979)
Google Scholar
ITU-T, Subjective performance assessment of telephone band and wide-band digital codecs, ITU-T Recommendation p. 830 (1996)
Google Scholar
International Telecommunication Union - Radiocommunication Sector, Recommendation BS. 562-3, Subjective assessment of sound quality (1990)
Google Scholar
IEEE Subcommittee, IEEE Recommended Practice for Speech Quality Measurements. IEEE Trans. Audio and Electroacoustics AU-17(3), 225–246 (1969)
Google Scholar
International Telecommunication Union - Telecommunication Sector, Recommendation, Subjective performance assessment of telephone band and wideband digital codecs p. 830 (1998)
Google Scholar
IEEE Recommended Practice for Speech Quality Measurements. IEEE Trans. Audio and Electroacoustics AU-17(3), 225–246 (1969)
Google Scholar
Coleman, A., Gleiss, N., Usai, P.: A subjective testing methodology for evaluating medium rate codecs for digital mobile radio applications. Speech Communication 7(2), 151–166 (1988)
Article Google Scholar
Goodman, D., Nash, R.: Subjective quality of the same speech transmission conditions in seven different countries. IEEE Trans. Communications COm-30(4), 642–654 (1982)
Article Google Scholar
Rothauser, E., Urbanek, G., Pachl, W.: A comparison of preference measurement methods. J. Acoust. Soc. Am. 49(4), 1297–1308 (1970)
Article Google Scholar
ITU-T, Methods for subjective determination of transmission quality, ITU-T Recommendation p. 800 (1996)
Google Scholar
Voiers, W.D.: Diagnostic Acceptability Measure for speech communication systems. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, pp. 204–207 (1977)
Google Scholar
Voiers, W.D., Sharpley, A., Panzer, I.: Evaluating the effects of noise on voice communication systems. In: Davis, G. (ed.) Noise Reduction in Speech Applications, pp. 125–152. CRC Press, Boca Raton (2002)
Google Scholar
ITU-T, Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm, ITU-T Recommendation p. 835 (2003)
Google Scholar
Hu, Y., Loizou, P.: Subjective comparison of speech enhancement al-gorithms. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, vol. I, pp. 153–156 (2006)
Google Scholar
Kreiman, J., Kempster, G., Erman, A., Berke, G.: Perceptual evaluation of voice quality: Review, tutorial and a framework for future research. J. Speech Hear. Res. 36(2), 21–40 (1993)
Google Scholar
Suen, H.: Agreement, reliability, accuracy and vailidity: Toward a clarification. Behavioral Assessment 10, 343–366 (1988)
Google Scholar
Cronbach, L.: Coefficient alpha and the internal structure of tests. Psychometrika 16, 297–334 (1951)
Article Google Scholar
Kendall, M.: Rank correlation methods. Hafner Publishing Co., New York (1955)
MATH Google Scholar
Shrout, P., Fleiss, J.: Intraclass correlations: Uses in assessing rater re-liability. Psychological Bulletin 86(2), 420–428 (1979)
Article Google Scholar
McGraw, K., Wong, S.: Forming inferences about some intraclass cor-relation coefficients. Psychological Methods 1(1), 30–46 (1996)
Article Google Scholar
Tinsley, H., Weiss, D.: Interrater reliability and agreement of subjective judgments. J. Counseling Psychology 22(4), 358–376 (1975)
Article Google Scholar
Gerratt, B., Kreiman, J., Antonanzas-Barroso, N., Berke, G.: Compar-ing internal and external standards in voice quality judgments. J. Speech Hear. Res. 36, 14–20 (1993)
Google Scholar
Kreiman, J., Gerratt, B.: Validity of raing scale measures of voice quality. J. Acoust. Soc. Am. 104(3), 1598–1608 (1998)
Article Google Scholar
Ott, L.: An introduction to statistical methods and data analysis, 3rd edn. PWS-Kent Publishing Company, Boston (1988)
Google Scholar
Chong, F., McLoughlin, I., Pawlikoski, K.: A Methodology for Improving PESQ accuracy for Chinese Speech. In: TENCON Conference, pp. 1–6 (2005)
Google Scholar
ITU, Perceptual evaluation of speech quality (PESQ), and objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs. ITU-T Recommendation p. 862 (2000)
Google Scholar
Rix, A., Beerends, J., Hollier, M., Hekstra, A.: Perceptual evaluation of speech quality (PESQ) - A new method for speech quality assessment of telephone networks and codecs. In: Proc. IEEE Int. Conf. Acoust, Speech, Signal Processing, vol. 2, pp. 749–752 (2001)
Google Scholar
Voran, S.: Objective estimation of perceived speech quality - Part I: Development of the measuring normalizing block technique. IEEE Transactions on Speech and Audio Processing 7(4), 371–382 (1999)
Article Google Scholar
Flanagan, J.: A difference limen for vowel formant frequency. J. Acoust. Soc. Am. 27, 613–617 (1955)
Article Google Scholar
Viswanathan, R., Makhoul, J., Russell, W.: Towards perceptually consistent measures of spectral distance. In: Proc. IEEE Int. Conf. Acoust, Speech, Signal Processing, vol. 1, pp. 485–488 (1976)
Google Scholar
Hu, Y., Loizou, P.: Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang Processing 16(1), 229–238 (2008)
Article Google Scholar
Dimolitsas, S.: Objective speech distortion measures and their relevance to speech quality assessments. In: IEE Proc. - Vision, Image and Signal Processing, vol. 136(5), pp. 317–324 (1989)
Google Scholar
Kubichek, R., Atkinson, D., Webster, A.: Advances in objective voice quality assessment. In: Proc. Global Telecommunications Conference, vol. 3, pp. 1765–1770 (1991)
Google Scholar
Kitawaki, N.: Quality assessment of coded speech. In: Furui, S., Sondhi, M. (eds.) Advances in Speech Signal Processing, pp. 357–385. Marcel Dekker, New York (1991)
Google Scholar
Barnwell, T.: Objective measures for speech quality testing. J. Acoust. Soc. Am. 66(6), 1658–1663 (1979)
Article Google Scholar
Hansen, J., Pellom, B.: An effective quality evaluation protocol for speech enhancement algorithms. In: Proc. Inter. Conf. on Spoken Language Processing, vol. 7, pp. 2819–2822 (1998)
Google Scholar
Tribolet, J., Noll, P., McDermott, B., Crochiere, R.E.: A study of complexity and quality of speech waveform coders. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, pp. 586–590 (1978)
Google Scholar
Kryter, K.: Methods for calculation and use of the articulation index. J. Acoust. Soc. Am. 34(11), 1689–1697 (1962)
Article Google Scholar
Klatt, D.: Prediction of perceived phonetic distance from critical band spectra. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, vol. 7, pp. 1278–1281 (1982)
Google Scholar
Rabiner, L., Schafer, R.: Digital processing of speech signals. Prentice Hall, Englewood Cliffs (1978)
Google Scholar
Kitawaki, N., Nagabuchi, H., Itoh, K.: Objective quality evaluation for low bit-rate speech coding systems. IEEE J. Select. Areas in Comm. 6(2), 262–273 (1988)
Article Google Scholar
Karjalainen, M.: A new auditory model for the evaluation of sound quality of audio system. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, vol. 10, pp. 608–611 (1985)
Google Scholar
Wang, S., Sekey, A., Gersho, A.: An objective measure for predicting subjective quality of speech coders. IEEE J. on Select. Areas in Comm. 10(5), 819–829 (1992)
Article Google Scholar
Yang, W., Benbouchta, M., Yantorno, R.: Performance of the modified Bark spectral distortion as an objective speech quality measure. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, pp. 541–544 (1998)
Google Scholar
Karjalainen, M.: Sound quality measurements of audio systems based on models of auditory perception. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, vol. 9, pp. 132–135 (1984)
Google Scholar
Chen, G., Parsa, V.: Loudness pattern-based speech quality evaluation using Bayesian modelling and Markov chain Monte Carlo methods. J. Acoust., Soc. Am. 121(2), 77–83 (2007)
Google Scholar
Pourmand, N., Suelzle, D., Parsa, V., Hu, Y., Loizou, P.: On the use of Bayesian modeling for predicting noise reduction performance. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 3873–3876 (2009)
Google Scholar
Moore, B.: An introduction to the psychology of hearing, 5th edn. Academic Press, London (2003)
Google Scholar
Fletcher, H., Munson, W.: Loudness, its definition, measurement and calculation. J. Acoust. Soc. Am. 5, 82–108 (1933)
Article Google Scholar
Robinson, D., Dadson, R.: A re-determination of the equal-loudness relations for pure tones. Brit. J. Appl. Phys. 7, 166–181 (1956)
Article Google Scholar
Yang, W.: Enhanced modified Bark spectral distortion (EMBSD): An objective speech quality measure based on audible distortion and cognition model. Ph.D., Temple University (1999)
Google Scholar
Novorita, B.: Incorporation of temporal masking effects into bark spectral distortion measure. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, vol. 2, pp. 665–668 (1999)
Google Scholar
Yang, W., Yantorno, R.: Improvement of MBSD by scaling noise masking threshold and correlation analysis with MOS difference instead of MOS. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, vol. 2, pp. 673–676 (1999)
Google Scholar
Rix, A., Hollier, M.: The perceptual analysis measurement for robust end-to-end speech quality assessment. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, vol. 3, pp. 1515–1518 (2000)
Google Scholar
Bartels, R., Stewart, G.: Solution of the matrix equation AX+XB=C. Comm. of ACM 15(9), 820–826 (1972)
Article Google Scholar
Beerends, J., Stemerdink, J.: A perceptual speech-quality measure based on a psychoacoustic sound representation. J. Audio Eng. Soc. 42(3), 115–123 (1994)
Google Scholar
Friedman, J.: Multivariate adaptive regression splines. Annals Statistics 19(1), 1–67 (1991)
Article MATH Google Scholar
Falk, T.H., Chan, W.: Single-Ended Speech Quality Measurement Using Machine Learning Methods. IEEE Trans. Audio Speech Lang. Processing 14(6), 1935–1947 (2006)
Article Google Scholar
Rix, A.: Perceptual speech quality assessment - A review. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, vol. 3, pp. 1056–1059 (2004)
Google Scholar
Gray, P., Hollier, M., Massara, R.: Non-intrusive speech quality as-sessment using vocal-tract models. In: IEE Proc. - Vision, Image and Signal Processing, vol. 147(6), pp. 493–501 (2000)
Google Scholar
Chen, G., Parsa, V.: Nonintrusive speech quality evaluation using an adaptive neurofuzzy inference system. IEEE Signal Processing Letters 12(5), 403–406 (2005)
Article Google Scholar
Jin, C., Kubichek, R.: Vector quantization techniques for output-based objective speech quality. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 491–494 (1996)
Google Scholar
Picovici, D., Madhi, A.: Output-based objective speech quality measure using self-organizing map. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 476–479 (2003)
Google Scholar
Kim, D., Tarraf, A.: Perceptual model for nonintrusive speech quality assessment. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, vol. 3, pp. 1060–1063 (2004)
Google Scholar
ITU, Single ended method for objective speech quality assessment in narrow-band telephony applications. ITU-T Recommendation p. 563 (2004)
Google Scholar
Hollier, M., Hawksford, M., Guard, D.: Error activity and error en-tropy as a measure of psychoacoustic significance in the perceptual domain. In: IEE Proc. - Vision, Image and Signal Processing, vol. 141(3), pp. 203–208 (1994)
Google Scholar
Arehart, K., Kates, J., Anderson, M., Harvey, L.: Effects of noise and distortion on speech quality judgments in normal-hearing and hearing-impaired listeners. J. Acoust. Soc. Am. 122, 1150–1164 (2007)
Article Google Scholar
Kates, J.: On using coherence to measure distortion in hearing aids. J. Acoust. Soc. Am. 91, 2236–2244 (1992)
Article Google Scholar
Hu, Y., Loizou, P.: Subjective comparison and evaluation of speech enhancement algorithms. Speech Communication 49, 588–601 (2007)
Article Google Scholar
Holub, J., Jianjun, L.: Intrusive Speech Transmission Quality Measurement in Chinese Environment. In: Intern. Conf. on Information, Communications and Signal Processing, pp. 1–3 (2007)
Google Scholar
Ma, J., Hu, Y., Loizou, P.: Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions. J. Acoust. Soc. Am. 125(5), 3387–3405 (2009)
Article Google Scholar
Zwicker, E., Fastl, H.: Pschoacoustics: Facts and Models, 2nd edn. Springer, Heidelberg (1999)
Google Scholar
Kang, J.: Comparison of speech intelligibility between English and Chinese. J. Acoust. Soc. Am. 103(2), 1213–1216 (1998)
Article Google Scholar
Bladon, R., Lindblom, B.: Modeling the judgment of vowel quality differences. J. Acoust. Soc. Am. 69(5), 1414–1422 (1981)
Article Google Scholar
Kent, R., Read, C.: The Acoustic Analysis of Speech. Singular Publishing Group, San Diego (1992)
Google Scholar
Stevens, K., Blumstein, S.: Invariant cues for the place of articulation in stop consonants. J. Acoust. Soc. Am. 64, 1358–1368 (1978)
Article Google Scholar
Breitkopf, P., Barnwell, T.: Segmental preclassification for improved objective speech quality measures. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 1101–1104 (1981)
Google Scholar
Kubichek, R., Quincy, E., Kiser, K.: Speech quality asessment using expert pattern recognition techniques. In: IEEE Pacific Rim Conf. on Comm. Computers, Sign. Proc., pp. 208–211 (1989)
Google Scholar
Barnwell, T.: A comparison of parametrically different objective speech quality measures using correlation analysis with subjective listening tests. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 710–713 (1980)
Google Scholar
Kates, J., Arehart, K.: Coherence and the speech intelligibility index. J. Acoust. Soc. Am. 117, 2224–2237 (2005)
Article Google Scholar
Mattila, V.: Objective measures for the characterization of the basic functioning of noise suppression algorithms. In: Proc. of online workshop on Measurement Speech and Audio Quality in Networks (2003)
Google Scholar
Mester, R., Franke, U.: Spectral entropy-activity classification in adaptive transform coding. IEEE J. Sel. Areas Comm. 10(5), 913–917 (1992)
Article Google Scholar
Voran, S.: Objective estimation of perceived speech quality - Parti I: Development of the measuring normalizing block technique. IEEE Transactions on Speech and Audio Processing 7(4), 371–382 (1999)
Article Google Scholar
Klatt, D.H.: Prediction of perceived phonetic distance from critical-band spectra:A first step. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 1278–1281 (1982)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, University of Texas-Dallas, Richardson, TX, USA
Philipos C. Loizou

Authors

Philipos C. Loizou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Engineering , Nanyang Technological University, 639798, Singapore
Weisi Lin & Dacheng Tao &
Intelligent Systems Laboratory Systems Research Institute , Polish Academy of Sciences, Poland
Janusz Kacprzyk
Department of Computing , Hong Kong Polytechnic University, Hung Hom, Hong Kong
Zhu Li
School of Electronic Engineering and Computer Science, Queen Mary, University of London, London, U.K.
Ebroul Izquierdo
TCL-Thomson Electronics , Santa Clara, California
Haohong Wang

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Loizou, P.C. (2011). Speech Quality Assessment. In: Lin, W., Tao, D., Kacprzyk, J., Li, Z., Izquierdo, E., Wang, H. (eds) Multimedia Analysis, Processing and Communications. Studies in Computational Intelligence, vol 346. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19551-8_23

Download citation

DOI: https://doi.org/10.1007/978-3-642-19551-8_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19550-1
Online ISBN: 978-3-642-19551-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics