Skip to main content

Part of the book series: Studies in Computational Intelligence ((SCI,volume 346))

Abstract

This chapter provides an overview of the various methods and techniques used for assessment of speech quality. A summary is given of some of the most commonly used listening tests designed to obtain reliable ratings of the quality of processed speech from human listeners. Considerations for conducting successful subjective listening tests are given along with cautions that need to be exercised. While the listening tests are considered the gold standard in terms of assessment of speech quality, they can be costly and time consuming. For that reason, much research effort has been placed on devising objective measures that correlate highly with subjective rating scores. An overview of some of the most commonly used objective measures is provided along with a discussion on how well they correlate with subjective listening tests.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 379.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Quackenbush, S., Barnwell, T., Clements, M.: Objective measures of speech quality. Prentice Hall, Englewood Cliffs (1988)

    Google Scholar 

  2. Loizou, P.: Speech Enhancement: Theory and Practice. CRC Press LLC, Boca Raton (2007)

    Google Scholar 

  3. Grancharov, V., Kleijn, W.: Speech Quality Assessment. In: Benesty, J., Sondhi, M., Huang, Y. (eds.) Handbook of Speech Processing, pp. 83–99. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  4. Berouti, M., Schwartz, M., Makhoul, J.: Enhancement of speech corrupted by acoustic noise. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, pp. 208–211 (1979)

    Google Scholar 

  5. ITU-T, Subjective performance assessment of telephone band and wide-band digital codecs, ITU-T Recommendation p. 830 (1996)

    Google Scholar 

  6. International Telecommunication Union - Radiocommunication Sector, Recommendation BS. 562-3, Subjective assessment of sound quality (1990)

    Google Scholar 

  7. IEEE Subcommittee, IEEE Recommended Practice for Speech Quality Measurements. IEEE Trans. Audio and Electroacoustics AU-17(3), 225–246 (1969)

    Google Scholar 

  8. International Telecommunication Union - Telecommunication Sector, Recommendation, Subjective performance assessment of telephone band and wideband digital codecs p. 830 (1998)

    Google Scholar 

  9. IEEE Recommended Practice for Speech Quality Measurements. IEEE Trans. Audio and Electroacoustics AU-17(3), 225–246 (1969)

    Google Scholar 

  10. Coleman, A., Gleiss, N., Usai, P.: A subjective testing methodology for evaluating medium rate codecs for digital mobile radio applications. Speech Communication 7(2), 151–166 (1988)

    Article  Google Scholar 

  11. Goodman, D., Nash, R.: Subjective quality of the same speech transmission conditions in seven different countries. IEEE Trans. Communications COm-30(4), 642–654 (1982)

    Article  Google Scholar 

  12. Rothauser, E., Urbanek, G., Pachl, W.: A comparison of preference measurement methods. J. Acoust. Soc. Am. 49(4), 1297–1308 (1970)

    Article  Google Scholar 

  13. ITU-T, Methods for subjective determination of transmission quality, ITU-T Recommendation p. 800 (1996)

    Google Scholar 

  14. Voiers, W.D.: Diagnostic Acceptability Measure for speech communication systems. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, pp. 204–207 (1977)

    Google Scholar 

  15. Voiers, W.D., Sharpley, A., Panzer, I.: Evaluating the effects of noise on voice communication systems. In: Davis, G. (ed.) Noise Reduction in Speech Applications, pp. 125–152. CRC Press, Boca Raton (2002)

    Google Scholar 

  16. ITU-T, Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm, ITU-T Recommendation p. 835 (2003)

    Google Scholar 

  17. Hu, Y., Loizou, P.: Subjective comparison of speech enhancement al-gorithms. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, vol. I, pp. 153–156 (2006)

    Google Scholar 

  18. Kreiman, J., Kempster, G., Erman, A., Berke, G.: Perceptual evaluation of voice quality: Review, tutorial and a framework for future research. J. Speech Hear. Res. 36(2), 21–40 (1993)

    Google Scholar 

  19. Suen, H.: Agreement, reliability, accuracy and vailidity: Toward a clarification. Behavioral Assessment 10, 343–366 (1988)

    Google Scholar 

  20. Cronbach, L.: Coefficient alpha and the internal structure of tests. Psychometrika 16, 297–334 (1951)

    Article  Google Scholar 

  21. Kendall, M.: Rank correlation methods. Hafner Publishing Co., New York (1955)

    MATH  Google Scholar 

  22. Shrout, P., Fleiss, J.: Intraclass correlations: Uses in assessing rater re-liability. Psychological Bulletin 86(2), 420–428 (1979)

    Article  Google Scholar 

  23. McGraw, K., Wong, S.: Forming inferences about some intraclass cor-relation coefficients. Psychological Methods 1(1), 30–46 (1996)

    Article  Google Scholar 

  24. Tinsley, H., Weiss, D.: Interrater reliability and agreement of subjective judgments. J. Counseling Psychology 22(4), 358–376 (1975)

    Article  Google Scholar 

  25. Gerratt, B., Kreiman, J., Antonanzas-Barroso, N., Berke, G.: Compar-ing internal and external standards in voice quality judgments. J. Speech Hear. Res. 36, 14–20 (1993)

    Google Scholar 

  26. Kreiman, J., Gerratt, B.: Validity of raing scale measures of voice quality. J. Acoust. Soc. Am. 104(3), 1598–1608 (1998)

    Article  Google Scholar 

  27. Ott, L.: An introduction to statistical methods and data analysis, 3rd edn. PWS-Kent Publishing Company, Boston (1988)

    Google Scholar 

  28. Chong, F., McLoughlin, I., Pawlikoski, K.: A Methodology for Improving PESQ accuracy for Chinese Speech. In: TENCON Conference, pp. 1–6 (2005)

    Google Scholar 

  29. ITU, Perceptual evaluation of speech quality (PESQ), and objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs. ITU-T Recommendation p. 862 (2000)

    Google Scholar 

  30. Rix, A., Beerends, J., Hollier, M., Hekstra, A.: Perceptual evaluation of speech quality (PESQ) - A new method for speech quality assessment of telephone networks and codecs. In: Proc. IEEE Int. Conf. Acoust, Speech, Signal Processing, vol. 2, pp. 749–752 (2001)

    Google Scholar 

  31. Voran, S.: Objective estimation of perceived speech quality - Part I: Development of the measuring normalizing block technique. IEEE Transactions on Speech and Audio Processing 7(4), 371–382 (1999)

    Article  Google Scholar 

  32. Flanagan, J.: A difference limen for vowel formant frequency. J. Acoust. Soc. Am. 27, 613–617 (1955)

    Article  Google Scholar 

  33. Viswanathan, R., Makhoul, J., Russell, W.: Towards perceptually consistent measures of spectral distance. In: Proc. IEEE Int. Conf. Acoust, Speech, Signal Processing, vol. 1, pp. 485–488 (1976)

    Google Scholar 

  34. Hu, Y., Loizou, P.: Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang Processing 16(1), 229–238 (2008)

    Article  Google Scholar 

  35. Dimolitsas, S.: Objective speech distortion measures and their relevance to speech quality assessments. In: IEE Proc. - Vision, Image and Signal Processing, vol. 136(5), pp. 317–324 (1989)

    Google Scholar 

  36. Kubichek, R., Atkinson, D., Webster, A.: Advances in objective voice quality assessment. In: Proc. Global Telecommunications Conference, vol. 3, pp. 1765–1770 (1991)

    Google Scholar 

  37. Kitawaki, N.: Quality assessment of coded speech. In: Furui, S., Sondhi, M. (eds.) Advances in Speech Signal Processing, pp. 357–385. Marcel Dekker, New York (1991)

    Google Scholar 

  38. Barnwell, T.: Objective measures for speech quality testing. J. Acoust. Soc. Am. 66(6), 1658–1663 (1979)

    Article  Google Scholar 

  39. Hansen, J., Pellom, B.: An effective quality evaluation protocol for speech enhancement algorithms. In: Proc. Inter. Conf. on Spoken Language Processing, vol. 7, pp. 2819–2822 (1998)

    Google Scholar 

  40. Tribolet, J., Noll, P., McDermott, B., Crochiere, R.E.: A study of complexity and quality of speech waveform coders. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, pp. 586–590 (1978)

    Google Scholar 

  41. Kryter, K.: Methods for calculation and use of the articulation index. J. Acoust. Soc. Am. 34(11), 1689–1697 (1962)

    Article  Google Scholar 

  42. Klatt, D.: Prediction of perceived phonetic distance from critical band spectra. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, vol. 7, pp. 1278–1281 (1982)

    Google Scholar 

  43. Rabiner, L., Schafer, R.: Digital processing of speech signals. Prentice Hall, Englewood Cliffs (1978)

    Google Scholar 

  44. Kitawaki, N., Nagabuchi, H., Itoh, K.: Objective quality evaluation for low bit-rate speech coding systems. IEEE J. Select. Areas in Comm. 6(2), 262–273 (1988)

    Article  Google Scholar 

  45. Karjalainen, M.: A new auditory model for the evaluation of sound quality of audio system. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, vol. 10, pp. 608–611 (1985)

    Google Scholar 

  46. Wang, S., Sekey, A., Gersho, A.: An objective measure for predicting subjective quality of speech coders. IEEE J. on Select. Areas in Comm. 10(5), 819–829 (1992)

    Article  Google Scholar 

  47. Yang, W., Benbouchta, M., Yantorno, R.: Performance of the modified Bark spectral distortion as an objective speech quality measure. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, pp. 541–544 (1998)

    Google Scholar 

  48. Karjalainen, M.: Sound quality measurements of audio systems based on models of auditory perception. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, vol. 9, pp. 132–135 (1984)

    Google Scholar 

  49. Chen, G., Parsa, V.: Loudness pattern-based speech quality evaluation using Bayesian modelling and Markov chain Monte Carlo methods. J. Acoust., Soc. Am. 121(2), 77–83 (2007)

    Google Scholar 

  50. Pourmand, N., Suelzle, D., Parsa, V., Hu, Y., Loizou, P.: On the use of Bayesian modeling for predicting noise reduction performance. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 3873–3876 (2009)

    Google Scholar 

  51. Moore, B.: An introduction to the psychology of hearing, 5th edn. Academic Press, London (2003)

    Google Scholar 

  52. Fletcher, H., Munson, W.: Loudness, its definition, measurement and calculation. J. Acoust. Soc. Am. 5, 82–108 (1933)

    Article  Google Scholar 

  53. Robinson, D., Dadson, R.: A re-determination of the equal-loudness relations for pure tones. Brit. J. Appl. Phys. 7, 166–181 (1956)

    Article  Google Scholar 

  54. Yang, W.: Enhanced modified Bark spectral distortion (EMBSD): An objective speech quality measure based on audible distortion and cognition model. Ph.D., Temple University (1999)

    Google Scholar 

  55. Novorita, B.: Incorporation of temporal masking effects into bark spectral distortion measure. In: Proc. IEEE Int. Conf. Acoust. Speech, Signal Processing, vol. 2, pp. 665–668 (1999)

    Google Scholar 

  56. Yang, W., Yantorno, R.: Improvement of MBSD by scaling noise masking threshold and correlation analysis with MOS difference instead of MOS. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, vol. 2, pp. 673–676 (1999)

    Google Scholar 

  57. Rix, A., Hollier, M.: The perceptual analysis measurement for robust end-to-end speech quality assessment. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, vol. 3, pp. 1515–1518 (2000)

    Google Scholar 

  58. Bartels, R., Stewart, G.: Solution of the matrix equation AX+XB=C. Comm. of ACM 15(9), 820–826 (1972)

    Article  Google Scholar 

  59. Beerends, J., Stemerdink, J.: A perceptual speech-quality measure based on a psychoacoustic sound representation. J. Audio Eng. Soc. 42(3), 115–123 (1994)

    Google Scholar 

  60. Friedman, J.: Multivariate adaptive regression splines. Annals Statistics 19(1), 1–67 (1991)

    Article  MATH  Google Scholar 

  61. Falk, T.H., Chan, W.: Single-Ended Speech Quality Measurement Using Machine Learning Methods. IEEE Trans. Audio Speech Lang. Processing 14(6), 1935–1947 (2006)

    Article  Google Scholar 

  62. Rix, A.: Perceptual speech quality assessment - A review. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, vol. 3, pp. 1056–1059 (2004)

    Google Scholar 

  63. Gray, P., Hollier, M., Massara, R.: Non-intrusive speech quality as-sessment using vocal-tract models. In: IEE Proc. - Vision, Image and Signal Processing, vol. 147(6), pp. 493–501 (2000)

    Google Scholar 

  64. Chen, G., Parsa, V.: Nonintrusive speech quality evaluation using an adaptive neurofuzzy inference system. IEEE Signal Processing Letters 12(5), 403–406 (2005)

    Article  Google Scholar 

  65. Jin, C., Kubichek, R.: Vector quantization techniques for output-based objective speech quality. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 491–494 (1996)

    Google Scholar 

  66. Picovici, D., Madhi, A.: Output-based objective speech quality measure using self-organizing map. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 476–479 (2003)

    Google Scholar 

  67. Kim, D., Tarraf, A.: Perceptual model for nonintrusive speech quality assessment. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, vol. 3, pp. 1060–1063 (2004)

    Google Scholar 

  68. ITU, Single ended method for objective speech quality assessment in narrow-band telephony applications. ITU-T Recommendation p. 563 (2004)

    Google Scholar 

  69. Hollier, M., Hawksford, M., Guard, D.: Error activity and error en-tropy as a measure of psychoacoustic significance in the perceptual domain. In: IEE Proc. - Vision, Image and Signal Processing, vol. 141(3), pp. 203–208 (1994)

    Google Scholar 

  70. Arehart, K., Kates, J., Anderson, M., Harvey, L.: Effects of noise and distortion on speech quality judgments in normal-hearing and hearing-impaired listeners. J. Acoust. Soc. Am. 122, 1150–1164 (2007)

    Article  Google Scholar 

  71. Kates, J.: On using coherence to measure distortion in hearing aids. J. Acoust. Soc. Am. 91, 2236–2244 (1992)

    Article  Google Scholar 

  72. Hu, Y., Loizou, P.: Subjective comparison and evaluation of speech enhancement algorithms. Speech Communication 49, 588–601 (2007)

    Article  Google Scholar 

  73. Holub, J., Jianjun, L.: Intrusive Speech Transmission Quality Measurement in Chinese Environment. In: Intern. Conf. on Information, Communications and Signal Processing, pp. 1–3 (2007)

    Google Scholar 

  74. Ma, J., Hu, Y., Loizou, P.: Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions. J. Acoust. Soc. Am. 125(5), 3387–3405 (2009)

    Article  Google Scholar 

  75. Zwicker, E., Fastl, H.: Pschoacoustics: Facts and Models, 2nd edn. Springer, Heidelberg (1999)

    Google Scholar 

  76. Kang, J.: Comparison of speech intelligibility between English and Chinese. J. Acoust. Soc. Am. 103(2), 1213–1216 (1998)

    Article  Google Scholar 

  77. Bladon, R., Lindblom, B.: Modeling the judgment of vowel quality differences. J. Acoust. Soc. Am. 69(5), 1414–1422 (1981)

    Article  Google Scholar 

  78. Kent, R., Read, C.: The Acoustic Analysis of Speech. Singular Publishing Group, San Diego (1992)

    Google Scholar 

  79. Stevens, K., Blumstein, S.: Invariant cues for the place of articulation in stop consonants. J. Acoust. Soc. Am. 64, 1358–1368 (1978)

    Article  Google Scholar 

  80. Breitkopf, P., Barnwell, T.: Segmental preclassification for improved objective speech quality measures. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 1101–1104 (1981)

    Google Scholar 

  81. Kubichek, R., Quincy, E., Kiser, K.: Speech quality asessment using expert pattern recognition techniques. In: IEEE Pacific Rim Conf. on Comm. Computers, Sign. Proc., pp. 208–211 (1989)

    Google Scholar 

  82. Barnwell, T.: A comparison of parametrically different objective speech quality measures using correlation analysis with subjective listening tests. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 710–713 (1980)

    Google Scholar 

  83. Kates, J., Arehart, K.: Coherence and the speech intelligibility index. J. Acoust. Soc. Am. 117, 2224–2237 (2005)

    Article  Google Scholar 

  84. Mattila, V.: Objective measures for the characterization of the basic functioning of noise suppression algorithms. In: Proc. of online workshop on Measurement Speech and Audio Quality in Networks (2003)

    Google Scholar 

  85. Mester, R., Franke, U.: Spectral entropy-activity classification in adaptive transform coding. IEEE J. Sel. Areas Comm. 10(5), 913–917 (1992)

    Article  Google Scholar 

  86. Voran, S.: Objective estimation of perceived speech quality - Parti I: Development of the measuring normalizing block technique. IEEE Transactions on Speech and Audio Processing 7(4), 371–382 (1999)

    Article  Google Scholar 

  87. Klatt, D.H.: Prediction of perceived phonetic distance from critical-band spectra:A first step. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 1278–1281 (1982)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Loizou, P.C. (2011). Speech Quality Assessment. In: Lin, W., Tao, D., Kacprzyk, J., Li, Z., Izquierdo, E., Wang, H. (eds) Multimedia Analysis, Processing and Communications. Studies in Computational Intelligence, vol 346. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19551-8_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19551-8_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19550-1

  • Online ISBN: 978-3-642-19551-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics