Skip to main content

Estimation of Speech Intelligibility and Quality

  • Chapter
Handbook of Signal Processing in Acoustics

Speech communication requires a talker and a listener. Acoustical and in some cases electrical representations of the speech are carried from the talker to the listener by some system. This system might consist of the air in a room, or it might involve electro-acoustic transducers and sound reinforcement or telecommunications equipment. Interfering noises (including reverberation of speech) may be present and these may impinge upon and affect the talker, the system, and the listener. A schematic representation of this basic unidirectional speech communication scenario is given in Figure 1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 629.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 799.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 799.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. N. Johannesson, “The ETSI Computation Model: A Tool for Transmission Planning of Telephone Networks,” IEEE Commun. Mag., vol. 35, pp. 70–79, Jan. 1997.

    Google Scholar 

  2. ITU-T Recommendation G.107, “The E-Model, A Computational Model for Use in Transmission Planning,” Geneva, 2003.

    Google Scholar 

  3. S. Möller, Assessment and Prediction of Speech Quality in Telecommunications. Boston: Kluwer Academic, 2000.

    Google Scholar 

  4. S. Quackenbush, T. Barnwell III, & M. Clements, Objective Measures of Speech Quality. Englewood Cliffs, NJ: Prentice-Hall, 1988.

    Google Scholar 

  5. ITU-T Recommendation P.800, “Methods for Subjective Determination of Transmission Quality,” Geneva, 1996.

    Google Scholar 

  6. ANSI S3.2-1989, “Method for Measuring the Intelligibility of Speech over Communications Systems,” New York, 1990.

    Google Scholar 

  7. ISO TR-4870, “The Construction and Calibration of Speech Intelligibility Tests,” Geneva, 1991.

    Google Scholar 

  8. H. Fletcher, The ASA Edition of Speech and Hearing in Communication. J. Allen, Ed. Woodbury, NY: Acoustical Society of America, 1995, Chap. A1.

    Google Scholar 

  9. N. French & J. Steinberg, “Factors Governing the Intelligibility of Speech Sounds,” J. Acoust. Soc. Am., vol. 19, pp. 90–119, Jan. 1947.

    Google Scholar 

  10. H. Fletcher, Speech and Hearing in Communication. New York: Van Nostrand, 1953, Chap. 17.

    Google Scholar 

  11. K. Kryter, “Methods for the Calculation and Use of the Articulation Index,” J. Acoust. Soc. Am., vol. 34, pp. 1689–1697, Nov. 1962.

    Google Scholar 

  12. K. Kryter, “Validation of the Articulation Index,” J. Acoust. Soc. Am., vol. 34, pp. 1698–1702, Nov. 1962.

    Google Scholar 

  13. ANSI S3.5-1969, “Methods for the Calculation of the Articulation Index,” New York, 1969.

    Google Scholar 

  14. N. Jayant & P. Noll, Digital Coding of Waveforms. Englewood Cliffs, NJ: Prentice-Hall, 1984, Appendix E.

    Google Scholar 

  15. T. Houtgast, “A Physical Method for Measuring Speech-Transmission Quality,” J. Acoust. Soc. Am., vol. 67, pp. 318–326, Jan. 1980.

    Google Scholar 

  16. H. Steeneken & T. Houtgast, “Mutual Dependence of the Octave-Band Weights in Predicting Speech Intelligibility,” Speech Commun., vol. 28, pp. 109–123, 1999.

    Article  Google Scholar 

  17. H. Steeneken & T. Houtgast, “Validation of the Revised STIr Method,” Speech Commun., vol. 38, pp. 413–425, 2002.

    Article  MATH  Google Scholar 

  18. IEC 60268-16, “Sound System Equipment – Part 16: Objective Rating of Speech Intelligibility by Speech Transmission Index,” Geneva, 2003.

    Google Scholar 

  19. T. Houtgast & H. Steeneken, “A Multi-language Evaluation of the RASTI-Method for Estimating Speech Intelligibility in Auditoria,” Acustica, vol. 54, pp. 185–199, 1984.

    Google Scholar 

  20. R. Goldsworthy & J. Greenberg, “Analysis of Speech-Based Speech Transmission Index Methods with Implications for Nonlinear Operations,” J. Acoust. Soc. Am., vol. 116, pp. 3679–3689, Dec. 2004.

    Google Scholar 

  21. ANSI S3.5-1997, “Methods for Calculation of the Speech Intelligibility Index,” New York, 1998.

    Google Scholar 

  22. S. Voran, “Listener Ratings of Speech Passbands,” in Proc. 1997 IEEE Workshop on Speech Coding for Telecommunications, pp. 81–82, Pocono Manor, PA, 1997.

    Google Scholar 

  23. J. Beerends, E. Larsen, N. Iyer, & J. van Vugt, “Measurement of Speech Intelligibility Based on the PESQ Approach,” in Proc. 3rd International Conference on Measurement of Speech and Audio Quality in Networks, pp. 27–30, Prague, Czech Republic, 2004.

    Google Scholar 

  24. ITU-T Recommendation P.50, “Artificial Voices,” Geneva, 1999.

    Google Scholar 

  25. N. Kitawaki, K. Nagai, & T. Yamada, “Objective Quality Assessment of Wideband Speech Coding Using W-PESQ Measure and Artificial Voice,” in Proc. 3rd International Conference on Measurement of Speech and Audio Quality in Networks, pp. 31–36, Prague, Czech Republic, 2004.

    Google Scholar 

  26. M. Werner, T. Junge, & P. Vary, “Quality Control for AMR Speech Channels in GSM Networks,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1076–1079, Montreal, 2004.

    Google Scholar 

  27. B. Timus, “Radio Link Parameter Based Speech Quality Index – SQI,” in Proc. 1999 IEEE Workshop on Speech Coding, pp. 147–149, Porvoo, Finland, 1999.

    Google Scholar 

  28. S. Voran, “Compensating for Gain in Objective Quality Estimation Algorithms,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1068–1071, Montreal, 2004.

    Google Scholar 

  29. S. Voran, “Objective Estimation of Perceived Speech Quality, Part I: Development of the Measuring Normalizing Block Technique,” IEEE Trans. Speech Audio Process., vol. 7, pp. 371–382, Jul. 1999.

    Google Scholar 

  30. ITU-T Recommendation P.931, “Multimedia Communications Delay, Synchronization, and Frame Rate Measurement,” Geneva, 1998.

    Google Scholar 

  31. S. Voran, “Perception of Temporal Discontinuity Impairments in Coded Speech – A Proposal for Objective Estimators and Some Subjective Test Results,” in Proc. 2nd International Conference on Measurement of Speech and Audio Quality in Networks, pp. 37–46, Prague, Czech Republic, 2003.

    Google Scholar 

  32. A. Rix, M. Hollier, A. Hekstra, & J. Beerends, “Perceptual Evaluation of Speech Quality (PESQ) – The New ITU Standard for End-to-End Speech Quality Assessment, Part I – Time-Delay Compensation,” J. Audio Eng. Soc., vol. 50, pp. 755–764, Oct. 2002.

    Google Scholar 

  33. ITU-T Recommendation P.862, “Perceptual Evaluation of Speech Quality (PESQ), an Objective Method for End-to-End Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs,” Geneva, 2001.

    Google Scholar 

  34. S. Voran, “A Bottom-Up Algorithm for Estimating Time-Varying Delays in Coded Speech,” in Proc. 3rd International Conference on Measurement of Speech and Audio Quality in Networks, pp. 43–56, Prague, Czech Republic, 2004.

    Google Scholar 

  35. ANSI T1-801-04-2005, “Multimedia Communications Delay, Synchronization, and Frame Rate,” New York, 2005.

    Google Scholar 

  36. A. Gray Jr & J. Markel, “Distance Measures for Speech Processing,” IEEE Trans. Acoust., Speech Signal Process., vol. 24, pp. 380–391, Oct. 1976.

    Google Scholar 

  37. S. Voran, “Advances in Objective Estimation of Perceived Speech Quality,” in Proc. 1999 IEEE Workshop on Speech Coding, pp. 138–140, Porvoo, Finland, 1999.

    Google Scholar 

  38. S. Wang, A. Sekey, & A. Gersho, “An Objective Measure for Predicting Subjective Quality of Speech Coders,” IEEE J. Sel. Areas Commun., vol. 10, pp. 819–829, Jun. 1992.

    Google Scholar 

  39. B. Moore, An Introduction to the Psychology of Hearing. London: Academic Press, 1989, Chap. 3.

    Google Scholar 

  40. R. Bladon, “Modeling the Judgment of Vowel Quality Differences,” J. Acoust. Soc. Am., vol. 69, pp. 1414–1422, May 1981.

    Google Scholar 

  41. R. Yantorno, “Improvement of MBSD by Scaling Noise Masking Threshold and Correlation Analysis with MOS Difference Instead of MOS,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 673–676, Phoenix, 1999.

    Google Scholar 

  42. J. Beerends & J. Stemerdink, “A Perceptual Speech-Quality Measure Based on a Psychoacoustic Sound Representation,” J. Audio Eng. Soc., vol. 42, pp. 115–123, Mar. 1994.

    Google Scholar 

  43. ITU-T Recommendation P.861, “Objective Quality Measurement of Telephone-Band (300–3400 Hz) Speech Codecs,” Geneva, 1996.

    Google Scholar 

  44. E. Zwicker & R. Feldtkeller, Das Ohr als Nachrichtenempfänger. Stuttgart: S. Hirzel Verlag, 1967.

    Google Scholar 

  45. S. Voran, “A Simplified Version of the ITU Algorithm for Objective Measurement of Speech Codec Quality,” in Proc. International Conference on Acoustics, Speech and Signal Processing pp. 537–540, Seattle, 1998.

    Google Scholar 

  46. S. Voran & C. Sholl, “Perception-Based Objective Estimators of Speech Quality,” in Proc. 1995 IEEE Workshop on Speech Coding for Telecommunications, pp. 13–14, Annapolis, MD, 1995.

    Google Scholar 

  47. S. Voran, “Objective Estimation of Perceived Speech Quality, Part II: Evaluation of the Measuring Normalizing Block Technique,” IEEE Trans. Speech Audio Process., vol. 7, pp. 383–390, Jul. 1999.

    Google Scholar 

  48. ANSI T1-518-1998, “Objective Measurement of Telephone Band Speech Quality Using Measuring Normalizing Blocks (MNBs),” New York, 1998. Reaffirmed 2008.

    Google Scholar 

  49. ITU-T Recommendation P.861 Appendix II, “Objective Quality Measurement of Telephone-Band (300–3400 Hz) Speech Codecs Using Measuring Normalizing Blocks (MNBs),” Geneva, 1998.

    Google Scholar 

  50. J. Beerends, A. Hekstra, A. Rix, & M. Hollier, “Perceptual Evaluation of Speech Quality (PESQ) – The New ITU Standard for End-to-End Speech Quality Assessment, Part II – Psychoacoustic Model,” J. Audio Eng. Soc., vol. 50, pp. 765–778, Oct. 2002.

    Google Scholar 

  51. C. Jin & R. Kubichek, “Vector Quantization Techniques for Output-Based Objective Speech Quality,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 491–494, Atlanta, GA, 1996.

    Google Scholar 

  52. W. Li & R. Kubichek, “Output-Based Objective Speech Quality Measurement Using Continuous Hidden Markov Models,” in Proc. 7th International Symposium on Signal Processing and its Applications, pp. 389–392, Paris, 2003.

    Google Scholar 

  53. D. Kim, “ANIQUE: An Auditory Model for Single-Ended Speech Quality Estimation,” IEEE Trans. Speech Audio Process., vol. 13, pp. 821–831, Sep. 2005.

    Google Scholar 

  54. ANSI ATIS-PP-0100005.2006, “Auditory Non-Intrusive Quality Estimation Plus (ANIQUE+) Perceptual Model for Non-Intrusive Estimation of Narrowband Speech Quality”, New York, 2006.

    Google Scholar 

  55. ITU-T Recommendation P.563, “Single-Ended Method for Objective Speech Quality Assessment in Narrow-Band Telephony Applications,” Geneva, 2004.

    Google Scholar 

  56. V. Peutz, “Speech Information and Speech Intelligibility,” Preprint, Audio Engineering Society 85th Convention, Los Angeles, 1988.

    Google Scholar 

  57. R. Kubichek, “Mel-Cepstral Distance Measure for Objective Speech Quality Assessment,” in Proc. IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, pp. 125–128, Victoria, British Columbia, 1993.

    Google Scholar 

  58. A. De & P. Kabal, “Auditory Distortion Measure for Coded Speech – Discrimination Information Approach,” Speech Commun., vol. 14, pp. 205–229, Jun. 1994.

    Google Scholar 

  59. A. De & P. Kabal, “Auditory Distortion Measure for Coded Speech – Hidden Markovian Approach,” Speech Commun., vol. 17, pp. 39–57, Aug. 1995.

    Google Scholar 

  60. M. Hansen & B. Kollmeier, “Using a Quantitative Psychoacoustical Signal Representation for Objective Speech Quality Measurement,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1387–1390, Munich, 1997.

    Google Scholar 

  61. M. Hauenstein, “Application of Meddis’ Inner Hair-Cell Model to the Prediction of Subjective Speech Quality,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 545–548, Seattle, 1998.

    Google Scholar 

  62. A. Rix & M. Hollier, “The Perceptual Analysis Measurement System for Robust End-to-End Speech Quality Assessment,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1515–1518, Istanbul, 2000.

    Google Scholar 

  63. J. van der Werff & D. de Leeuw, “What You Specify Is What You Get (Parts 1 & 2),” Preprint, Audio Engineering Society 114th Convention, Amsterdam, 2003.

    Google Scholar 

  64. J. Holub, M. Street, & R. Smid, “Intrusive Speech Transmission Quality Measurements for Low Bit-Rate Coded Audio Signals,” Preprint, Audio Engineering Society 115th Convention, New York, 2003.

    Google Scholar 

  65. D. Sen, “Predicting Foreground SH, SL and BNH DAM Scores for Multidimensional Objective Measure of Speech Quality,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 493–496, Montreal, 2004.

    Google Scholar 

  66. A. Takahashi & H. Yoshino, “Perceptual QoS Assessment Technologies for VoIP,” IEEE Commun. Mag., vol. 24, pp. 28–34, Jul. 2004.

    Google Scholar 

  67. T. Thiede, W. Treurniet, R. Bitto, C. Schmidmer, T. Sporer, J. Beerends, C. Colomes, M. Keyhl, G. Stoll, K. Brandenburg, & B. Feiten, “PEAQ – The ITU Standard for Objective Measurement of Perceived Audio Quality,” J. Audio Eng. Soc., vol. 48, pp. 3–29, Jan./Feb. 2000.

    Google Scholar 

  68. W. Treurniet & G. Soulodre, “Evaluation of the ITU-R Objective Audio Quality Measurement Method,” J. Audio Eng. Soc., vol. 48, pp. 164–173, Jan./Feb. 2000.

    Google Scholar 

  69. ITU-R Recommendation BS.1387-1, “Method for Objective Measurements of Perceived Audio Quality,” Geneva, 2001.

    Google Scholar 

  70. B. Moore & C. Tan, “Measuring and Predicting the Perceived Quality of Music and Speech Subjective to Combined Linear and Nonlinear Distortion,” J. Audio Eng. Soc., vol. 52, pp. 1228–1244, Dec. 2004.

    Google Scholar 

  71. L. Thorpe & W. Yang, “Performance of Current Perceptual Objective Speech Quality Measures,” in Proc. 1999 IEEE Workshop on Speech Coding, pp. 144–146, Porvoo, Finland, 1999.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Voran, S. (2008). Estimation of Speech Intelligibility and Quality. In: Havelock, D., Kuwano, S., Vorländer, M. (eds) Handbook of Signal Processing in Acoustics. Springer, New York, NY. https://doi.org/10.1007/978-0-387-30441-0_28

Download citation

Publish with us

Policies and ethics