Speech communication requires a talker and a listener. Acoustical and in some cases electrical representations of the speech are carried from the talker to the listener by some system. This system might consist of the air in a room, or it might involve electro-acoustic transducers and sound reinforcement or telecommunications equipment. Interfering noises (including reverberation of speech) may be present and these may impinge upon and affect the talker, the system, and the listener. A schematic representation of this basic unidirectional speech communication scenario is given in Figure 1.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
N. Johannesson, “The ETSI Computation Model: A Tool for Transmission Planning of Telephone Networks,” IEEE Commun. Mag., vol. 35, pp. 70–79, Jan. 1997.
ITU-T Recommendation G.107, “The E-Model, A Computational Model for Use in Transmission Planning,” Geneva, 2003.
S. Möller, Assessment and Prediction of Speech Quality in Telecommunications. Boston: Kluwer Academic, 2000.
S. Quackenbush, T. Barnwell III, & M. Clements, Objective Measures of Speech Quality. Englewood Cliffs, NJ: Prentice-Hall, 1988.
ITU-T Recommendation P.800, “Methods for Subjective Determination of Transmission Quality,” Geneva, 1996.
ANSI S3.2-1989, “Method for Measuring the Intelligibility of Speech over Communications Systems,” New York, 1990.
ISO TR-4870, “The Construction and Calibration of Speech Intelligibility Tests,” Geneva, 1991.
H. Fletcher, The ASA Edition of Speech and Hearing in Communication. J. Allen, Ed. Woodbury, NY: Acoustical Society of America, 1995, Chap. A1.
N. French & J. Steinberg, “Factors Governing the Intelligibility of Speech Sounds,” J. Acoust. Soc. Am., vol. 19, pp. 90–119, Jan. 1947.
H. Fletcher, Speech and Hearing in Communication. New York: Van Nostrand, 1953, Chap. 17.
K. Kryter, “Methods for the Calculation and Use of the Articulation Index,” J. Acoust. Soc. Am., vol. 34, pp. 1689–1697, Nov. 1962.
K. Kryter, “Validation of the Articulation Index,” J. Acoust. Soc. Am., vol. 34, pp. 1698–1702, Nov. 1962.
ANSI S3.5-1969, “Methods for the Calculation of the Articulation Index,” New York, 1969.
N. Jayant & P. Noll, Digital Coding of Waveforms. Englewood Cliffs, NJ: Prentice-Hall, 1984, Appendix E.
T. Houtgast, “A Physical Method for Measuring Speech-Transmission Quality,” J. Acoust. Soc. Am., vol. 67, pp. 318–326, Jan. 1980.
H. Steeneken & T. Houtgast, “Mutual Dependence of the Octave-Band Weights in Predicting Speech Intelligibility,” Speech Commun., vol. 28, pp. 109–123, 1999.
H. Steeneken & T. Houtgast, “Validation of the Revised STIr Method,” Speech Commun., vol. 38, pp. 413–425, 2002.
IEC 60268-16, “Sound System Equipment – Part 16: Objective Rating of Speech Intelligibility by Speech Transmission Index,” Geneva, 2003.
T. Houtgast & H. Steeneken, “A Multi-language Evaluation of the RASTI-Method for Estimating Speech Intelligibility in Auditoria,” Acustica, vol. 54, pp. 185–199, 1984.
R. Goldsworthy & J. Greenberg, “Analysis of Speech-Based Speech Transmission Index Methods with Implications for Nonlinear Operations,” J. Acoust. Soc. Am., vol. 116, pp. 3679–3689, Dec. 2004.
ANSI S3.5-1997, “Methods for Calculation of the Speech Intelligibility Index,” New York, 1998.
S. Voran, “Listener Ratings of Speech Passbands,” in Proc. 1997 IEEE Workshop on Speech Coding for Telecommunications, pp. 81–82, Pocono Manor, PA, 1997.
J. Beerends, E. Larsen, N. Iyer, & J. van Vugt, “Measurement of Speech Intelligibility Based on the PESQ Approach,” in Proc. 3rd International Conference on Measurement of Speech and Audio Quality in Networks, pp. 27–30, Prague, Czech Republic, 2004.
ITU-T Recommendation P.50, “Artificial Voices,” Geneva, 1999.
N. Kitawaki, K. Nagai, & T. Yamada, “Objective Quality Assessment of Wideband Speech Coding Using W-PESQ Measure and Artificial Voice,” in Proc. 3rd International Conference on Measurement of Speech and Audio Quality in Networks, pp. 31–36, Prague, Czech Republic, 2004.
M. Werner, T. Junge, & P. Vary, “Quality Control for AMR Speech Channels in GSM Networks,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1076–1079, Montreal, 2004.
B. Timus, “Radio Link Parameter Based Speech Quality Index – SQI,” in Proc. 1999 IEEE Workshop on Speech Coding, pp. 147–149, Porvoo, Finland, 1999.
S. Voran, “Compensating for Gain in Objective Quality Estimation Algorithms,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1068–1071, Montreal, 2004.
S. Voran, “Objective Estimation of Perceived Speech Quality, Part I: Development of the Measuring Normalizing Block Technique,” IEEE Trans. Speech Audio Process., vol. 7, pp. 371–382, Jul. 1999.
ITU-T Recommendation P.931, “Multimedia Communications Delay, Synchronization, and Frame Rate Measurement,” Geneva, 1998.
S. Voran, “Perception of Temporal Discontinuity Impairments in Coded Speech – A Proposal for Objective Estimators and Some Subjective Test Results,” in Proc. 2nd International Conference on Measurement of Speech and Audio Quality in Networks, pp. 37–46, Prague, Czech Republic, 2003.
A. Rix, M. Hollier, A. Hekstra, & J. Beerends, “Perceptual Evaluation of Speech Quality (PESQ) – The New ITU Standard for End-to-End Speech Quality Assessment, Part I – Time-Delay Compensation,” J. Audio Eng. Soc., vol. 50, pp. 755–764, Oct. 2002.
ITU-T Recommendation P.862, “Perceptual Evaluation of Speech Quality (PESQ), an Objective Method for End-to-End Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs,” Geneva, 2001.
S. Voran, “A Bottom-Up Algorithm for Estimating Time-Varying Delays in Coded Speech,” in Proc. 3rd International Conference on Measurement of Speech and Audio Quality in Networks, pp. 43–56, Prague, Czech Republic, 2004.
ANSI T1-801-04-2005, “Multimedia Communications Delay, Synchronization, and Frame Rate,” New York, 2005.
A. Gray Jr & J. Markel, “Distance Measures for Speech Processing,” IEEE Trans. Acoust., Speech Signal Process., vol. 24, pp. 380–391, Oct. 1976.
S. Voran, “Advances in Objective Estimation of Perceived Speech Quality,” in Proc. 1999 IEEE Workshop on Speech Coding, pp. 138–140, Porvoo, Finland, 1999.
S. Wang, A. Sekey, & A. Gersho, “An Objective Measure for Predicting Subjective Quality of Speech Coders,” IEEE J. Sel. Areas Commun., vol. 10, pp. 819–829, Jun. 1992.
B. Moore, An Introduction to the Psychology of Hearing. London: Academic Press, 1989, Chap. 3.
R. Bladon, “Modeling the Judgment of Vowel Quality Differences,” J. Acoust. Soc. Am., vol. 69, pp. 1414–1422, May 1981.
R. Yantorno, “Improvement of MBSD by Scaling Noise Masking Threshold and Correlation Analysis with MOS Difference Instead of MOS,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 673–676, Phoenix, 1999.
J. Beerends & J. Stemerdink, “A Perceptual Speech-Quality Measure Based on a Psychoacoustic Sound Representation,” J. Audio Eng. Soc., vol. 42, pp. 115–123, Mar. 1994.
ITU-T Recommendation P.861, “Objective Quality Measurement of Telephone-Band (300–3400 Hz) Speech Codecs,” Geneva, 1996.
E. Zwicker & R. Feldtkeller, Das Ohr als Nachrichtenempfänger. Stuttgart: S. Hirzel Verlag, 1967.
S. Voran, “A Simplified Version of the ITU Algorithm for Objective Measurement of Speech Codec Quality,” in Proc. International Conference on Acoustics, Speech and Signal Processing pp. 537–540, Seattle, 1998.
S. Voran & C. Sholl, “Perception-Based Objective Estimators of Speech Quality,” in Proc. 1995 IEEE Workshop on Speech Coding for Telecommunications, pp. 13–14, Annapolis, MD, 1995.
S. Voran, “Objective Estimation of Perceived Speech Quality, Part II: Evaluation of the Measuring Normalizing Block Technique,” IEEE Trans. Speech Audio Process., vol. 7, pp. 383–390, Jul. 1999.
ANSI T1-518-1998, “Objective Measurement of Telephone Band Speech Quality Using Measuring Normalizing Blocks (MNBs),” New York, 1998. Reaffirmed 2008.
ITU-T Recommendation P.861 Appendix II, “Objective Quality Measurement of Telephone-Band (300–3400 Hz) Speech Codecs Using Measuring Normalizing Blocks (MNBs),” Geneva, 1998.
J. Beerends, A. Hekstra, A. Rix, & M. Hollier, “Perceptual Evaluation of Speech Quality (PESQ) – The New ITU Standard for End-to-End Speech Quality Assessment, Part II – Psychoacoustic Model,” J. Audio Eng. Soc., vol. 50, pp. 765–778, Oct. 2002.
C. Jin & R. Kubichek, “Vector Quantization Techniques for Output-Based Objective Speech Quality,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 491–494, Atlanta, GA, 1996.
W. Li & R. Kubichek, “Output-Based Objective Speech Quality Measurement Using Continuous Hidden Markov Models,” in Proc. 7th International Symposium on Signal Processing and its Applications, pp. 389–392, Paris, 2003.
D. Kim, “ANIQUE: An Auditory Model for Single-Ended Speech Quality Estimation,” IEEE Trans. Speech Audio Process., vol. 13, pp. 821–831, Sep. 2005.
ANSI ATIS-PP-0100005.2006, “Auditory Non-Intrusive Quality Estimation Plus (ANIQUE+) Perceptual Model for Non-Intrusive Estimation of Narrowband Speech Quality”, New York, 2006.
ITU-T Recommendation P.563, “Single-Ended Method for Objective Speech Quality Assessment in Narrow-Band Telephony Applications,” Geneva, 2004.
V. Peutz, “Speech Information and Speech Intelligibility,” Preprint, Audio Engineering Society 85th Convention, Los Angeles, 1988.
R. Kubichek, “Mel-Cepstral Distance Measure for Objective Speech Quality Assessment,” in Proc. IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, pp. 125–128, Victoria, British Columbia, 1993.
A. De & P. Kabal, “Auditory Distortion Measure for Coded Speech – Discrimination Information Approach,” Speech Commun., vol. 14, pp. 205–229, Jun. 1994.
A. De & P. Kabal, “Auditory Distortion Measure for Coded Speech – Hidden Markovian Approach,” Speech Commun., vol. 17, pp. 39–57, Aug. 1995.
M. Hansen & B. Kollmeier, “Using a Quantitative Psychoacoustical Signal Representation for Objective Speech Quality Measurement,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1387–1390, Munich, 1997.
M. Hauenstein, “Application of Meddis’ Inner Hair-Cell Model to the Prediction of Subjective Speech Quality,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 545–548, Seattle, 1998.
A. Rix & M. Hollier, “The Perceptual Analysis Measurement System for Robust End-to-End Speech Quality Assessment,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1515–1518, Istanbul, 2000.
J. van der Werff & D. de Leeuw, “What You Specify Is What You Get (Parts 1 & 2),” Preprint, Audio Engineering Society 114th Convention, Amsterdam, 2003.
J. Holub, M. Street, & R. Smid, “Intrusive Speech Transmission Quality Measurements for Low Bit-Rate Coded Audio Signals,” Preprint, Audio Engineering Society 115th Convention, New York, 2003.
D. Sen, “Predicting Foreground SH, SL and BNH DAM Scores for Multidimensional Objective Measure of Speech Quality,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 493–496, Montreal, 2004.
A. Takahashi & H. Yoshino, “Perceptual QoS Assessment Technologies for VoIP,” IEEE Commun. Mag., vol. 24, pp. 28–34, Jul. 2004.
T. Thiede, W. Treurniet, R. Bitto, C. Schmidmer, T. Sporer, J. Beerends, C. Colomes, M. Keyhl, G. Stoll, K. Brandenburg, & B. Feiten, “PEAQ – The ITU Standard for Objective Measurement of Perceived Audio Quality,” J. Audio Eng. Soc., vol. 48, pp. 3–29, Jan./Feb. 2000.
W. Treurniet & G. Soulodre, “Evaluation of the ITU-R Objective Audio Quality Measurement Method,” J. Audio Eng. Soc., vol. 48, pp. 164–173, Jan./Feb. 2000.
ITU-R Recommendation BS.1387-1, “Method for Objective Measurements of Perceived Audio Quality,” Geneva, 2001.
B. Moore & C. Tan, “Measuring and Predicting the Perceived Quality of Music and Speech Subjective to Combined Linear and Nonlinear Distortion,” J. Audio Eng. Soc., vol. 52, pp. 1228–1244, Dec. 2004.
L. Thorpe & W. Yang, “Performance of Current Perceptual Objective Speech Quality Measures,” in Proc. 1999 IEEE Workshop on Speech Coding, pp. 144–146, Porvoo, Finland, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Voran, S. (2008). Estimation of Speech Intelligibility and Quality. In: Havelock, D., Kuwano, S., Vorländer, M. (eds) Handbook of Signal Processing in Acoustics. Springer, New York, NY. https://doi.org/10.1007/978-0-387-30441-0_28
Download citation
DOI: https://doi.org/10.1007/978-0-387-30441-0_28
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-77698-9
Online ISBN: 978-0-387-30441-0
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)