Estimation of Speech Intelligibility and Quality

Voran, Stephen

doi:10.1007/978-0-387-30441-0_28

Stephen Voran⁴

774 Accesses
5 Citations

Speech communication requires a talker and a listener. Acoustical and in some cases electrical representations of the speech are carried from the talker to the listener by some system. This system might consist of the air in a room, or it might involve electro-acoustic transducers and sound reinforcement or telecommunications equipment. Interfering noises (including reverberation of speech) may be present and these may impinge upon and affect the talker, the system, and the listener. A schematic representation of this basic unidirectional speech communication scenario is given in Figure 1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 629.00; Price excludes VAT (USA)

Softcover Book: USD 799.99; Price excludes VAT (USA)

Hardcover Book: USD 799.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

N. Johannesson, “The ETSI Computation Model: A Tool for Transmission Planning of Telephone Networks,” IEEE Commun. Mag., vol. 35, pp. 70–79, Jan. 1997.
Google Scholar
ITU-T Recommendation G.107, “The E-Model, A Computational Model for Use in Transmission Planning,” Geneva, 2003.
Google Scholar
S. Möller, Assessment and Prediction of Speech Quality in Telecommunications. Boston: Kluwer Academic, 2000.
Google Scholar
S. Quackenbush, T. Barnwell III, & M. Clements, Objective Measures of Speech Quality. Englewood Cliffs, NJ: Prentice-Hall, 1988.
Google Scholar
ITU-T Recommendation P.800, “Methods for Subjective Determination of Transmission Quality,” Geneva, 1996.
Google Scholar
ANSI S3.2-1989, “Method for Measuring the Intelligibility of Speech over Communications Systems,” New York, 1990.
Google Scholar
ISO TR-4870, “The Construction and Calibration of Speech Intelligibility Tests,” Geneva, 1991.
Google Scholar
H. Fletcher, The ASA Edition of Speech and Hearing in Communication. J. Allen, Ed. Woodbury, NY: Acoustical Society of America, 1995, Chap. A1.
Google Scholar
N. French & J. Steinberg, “Factors Governing the Intelligibility of Speech Sounds,” J. Acoust. Soc. Am., vol. 19, pp. 90–119, Jan. 1947.
Google Scholar
H. Fletcher, Speech and Hearing in Communication. New York: Van Nostrand, 1953, Chap. 17.
Google Scholar
K. Kryter, “Methods for the Calculation and Use of the Articulation Index,” J. Acoust. Soc. Am., vol. 34, pp. 1689–1697, Nov. 1962.
Google Scholar
K. Kryter, “Validation of the Articulation Index,” J. Acoust. Soc. Am., vol. 34, pp. 1698–1702, Nov. 1962.
Google Scholar
ANSI S3.5-1969, “Methods for the Calculation of the Articulation Index,” New York, 1969.
Google Scholar
N. Jayant & P. Noll, Digital Coding of Waveforms. Englewood Cliffs, NJ: Prentice-Hall, 1984, Appendix E.
Google Scholar
T. Houtgast, “A Physical Method for Measuring Speech-Transmission Quality,” J. Acoust. Soc. Am., vol. 67, pp. 318–326, Jan. 1980.
Google Scholar
H. Steeneken & T. Houtgast, “Mutual Dependence of the Octave-Band Weights in Predicting Speech Intelligibility,” Speech Commun., vol. 28, pp. 109–123, 1999.
Article Google Scholar
H. Steeneken & T. Houtgast, “Validation of the Revised STIr Method,” Speech Commun., vol. 38, pp. 413–425, 2002.
Article MATH Google Scholar
IEC 60268-16, “Sound System Equipment – Part 16: Objective Rating of Speech Intelligibility by Speech Transmission Index,” Geneva, 2003.
Google Scholar
T. Houtgast & H. Steeneken, “A Multi-language Evaluation of the RASTI-Method for Estimating Speech Intelligibility in Auditoria,” Acustica, vol. 54, pp. 185–199, 1984.
Google Scholar
R. Goldsworthy & J. Greenberg, “Analysis of Speech-Based Speech Transmission Index Methods with Implications for Nonlinear Operations,” J. Acoust. Soc. Am., vol. 116, pp. 3679–3689, Dec. 2004.
Google Scholar
ANSI S3.5-1997, “Methods for Calculation of the Speech Intelligibility Index,” New York, 1998.
Google Scholar
S. Voran, “Listener Ratings of Speech Passbands,” in Proc. 1997 IEEE Workshop on Speech Coding for Telecommunications, pp. 81–82, Pocono Manor, PA, 1997.
Google Scholar
J. Beerends, E. Larsen, N. Iyer, & J. van Vugt, “Measurement of Speech Intelligibility Based on the PESQ Approach,” in Proc. 3rd International Conference on Measurement of Speech and Audio Quality in Networks, pp. 27–30, Prague, Czech Republic, 2004.
Google Scholar
ITU-T Recommendation P.50, “Artificial Voices,” Geneva, 1999.
Google Scholar
N. Kitawaki, K. Nagai, & T. Yamada, “Objective Quality Assessment of Wideband Speech Coding Using W-PESQ Measure and Artificial Voice,” in Proc. 3rd International Conference on Measurement of Speech and Audio Quality in Networks, pp. 31–36, Prague, Czech Republic, 2004.
Google Scholar
M. Werner, T. Junge, & P. Vary, “Quality Control for AMR Speech Channels in GSM Networks,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1076–1079, Montreal, 2004.
Google Scholar
B. Timus, “Radio Link Parameter Based Speech Quality Index – SQI,” in Proc. 1999 IEEE Workshop on Speech Coding, pp. 147–149, Porvoo, Finland, 1999.
Google Scholar
S. Voran, “Compensating for Gain in Objective Quality Estimation Algorithms,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1068–1071, Montreal, 2004.
Google Scholar
S. Voran, “Objective Estimation of Perceived Speech Quality, Part I: Development of the Measuring Normalizing Block Technique,” IEEE Trans. Speech Audio Process., vol. 7, pp. 371–382, Jul. 1999.
Google Scholar
ITU-T Recommendation P.931, “Multimedia Communications Delay, Synchronization, and Frame Rate Measurement,” Geneva, 1998.
Google Scholar
S. Voran, “Perception of Temporal Discontinuity Impairments in Coded Speech – A Proposal for Objective Estimators and Some Subjective Test Results,” in Proc. 2nd International Conference on Measurement of Speech and Audio Quality in Networks, pp. 37–46, Prague, Czech Republic, 2003.
Google Scholar
A. Rix, M. Hollier, A. Hekstra, & J. Beerends, “Perceptual Evaluation of Speech Quality (PESQ) – The New ITU Standard for End-to-End Speech Quality Assessment, Part I – Time-Delay Compensation,” J. Audio Eng. Soc., vol. 50, pp. 755–764, Oct. 2002.
Google Scholar
ITU-T Recommendation P.862, “Perceptual Evaluation of Speech Quality (PESQ), an Objective Method for End-to-End Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs,” Geneva, 2001.
Google Scholar
S. Voran, “A Bottom-Up Algorithm for Estimating Time-Varying Delays in Coded Speech,” in Proc. 3rd International Conference on Measurement of Speech and Audio Quality in Networks, pp. 43–56, Prague, Czech Republic, 2004.
Google Scholar
ANSI T1-801-04-2005, “Multimedia Communications Delay, Synchronization, and Frame Rate,” New York, 2005.
Google Scholar
A. Gray Jr & J. Markel, “Distance Measures for Speech Processing,” IEEE Trans. Acoust., Speech Signal Process., vol. 24, pp. 380–391, Oct. 1976.
Google Scholar
S. Voran, “Advances in Objective Estimation of Perceived Speech Quality,” in Proc. 1999 IEEE Workshop on Speech Coding, pp. 138–140, Porvoo, Finland, 1999.
Google Scholar
S. Wang, A. Sekey, & A. Gersho, “An Objective Measure for Predicting Subjective Quality of Speech Coders,” IEEE J. Sel. Areas Commun., vol. 10, pp. 819–829, Jun. 1992.
Google Scholar
B. Moore, An Introduction to the Psychology of Hearing. London: Academic Press, 1989, Chap. 3.
Google Scholar
R. Bladon, “Modeling the Judgment of Vowel Quality Differences,” J. Acoust. Soc. Am., vol. 69, pp. 1414–1422, May 1981.
Google Scholar
R. Yantorno, “Improvement of MBSD by Scaling Noise Masking Threshold and Correlation Analysis with MOS Difference Instead of MOS,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 673–676, Phoenix, 1999.
Google Scholar
J. Beerends & J. Stemerdink, “A Perceptual Speech-Quality Measure Based on a Psychoacoustic Sound Representation,” J. Audio Eng. Soc., vol. 42, pp. 115–123, Mar. 1994.
Google Scholar
ITU-T Recommendation P.861, “Objective Quality Measurement of Telephone-Band (300–3400 Hz) Speech Codecs,” Geneva, 1996.
Google Scholar
E. Zwicker & R. Feldtkeller, Das Ohr als Nachrichtenempfänger. Stuttgart: S. Hirzel Verlag, 1967.
Google Scholar
S. Voran, “A Simplified Version of the ITU Algorithm for Objective Measurement of Speech Codec Quality,” in Proc. International Conference on Acoustics, Speech and Signal Processing pp. 537–540, Seattle, 1998.
Google Scholar
S. Voran & C. Sholl, “Perception-Based Objective Estimators of Speech Quality,” in Proc. 1995 IEEE Workshop on Speech Coding for Telecommunications, pp. 13–14, Annapolis, MD, 1995.
Google Scholar
S. Voran, “Objective Estimation of Perceived Speech Quality, Part II: Evaluation of the Measuring Normalizing Block Technique,” IEEE Trans. Speech Audio Process., vol. 7, pp. 383–390, Jul. 1999.
Google Scholar
ANSI T1-518-1998, “Objective Measurement of Telephone Band Speech Quality Using Measuring Normalizing Blocks (MNBs),” New York, 1998. Reaffirmed 2008.
Google Scholar
ITU-T Recommendation P.861 Appendix II, “Objective Quality Measurement of Telephone-Band (300–3400 Hz) Speech Codecs Using Measuring Normalizing Blocks (MNBs),” Geneva, 1998.
Google Scholar
J. Beerends, A. Hekstra, A. Rix, & M. Hollier, “Perceptual Evaluation of Speech Quality (PESQ) – The New ITU Standard for End-to-End Speech Quality Assessment, Part II – Psychoacoustic Model,” J. Audio Eng. Soc., vol. 50, pp. 765–778, Oct. 2002.
Google Scholar
C. Jin & R. Kubichek, “Vector Quantization Techniques for Output-Based Objective Speech Quality,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 491–494, Atlanta, GA, 1996.
Google Scholar
W. Li & R. Kubichek, “Output-Based Objective Speech Quality Measurement Using Continuous Hidden Markov Models,” in Proc. 7th International Symposium on Signal Processing and its Applications, pp. 389–392, Paris, 2003.
Google Scholar
D. Kim, “ANIQUE: An Auditory Model for Single-Ended Speech Quality Estimation,” IEEE Trans. Speech Audio Process., vol. 13, pp. 821–831, Sep. 2005.
Google Scholar
ANSI ATIS-PP-0100005.2006, “Auditory Non-Intrusive Quality Estimation Plus (ANIQUE+) Perceptual Model for Non-Intrusive Estimation of Narrowband Speech Quality”, New York, 2006.
Google Scholar
ITU-T Recommendation P.563, “Single-Ended Method for Objective Speech Quality Assessment in Narrow-Band Telephony Applications,” Geneva, 2004.
Google Scholar
V. Peutz, “Speech Information and Speech Intelligibility,” Preprint, Audio Engineering Society 85th Convention, Los Angeles, 1988.
Google Scholar
R. Kubichek, “Mel-Cepstral Distance Measure for Objective Speech Quality Assessment,” in Proc. IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, pp. 125–128, Victoria, British Columbia, 1993.
Google Scholar
A. De & P. Kabal, “Auditory Distortion Measure for Coded Speech – Discrimination Information Approach,” Speech Commun., vol. 14, pp. 205–229, Jun. 1994.
Google Scholar
A. De & P. Kabal, “Auditory Distortion Measure for Coded Speech – Hidden Markovian Approach,” Speech Commun., vol. 17, pp. 39–57, Aug. 1995.
Google Scholar
M. Hansen & B. Kollmeier, “Using a Quantitative Psychoacoustical Signal Representation for Objective Speech Quality Measurement,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1387–1390, Munich, 1997.
Google Scholar
M. Hauenstein, “Application of Meddis’ Inner Hair-Cell Model to the Prediction of Subjective Speech Quality,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 545–548, Seattle, 1998.
Google Scholar
A. Rix & M. Hollier, “The Perceptual Analysis Measurement System for Robust End-to-End Speech Quality Assessment,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1515–1518, Istanbul, 2000.
Google Scholar
J. van der Werff & D. de Leeuw, “What You Specify Is What You Get (Parts 1 & 2),” Preprint, Audio Engineering Society 114th Convention, Amsterdam, 2003.
Google Scholar
J. Holub, M. Street, & R. Smid, “Intrusive Speech Transmission Quality Measurements for Low Bit-Rate Coded Audio Signals,” Preprint, Audio Engineering Society 115th Convention, New York, 2003.
Google Scholar
D. Sen, “Predicting Foreground SH, SL and BNH DAM Scores for Multidimensional Objective Measure of Speech Quality,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 493–496, Montreal, 2004.
Google Scholar
A. Takahashi & H. Yoshino, “Perceptual QoS Assessment Technologies for VoIP,” IEEE Commun. Mag., vol. 24, pp. 28–34, Jul. 2004.
Google Scholar
T. Thiede, W. Treurniet, R. Bitto, C. Schmidmer, T. Sporer, J. Beerends, C. Colomes, M. Keyhl, G. Stoll, K. Brandenburg, & B. Feiten, “PEAQ – The ITU Standard for Objective Measurement of Perceived Audio Quality,” J. Audio Eng. Soc., vol. 48, pp. 3–29, Jan./Feb. 2000.
Google Scholar
W. Treurniet & G. Soulodre, “Evaluation of the ITU-R Objective Audio Quality Measurement Method,” J. Audio Eng. Soc., vol. 48, pp. 164–173, Jan./Feb. 2000.
Google Scholar
ITU-R Recommendation BS.1387-1, “Method for Objective Measurements of Perceived Audio Quality,” Geneva, 2001.
Google Scholar
B. Moore & C. Tan, “Measuring and Predicting the Perceived Quality of Music and Speech Subjective to Combined Linear and Nonlinear Distortion,” J. Audio Eng. Soc., vol. 52, pp. 1228–1244, Dec. 2004.
Google Scholar
L. Thorpe & W. Yang, “Performance of Current Perceptual Objective Speech Quality Measures,” in Proc. 1999 IEEE Workshop on Speech Coding, pp. 144–146, Porvoo, Finland, 1999.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Telecommunication Sciences, Boulder, CO, USA
Stephen Voran

Authors

Stephen Voran
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Research Council Institute for Microstructural Sciences, Acoustics and Signal Processing Group, 1200 Montreal Road, Ottawa, ON K1A 0R6, Canada
David Havelock
Department of Environmental Psychology, Osaka University Graduate School of Human Sciences, 1-2 Yamadaok Suita, Osaka, Japan
Sonoko Kuwano
Institute of Technical Acoustics, RWTH Aachen University, Aachen, Germany
Michael Vorländer

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Voran, S. (2008). Estimation of Speech Intelligibility and Quality. In: Havelock, D., Kuwano, S., Vorländer, M. (eds) Handbook of Signal Processing in Acoustics. Springer, New York, NY. https://doi.org/10.1007/978-0-387-30441-0_28

Download citation

DOI: https://doi.org/10.1007/978-0-387-30441-0_28
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-77698-9
Online ISBN: 978-0-387-30441-0
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)

Publish with us

Policies and ethics