Quality Evaluation of Speech Processing Systems

  • Herman J. M. Steeneken
Part of the The Kluwer International Series in Engineering and Computer Science book series (SECS, volume 155)


This chapter gives an overview of assessment methods for speech communication systems, speech synthesis systems and speech recognition systems. The first two systems require an evaluation in terms of intelligibility measures. Several subjective and objective measures will be discussed.

Evaluation of speech recognizers requires a different approach as the recognition rate normally depends on recognizer-specific parameters and external factors. Some results of the assessment methods for recognition systems will be discussed.

Case studies are given for each group of systems.


Mean Opinion Score Test Word Speech Recognition System Speech Intelligibility Natural Speech 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Anderson, B.W. and Kalb, J.T. English verification of the STI method for estimating speech intelligibility of a communications channel. J. Acoust. Soc. Am. 81(6), (1987), 1982–1985.CrossRefGoogle Scholar
  2. [2]
    Chollet, G.F., Gagnoulet, C., “On the Evaluation of recognizers and databases using a reference system”, IEEE Proc. ICASSP, Atlanta (1981).Google Scholar
  3. [3]
    French, N.R. and Steinberg, J.C., Factors governing the intelligibility of speech sounds. J. Acoust. Soc. Am. 19 (1947), 90.CrossRefGoogle Scholar
  4. [4]
    Gillick, L. and Cox, S.J., Some statistical issues in the comparison of speech recognition algorithms, IEEE Proc. ICASSP, Glasgow (1989).Google Scholar
  5. [5]
    Goodman, D.J. and Nash, R.D., Subjective quality of the same speech transmission conditions in seven different countries, IEEE Trans Comm. 30 (1984) 642–654.CrossRefGoogle Scholar
  6. [6]
    Greenspan, S.L., Bennett, R.W. and Syrdal, A.K., A study of Two Standard Speech Intelligibility Measures. Presented 117th Meeting Acoust. Soc. Am., May 1989.Google Scholar
  7. [7]
    Hieronymus, J.L., Majurski, W.J., “A reference speech recognition algorithm for benchmarking and speech data-base analysis”, IEEE Proc. ICASSP, Tampa (1985).Google Scholar
  8. [8]
    House, A.S., Williams, C.E., Hecker, M.H.L. and Kryter, K.D., Articulation testing Methods: Consonantal differentiation with a closed response set., J. Acoust Soc. Am. 37 (1965), 158–166.CrossRefGoogle Scholar
  9. [9]
    Houtgast, T. and Steeneken, H.J.M., A multilanguage evaluation of the Rasti-method for estimating speech intelligibility in auditoria. Acustica 54 (1984), 185–199.Google Scholar
  10. [10]
    Hunt, M.J., Figures of merit for assessing connected-word recognizers. Speech Communication 9 (1990), 329–336.CrossRefGoogle Scholar
  11. [11]
    IEC-report. The objective rating of speech intelligibility in auditoria by the “RASTI” method, Publication IEC 268-16 (1988).Google Scholar
  12. [12]
    Kryter, K.D., Methods for the calculation and use of the articulation index. J. Acoust. Soc. Am. 34 (1962), 1689–1697.CrossRefGoogle Scholar
  13. [13]
    Logan, J.S., Greene, B.G. and Pisoni, D.B., Segmental intelligibility of synthetic speech produced by rule. J. Acoust. Soc. Am. 86(2), (1989), 566–581.CrossRefGoogle Scholar
  14. [14]
    Mariani, J., Covering notes concerning the survey on existing voice recognition equipments. AC/243(Panel 3) RSG-10 document (1989).Google Scholar
  15. [15]
    Michael Nye, J., Human factors analysis of Speech Recognition systems. Speech Technology, Vol 1 (1982), No.2.Google Scholar
  16. [16]
    Moore, R.K., Evaluating speech recognizers. IEEE Trans. ASSP, Vol ASSP-25, No. 2 (1977), 178–183.CrossRefGoogle Scholar
  17. [17]
    Moore, R.K., Report on connected digit recognition in a multilingual environment. Report AC/243(Panel 3)D/259, January 25, 1988.Google Scholar
  18. [18]
    Peckels, J.P. and Rossi, M., Le test diagnostic par paires minimales. Revue d’Acoustique No 27 (1973), 245-262.Google Scholar
  19. [19]
    Plomp, R. and Mimpen, A.M., Improving the reliability of testing the speech reception threshold for sentences, Audiology 8 (1979), 43–52.CrossRefGoogle Scholar
  20. [20]
    Pols, L.C.W., Improving synthetic speech quality by systematic evaluation. Proceedings ESCA workshop, Noordwijkerhout (1989), The Netherlands.Google Scholar
  21. [21]
    Son, N. van, and Pols, L.C.W., “Final evaluation of three multipulse LPC coders: CVC intelligibility, Quality Assessment and speaker identification.” Report IZF 1989–17 (1989), TNO Institute for Perception, Soesterberg, The Netherlands.Google Scholar
  22. [22]
    Spiegel, M., Altom, M.J., Macchi, K. and Wallace, K., A monosyllabic test corpus to evaluate the intelligibility of synthesized and natural speech. Proceedings ESCA Workshop, Noordwijkerhout (1989), The Netherlands.Google Scholar
  23. [23]
    Steeneken, H.J.M. and Houtgast, T., A physical method for measuring speech-transmission quality. J. Acoust. Soc. Am. 67(1), (1980), 318–326.CrossRefGoogle Scholar
  24. [24]
    Steeneken, H.J.M., Ontwikkeling en toetsing van een Nederlandstalige diagnostische rijmtest voor het testen van spraakkommunikatiekanalen. Report IZF 1982-13 (1982), TNO Institute for Perception, Soesterberg, The Netherlands.Google Scholar
  25. [25]
    Steeneken, H.J.M., Diagnostic information of subjective intelligibility tests. Internat. IEEE Proc., ICASSP, Dallas (1986).Google Scholar
  26. [26]
    Steeneken, H.J.M. and Houtgast, T., Comparison of some methods for measuring speech levels. Report IZF 1986-20 (1986), TNO Institute for Perception, Soesterberg, The Netherlands.Google Scholar
  27. [27]
    Steeneken, H.J.M., Comparison among three subjective and one objective intelligibility test. Report IZF 1987–8 (1987), TNO Institute for Perception, Soesterberg, The Netherlands.Google Scholar
  28. [28]
    Steeneken, H.J.M. and Geurtsen, F.W.M., Description of the RSG-10 Noise Data-base. Report IZF 1988-3 (1988), TNO Institute for Perception, Soesterberg, The Netherlands.Google Scholar
  29. [29]
    Steeneken, HJ.M. and Van Velden, J.G., Objective and diagnostic assessment of (isolated) word recognizers. IEEE Proc. ICASSP, Glasgow (1989), 540-543.Google Scholar
  30. [30]
    Steeneken, H.J.M., Tomlinson, M., and Gauvain, J.L., Assessment of two commercial recognizers with the SAM workstation and Eurom 0. Proceedings ESCA workshop, Noordwijkerhout (1989), The Netherlands.Google Scholar
  31. [31]
    Simpson, CA., and Ruth, J.C., “The Phonetic Discrimination Test for Speech Recognizers”, Part I, Speech Technology, March/April 1987, Part II, Speech Technology Oct/Nov 1987.Google Scholar
  32. [32]
    Taylor, M.M., “Issues in the evaluation of speech recognition systems”, J. Am. Voice I/O Soc., Vol 3 (1986), 34–68.Google Scholar
  33. [33]
    Terken, J.M.B. and Collier, R., Automatic synthesis of natural-sounding intonation for text-to-speech conversion in Dutch. Proceedings Eurospeech 89, Paris (1989), September 26–28.Google Scholar
  34. [34]
    Thomas, T.J., “The prediction of speech recognizer performance by the use of laboratory experiments: some preliminary observations”, Proc. European Conf. on Speech Technology, Edinburgh (1987), Vol 2, 245-248.Google Scholar
  35. [35]
    Voiers W.D., Diagnostic Evaluation of Speech Intelligibility. Chapter 32 in M.E. Hawley (ed.) Speech Intelligibility and speaker recognition, Vol. 2. Benchmark papers in Acoustics, Dowden, Hutchinson, and Ross, (1977), Stroudburg, Pa.Google Scholar

Copyright information

© Springer Science+Business Media New York 1992

Authors and Affiliations

  • Herman J. M. Steeneken
    • 1
  1. 1.TNO-Institute for PerceptionSoesterbergThe Netherlands

Personalised recommendations