Quality Evaluation of Speech Processing Systems
This chapter gives an overview of assessment methods for speech communication systems, speech synthesis systems and speech recognition systems. The first two systems require an evaluation in terms of intelligibility measures. Several subjective and objective measures will be discussed.
Evaluation of speech recognizers requires a different approach as the recognition rate normally depends on recognizer-specific parameters and external factors. Some results of the assessment methods for recognition systems will be discussed.
Case studies are given for each group of systems.
KeywordsMean Opinion Score Test Word Speech Recognition System Speech Intelligibility Natural Speech
Unable to display preview. Download preview PDF.
- Chollet, G.F., Gagnoulet, C., “On the Evaluation of recognizers and databases using a reference system”, IEEE Proc. ICASSP, Atlanta (1981).Google Scholar
- Gillick, L. and Cox, S.J., Some statistical issues in the comparison of speech recognition algorithms, IEEE Proc. ICASSP, Glasgow (1989).Google Scholar
- Greenspan, S.L., Bennett, R.W. and Syrdal, A.K., A study of Two Standard Speech Intelligibility Measures. Presented 117th Meeting Acoust. Soc. Am., May 1989.Google Scholar
- Hieronymus, J.L., Majurski, W.J., “A reference speech recognition algorithm for benchmarking and speech data-base analysis”, IEEE Proc. ICASSP, Tampa (1985).Google Scholar
- Houtgast, T. and Steeneken, H.J.M., A multilanguage evaluation of the Rasti-method for estimating speech intelligibility in auditoria. Acustica 54 (1984), 185–199.Google Scholar
- IEC-report. The objective rating of speech intelligibility in auditoria by the “RASTI” method, Publication IEC 268-16 (1988).Google Scholar
- Mariani, J., Covering notes concerning the survey on existing voice recognition equipments. AC/243(Panel 3) RSG-10 document (1989).Google Scholar
- Michael Nye, J., Human factors analysis of Speech Recognition systems. Speech Technology, Vol 1 (1982), No.2.Google Scholar
- Moore, R.K., Report on connected digit recognition in a multilingual environment. Report AC/243(Panel 3)D/259, January 25, 1988.Google Scholar
- Peckels, J.P. and Rossi, M., Le test diagnostic par paires minimales. Revue d’Acoustique No 27 (1973), 245-262.Google Scholar
- Pols, L.C.W., Improving synthetic speech quality by systematic evaluation. Proceedings ESCA workshop, Noordwijkerhout (1989), The Netherlands.Google Scholar
- Son, N. van, and Pols, L.C.W., “Final evaluation of three multipulse LPC coders: CVC intelligibility, Quality Assessment and speaker identification.” Report IZF 1989–17 (1989), TNO Institute for Perception, Soesterberg, The Netherlands.Google Scholar
- Spiegel, M., Altom, M.J., Macchi, K. and Wallace, K., A monosyllabic test corpus to evaluate the intelligibility of synthesized and natural speech. Proceedings ESCA Workshop, Noordwijkerhout (1989), The Netherlands.Google Scholar
- Steeneken, H.J.M., Ontwikkeling en toetsing van een Nederlandstalige diagnostische rijmtest voor het testen van spraakkommunikatiekanalen. Report IZF 1982-13 (1982), TNO Institute for Perception, Soesterberg, The Netherlands.Google Scholar
- Steeneken, H.J.M., Diagnostic information of subjective intelligibility tests. Internat. IEEE Proc., ICASSP, Dallas (1986).Google Scholar
- Steeneken, H.J.M. and Houtgast, T., Comparison of some methods for measuring speech levels. Report IZF 1986-20 (1986), TNO Institute for Perception, Soesterberg, The Netherlands.Google Scholar
- Steeneken, H.J.M., Comparison among three subjective and one objective intelligibility test. Report IZF 1987–8 (1987), TNO Institute for Perception, Soesterberg, The Netherlands.Google Scholar
- Steeneken, H.J.M. and Geurtsen, F.W.M., Description of the RSG-10 Noise Data-base. Report IZF 1988-3 (1988), TNO Institute for Perception, Soesterberg, The Netherlands.Google Scholar
- Steeneken, HJ.M. and Van Velden, J.G., Objective and diagnostic assessment of (isolated) word recognizers. IEEE Proc. ICASSP, Glasgow (1989), 540-543.Google Scholar
- Steeneken, H.J.M., Tomlinson, M., and Gauvain, J.L., Assessment of two commercial recognizers with the SAM workstation and Eurom 0. Proceedings ESCA workshop, Noordwijkerhout (1989), The Netherlands.Google Scholar
- Simpson, CA., and Ruth, J.C., “The Phonetic Discrimination Test for Speech Recognizers”, Part I, Speech Technology, March/April 1987, Part II, Speech Technology Oct/Nov 1987.Google Scholar
- Taylor, M.M., “Issues in the evaluation of speech recognition systems”, J. Am. Voice I/O Soc., Vol 3 (1986), 34–68.Google Scholar
- Terken, J.M.B. and Collier, R., Automatic synthesis of natural-sounding intonation for text-to-speech conversion in Dutch. Proceedings Eurospeech 89, Paris (1989), September 26–28.Google Scholar
- Thomas, T.J., “The prediction of speech recognizer performance by the use of laboratory experiments: some preliminary observations”, Proc. European Conf. on Speech Technology, Edinburgh (1987), Vol 2, 245-248.Google Scholar
- Voiers W.D., Diagnostic Evaluation of Speech Intelligibility. Chapter 32 in M.E. Hawley (ed.) Speech Intelligibility and speaker recognition, Vol. 2. Benchmark papers in Acoustics, Dowden, Hutchinson, and Ross, (1977), Stroudburg, Pa.Google Scholar