Overview of Speech Recognition in the ‘SPICOS’ System
In this paper, a recognition technique used in the ‘SPICOS’ project is described. It is based on an integrated approach that combines the various knowledge sources, such as inventory of subword unit, pronunciation lexicon and language model, during the process of decision making in order to improve the reliability of the acoustic recognition. The recognition decision amounts to a search through a large state space with delayed decisions. The speaker dependent recognition tests are performed on a speech data base comprising 3 sessions of each of 5 speakers. A session consists of 200 sentences and amounts to 1391 word samples.
KeywordsLanguage Model Knowledge Source Word Sequence Word Error Rate Large State Space
Unable to display preview. Download preview PDF.
- J.K. BAKER (1975): “Stochastic Modeling for Automatic Speech Understanding”, in D.R. REDDY (ed.): ‘Speech Recognition’, Academic Press, New York, pp.512–542, 1975.Google Scholar
- D. MERGEL, A. PAESELER (1987): “Construction of Language Models for Spoken Data Base Queries”, Proc. 1 987 IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Dallas, Texas, pp. 20.13.1–4, April 1987.Google Scholar
- H. NEY, D. MERGEL, A. NOLL, P. PAESELER (1987): “A Data-Driven Organization of the Dynamic Programming Beam Search for Continuous Speech Recognition”, Proc. 1987 IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Dallas, Texas, pp. 20.10.1–4, April 1987.Google Scholar
- A. NOLL, H. NEY (1987): “Training of Phoneme Models in a Sentence Recognition System”, Proc. 1987 IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Dallas, Texas, pp.29.6.1–4, April 1987.Google Scholar