Speech Recognition Using GFIKS
In this chapter, we demonstrate how the statistical speech recognition system may incorporate additional sources by utilizing GFIKS at different levels, HMM state and phonetic-unit. We also present some experimental results of incorporating various knowledge sources, including environmental variability (i.e., background noise information), speaker variability (i.e., accent and gender information) and contextual variability (i.e., wide-phonetic information). The incorporation of these knowledge sources may be done only for a single type of knowledge source, or even the combination between different type of knowledge sources.
We describe some common considerations of using GFIKS at the HMM state level in Section 4.1 and at the HMM phonetic-unit level in Section 4.2. These issues include defening causal relationships between information sources, inference, training issues, and enhancing model reliability. Then, in Section 4.3, we describe an experimental evaluation of applying the proposed GFIKS to the task of incorporating various knowledge sources. Finally, in Section 4.4, the summary of the experiments are presented and the comparison between diferent level of incorporation is also discussed.
KeywordsSpeech Recognition Knowledge Source Word Error Rate Junction Tree Pronunciation Dictionary
Unable to display preview. Download preview PDF.