A Hierarchical Lexical Representation for Pronunciation Generation
We propose a unified framework for integrating a variety of linguistic knowledge sources for representing the English word, to facilitate their concurrent utilization in language applications. Our hierarchical lexical representation encompasses information such as morphology, stress, syllabification, phonemics and graphemics. Each occupies a distinct stratum in the hierarchy, and the constraints they provide are administered in parallel during generation via a probabilistic parsing paradigm. The merits of the proposed methodology have been demonstrated on the test bed of bi-directional spelling-to-pronunciation/pronunciationto-spelling generation. This chapter focuses on the former task. Training and testing corpora are derived from the high-frequency portion of the Brown corpus (10,000 words), augmented with markers indicating stress and word morphology. The system was evaluated on an unseen test set, and achieved a parse coverage of 94%, with a word accuracy of 71.8% and a phoneme accuracy of 92.5% using a set of 52 phonemes. We have also conducted experiments to assess empirically (a) the relative contribution of each linguistic layer towards generation accuracy, and (b) the relative merits of the overall hierarchical design. We believe that our formalism will be especially applicable for augmenting the vocabulary of existing speech recognition and synthesis systems.
KeywordsParse Tree Generation Accuracy Linguistic Knowledge Letter Sequence Partial Theory
Unable to display preview. Download preview PDF.