Treetalk: Memory-Based Word Phonemisation
We propose a memory-based (similarity-based) approach to learning the mapping of words into phonetic representations for use in speech synthesis systems. The main advantage of memory-based data-driven techniques is their high accuracy; the main disadvantage is processing speed. We introduce a hybrid between memory-based and decision-tree-based learning (TRIBL) which optimises the trade-off between efficiency and accuracy. TRIBL was used in TreeTalk, a methodology for fast engineering of word-to-pronunciation conversion systems. We also show that, for English, a single TRIBL classifier trained on predicting phonetic transcription and word stress at the same time performs better than a ‘modular’ approach in which different classifiers corresponding to linguistically relevant representations such as morphological and syllable structure are separately trained and integrated.
KeywordsStress Marker Stress Assignment Generalisation Accuracy Syllable Boundary Phonetic Transcription
Unable to display preview. Download preview PDF.