Treetalk: Memory-Based Word Phonemisation

  • Walter Daelemans
  • Antal van den Bosch
Part of the Telecommunications Technology & Applications Series book series (TTAP)


We propose a memory-based (similarity-based) approach to learning the mapping of words into phonetic representations for use in speech synthesis systems. The main advantage of memory-based data-driven techniques is their high accuracy; the main disadvantage is processing speed. We introduce a hybrid between memory-based and decision-tree-based learning (TRIBL) which optimises the trade-off between efficiency and accuracy. TRIBL was used in TreeTalk, a methodology for fast engineering of word-to-pronunciation conversion systems. We also show that, for English, a single TRIBL classifier trained on predicting phonetic transcription and word stress at the same time performs better than a ‘modular’ approach in which different classifiers corresponding to linguistically relevant representations such as morphological and syllable structure are separately trained and integrated.


Stress Marker Stress Assignment Generalisation Accuracy Syllable Boundary Phonetic Transcription 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer Science+Business Media Dordrecht 2001

Authors and Affiliations

  • Walter Daelemans
    • 1
  • Antal van den Bosch
    • 2
  1. 1.Center for Dutch Language and SpeechUniversity of AntwerpBelgium
  2. 2.ILK/Computational LinguisticsTilburg UniversityThe Netherlands

Personalised recommendations