Abstract
When looked at from a multilingual perspective, grapheme-to-phoneme conversion is a challenging task, fraught with most of the classical NLP ”vexed questions”: bottle-neck problem of data acquisition, pervasiveness of exceptions, difficulty to state range and order of rule application, proper treatment of context-sensitive phenomena and long-distance dependencies, and so on. The hand-crafting of transcription rules by a human expert is onerous and time-consuming, and yet, for some European languages, still stops short of a level of correctness and accuracy acceptable for practical applications. We illustrate here a self-learning multilingual system for analogy-based pronunciation which was tested on Italian, English and French, and whose performances are assessed against the output of both statistically and rule-based transcribers. The general point is made that analogy-based self-learning techniques are no longer just psycholinguistically-plausible models, but competitive tools, combining the advantages of using language-independent, self-learning, tractable algorithms, with the welcome bonus of being more reliable for applications than traditional text-to-speech systems.
This paper is the outcome of a cooperative effort. However, for the specific concern of the Italian Academy only, S. Federici is responsible for sections 6 and 7, V. Pirrelli for sections 1, 2 and 3, and F. Yvon for sections 4 and 5
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
Ove Andersen and Paul Dalsgaard. A self-learning approach to transcription of proper names. Draft research paper from the LRE project ONOMASTICA, 1993.
John A. Bullinaria. Neural networks models of reading without wickelfeatures. In Proceedings of the second Neural Computation and Psychology Workshop, Edinburgh, 1993.
T. Carr and A. Pollatsek. Recognizing printed words: a look at current models. In Besner et al., editor, Reading Research: advances in theory and practice. Academic Press, 1985.
Michael J. Dedina and Howard C. Nusbaum. Pronounce: a program for pronunciation by analogy. Computer Speech and Langage, 5:55–64, 1991.
Thomas G. Dietterich, Hermann Hild, and Ghulum Bakiri. A comparative study of id3 and back-propagation for english text-to-speech mapping. In Morgan Kaufman, editor, Proceedings of the 7th International Machine Learning Workshop, Austin, 1990.
Stefano Federici and Vito Pirrelli. Analogy as a computable process. In Proceeding of Nemlap, Manchester, 1994. Umist.
J. R. Glushko. The organization and activation of orthographic knowledge in reading aloud. Journal of experimental psychology: Human perception and performance, 5:674–691, 1979.
Andrew R. Golding. Pronouncing Names by a Combination of Rule-Based and Case-Based Reasoning. PhD thesis, Stanford University, Stanford, CA, oct 1991.
Teuvo Kohonen. Dynamically expanding context with application to the correction of symbol strings in the recognition of continuous speech. In Proceedings of the 8th International Conference on Pattern Recognition, volume 2, pages 1148–1151, Paris, France, oct. 1986.
R. W. Langacker. Concept, Image and Symbol: the cognitive basis of grammar. Berlin, Berlin, 1991.
Wendy G. Lehnert. Case-based problem solving with a large knowledge base of learned cases. In Proceedings of the meeting of the American Association for Artificial Intelligence (AAAI), pages 301–306, Seattle, WA, 1987.
J.M. Lucassen and R.L. Mercer. An information theoretic approach to the automatic determination of phonemic baseforms. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), volume 3, pages 45.5.1–42.5.4, San Diego, 1984.
S. Owen. Analogy for Automated Reasoning. Academic Press, 1990.
Vito Pirrelli. Evaluation of italian transcription rules in onomastica. Onomastica technical report, 1994.
Vito Pirrelli and Stefano Federici. “derivational” paradigms in morphonology. In Proceedings of COLING, Kyoto, Japan, 1994.
Vito Pirrelli and Stefano Federici. On the pronunciation of unknown words by analogy in text-to-speech systems: an evaluation. In Proceeding of the 2nd Onomastica Research Colloquium, London, Nov 1994.
Terrence J. Sejnowski and Charles R. Rosenberg. Parrallel network that learn to pronounce english text. Complex Systems, 1:145–168, 1987.
Craig W. Stanfill. Memory-based reasonning applied to english pronunciation. In Proceedings of the meeting of the American Association for Artificial Intelligence (AAAI), pages 577–581, Seattle, WA, 1987.
K.P.H Sullivan and Robert. I Damper. Novel-word pronunciation within a text-to-speech system. In Gérard Bailly and Eric Moulines, editors, Talking Machines, pages 183–195. North Holland, 1992.
François Yvon. Self-learning techniques for grapheme-to-phoneme conversion. In Proceeding of the 2nd Onomastica Research Colloquium, London, Nov 1994.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Federici, S., Pirrelli, V., Yvon, F. (1996). A dynamic approach to paradigm-driven analogy. In: Wermter, S., Riloff, E., Scheler, G. (eds) Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing. IJCAI 1995. Lecture Notes in Computer Science, vol 1040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60925-3_61
Download citation
DOI: https://doi.org/10.1007/3-540-60925-3_61
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60925-4
Online ISBN: 978-3-540-49738-7
eBook Packages: Springer Book Archive