Abstract
Chapters Five and Six will highlight the need for some syntactic processing in a high-quality text-to-speech system. Being able to reduce a given sentence into something like the sequence of its parts of speech, and to further describe it in the form of a syntax tree, which unveils its internal structure, is required for at least two reasons:
-
Accurate phonetic transcription can be achieved only if the part of speech category of some words is available, and may depend on knowing the dependency relationship between successive words.
-
Natural prosody heavily relies on syntax. It also obviously has a lot to do with semantics and pragmatics, but since very little data is currently available on the generative aspects of this dependence, TTS systems merely concentrate on syntax. Yet, as we shall see, few of them are actually provided with full disambiguation and structuring capabilities.
‘Twas brillig, and the slithy toves Did gyre and gimble in the wabes: All mimst were the borogroves, And the mome raths outgrabe. Lewis Carroll, Jabberwocky
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
ALLEN, J., S. HUNNICUT, and D. KLATT, (1987), From Text To Speech, The MITALK System, Cambridge University Press, Cambridge.
ALLEN, J., (1992), “Overview of Text-to-Speech Systems”, in Advances in Speech Signal Processing, S. Furui and M. Sondhi, eds., Dekker, New York, pp. 741–790.
BENELLO, J., A.W. MACKIE, and J.A. ANDERSON, (1989), “Syntactic Category Disambiguation with Neural Networks”, Computer Speech and Language, n°3, pp. 203–217.
BLOIS, J., and J. BUYDENS, (1968), Problèmes de la Traduction Automatique, Klinckslieck, Paris.
BÖHM, A., (1992), Maschinelle Sprachausgabe Deutschen und Englishe Textes, Ph.D. dissertation, Ruhr-Universität Bochum.
BRIEMAN, L., J.H. FRFFIDMAN, R.A. OLSHEN, and C.J. STONE, (1984), Classification and Regression Trees, 1984, Wadsworth & Brooks, Monterey, CA.
BRILL, E., (1994), “Some Advances in Transformation-based Part of Speech Tagging”, to appear in Proceedings of the AAAI’94, also on CMP-LG, paper n° 940601033.
CERF-DANON, H., and M. ELBEZE, (1991), “Three Different Probabilistic Language Models: Comparison and Combination”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 91, Toronto, pp. 297–300.
CHURCH, K. W., (1987), Phonological Parsing in Speech Recognition, Kluwer
CHURCH, K. W., (1988), “A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text”, Proceedings of the Second Conference on Applied Natural Language Processing, Austin, Texas.
CLEEREMANS, A., D. SERVAN-SCHREIBER, and J.L McCLELLAND, (1989), “Finite-State Automata and Simple Recurrent Networks”, Neural Computation, vol. 1, pp. 372–382.
CRYSTAL, D., (1985), A Dictionary of Linguistics and Phonetics, Basil Blackwell.
DELIGNE, S., and F. BIMBOT, (1995), “Language Modeling by Variable Length Sequences: Theoretical Formulation and Evaluation of Multigrams”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 95, vol. 1, pp. 169–172.
DeROSE, S., (1988), “Grammatical Category Disambiguation by Statistical Optimization”, Computational Linguistics, n°14, pp. 31–39.
DUMOUCHEL, P., V. GUPTA, M. LENNIG, and P. MERMELSTEIN, (1988), “Three Probabilistic Models for a Large-Vocabulary Speech Recognizer”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 88, New-York, pp. 513–516.
DUTOIT, T., (1993), High Quality Text-To-Speech Synthesis of the French Language, Ph. D. dissertation, Faculté Polytechnique de Mons.
FROIDEVEAUX, C., M.-C. GAUDEL, and M.SORIA, (1990), Types de Données et Algorithmes, McGraw-Hill, Paris, 577 pp.
GARSIDE, R., G. LEECH, and G. SAMPSON, (1987), The Computational Analysis of English, Longman, London.
GAZDAR, G., and C. MELLISH, (1989), Natural Language Processing in Prolog: an Introduction to Computional Linguistics, Addison-Wesley, Reading, MA.
GREENE, B.B., and G.M. RUBIN, (1977), “Automatic Grammatical Tagging of English”, Department of Linguistics, Brown University, Providence, Rhode Island.
GROSS, M., (1986), Grammaire Transformationnelle du Frangais I. Syntaxe du Verbe. II. Syntaxe du Nom., Cantilène, Paris.
GULIKERS, L., and R. WILLEMSE, (1992), “A Lexicon for a Text-to-Speech System”, Proeedings.of the International Conference on Spoken Language Processing, Alberta, pp 101–104.
INALF, (1984), Dictionnaire des Fréquences: Table de Répartition des Homographes, CNRS – INALF (institut national de la langue francaise), Nancy.
JELINEK, F., (1976), “Continuous Speech Recognition by Statistical Methods”, Proceedings of the IEEE, vol. 64, n°4, pp. 532–556.
JELINEK, F., (1991), “Up from Trigrams!”, Proceedings of Eurospeech 91, Genova, vol. 3, pp. 1037–1040.
JELINEK, F., R.L. MERCER, and S. ROUKOS, (1992), “Principles of Lexical Language Modeling for Speech Recognition”, in Advances in Speech Signal Processing, S. Furuy and M. Sondhi, eds., Dekker, New York.
KARLSSON, F., (1990), “Constraint Grammars as a Framework for Parsing Running Text”, Proceedings of the Conference on Computational Linguistics, Helsinki, vol. 3, pp. 168–173.
KARTUNNEN, L., K. KOSKENNIEMI, and R. KAPLAN, (1987), “A Compiler for Two-Level Phonological Rules”. In: Daylrimple et al., Tools for Morphological Analysis, Report N° CSLI-87-108, Center for Study of Language and Information, Stanford University.
KNUTH, D., (1973), The Art of Computer Programming, vol. 2, Addison-Wesley, Reading, MA.
KOSKENNIEMI, K., (1983), Two Level Morphology: A general Computational Model for Word-Form Recognition and Production, Ph.D. dissertation, Department of General Linguistics, University of Helsinki.
KOSKENNIEMI, K., (1990), “Finite-State Parsing and Disambiguation”, Proeedings. of the Conference on Computational Linguistics, Helsinki, vol. 2, pp. 229–232.
KUHN, T., H.NIEMANN, and E.G. SHUKAT-TALAMAZZINI, (1994), “Ergodic Hidden Markov Models and Polygrams for Language Modeling”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 94, vol. 1, pp. 357–360.
KUPIEC, J., (1992), “Robust Part-of-Speech Tagging Using a Hidden Markov Model”, Computer Speech and Language, n°6, pp. 225–242.
LARREUR, D., F. EMERARD, F. MARTY, (1989), “Linguistic and Prosodic Processing for a Text-to-Speech Synthesis System” Proceedings of Eurospeech 89, Paris, pp. 510–513.
LIBERMAN, M.Y., and K.W. CHURCH, (1992), “Text Analysis and Word Pronunciation in Text-to-Speech Systems”, in Advances in Speech Signal Processing, S. Furui and M. Sondhi, eds., Dekker, New York, pp. 791–831.
LINDSTRÖM, A., M. LJUNGQVIST, and K. GUSTAFSONN, (1993), “A Modular Architecture Suppoorting Multiple Hypotheses for Conversion of Text to Phonetic and Linguistic Entities”, Proceedings of Eurospeech 93, Berlin, pp. 1463–1466.
LINDSTRÖM, A., and M. LJUNGQVIST, (1994), “Text Processing within a Speech Synthesis System”, Proc. Proceedings, of the International Conference on Spoken Language Processing 94, Yokohama.
MASTROLONARDO, A., and M. REFICE, (1989), “Measuring the Power of Self-Organized Linguistic Models”, Proceedings of Eurospeech 89, Paris, vol. 1, pp. 390–393.
McALLISTER, M., (1989), “The Problems of Punctuation Ambiguity in Full Automatic Text-to-Speech Conversion”, Proceedings of Eurospeech 89, Paris, vol. 1, pp. 538–541.
MERIALDO, B., (1991), “Tagging Text with a Probabilistic Model”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 91, pp. 809–812.
MOHRI, M., (1994), “Syntactic Analysis by Local Grammars Automata: an Efficient Algorithm”, to appear in the Proceedings ofComplex94. Also on CMP-LG, paper n° 9407002.
O’MALLEY, M.H., D.K. LARKIN, and E.W. PETERS, (1986), “Beyond the Reading Machine: What the Next Generation of Intelligent Text-To-Speech Systems Should do for the User”, Proceedings of Speech Technology 86, pp. 216–219.
O’SHAUGHNESSY, D., (1987), “Specifying Intonation in a Text-to-Speech System Using Only a Small Dictionary”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 87, pp. 1430–1433.
PACHUNKE, T., O. MERTINEIT, K. WOTHKE, and R. SCHMIDT, (1994), “The Linguistic Knowledge in a Morphological Segmentation Procedure of German”, Computer Speech and Language, vol. 8, pp. 233–245.
PITRAT, J., (1983), Réalisation d’un Analyseur Léxicographique Général, rapport de recherche n°79/2, GR22, Institut de programmation, Paris VI.
RILEY, M.D., (1990), “Tree-Based Modeling for Speech Synthesis”, Proceedings of the ESCA Workshop on Speech Synthesis, Autrans (France), pp. 269–272.
RIVEST, R.L., (1987), “Learning Decision Lists”, Machine Learning, 2, pp. 229–246.
RÜHL, H.-W., (1984), Sprachsynthese nach Regeln ßr Unbeschränkten Deutschen Text, PhD dissertation, Ruhr-Universtät Bochum.
SABAH, G., (1989), L’intelligence Artificielle et le Langage, Tome 1: Répresentation des Connaissances, Tomel: Processus de Compréhension, Hermes, Paris.
SENDERS, W., M. KUGLER, and L. BOVES, (1989), “Simultaneous Optimization of Several Variables in a Probabilistic Language Model”, Proceedings of Eurospeech 89, Paris, vol. 2, pp. 63–67.
SHIKANO, K., (1987), “Improvement of Word Recognition Results by Trigram Model”, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing 87, pp. 1261–1264.
SORIN, C, D. LARREUR, and R. LLORCA, (1987), “A Rhythm-Based Prosodic Parser for Text-to-Speech Systems in French”, Proceedings of the 11th International Congress on Phonetic Sciences, Tallin, vol.1, 125–128.
SPROAT, R., J. HIRSHBERG, and D. YAROWSKY, (1992), “A Corpus-Based Synthesizer”, Proc. Proceedings.of the International Conference on Spoken Language Processing 92 Alberta, pp. 563–566.
TAPANAINEN, P., and A. VOUTILAINEN, (1994), “Tagging Accurately — Don’t Guess if you Know”, CMP-LG, paper n° 9408009.
TRABER, C., (1993), “Syntactic Processing and Prosody Control in the SVOX TTS System for German”, Proceedings of Eurospeech 93, Berlin, vol. 3, pp. 2099–2102.
WEISCHEDEL, R., M. METEER, R. SCHWARTZ, L. RAMSHAW, and J. PALMUCCI, (1993), “Coping with Ambiguity and Unknown Wordsthrough Probabilistic Models”, Computational Linguistics, 1994.
WILLEMSE, R., and L. GULIKERS, (1992), “Word Class Assignment in a Text-to-Speech System”, Proceedings.of the International Conference on Spoken Language Processing, Alberta, pp. 105–108.
WINOGRAD, T., (1972), Understanding Natural Language, Academic Press, Edimburgh.
YAROWSKY, D., (1994), “Homograph Disambiguation in Speech Synthesis”, Proceedings of the 2nd ESCA/IEEE Workshop on Speech Synthesis, New Paltz, NY.
ZINGLE, H., (1990), “Morphological Segmentation and Stress Calculus in German with an Expert System”, Proceedings of the ESCA Workshop on Speech Synthesis, Autrans (France), pp. 133–136.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 1997 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Dutoit, T. (1997). Morpho-Syntactic Analysis. In: An Introduction to Text-to-Speech Synthesis. Text, Speech and Language Technology, vol 3. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-5730-8_4
Download citation
DOI: https://doi.org/10.1007/978-94-011-5730-8_4
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-0369-1
Online ISBN: 978-94-011-5730-8
eBook Packages: Springer Book Archive