Duration Study for the Bell Laboratories Mandarin Text-to-Speech System

Shih, Chilin; Ao, Benjamin

doi:10.1007/978-1-4612-1894-4_31

Chilin Shih &
Benjamin Ao

293 Accesses
7 Citations

Abstract

We present in this chapter the methodology and results of a duration study designed for the Mandarin Chinese text-to-speech system of Bell Laboratories. A greedy algorithm is used to select text from on-line corpora to maximize the coverage of factors that are important to the study of duration. The duration model and some interesting results are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

J. Allen, S. Hunnicut, and D. H. Klatt. From text to speech: The MITalk system. Cambridge University Press, Cambridge, UK, 1987.
Google Scholar
R. Berkovits. Utterance-final lengthening and the duration of final-stop closures. J. Phonetics21(4):479–489, 1993.
Google Scholar
R. Carlson and B. Cranström. A search for durational rules in a real-speech data base. Phonetica43:140–154, 1986.
Article Google Scholar
T. H. Crystal and A. S. House. Segmental durations in connected speech signals: Preliminary results. JASA72:705–716, 1982.
Article Google Scholar
T. H. Crystal and A. S. House. Segmental durations in connected-speech signals: Current results. JASA83:1553–1573, 1988.
Article Google Scholar
J. Edwards and M. E. Beckman. Articulatory timing and the prosodic interpretation of syllable duration. Phonetica45(2): 156–174, 1988.
Article Google Scholar
L. Feng Beijinghua yuliu zhong sheng yun diao de shichang (Duration of consonants, vowels, and tones in Beijing Mandarin speech). In Beijinghua Yuyin Shiyanlu (Acoustics Experiments in Beijing Mandarin),Beijing University Press, Beijing, 131–195, 1985.
Google Scholar
R. M. French, A. Greenwood, and J. P. Olive. Speech Segmentation Criteria. Technical report, AT&T Bell Laboratories, 1993.
Google Scholar
J. Fletcher and A. McVeigh. Segment and syllable duration in Australian English. Speech Comm. 13:355–365, 1993.
Article Google Scholar
A. S. House. On vowel duration in English. JASA33:1174–1178, 1961.
Article Google Scholar
M. S. Harris and N. Umeda. Effect of speaking mode on temporal factors in speech: Vowel duration. JASA56:1016–1018, 1974.
Article Google Scholar
D. H. Klatt. Interaction between two factors that influence vowel duration. JASA54:1102–1104, 1973.
Article Google Scholar
D. H. Klatt. Vowel lengthening is syntactically determined in a connected discourse. J. Phonetics3:129–140, 1975.
Google Scholar
I. Lehiste The timing of utterances and linguistic boundaries. JASA 51(6.2): 2018–2024, 1972.
Article Google Scholar
D. Lindblom and K. Rapp. Some temporal regularities of spoken Swedish. Publication of the Institute of Linguistics, University of Stockholm,21:1–59, 1973.
Google Scholar
S. G. Nooteboom. Production and Perception of Vowel Duration. University of Utrecht, Utrecht, 1972.
Google Scholar
J. P. Olive, A. Greenwood, and J. Coleman. Acoustics of American English Speech: A Dynamic Approach. Springer-Verlag, New York, 1993.
Google Scholar
D. K. Oiler. The effect of position in utterance on speech segment duration in English. JASA54:1235–1247, 1973.
Article Google Scholar
R. F. Port. Linguistic timing factors in combination. JASA69:262–274, 1981.
Article Google Scholar
H. Ren. Linguistically conditioned duration rules in a timing model for Chinese. In UCLA Working Papers in Phonetics 62,I. Maddieson, ed. UCLA, Los Angeles, 1985.
Google Scholar
R. W. Sproat, C. Shih, W. Gale, and N. Chang. A stochastic finite-state wordsegmentation algorithm for Chinese. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics,New Mexico State University, 66–73, 1994.
Google Scholar
K. Takeda, Y. Sagisaka, and H. Kuwabara. On sentence-level factors governing segmental duration in Japanese. JASA86:2081–2087, 1989.
Article Google Scholar
N. Umeda. Consonant duration in American English. JASA61:846–858, 1977.
Article Google Scholar
J. P. H. van Santen. Contextual effects on vowel duration. Speech Comm. ll(6):513–546, 1992.
Article Google Scholar
J. P. H. van Santen. Diagnostic perceptual experiments for text-to-speech system evaluation. In Proceedings of ICSLP,Barff, Alberta, Canada, 555–558, 1992.
Google Scholar
J. P. H. van Santen. Perceptual experiments for diagnostic testing of text-to-speech system. Computer Speech and Language 7(l):49–100, 1993.
Article Google Scholar
J. P. H. van Santen. Assignment of segmental duration in text-to-speech synthesis. Computer Speech and Language8(2):95–128, 1994.
Article Google Scholar
C. W. Wightman, S. Shattuck-Hufnagel, M. Ostendorf, and P. J. Price. Segmentai durations in the vicinity of prosodic phrase boundaries. JASA91:1707–1717, 1992.
Article Google Scholar

Download references

Authors

Chilin Shih
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Ao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Bell Laboratories Room 2D-452, 600 Mountain Avenue, Murray Hill, NJ, 07974-0636, USA
Jan P. H. van Santen
Bell Laboratories Room 2D-447, 600 Mountain Avenue, Murray Hill, NJ, 07974-0636, USA
Joseph P. Olive
Bell Laboratories Room 2D-451, 600 Mountain Avenue, Murray Hill, NJ, 07974-0636, USA
Richard W. Sproat
AT&T Research Room 2C-409, 600 Mountain Avenue, Murray Hill, NJ, 07974-0636, USA
Julia Hirschberg

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Shih, C., Ao, B. (1997). Duration Study for the Bell Laboratories Mandarin Text-to-Speech System. In: van Santen, J.P.H., Olive, J.P., Sproat, R.W., Hirschberg, J. (eds) Progress in Speech Synthesis. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-1894-4_31

Download citation

DOI: https://doi.org/10.1007/978-1-4612-1894-4_31
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4612-7328-8
Online ISBN: 978-1-4612-1894-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics