Skip to main content

Duration Study for the Bell Laboratories Mandarin Text-to-Speech System

  • Chapter
Progress in Speech Synthesis

Abstract

We present in this chapter the methodology and results of a duration study designed for the Mandarin Chinese text-to-speech system of Bell Laboratories. A greedy algorithm is used to select text from on-line corpora to maximize the coverage of factors that are important to the study of duration. The duration model and some interesting results are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J. Allen, S. Hunnicut, and D. H. Klatt. From text to speech: The MITalk system. Cambridge University Press, Cambridge, UK, 1987.

    Google Scholar 

  2. R. Berkovits. Utterance-final lengthening and the duration of final-stop closures. J. Phonetics21(4):479–489, 1993.

    Google Scholar 

  3. R. Carlson and B. Cranström. A search for durational rules in a real-speech data base. Phonetica43:140–154, 1986.

    Article  Google Scholar 

  4. T. H. Crystal and A. S. House. Segmental durations in connected speech signals: Preliminary results. JASA72:705–716, 1982.

    Article  Google Scholar 

  5. T. H. Crystal and A. S. House. Segmental durations in connected-speech signals: Current results. JASA83:1553–1573, 1988.

    Article  Google Scholar 

  6. J. Edwards and M. E. Beckman. Articulatory timing and the prosodic interpretation of syllable duration. Phonetica45(2): 156–174, 1988.

    Article  Google Scholar 

  7. L. Feng Beijinghua yuliu zhong sheng yun diao de shichang (Duration of consonants, vowels, and tones in Beijing Mandarin speech). In Beijinghua Yuyin Shiyanlu (Acoustics Experiments in Beijing Mandarin),Beijing University Press, Beijing, 131–195, 1985.

    Google Scholar 

  8. R. M. French, A. Greenwood, and J. P. Olive. Speech Segmentation Criteria. Technical report, AT&T Bell Laboratories, 1993.

    Google Scholar 

  9. J. Fletcher and A. McVeigh. Segment and syllable duration in Australian English. Speech Comm. 13:355–365, 1993.

    Article  Google Scholar 

  10. A. S. House. On vowel duration in English. JASA33:1174–1178, 1961.

    Article  Google Scholar 

  11. M. S. Harris and N. Umeda. Effect of speaking mode on temporal factors in speech: Vowel duration. JASA56:1016–1018, 1974.

    Article  Google Scholar 

  12. D. H. Klatt. Interaction between two factors that influence vowel duration. JASA54:1102–1104, 1973.

    Article  Google Scholar 

  13. D. H. Klatt. Vowel lengthening is syntactically determined in a connected discourse. J. Phonetics3:129–140, 1975.

    Google Scholar 

  14. I. Lehiste The timing of utterances and linguistic boundaries. JASA 51(6.2): 2018–2024, 1972.

    Article  Google Scholar 

  15. D. Lindblom and K. Rapp. Some temporal regularities of spoken Swedish. Publication of the Institute of Linguistics, University of Stockholm,21:1–59, 1973.

    Google Scholar 

  16. S. G. Nooteboom. Production and Perception of Vowel Duration. University of Utrecht, Utrecht, 1972.

    Google Scholar 

  17. J. P. Olive, A. Greenwood, and J. Coleman. Acoustics of American English Speech: A Dynamic Approach. Springer-Verlag, New York, 1993.

    Google Scholar 

  18. D. K. Oiler. The effect of position in utterance on speech segment duration in English. JASA54:1235–1247, 1973.

    Article  Google Scholar 

  19. R. F. Port. Linguistic timing factors in combination. JASA69:262–274, 1981.

    Article  Google Scholar 

  20. H. Ren. Linguistically conditioned duration rules in a timing model for Chinese. In UCLA Working Papers in Phonetics 62,I. Maddieson, ed. UCLA, Los Angeles, 1985.

    Google Scholar 

  21. R. W. Sproat, C. Shih, W. Gale, and N. Chang. A stochastic finite-state wordsegmentation algorithm for Chinese. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics,New Mexico State University, 66–73, 1994.

    Google Scholar 

  22. K. Takeda, Y. Sagisaka, and H. Kuwabara. On sentence-level factors governing segmental duration in Japanese. JASA86:2081–2087, 1989.

    Article  Google Scholar 

  23. N. Umeda. Consonant duration in American English. JASA61:846–858, 1977.

    Article  Google Scholar 

  24. J. P. H. van Santen. Contextual effects on vowel duration. Speech Comm. ll(6):513–546, 1992.

    Article  Google Scholar 

  25. J. P. H. van Santen. Diagnostic perceptual experiments for text-to-speech system evaluation. In Proceedings of ICSLP,Barff, Alberta, Canada, 555–558, 1992.

    Google Scholar 

  26. J. P. H. van Santen. Perceptual experiments for diagnostic testing of text-to-speech system. Computer Speech and Language 7(l):49–100, 1993.

    Article  Google Scholar 

  27. J. P. H. van Santen. Assignment of segmental duration in text-to-speech synthesis. Computer Speech and Language8(2):95–128, 1994.

    Article  Google Scholar 

  28. C. W. Wightman, S. Shattuck-Hufnagel, M. Ostendorf, and P. J. Price. Segmentai durations in the vicinity of prosodic phrase boundaries. JASA91:1707–1717, 1992.

    Article  Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer Science+Business Media New York

About this chapter

Cite this chapter

Shih, C., Ao, B. (1997). Duration Study for the Bell Laboratories Mandarin Text-to-Speech System. In: van Santen, J.P.H., Olive, J.P., Sproat, R.W., Hirschberg, J. (eds) Progress in Speech Synthesis. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-1894-4_31

Download citation

  • DOI: https://doi.org/10.1007/978-1-4612-1894-4_31

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4612-7328-8

  • Online ISBN: 978-1-4612-1894-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics