Intonation Rules for Text Reading

Datta, Asoke Kumar

doi:10.1007/978-981-10-7016-7_5

Asoke Kumar Datta²

Part of the book series: Signals and Communication Technology ((SCT))

312 Accesses

Abstract

Intonation is the cognitive aspect of the ensemble of pitch variations in the course of an utterance. This perceptual impression of speech melody correlates, to a first approximation, with changes in the fundamental frequency (F0) of the signal. This chapter presents the study of intonation patterns for text reading in Standard Colloquial Bengali for the development of rules and appropriate methods for using them in a text-to-speech synthesis system. In the model presented here, the pitch movements at the syllabic level are considered to be basic. Syllabic stylization uses the closest linear match using linear regression and t the pitch movements are expressed in semitones per second. The sentence level intonation pattern is the sequences of the word level patterns constituting the sentence. This chapter also presents the statistical method for the implementation of these obtained rule in TTS. The model is tested by synthesizing several sentences and the perceptual results are satisfactory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Agüero PD, Wimmer K, Bonafonte A (2004) Automatic analysis and synthesis of Fujisaki’s intonation model for TTS. Speech prosody 2004, Nara, Japan
Google Scholar
Cardozo BL, Ritsma RJ (1965) Short-time characteristics of periodicity of pitch. In: Commins DE (ed) Proceedings of the fifth International Congress on Acoustics, Liege, Belgium, paper B37
Google Scholar
Chowdhury S, Datta AK, Chaudhuri BB (2000) Pitch detection algorithm using state phase analysis. J Acoust Soc India 28(1–4):247–250
Google Scholar
Chowdhury S, Datta AK, Chaudhuri BB (2001) Study of intonation patterns for text reading in standard colloquial Bengali. In: Proceedings of the Sixth International Workshop on Recent Trends in Speech, Music and Allied Signal Processing (IWSMSP), National Physical Laboratory, New Delhi, 19–21 Dec 2001, pp 56–64
Google Scholar
Chowdhury S, Datta AK, Chaudhuri BB (2002) Intonation patterns for text reading in standard colloquial Bengali. J Acoust Soc India 30:160–163
Google Scholar
Crystal D (2003) A dictionary of linguistics & phonetics, 5th edn. Blackwell Publishing, pp 326
Google Scholar
Datta AK (2017) Springer Nature
Google Scholar
Dedina MJ, Nusbaum HC (1991) PRONOUNCE: a program for pronunciation by analogy. Comput Speech Lang 5:55–64
Article Google Scholar
Fujisaki H, Hirose K (1984) Analysis of voice fundamental frequency contours for declarative sentences of Japanese. J Acoust Soc Jpn 5(4):233–242
Article Google Scholar
Fujisaki H, Omura T (1971) Characteristics of durations of pauses and speech segments in connected speech. Annual Report, Engineering Research Institute, Faculty of Engineering, University of Tokyo, vol 30, pp 69–74
Google Scholar
Hart J’t, Collier R, Cohen A (1990) A perceptual study of intonation, an experimental phonetic approach to speech melody. Cambridge Studies in Speech Science and Communication, Cambridge University Press, Cambridge
Google Scholar
Hiki S (1970) Control rule of the tongue movement for dynamic analog speech synthesis. J Acoust Soc Am Supplement 147:S85
Article Google Scholar
Kaiki N, Sagisaka Y (1992) Pause characteristics and local phrase-dependency structure in Japanese. In: Proceeding ICSLP-1992, Banff, Canada, pp 357–360
Google Scholar
Klatt DH (1973) Interaction between two factors that influence vowel duration. J Acoust Soc Am 54:1102–1104
Article Google Scholar
Das Mandal SK, Saha A, Sarkar I, Datta AK (2005) Phonological, international & prosodic aspects of concatenative speech synthesizer development for Bangla. In: Proceeding of SIMPLE 05, pp. 56–60
Google Scholar
Lee L-S, Tseng C-Y, Ouh-Young M (1989) The synthesis rules in a Chinese text-to-speech system. IEEE Trans Acous Speech Signal Process 37(9):269–285
Google Scholar
Moebius B (1995) Components of a quantitative model of German intonation. In: Proceedings of 13th International Congress of Phonetic Sciences, Stockholm, vol 2, pp 108–115
Google Scholar
Möhler G, Conkie A (1998) Parametric modeling of into nation using vector quantization. In: 3rd European Speech Communication Association (ESCA) Workshop on Speech Synthesis, Jenolan Caves, Australia
Google Scholar
Pike KL (1945) The intonation of American English. University of Michigan Press, AnnArbor, MI
Google Scholar
Pitrelli JF, Zue VW (1989) A hierarchical model for phoneme duration in American English. In Proceeding of Eurospeech-89, Paris, pp 324–327
Google Scholar
Pollack I (1968) Detection of rate of change of auditory frequency. J Exp Psychol 77:535–541
Google Scholar
Rao KS, Yegnanarayana B (2004) Modelling syllable duration in Indian languages using neural networks. In: ICASSP, pp 313–315
Google Scholar
Reichel UD (2007) Data-driven extraction of intonation contour classe. In: 6th ISCA Workshop on Speech Synthesis, Germany, pp 240–245
Google Scholar
Ritsma RJ (1965) Pitch discrimination and frequency discrimination. In: Commins DE (ed) Proceedings of the fifth International Congress on Acoustics, Liege, paper B22
Google Scholar
Roy R, Basu T, Saha A, Basu J, Das Manda Shyamal Krl (2008) Duration modeling for Bangla text to speech synthesis system. In: International Conference on Asian Language Processing 2008, Chiang Mai, Thailand, 12–14 Nov 2008
Google Scholar
Saha A, Basu T, Khan S (2008) Analysis of occurrence and duration of intra and inter sentential pauses in Bangla read out speech. In: Proceeding of Oriental COCOSDA, 2008, Kyoto, Japan, pp 53–58
Google Scholar
Sergeant RL, Harris JD (1962) Sensitivity to unidirectional frequency modulation. J Acoust Soc Am 34:1625–1628
Google Scholar
Silverman K, Beckman M, Pitrelli J, Ostendorf M, Wightman C, Price P, et al. (1992) TOBI: a standard for labeling english prosody. In: Proceedings of International Conference on Spoken Language Processing (ICSLP 92), Banff, pp 867–870
Google Scholar
Taylor P (2000) Analysis and synthesis of intonation using the Tilt model. J Acoust Soc Am 107(3):1697–1714
Google Scholar

Download references

Author information

Authors and Affiliations

Society for Natural Language Technology Research (SNLTR), Kolkata, West Bengal, India
Asoke Kumar Datta

Authors

Asoke Kumar Datta
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Datta, A.K. (2018). Intonation Rules for Text Reading. In: Epoch Synchronous Overlap Add (ESOLA). Signals and Communication Technology. Springer, Singapore. https://doi.org/10.1007/978-981-10-7016-7_5

Download citation

DOI: https://doi.org/10.1007/978-981-10-7016-7_5
Published: 30 December 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7015-0
Online ISBN: 978-981-10-7016-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics