Prosodic Boundary Detection

Ostendorf, Mari

doi:10.1007/978-94-015-9413-4_10

Mari Ostendorf⁴

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 14))

369 Accesses
23 Citations

Abstract

Prosodic constituent structure, or the perceived grouping of words in speech, plays a role in human speech communication in virtually every language. Speakers use prosodic phrasing to contribute meaning to and sometimes disambiguate the sequence of words that comprise an utterance by highlighting its information structure. From a speech analysis perspective, prosodic phrase structure provides the link that seems to most effectively explain continuously varying acoustic correlates (pauses, FO patterns, duration lengthening, etc.) in terms of the word sequence of an utterance (syntactic, semantic and discourse structure). Just as both speakers and listeners use prosodic phrases in human speech communication, so computational models of prosodic phrase structure can be useful both for communicating meaning in synthesized speech and for extracting meaning in automatic speech understanding. In fact, prosodic phrase structure is probably even more important for computer speech processing than for humans, because computers have a much less detailed semantic representation and less extensive knowledge of the world than humans and thus word sequences tend to be more often ambiguous in computer language processing than for human listeners.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bakenecker, G., Block, U., Batliner, A., Kompe, R., Nöth, E. and Regel-Brietzmann, P. 1994. Improving parsing by incorporating `prosodic clause boundaries’ into a grammar. Proc. International Conference on Spoken Language Processing (Yokohama), vol. 3, 1115–1118.
Google Scholar
Batliner, A., Feldhaus, A., Geißler, S., Kiss, T., Kompe, R. and Nöth, E. 1996. Prosody, empty categories and parsing — a success story. Proc. International Conference on Spoken Language Processing (Philadelphia) vol. 2, 1169–1172.
Google Scholar
Bear, J. and Price, P. J. 1990. Prosody, syntax and parsing. Proc. 28th Annual Meeting, Association for Computational Linguistics, 17–22.
Google Scholar
Beckman, M. and Pierrehumbert, J. 1986. Intonational structure in Japanese and English. Phonology Yearbook 3, 255–309.
Article Google Scholar
Beckman, M. 1995. Local shapes and global trends. Proc. XIIIth International Congress of Phonetic Sciences (Stockholm) vol. 2, 100–107.
Google Scholar
Beckman, M. 1996. The parsing of prosody, Language and Cognitive Processes II, 17–67.
Article Google Scholar
Bruce, G. 1977. Swedish Word Accents in Sentence Perspective. Lund: Gleerups.
Google Scholar
Bruce, G. 1995. Modelling Swedish intonation for read and spontaneous speech. Proc. XIIIth International Congress of Phonetic Sciences (Stockholm) vol. 2, 28–35.
Google Scholar
Bruce, G., Granström, B., Gustafson, K., and House, D. 1993. Prosodic modelling of phrasing in Swedish. Proc. ESCA Workshop on Prosody, Working Papers 41 ( Dept. of Linguistics and Phonetics, U. of Lund ), 180–183.
Google Scholar
Butzberger, J. 1989. Statistical Methods for Intonation Pattern Recognition. Boston University M.S. Thesis.
Google Scholar
Campbell, W. N. 1993. Automatic detection of prosodic boundaries in speech. Speech Communication 13, 343–354.
Article Google Scholar
Campbell, W.N. 1994. Combining the use of duration and FO in an automatic analysis of dialogue prosody. Proc. International Conference on Spoken Language Processing (Yokohama) vol. 3, 1111–1114.
Google Scholar
Campbell, W.N. 1997. Synthesizing spontaneous speech. In Y. Sagisaka, N. Campbell and N. Higuchi (eds.), Computing Prosody. New York: Springer, 165–186.
Chapter Google Scholar
Dahl, D. et al. 1994. Expanding the scope of the ATIS task: the ATIS-3 corpus. Proc. ARPA Workshop on Human Language Technology, 43–48.
Google Scholar
Dilley, L., Shattuck-Hufnagel, S. and Ostendorf, M. 1996. Glottalization of vowel-initial syllables as a function of prosodic structure. Journal of Phonetics, 24, 423–444.
Article Google Scholar
Fujisaki, H. and Kawai, H. 1988. Realization of linguistic information in the voice fundamental frequency contour of the spoken Japanese. Proc. International Conference on Acoustics, Speech and Signal Processing, 663–666.
Google Scholar
Geoffrois, E. 1993. A pitch contour analysis guided by prosodic event detection. Proc. Eurospeech (Berlin), vol. 2, 793–796.
Google Scholar
Glass, J., Chang, J. and McCandless, M. 1996. A probabilistic framework for feature-based speech recognition. Proc. International Conference on Spoken Language Processing (Philadelphia), vol. 4, 2277–2280.
Google Scholar
Godfrey, J., Holliman E., and McDaniel, J. 1992. Switchboard: Telephone speech corpus for research and development. Proc. International Conference on Acoustics, Speech and Signal Processing, vol. 1, 517–520.
Google Scholar
Gopalakrishnan, P., Bahl, L. and Mercer, R. 1995. A tree-search strategy for large vocabulary continuous speech recognition. Proc. International Conference on Acoustics, Speech and Signal Processing, vol. 1, 572–575.
Google Scholar
Hirose, K. and Fujisaki, H. 1982. Analysis and synthesis of voice fundamental frequency contours of spoken sentences. Proc. International Conference on Acoustics, Speech and Signal Processing, 950–953.
Google Scholar
Hirschberg, J. 1993. Studies of intonation and discourse. Proceedings ESCA Workshop on Prosody, Working Papers 41, ( Dept. of Linguistics and Phonetics, U. of Lund ), 90–95.
Google Scholar
Hirschberg, J. 1995. Prosodic and other acoustic cues to speaking style in spontaneous and read speech. Proc. XIIIth International Congress of Phonetic Sciences (Stockholm) vol. 2, 36–43.
Google Scholar
Horne, M., Strangert, E. and Heldner, M. 1995. Prosodic boundary strength in Swedish: final lengthening and silent interval duration. Proc. XIIIth International Congress of Phonetic Sciences (Stockholm) vol. 1, 170–173.
Google Scholar
Hunt, A. 1997. Training prosody-syntax recognition models without prosodic labels. In Y. Sagisaka, N. Campbell and N. Higuchi (eds), Computing Prosody. New York: Springer, 309–326.
Chapter Google Scholar
Jensen, U., Moore, R., Dalsgaard, P. and Lindberg, B. 1993. Modelling of intonation contours at the sentence level using CHMMs and the 1961 O’Connor and Arnold scheme. Proc. Eurospeech 93 (Berlin), 785–788.
Google Scholar
Kompe, R., Batliner, A., Kießling, A., Kilian, U., Niemann, H., Nöth, E. and RegelBrietzmann, P. 1994. Automatic classification of prosodically marked phrase boundaries in German. Proc. International Conference on Acoustics, Speech and Signal Processing, vol. 2, 173–176.
Google Scholar
Kompe, R., Kießling, A., Niemann, H., Nöth, E., Schukat-Talamazzini, E., Zottmann, A. and Batliner, A. 1995. Prosodic scoring of word hypotheses graphs. Proc. Eurospeech 95 (Madrid), vol. 2, 1333–1336.
Google Scholar
Lari, K. and Young, S.J. 1990. The estimation of stochastic context-free grammars using the inside-outside algorithm. Computer Speech and Language 4, 35–56.
Article Google Scholar
Macanucco, D. 1994. Automatic recognition of prosodic patterns. Unpublished Boston University course report.
Google Scholar
Mast, M., Kompe, R., Harbeck, S., Kießling, A. Niemann, H., Nöth, E., SchukatTalamazzini, E. and Warnke, V. 1996. Dialog act classification with the help of prosody. Proc. International Conference on Spoken Language Processing (Philadelphia) vol. 3, 1732–1735.
Google Scholar
Morlec, Y., Bailly, G. and Aubergé, V. 1996. Generating intonation by superposing gestures. Proc. International Conference on Spoken Language Processing (Philadelphia), vol. 1, 283–286.
Google Scholar
Nakai, M., Singer, H., Sagisaka, Y. and Shimodaira, H. 1995. Automatic prosodic segmentation by FO clustering using superpositional modeling. Proc. International Conference on Acoustics, Speech awl Signal Processing, vol. 1, 624–627.
Google Scholar
Nöth, E., De Mori, R., Fischer, J., Gebhard, A., Kompe, R., Kuhn, R., Niemann, H., and Mast 1996. An integrated model of acoustics and language using semantic classification trees. Proc. International Conference on Acoustics, Speech and Signal Processing, vol. 1, 419–422.
Google Scholar
Ostendorf, M., M. 1998. Linking speech recognition and language processing through prosody. CCAI, vol. 15, 279–303.
Google Scholar
Ostendorf, M., Kannan, A., Austin, S., Kimball, O., Schwartz, R., and Rohlicek, J.R. 1991. Integration of diverse recognition methodologies through reevaluation of N-Best sentence hypotheses. Proc. DARPA Workshop on Speech and Natural Language, 83–87.
Google Scholar
Ostendorf, M., Wightman, C. and Veilleux, M. 1993. Parse scoring with prosodic information: An analysis/synthesis approach. Computer Speech and Language, 193–210.
Google Scholar
Ostendorf, M. and Veilleux, N. 1994. A hierarchical stochastic model for automatic prediction of prosodic boundary location. Computational Linguistics, 20, 27–54.
Google Scholar
Ostendorf, M., Digalakis, V. and Kimball, O. 1996. From HMMs to segment models: A unified view of stochastic modeling for speech recognition. IEEE Trans. on Speech and Audio Proc., vol. 4, no. 5, 360–378.
Article Google Scholar
Ostendorf, M. and Ross, K. 1997. A multi-level model for recognition of intonation labels. In Y. Sagisaka, N. Campbell and N. Higuchi (eds.) Computing Prosody. New York: Springer, 291–308.
Chapter Google Scholar
Pierrehumbert, J. 1980. The Phonetics and Phonology of English Intonation. Ph.D. Dissertation, MIT.
Google Scholar
Pitrelli, J., Beckman, M., and Hirschberg, J. 1994. Evaluation of prosodic labeling reliability in the ToBI framework. Proc. International Conference on Spoken Language Processing (Yokohama) vol. 1, 123–126.
Google Scholar
Selkirk, E. This Volume. The interaction of constraints on prosodic phrasing.
Google Scholar
Shattuck-Hufnagel, S., Ostendorf, M. and Ross, K. 1994. Pitch accent placement within lexical items in American English. Journal of Phonetics 22, 357–388.
Google Scholar
Shattuck-Hufnagel, S. This Volume. Phrase-level phonology in speech production planning. Evidence for the role of prosodic structure.
Google Scholar
Silverman, K. Beckman, M., Pierrehumbert, J., Ostendorf, M., Wightman, C., Price, P. and Hirschberg, J. 1992. ToBI: a standard for labeling English prosody. Proc. International Conference on Spoken Language Processing (Banff) vol. 2, 867–870.
Google Scholar
ten Bosch, L. 1993. On the automatic classification of pitch movements. Proc. Eurospeech 93 (Berlin), vol. 2, 781–784.
Google Scholar
Veilleux, N. and Ostendorf, M. 1993a. Proc. International Conference on Acoustics, Speech and Signal Processing,vol.II, 51–54.
Google Scholar
Veilleux, N. and Ostendorf, M. 1993b. Prosody/parse scoring and its application in ATIS. Proc. ARPA Workshop on Human Language Technology, 335–340.
Google Scholar
Wightman, C., Ostendorf, M., Price, P. and Bear, J. 1990. The use of relative duration in syntactic disambiguation. Proc. International Conference on Spoken Language Processing, 13–16.
Google Scholar
Wightman, C., Shattuck-Hufnagel, S., Ostendorf, M. and Price, P. 1992. Segmental durations in the vicinity of prosodic phrase boundaries. Journal of the Acoustical Society of America 91, 1707–1717.
Article Google Scholar
Wightman, C. and Ostendorf, M. 1994. Automatic labeling of prosodic patterns. IEEE Trans. on Speech and Audio Proc. 2, 469–481.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Electrical Engineering Department, University of Washington, Seattle, WA, USA
Mari Ostendorf

Authors

Mari Ostendorf
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Lund, Sweden
Merle Horne

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ostendorf, M. (2000). Prosodic Boundary Detection. In: Horne, M. (eds) Prosody: Theory and Experiment. Text, Speech and Language Technology, vol 14. Springer, Dordrecht. https://doi.org/10.1007/978-94-015-9413-4_10

Download citation

DOI: https://doi.org/10.1007/978-94-015-9413-4_10
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5562-0
Online ISBN: 978-94-015-9413-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics