A Segment Based Approach for Prosodic Boundary Detection?

Warnke, Volker; Nöth, Elmar; Niemann, Heinrich; Stemmer, Georg

doi:10.1007/3-540-48239-3_36

Volker Warnke³,
Elmar Nöth³,
Heinrich Niemann³ &
…
Georg Stemmer³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1692))

Included in the following conference series:

International Workshop on Text, Speech and Dialogue

476 Accesses

Abstract

Successful detection of the position of prosodic phrase boundaries is useful for the rescoring of the sentence hypotheses in a speech recognition system. In addition, knowledge about prosodic boundaries may be used in a speech understanding system for disambiguation. In this paper, a segment oriented approach to prosodic boundary detection is presented. In contrast to word oriented methods (e.g. [6]), it has the advance to be independent of the spoken word chain. This makes it possible to use the knowledge about the boundary positions to reduce search space during word recognition. We have evaluated several different boundary detectors. For the two class problem ‘boundary vs. no-boundary’ we achieved an average recognition rate of 77% and an overall recognition rate up to 92 %. On the spoken phoneme chain 83% average recognition rate (total 92 %) is possible.

This work was funded by the German Federal Ministry of Education, Science, Research and Technology (BMBF) in the framework of the Verbmobil Project under Grant 01 IV 102 H/0 and by the DFG (German Research Foundation) under contract number 810 939-9. The responsibility for the contents lies with the authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

J. Alexandersson, B. Buschbeck-Wolf, T. Fujinami, M. Kipp, S. Koch, E. Maier, N. Reithinger, B. Schmitz, and M. Siegel. Dialogue Acts in VERBMOBIL-2-Second Edition. Verbmobil Report 226, 1998.
Google Scholar
L. R. Bahl, J. R. Bellegarda, P. V. de Souza, P. S. Gopalakrishnan, D. Nahamoo, and M. A. Picheny. A New Class of Fenonic Markov Word Models for Large Vocabulary Continuous Speech Recognition. Proceedings International Conference on Automatic Speech and Signal Processing, pages 177–180, 1991.
Google Scholar
A. Batliner, R. Kompe, A. Kießling, M. Mast, H. Niemann, and E. Nöth. M = Syntax + Prosody: A syntactic-prosodic labelling scheme for large spontaneous speech databases. Speech Communication, 25(4):193–222, 1998.
Article Google Scholar
F. Jelinek. Self-organized Language Modeling for Speech Recognition. In A. Waibel and K.-F. Lee, editors, Readings in Speech Recognition, pages 450–506. Morgan Kaufmann Publishers Inc., San Mateo, California, 1990.
Google Scholar
Andreas Kießling. Extraktion und Klassifikation prosodischer Merkmale in der automatischen Sprachverarbeitung. Berichte aus der Informatik. Shaker Verlag, Aachen, 1997.
Google Scholar
Ralf Kompe. Prosody in Speech Understanding Systems. Lecture Notes for Artificial Intelligence. Springer-Verlag, Berlin, 1997.
Google Scholar
E. G. Schukat-Talamazzini. Automatische Spracherkennung — Grundlagen, statistische Modelle und effiziente Algorithmen. Vieweg, Braunschweig, 1995.
Google Scholar
W. Wahlster, T. Bub, and A. Waibel. Verbmobil: The Combination of Deep and Shallow Processing for Spontaneous Speech Translation. In Proc. Int. Conf. on Acoustics, Speech and Signal Processing, volume 1, pages 71–74, München, 1997.
Google Scholar

Download references

Author information

Authors and Affiliations

Lerhstuhl für Mustererkennung (Informatik 5), Universität Erlangen-Nürnberg, Martensstr. 3, D-91058, Erlangen, Germany
Volker Warnke, Elmar Nöth, Heinrich Niemann & Georg Stemmer

Authors

Volker Warnke
View author publications
You can also search for this author in PubMed Google Scholar
Elmar Nöth
View author publications
You can also search for this author in PubMed Google Scholar
Heinrich Niemann
View author publications
You can also search for this author in PubMed Google Scholar
Georg Stemmer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineerig, Faculty of Applied Sciences, University of West Bohemia in Plzeň, Universitní 22, 306 14, Pizeň, Czech Republic
Václav Matousek , Pavel Mautner & Jana Ocelíková , &
Department of Programming Systems and Communication, Faculty of Informatics, Masaryk University Brno, Botanická 68a, 602 00, Brno, Czech Republic
Petr Sojka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Warnke, V., Nöth, E., Niemann, H., Stemmer, G. (1999). A Segment Based Approach for Prosodic Boundary Detection?. In: Matousek, V., Mautner, P., Ocelíková, J., Sojka, P. (eds) Text, Speech and Dialogue. TSD 1999. Lecture Notes in Computer Science(), vol 1692. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48239-3_36

Download citation

DOI: https://doi.org/10.1007/3-540-48239-3_36
Published: 01 October 1999
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66494-9
Online ISBN: 978-3-540-48239-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics