Abstract
Successful detection of the position of prosodic phrase boundaries is useful for the rescoring of the sentence hypotheses in a speech recognition system. In addition, knowledge about prosodic boundaries may be used in a speech understanding system for disambiguation. In this paper, a segment oriented approach to prosodic boundary detection is presented. In contrast to word oriented methods (e.g. [6]), it has the advance to be independent of the spoken word chain. This makes it possible to use the knowledge about the boundary positions to reduce search space during word recognition. We have evaluated several different boundary detectors. For the two class problem ‘boundary vs. no-boundary’ we achieved an average recognition rate of 77% and an overall recognition rate up to 92 %. On the spoken phoneme chain 83% average recognition rate (total 92 %) is possible.
This work was funded by the German Federal Ministry of Education, Science, Research and Technology (BMBF) in the framework of the Verbmobil Project under Grant 01 IV 102 H/0 and by the DFG (German Research Foundation) under contract number 810 939-9. The responsibility for the contents lies with the authors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
J. Alexandersson, B. Buschbeck-Wolf, T. Fujinami, M. Kipp, S. Koch, E. Maier, N. Reithinger, B. Schmitz, and M. Siegel. Dialogue Acts in VERBMOBIL-2-Second Edition. Verbmobil Report 226, 1998.
L. R. Bahl, J. R. Bellegarda, P. V. de Souza, P. S. Gopalakrishnan, D. Nahamoo, and M. A. Picheny. A New Class of Fenonic Markov Word Models for Large Vocabulary Continuous Speech Recognition. Proceedings International Conference on Automatic Speech and Signal Processing, pages 177–180, 1991.
A. Batliner, R. Kompe, A. Kießling, M. Mast, H. Niemann, and E. Nöth. M = Syntax + Prosody: A syntactic-prosodic labelling scheme for large spontaneous speech databases. Speech Communication, 25(4):193–222, 1998.
F. Jelinek. Self-organized Language Modeling for Speech Recognition. In A. Waibel and K.-F. Lee, editors, Readings in Speech Recognition, pages 450–506. Morgan Kaufmann Publishers Inc., San Mateo, California, 1990.
Andreas Kießling. Extraktion und Klassifikation prosodischer Merkmale in der automatischen Sprachverarbeitung. Berichte aus der Informatik. Shaker Verlag, Aachen, 1997.
Ralf Kompe. Prosody in Speech Understanding Systems. Lecture Notes for Artificial Intelligence. Springer-Verlag, Berlin, 1997.
E. G. Schukat-Talamazzini. Automatische Spracherkennung — Grundlagen, statistische Modelle und effiziente Algorithmen. Vieweg, Braunschweig, 1995.
W. Wahlster, T. Bub, and A. Waibel. Verbmobil: The Combination of Deep and Shallow Processing for Spontaneous Speech Translation. In Proc. Int. Conf. on Acoustics, Speech and Signal Processing, volume 1, pages 71–74, München, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Warnke, V., Nöth, E., Niemann, H., Stemmer, G. (1999). A Segment Based Approach for Prosodic Boundary Detection?. In: Matousek, V., Mautner, P., Ocelíková, J., Sojka, P. (eds) Text, Speech and Dialogue. TSD 1999. Lecture Notes in Computer Science(), vol 1692. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48239-3_36
Download citation
DOI: https://doi.org/10.1007/3-540-48239-3_36
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66494-9
Online ISBN: 978-3-540-48239-0
eBook Packages: Springer Book Archive