Abstract
We describe the acoustic-prosodic and syntactic-prosodic annotation and classification of boundaries, accents and sentence mood integrated in the Verbmobil system for the three languages German, English, and Japanese. For the acoustic-prosodic classification, a large feature vector with normalized prosodic features is used. For the three languages, a multilingual prosody module was developed that reduces memory requirement considerably, compared to three monolingual modules. For classification, neural networks and statistic language models are used.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Alexandersson, J., Engel, R., Kipp, M., Koch, S., Küssner, U., Reithinger, N., and Stede, M. Modeling Negotiation Dialogs. In this volume.
Bagshaw, P. C. (1994). Automatic Prosodic Analysis for Computer Aided Pronunciation Teaching. PhD thesis, University of Edinburgh.
Batliner, A., Huber, R., Niemann, H., Nöth, E., Spilker, J., and Fischer, K. The Recognition of Emotion. In this volume.
Batliner, A., Kompe, R., Kießling, A., Mast, M., Niemann, H., and Nöth, E. (1998). M = Syntax + Prosody: a Syntactic-Prosodic Labelling Scheme for Large Spontaneous Speech Databases. Speech Communication 25(4):193–222.
Batliner, A., Nutt, M., Warnke, V., Nöth, E., Buckow, J., Huber, R., and Niemann, H. (1999). Automatic Annotation and Classification of Phrase Accents in Spontaneous Speech. In Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH 99), 519–522.
Block, H. (1997). The Language Components in Verbmobil. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, volume 1, 79–82.
Heine, J., and Bos, J. Discourse and Dialog Semantics for Translation. In this volume.
Jekat, S., Klein, A., Maier, E., Maleck, I., Mast, M., and Quantz, J. (1995). Dialogue Acts in Verbmobil. Verbmobil Report 65.
Kiefer, B., Krieger, H.-U., and Nederhof, M.-J. Efficient and Robust HPSG Parsing of Word Graphs. In this volume.
Kießling, A. (1997). Extraktion und Klassifikation prosodischer Merkmale in der automatischen Sprachverarbeitung. Berichte aus der Informatik. Aachen: Shaker Verlag.
Kipp, M., Alexandersson, J., Reithinger, N., and Engel, R. Dialog Processing. In this volume.
Kompe, R. (1997). Prosody in Speech Understanding Systems. Lecture Notes for Artificial Intelligence. Berlin: Springer-Verlag.
Mast, M., Maier, E., and Schmitz, B. (1995). Criteria for the Segmentation of Spoken Input into Individual Utterances. Verbmobil Report 97.
Klüter, A., Ndiaye, A., and Kirchmann H. Verbmobil from a Software Engineering Point of View: System Design and Software Integration. In this volume.
Price, P., Ostendorf, M., Shattuck-Hufnagel, S., and Fong, C. (1991). The Use of Prosody in Syntactic Disambiguation. Journal of the Acoustic Society of America 90:2956–2970.
Reithinger, N., and Engel, R. Robust Content Extraction for Translation and Dialog Processing. In this volume.
Schukat-Talamazzini, E., Gallwitz, F., Harbeck, S., and Warnke, V. (1997). Rational Interpolation of Maximum Likelihood Predictors in Stochastic Language Modeling. In Proc. European Conf. on Speech Communication and Technology, volume 5, 2731–2734.
Searle, J. (1969). Speech Acts. An Essay in the Philosophy of Language. Cambridge: Cambridge University Press.
Shriberg, E., Bates, R., Taylor, P., Stolcke, A., Jurafsky, D., Ries, K., Cocarro, N., Martin, R., Meteer, M., and Ess-Dykema, C. V. (1998). Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech? Language and Speech 41:439–487.
Spilker, J., Klarner, M., and Görz, G. Processing Self Corrections in a Speech-to-Speech System. In this volume.
Vogel, S., Och, F.J., Tillmann, C., Niessen, S., Sawaf, H., and Ney, H. Statistical Methods for Machine Translation. In this volume.
Wang, M., and Hirschberg, J. (1992). Automatic Classification of Intonational Phrase Boundaries. Computer Speech & Language 6(2):175–196.
Warnke, V., Gallwitz, F., Batliner, A., Buckow, J., Huber, R., Nöth, E., and Höthker, A. (1999). Integrating Multiple Knowledge Sources for Word Hypotheses Graph Interpretation. In Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH 99), 235–239.
Wightman, C. (1992). Automatic Detection of Prosodic Constituents. PhD thesis, Boston University Graduate School.
Zell, A., Mache, N., Sommer, T., and Korb, T. (1991a). Design of the SNNS Neural Network Simulator. In Proceedings of the Österreichische Artificial-Intelligence-Tagung, Informatik-Fachberichte 287, 93–102. Springer Verlag.
Zell, A., Mache, N., Sommer, T., and Korb, T. (1991b). The SNNS Neural Network Simulator. In Proceedings of the 15. Fachtagung für Künstliche Intelligenz, 254–263. Springer Verlag.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Batliner, A., Buckow, J., Niemann, H., Nöth, E., Warnke, V. (2000). The Prosody Module. In: Wahlster, W. (eds) Verbmobil: Foundations of Speech-to-Speech Translation. Artificial Intelligence. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-04230-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-662-04230-4_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-08730-1
Online ISBN: 978-3-662-04230-4
eBook Packages: Springer Book Archive