Multistream Recognition of Dialogue Acts in Meetings

Dielmann, Alfred; Renals, Steve

doi:10.1007/11965152_16

Alfred Dielmann¹⁹ &
Steve Renals¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4299))

Included in the following conference series:

International Workshop on Machine Learning for Multimodal Interaction

738 Accesses
4 Citations

Abstract

We propose a joint segmentation and classification approach for the dialogue act recognition task on natural multi-party meetings (ICSI Meeting Corpus). Five broad DA categories are automatically recognised using a generative Dynamic Bayesian Network based infrastructure. Prosodic features and a switching graphical model are used to estimate DA boundaries, in conjunction with a factored language model which is used to relate words and DA categories. This easily generalizable and extensible system promotes a rational approach to the joint DA segmentation and recognition task, and is capable of good recognition performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ang, J., Liu, Y., Shriberg, E.: Automatic dialog act segmentation and classification in multiparty meetings. In: Proc. of the IEEE ICASSP (March 2005)
Google Scholar
Shriberg, E., Dhillon, R., Bhagat, S., Ang, J., Carvey, H.: The ICSI meeting recorder dialog act (MRDA) corpus. In: Proc. HLT-NAACL SIGDIAL Workshop (April–May 2004)
Google Scholar
Stolcke, A., Ries, K., Coccaro, N., Shriberg, E., Jurafsky, D., Taylor, P., Martin, R., Van Ess-Dykema, C., Meteer, M.: Dialogue act modeling for automatic tagging and recognition of conversational speech. Computational Linguistics (26), 339–373 (2000)
Article Google Scholar
Nagata, M., Morimoto, T.: An experimental statistical dialogue model to predict the speech act type of the next utterance. In: Proc. of the International Symposium on Spoken Dialogue, pp. 83–86 (November 1993)
Google Scholar
Bilmes, J., Kirchhoff, K.: Factored language models and generalized parallel backoff. In: Proceedings of HLT/NAACL 2003 (May 2003)
Google Scholar
Shriberg, E., Bates, R., Taylor, P., Stolcke, A., Jurafsky, D., Ries, K., Coccaro, N., Martin, R., Meteer, M., Van Ess-Dykema, C.: Can prosody aid the automatic classification of dialog acts in conversational speech? Language and Speech (41), 439–487 (1998)
Google Scholar
Hastie, H., Poesio, M., Isard, S.: Automatically predicting dialogue structure using prosodic features. Speech Communication (36), 63–79 (2002)
Article Google Scholar
Zimmermann, M., Liu, Y., Shriberg, E., Stolcke, A.: Toward joint segmentation and classification of dialog acts in multiparty meetings. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869. Springer, Heidelberg (2006)
Chapter Google Scholar
Ji, G., Bilmes, J.: Dialog act tagging using graphical models. In: Proc. of the IEEE ICASSP (March 2005)
Google Scholar
Venkataraman, A., Ferrer, L., Stolcke, A., Shriberg, E.: Training a prosody-based dialog act tagger from unlabeled data. In: Proc. of the IEEE ICASSP (April 2003)
Google Scholar
Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., Peskin, B., Pfau, T., Shriberg, E., Stolcke, A., Wooters, C.: The ICSI meeting corpus. In: Proc. IEEE ICASSP (April 2003)
Google Scholar
Hain, T., Karafiát, M., Garau, G., Moore, D., Wan, V., Ordelman, R., Renals, S.: Transcription of conference room meetings: An investigation. In: Proc. Interspeech 2005, Eurospeech, Lisbon (September 2005)
Google Scholar
Kirchhoff, K., Bilmes, J., Henderson, J., Schwartz, R., Noamany, M., Schone, P., Ji, G., Das, S., Egan, M., He, F., Vergyri, D., Liu, D., Duta, N.: Novel approaches to arabic speech recognition - final report from the jhu summer workshop 2002. Tech. Rep., John-Hopkins University (2002)
Google Scholar
Stolcke, A.: SRILM an extensible language modeling toolkit. In: Proc. Int. Conf. on Spoken Language Processing (September 2002)
Google Scholar
Murphy, K.P.: Dynamic Bayesian networks: Representation, inference and learning. Ph.D. Thesis, UC Berkeley, Computer Science Division (July 2002)
Google Scholar
Bilmes, J., Zweig, G.: The Graphical Model ToolKit: an open source software system for speech and time-series processing. In: Proc. IEEE ICASSP (June 2002)
Google Scholar
Bilmes, J.A.: Dynamic bayesian multinets. In: Proc. Int. Conf. on Uncertainty in Artificial Intelligence (2000)
Google Scholar
Carletta, J., Ashby, S., Bourban, S., Flynn, M., Guillemot, M., Hain, T., Kadlec, J., Karaiskos, V., Kraaij, W., Kronenthal, M., Lathoud, G., Lincoln, M., Lisowska, A., McCowan, I., Post, W., Reidsma, D., Wellner, P.: The AMI meeting corpus: A pre-announcement. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869. Springer, Heidelberg (2006)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Speech Technology Research, University of Edinburgh, Edinburgh, EH8 9LW, UK
Alfred Dielmann & Steve Renals

Authors

Alfred Dielmann
View author publications
You can also search for this author in PubMed Google Scholar
Steve Renals
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Edinburgh, Edinburgh, Scotland
Steve Renals
IDIAP Research Institute, Martigny, Switzerland
Samy Bengio
National Institute Of Standards and Technology, 100 Bureau Drive Stop 8940, Gaithersburg, MD, 20899
Jonathan G. Fiscus

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dielmann, A., Renals, S. (2006). Multistream Recognition of Dialogue Acts in Meetings. In: Renals, S., Bengio, S., Fiscus, J.G. (eds) Machine Learning for Multimodal Interaction. MLMI 2006. Lecture Notes in Computer Science, vol 4299. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11965152_16

Download citation

DOI: https://doi.org/10.1007/11965152_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69267-6
Online ISBN: 978-3-540-69268-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics