Automatic Meeting Participant Role Detection by Dialogue Patterns

Su, Jing; Kane, Bridget; Luz, Saturnino

doi:10.1007/978-3-642-12397-9_27

Jing Su²⁰,
Bridget Kane²⁰ &
Saturnino Luz²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5967))

2300 Accesses
1 Citations

Abstract

We introduce a new concept of ‘Vocalization Horizon’ for automatic speaker role detection in general meeting recordings. We demonstrate that classification accuracy reaches 38.5% when Vocalization Horizon and other features (i.e. vocalization duration and start time) are available. With another type of Horizon, the Pause - Overlap Horizon, the classification accuracy reaches 39.5%. Pauses and overlaps are also useful vocalization features for meeting structure analysis. In our experiments, the Bayesian Network classifier outperforms other classifiers, and is proposed for similar applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Banerjee, S., Rudnicky, A.: Using simple speech-based features to detect the state of a meeting and the roles of the meeting participants. In: INTERSPEECH 2004, pp. 2189–2192 (2004)
Google Scholar
Chen, S.S., Gopalakrishnan, P.S.: Speaker, environment and channel change detection and clustering via the bayesian information criterion. In: DARPA Broadcast News Transcription and Understanding Workshop (1998)
Google Scholar
Esposito, A., Stejskal, V., Smekal, Z., Bourbakis, N.: The significance of empty speech pauses: Cognitive and algorithmic issues. In: Mele, F., Ramella, G., Santillo, S., Ventriglia, F. (eds.) BVAI 2007. LNCS, vol. 4729, pp. 542–554. Springer, Heidelberg (2007)
Chapter Google Scholar
Galley, M., McKeown, K., Fosler-Lussier, E., Jing, H.: Discourse segmentation of multi-party conversation. In: ACL 2003: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, Morristown, NJ, USA, pp. 562–569. Association for Computational Linguistics (2003)
Google Scholar
Hsueh, P.-Y., Moore, J.D.: Combining multiple knowledge sources for dialogue segmentation in multimedia archives. In: Proceedings of the 45th Annual Meeting of the ACL. Association for Computational Linguistics (2007)
Google Scholar
Kane, B., Luz, S.: Achieving diagnosis by consensus. Computer Supported Cooperative Work (CSCW) 18(4), 357–392 (2009)
Google Scholar
Kutner, M.H., Nachtsheim, C.J., Neter, J., Li, W.: Applied Linear Statistical Models, 5th edn. McGraw Hill, New York (2005)
Google Scholar
Laskowski, K., Ostendorf, M., Schultz, T.: Modeling vocal interaction for text independent participant characterization in multi-party conversation. In: SIGDIAL 2008 (2008)
Google Scholar
Luz, S.: Locating case discussion segments in recorded medical team meetings. In: SSCS 2009: Proceedings of the ACM Multimedia Workshop on Searching Spontaneous Conversational Speech, Beijing, China, October 2009. ACM Press, New York (2009)
Google Scholar
Maganti, H.K., Motlicek, P., Gatica-Perez, D.: Unsupervised speech/non-speech detection for automatic speech recognition in meeting rooms. IDIAP-RR 57, IDIAP, Martigny, Switzerland (2006)
Google Scholar
Renals, S., Hain, T., Bourlard, H.: Recognition and interpretation of meetings: The AMI and AMIDA projects. In: Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007 (2007)
Google Scholar
Reynolds, D., Rose, R.: Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing 3(1), 72–83 (1995)
Article Google Scholar
Shapiro, S.S., Wilk, M.B.: An analysis of variance test for normality (complete samples). Biometrika 52(3/4), 591–611 (1965)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, O’Reilly Institute, Trinity College Dublin, Dublin 2, Ireland
Jing Su, Bridget Kane & Saturnino Luz

Authors

Jing Su
View author publications
You can also search for this author in PubMed Google Scholar
Bridget Kane
View author publications
You can also search for this author in PubMed Google Scholar
Saturnino Luz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Second University of Naples, and IIASS, Via Pellegrino, 84019, Vietri sul Mare, SA, Italy
Anna Esposito
Centre for Language and Communication Studies, Trinity College, The University of Dublin, Dublin 2, Ireland
Nick Campbell & Carl Vogel &
Department of Computing Science & Mathematics, University of Stirling, FK9 4LA, Stirling, Scotland, UK
Amir Hussain
Faculty of Electrical Engineering, Mathematics and Computer Science, University of Twente, P.O. Box 217, 7500 AE, Enschede, The Netherlands
Anton Nijholt

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Su, J., Kane, B., Luz, S. (2010). Automatic Meeting Participant Role Detection by Dialogue Patterns. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds) Development of Multimodal Interfaces: Active Listening and Synchrony. Lecture Notes in Computer Science, vol 5967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12397-9_27

Download citation

DOI: https://doi.org/10.1007/978-3-642-12397-9_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12396-2
Online ISBN: 978-3-642-12397-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics