Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5967))

Abstract

We introduce a new concept of ‘Vocalization Horizon’ for automatic speaker role detection in general meeting recordings. We demonstrate that classification accuracy reaches 38.5% when Vocalization Horizon and other features (i.e. vocalization duration and start time) are available. With another type of Horizon, the Pause - Overlap Horizon, the classification accuracy reaches 39.5%. Pauses and overlaps are also useful vocalization features for meeting structure analysis. In our experiments, the Bayesian Network classifier outperforms other classifiers, and is proposed for similar applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Banerjee, S., Rudnicky, A.: Using simple speech-based features to detect the state of a meeting and the roles of the meeting participants. In: INTERSPEECH 2004, pp. 2189–2192 (2004)

    Google Scholar 

  2. Chen, S.S., Gopalakrishnan, P.S.: Speaker, environment and channel change detection and clustering via the bayesian information criterion. In: DARPA Broadcast News Transcription and Understanding Workshop (1998)

    Google Scholar 

  3. Esposito, A., Stejskal, V., Smekal, Z., Bourbakis, N.: The significance of empty speech pauses: Cognitive and algorithmic issues. In: Mele, F., Ramella, G., Santillo, S., Ventriglia, F. (eds.) BVAI 2007. LNCS, vol. 4729, pp. 542–554. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  4. Galley, M., McKeown, K., Fosler-Lussier, E., Jing, H.: Discourse segmentation of multi-party conversation. In: ACL 2003: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, Morristown, NJ, USA, pp. 562–569. Association for Computational Linguistics (2003)

    Google Scholar 

  5. Hsueh, P.-Y., Moore, J.D.: Combining multiple knowledge sources for dialogue segmentation in multimedia archives. In: Proceedings of the 45th Annual Meeting of the ACL. Association for Computational Linguistics (2007)

    Google Scholar 

  6. Kane, B., Luz, S.: Achieving diagnosis by consensus. Computer Supported Cooperative Work (CSCW) 18(4), 357–392 (2009)

    Google Scholar 

  7. Kutner, M.H., Nachtsheim, C.J., Neter, J., Li, W.: Applied Linear Statistical Models, 5th edn. McGraw Hill, New York (2005)

    Google Scholar 

  8. Laskowski, K., Ostendorf, M., Schultz, T.: Modeling vocal interaction for text independent participant characterization in multi-party conversation. In: SIGDIAL 2008 (2008)

    Google Scholar 

  9. Luz, S.: Locating case discussion segments in recorded medical team meetings. In: SSCS 2009: Proceedings of the ACM Multimedia Workshop on Searching Spontaneous Conversational Speech, Beijing, China, October 2009. ACM Press, New York (2009)

    Google Scholar 

  10. Maganti, H.K., Motlicek, P., Gatica-Perez, D.: Unsupervised speech/non-speech detection for automatic speech recognition in meeting rooms. IDIAP-RR 57, IDIAP, Martigny, Switzerland (2006)

    Google Scholar 

  11. Renals, S., Hain, T., Bourlard, H.: Recognition and interpretation of meetings: The AMI and AMIDA projects. In: Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2007 (2007)

    Google Scholar 

  12. Reynolds, D., Rose, R.: Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing 3(1), 72–83 (1995)

    Article  Google Scholar 

  13. Shapiro, S.S., Wilk, M.B.: An analysis of variance test for normality (complete samples). Biometrika 52(3/4), 591–611 (1965)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Su, J., Kane, B., Luz, S. (2010). Automatic Meeting Participant Role Detection by Dialogue Patterns. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds) Development of Multimodal Interfaces: Active Listening and Synchrony. Lecture Notes in Computer Science, vol 5967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12397-9_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12397-9_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12396-2

  • Online ISBN: 978-3-642-12397-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics