Skip to main content

VACE Multimodal Meeting Corpus

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3869))

Abstract

In this paper, we report on the infrastructure we have developed to support our research on multimodal cues for understanding meetings. With our focus on multimodality, we investigate the interaction among speech, gesture, posture, and gaze in meetings. For this purpose, a high quality multimodal corpus is being produced.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Burger, S., MacLaren, V., Yu, H.: The ISL meeting corpus: The impact of meeting type on speech type. In: Proc. of Int. Conf. on Spoken Language Processing (ICSLP) (2002)

    Google Scholar 

  2. Morgan, N., et al.: Meetings about meetings: Research at ICSI on speech in multiparty conversations. In: Proc. of ICASSP, Hong Kong, vol. 4, pp. 740–743 (2003)

    Google Scholar 

  3. Garofolo, J., Laprum, C., Michel, M., Stanford, V., Tabassi, E.: The NISTMeeting Room Pilot Corpus. In: Proc. of Language Resource and Evaluation Conference (2004)

    Google Scholar 

  4. McCowan, I., Gatica-Perez, D., Bengio, S., Lathoud, G., Barnard, M., Zhang, D.: Automatic analysis of multimodal group actions in meetings. IEEE Trans. on Pattern Analysis and Machine Intelligence 27, 305–317 (2005)

    Article  Google Scholar 

  5. Schultz, T., Waibel, A., et al.: The ISL meeting room system. In: Proceedings of the Workshop on Hands-Free Speech Communication, Kyoto, Japan (2001)

    Google Scholar 

  6. Polzin, T.S., Waibel, A.: Detecting emotions in speech. In: Proc. of the CMC (1998)

    Google Scholar 

  7. Stiefelhagen, R.: Tracking focus of attention in meetings. In: Proc. of Int. Conf. on Multimodal Interface (ICMI), Pittsburg, PA (2002)

    Google Scholar 

  8. Alfred, D., Renals, S.: Dynamic bayesian networks for meeting structuring. In: Proc. of ICASSP, Montreal, Que, Canada, vol. 5, pp. 629–632 (2004)

    Google Scholar 

  9. Gatica-Perez, D., Lathoud, G., McCowan, I., Odobez, J., Moore, D.: Audio-visual speaker tracking with importance particle filters. In: Proc. of Int. Conf. on Image Processing (ICIP), Barcelona, Spain, vol. 3, pp. 25–28 (2003)

    Google Scholar 

  10. Renals, S., Ellis, D.: Audio information access from meeting rooms. In: Proc. of ICASSP, Hong Kong, vol. 4, pp. 744–747 (2003)

    Google Scholar 

  11. Ajmera, J., Lathoud, G., McCowan, I.: Clustering and segmenting speakers and their locations in meetings. In: Proc. of ICASSP, Montreal, Que, Canada, vol. 1, pp. 605–608 (2004)

    Google Scholar 

  12. Moore, D., McCowan, I.: Microphone array speech recognition: Experiments on overlapping speech in meetings. In: Proc. of ICASSP, Hong Kong, vol. 5, pp. 497–500 (2003)

    Google Scholar 

  13. Han, T.X., Huang, T.S.: Articulated body tracking using dynamic belief propagation. In: Proc. IEEE International Workshop on Human-Computer Interaction (2005)

    Google Scholar 

  14. Tu, J., Huang, T.S.: Online updating appearance generative mixture model for meanshift tracking. In: Proc. of Int. Conf. on Computer Vision (ICCV) (2005)

    Google Scholar 

  15. Tu, J., Tao, H., Forsyth, D., Huang, T.S.: Accurate head pose tracking in low resolution video. In: Proc. of Int. Conf. on Computer Vision (ICCV) (2005)

    Google Scholar 

  16. Quek, F., Bryll, R., Ma, X.F.: A parallel algorighm for dynamic gesture tracking. In: ICCV Workshop on RATFG-RTS, Gorfu,Greece (1999)

    Google Scholar 

  17. Bryll, R.: A Robust Agent-Based Gesture Tracking System. PhD thesis, Wright State University (2004)

    Google Scholar 

  18. Quek, F., Bryll, R., Qiao, Y., Rose, T.: Vector coherence mapping: Motion field extraction by exploiting multiple coherences. CVIU special issue on Spatial Coherence in Visual Motion Analysis (Submitted, 2005)

    Google Scholar 

  19. Strassel, S., Glenn, M.: Shared linguistic resources for human language technology in the meeting domain. In: Proceedings of ICASSP 2004 Meeting Workshop (2004)

    Google Scholar 

  20. Huang, Z., Harper, M.: Speech and non-speech detection in meeting audio for transcription. In: MLMI 2005 NIST RT-05S Workshop (2005)

    Google Scholar 

  21. Bird, S., Liberman, M.: Linguistic Annotation: Survey by LDC, http://www.ldc.upenn.edu/annotation/

  22. Barras, C., Geoffrois, D., Wu, Z., Liberman, W.: Transcriber: Development and use of a tool for assisting speech corpora production. Speech Communication (2001)

    Google Scholar 

  23. Boersma, P., Weeninck, D.: Praat, a system for doing phonetics by computer. Technical Report 132, University of Amsterdam, Inst. of Phonetic Sc. (1996)

    Google Scholar 

  24. Chen, L., Liu, Y., Harper, M., Maia, E., McRoy, S.: Evaluating factors impacting the accuracy of forced alignments in a multimodal corpus. In: Proc. of Language Resource and Evaluation Conference, Lisbon, Portugal (2004)

    Google Scholar 

  25. Sundaram, R., Ganapathiraju, A., Hamaker, J., Picone, J.: ISIP 2000 conversational speech evaluation system. In: Speech Transcription Workshop 2001, College Park, Maryland (2000)

    Google Scholar 

  26. Pellom, B.: SONIC: The University of Colorado continuous speech recognizer. Technical Report TR-CSLR-2001-01, University of Colorado (2001)

    Google Scholar 

  27. Quek, F., McNeill, D., Rose, T., Shi, Y.: A coding tool for multimodal analysis of meeting video. In: NIST Meeting Room Workshop (2003)

    Google Scholar 

  28. Chen, L., Liu, Y., Harper, M., Shriberg, E.: Multimodal model integration for sentence unit detection. In: Proc. of Int. Conf. on Multimodal Interface (ICMI), University Park, PA (2004)

    Google Scholar 

  29. Rose, T., Quek, F., Shi, Y.: Macvissta: A system for multimodal analysis. In: Proc. of Int. Conf. on Multimodal Interface (ICMI) (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, L. et al. (2006). VACE Multimodal Meeting Corpus. In: Renals, S., Bengio, S. (eds) Machine Learning for Multimodal Interaction. MLMI 2005. Lecture Notes in Computer Science, vol 3869. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677482_4

Download citation

  • DOI: https://doi.org/10.1007/11677482_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-32549-9

  • Online ISBN: 978-3-540-32550-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics