Skip to main content

The AMI Meeting Corpus: A Pre-announcement

  • Conference paper
Book cover Machine Learning for Multimodal Interaction (MLMI 2005)

Abstract

The AMI Meeting Corpus is a multi-modal data set consisting of 100 hours of meeting recordings. It is being created in the context of a project that is developing meeting browsing technology and will eventually be released publicly. Some of the meetings it contains are naturally occurring, and some are elicited, particularly using a scenario in which the participants play different roles in a design team, taking a design project from kick-off to completion over the course of a day. The corpus is being recorded using a wide range of devices including close-talking and far-field microphones, individual and room-view video cameras, projection, a whiteboard, and individual pens, all of which produce output signals that are synchronized with each other. It is also being hand-annotated for many different phenomena, including orthographic transcription, discourse properties such as named entities and dialogue acts, summaries, emotions, and some head and hand gestures. We describe the data set, including the rationale behind using elicited material, and explain how the material is being recorded, transcribed and annotated.

This work was supported by the European Union 6th FWP IST Integrated Project AMI (Augmented Multi-party Interaction, FP6-506811).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. McGrath, J.E., Hollingshead, A.: Interacting with Technology: Ideas, Evidence, Issues and an Agenda. Sage Publications, Thousand Oaks (1994)

    Google Scholar 

  2. Anderson, A.H., Bader, M., Bard, E.G., Boyle, E., Doherty, G., Garrod, S., Isard, S., Kowtko, J., McAllister, J., Miller, J., Sotillo, C., Thompson, H., Weinert, R.: The HCRC Map Task Corpus. Language and Speech 34, 351–366 (1991)

    Article  Google Scholar 

  3. Post, W.M., Cremers, A.H., Henkemans, O.B.: A research environment for meeting behavior. In: Nijholt, A., Nishida, T., Fruchter, R., Rosenberg, D. (eds.) Social Intelligence Design, University of Twente, Enschede, The Netherlands (2004)

    Google Scholar 

  4. Pahl, G., Beitz, W.: Engineering design: a systematic approach. Springer, London (1996)

    Book  Google Scholar 

  5. Chen, D., Odobez, J.M., Bourlard, H.: Text detection and recognition in images and video frames. Pattern Recognition 37, 595–608 (2004)

    Article  Google Scholar 

  6. Liwicki, M., Bunke, H.: Handwriting recognition of whiteboard notes. In: Marcelli, A. (ed.) 12th Conference of the International Graphonomics Society, Salerno (2005)

    Google Scholar 

  7. Lathoud, G., McCowan, I.A., Odobez, J.M.: Unsupervised location-based segmentation of multi-party speech. In: ICASSP-NIST Meeting Recognition Workshop, Montreal (2004), http://www.idiap.ch/publications/lathoud04a.bib

  8. Hain, T., Dines, J., Garau, G., Moore, D., Karafiat, M., Wan, V., Oerdelman, R., Renals, S.: Transcription of conference room meetings: an investigation. In: InterSpeech 2005, Lisbon (Submitted, 2005)

    Google Scholar 

  9. Fitt, S.: Documentation and user guide to UNISYN lexicon and post-lexical rules. Technical report, Centre for Speech Technology Research, University of Edinburgh (2000)

    Google Scholar 

  10. Greenberg, S.: Speaking in shorthand - a syllable-centric perspective for understanding pronunciation variation. In: ESCA Workshop on modelling pronunciation variation for automatic speech recognition, Kerkrade, Netherlands, pp. 47–56 (1998)

    Google Scholar 

  11. Cowie, R., Douglas-Cowie, E., Savvidou, S., McMahon, E., Sawey, M., Schröder, M.: ‘FEELTRACE’: An instrument for recording perceived emotion in real time. In: Douglas-Cowie, E., Cowie, R., Schrder, M. (eds.) ISCA Workshop on Speech and Emotion: A Conceptual Framework for Research, Belfast, pp. 19–24 (2000)

    Google Scholar 

  12. Carletta, J., Evert, S., Heid, U., Kilgour, J., Reidsma, D., Robertson, J.: The NITE XML Toolkit (Submitted)

    Google Scholar 

  13. Wellner, P., Flynn, M., Guillemot, M.: Browsing recorded meetings with Ferret. In: Bengio, S., Bourlard, H. (eds.) MLMI 2004. LNCS, vol. 3361, pp. 12–21. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Carletta, J. et al. (2006). The AMI Meeting Corpus: A Pre-announcement. In: Renals, S., Bengio, S. (eds) Machine Learning for Multimodal Interaction. MLMI 2005. Lecture Notes in Computer Science, vol 3869. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677482_3

Download citation

  • DOI: https://doi.org/10.1007/11677482_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-32549-9

  • Online ISBN: 978-3-540-32550-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics