The AMI Meeting Corpus: A Pre-announcement
The AMI Meeting Corpus is a multi-modal data set consisting of 100 hours of meeting recordings. It is being created in the context of a project that is developing meeting browsing technology and will eventually be released publicly. Some of the meetings it contains are naturally occurring, and some are elicited, particularly using a scenario in which the participants play different roles in a design team, taking a design project from kick-off to completion over the course of a day. The corpus is being recorded using a wide range of devices including close-talking and far-field microphones, individual and room-view video cameras, projection, a whiteboard, and individual pens, all of which produce output signals that are synchronized with each other. It is also being hand-annotated for many different phenomena, including orthographic transcription, discourse properties such as named entities and dialogue acts, summaries, emotions, and some head and hand gestures. We describe the data set, including the rationale behind using elicited material, and explain how the material is being recorded, transcribed and annotated.
KeywordsDesign Team Acoustic Model Industrial Designer Marketing Expert Pronunciation Variation
Unable to display preview. Download preview PDF.
- 1.McGrath, J.E., Hollingshead, A.: Interacting with Technology: Ideas, Evidence, Issues and an Agenda. Sage Publications, Thousand Oaks (1994)Google Scholar
- 3.Post, W.M., Cremers, A.H., Henkemans, O.B.: A research environment for meeting behavior. In: Nijholt, A., Nishida, T., Fruchter, R., Rosenberg, D. (eds.) Social Intelligence Design, University of Twente, Enschede, The Netherlands (2004)Google Scholar
- 6.Liwicki, M., Bunke, H.: Handwriting recognition of whiteboard notes. In: Marcelli, A. (ed.) 12th Conference of the International Graphonomics Society, Salerno (2005)Google Scholar
- 7.Lathoud, G., McCowan, I.A., Odobez, J.M.: Unsupervised location-based segmentation of multi-party speech. In: ICASSP-NIST Meeting Recognition Workshop, Montreal (2004), http://www.idiap.ch/publications/lathoud04a.bib
- 8.Hain, T., Dines, J., Garau, G., Moore, D., Karafiat, M., Wan, V., Oerdelman, R., Renals, S.: Transcription of conference room meetings: an investigation. In: InterSpeech 2005, Lisbon (Submitted, 2005)Google Scholar
- 9.Fitt, S.: Documentation and user guide to UNISYN lexicon and post-lexical rules. Technical report, Centre for Speech Technology Research, University of Edinburgh (2000)Google Scholar
- 10.Greenberg, S.: Speaking in shorthand - a syllable-centric perspective for understanding pronunciation variation. In: ESCA Workshop on modelling pronunciation variation for automatic speech recognition, Kerkrade, Netherlands, pp. 47–56 (1998)Google Scholar
- 11.Cowie, R., Douglas-Cowie, E., Savvidou, S., McMahon, E., Sawey, M., Schröder, M.: ‘FEELTRACE’: An instrument for recording perceived emotion in real time. In: Douglas-Cowie, E., Cowie, R., Schrder, M. (eds.) ISCA Workshop on Speech and Emotion: A Conceptual Framework for Research, Belfast, pp. 19–24 (2000)Google Scholar
- 12.Carletta, J., Evert, S., Heid, U., Kilgour, J., Reidsma, D., Robertson, J.: The NITE XML Toolkit (Submitted)Google Scholar