Linguistic Resources for Meeting Speech Recognition

Glenn, Meghan Lammie; Strassel, Stephanie

doi:10.1007/11677482_33

Meghan Lammie Glenn¹⁸ &
Stephanie Strassel¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3869))

Included in the following conference series:

International Workshop on Machine Learning for Multimodal Interaction

1968 Accesses
1 Citations

Abstract

This paper describes efforts by the University of Pennsylvania’s Linguistic Data Consortium to create and distribute shared linguistic resources – including data, annotations, tools and infrastructure – to support the Rich Transcription 2005 Spring Meeting Recognition Evaluation. In addition to distributing large volumes of training data, LDC produced reference transcripts for the RT-05S conference room evaluation corpus, which represents a variety of subjects, scenarios and recording conditions. Careful verbatim reference transcripts including rich markup were created for all two hours of data. One hour was also selected for a contrastive study using a quick transcription methodology. We review the two methodologies and discuss qualitative differences in the resulting transcripts. Finally, we describe infrastructure development including transcription tools to support our efforts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Strassel, S., Glenn, M.: Shared Linguistic Resources for Human Language Technology in the Meeting Domain. In: Proceedings of the ICASSP 2004 Meeting Recognition Workshop (2004), http://www.nist.gov/speech/test_beds/mr_proj/icassp_program.html
Linguistic Data Consortium: RT-04 Meeting Transcription Guidelines (2004), http://www.ldc.upenn.edu/Projects/Transcription/NISTMeet/index.html
Strassel, S., Cieri, C., Walker, K., Miller, D.: Shared Resources for Robust Speech-to-Text Technology. In: Proceedings of Eurospeech (2003)
Google Scholar
Bird, S., Liberman, M.: A formal framework for linguistic annotation. Speech Communication 33, 23–60 (2001)
Article MATH Google Scholar
Maeda, K., Strassel, S.: Annotation Tools for Large-Scale Corpus Development: Using AGTK at the Linguistic Data Consortium. In: Proceedings of the 4th International Conference on Language Resources and Evaluation (2004)
Google Scholar
http://www.ldc.upenn.edu/Projects/Transcription/Tools

Download references

Author information

Authors and Affiliations

Linguistic Data Consortium, University of Pennsylvania, 3600 Market Street, Suite 800, Philadelphia, PA, 19104, USA
Meghan Lammie Glenn & Stephanie Strassel

Authors

Meghan Lammie Glenn
View author publications
You can also search for this author in PubMed Google Scholar
Stephanie Strassel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Edinburgh, Edinburgh, Scotland
Steve Renals
IDIAP Research Institute, Martigny, Switzerland
Samy Bengio

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Glenn, M.L., Strassel, S. (2006). Linguistic Resources for Meeting Speech Recognition. In: Renals, S., Bengio, S. (eds) Machine Learning for Multimodal Interaction. MLMI 2005. Lecture Notes in Computer Science, vol 3869. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677482_33

Download citation

DOI: https://doi.org/10.1007/11677482_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32549-9
Online ISBN: 978-3-540-32550-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics