Abstract
Recordings of audio-visual presentations are a potentially valuable component of digital libraries. These recordings can be archived to enable remote access to audio presentations including lectures and seminars. Recordings of presentations often contain multiple information streams involving visual and audio data. If the full benefit of these recordings is to be realised these multiple media streams must be properly integrated to enable rapid navigation. This paper describes the application of information retrieval techniques within a system to automatically synchronise an audio soundtrack with electronic slides from a presentation. A novel component of the system is the detection of sections of the presentation unsupported by prepared slides, such as discussion and question answering, and automatic development of keypoint slides for these elements of the presentation.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
E. W. Brown, S. Srinivasen, A. Coden, D. Ponceleon, J. W. Cooper, and A. Amir. Towards Speech as a Knowldge Resource. IBM Systems Journal, 40(4):985–1001, 2001.
S. Mukhopadyay and B. Smith. Passive Capture and Structuring of Lectures. In Proceedings of the 7th ACM International Conference on Multimedia (Part 1), pages 477–487, Orlando, Florida, 1999. ACM.
J. Hunter and S. Little Building and Indexinga Distributed Multimedia Presentation Archive Using SMIL. In Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries (ECDL 2001), pages 415–428, Darmstadt, 2001.
A. G. Hauptmann and M. J. Witbrock. Informedia: News-on-Demand Multimedia Information Aquistion and Retrieval. In M. T. Maybury, editor, Intelligent Multimedia Information Retrieval, pages 215–239. AAAI/MIT Press, 1997.
M. G. Brown, J. T. Foote, G. J. F. Jones, K. Sparck Jones, and S. J. Young. Open-vocabulary speech indexing for voice and video mail retrieval. In Proceedings of ACM Multimedia 96, pages 307–316, Boston, 1996. ACM.
J. S. Garafolo, C. G. P. Auzanne, and E. M. Voorhees. The TREC Spoken Document Retrieval Track: A Success Story. In Proceedings of the RIAO 2000 Conference: Content-Based Multimedia Information Access, pages 1–20, Paris, 2000.
C. J. van Rijsbergen. Information Retrieval. Butterworths, 2nd edition, 1979.
M. F. Porter. An algorithm for suffix stripping. Program, 14:130–137, 1980.
S. E. Robertson and S. Walker. Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 232–241, Dublin, 1994. ACM.
S. E. Robertson, S. Walker, M. M. Beaulieu, M. Gatford, and A. Payne. Okapi at TREC-4. In D. K. Harman, editor, Overview of the Fourth Text REtrieval Conference (TREC-4), pages 73–96. NIST, 1996.
M. Hearst. Multi-Paragraph Segmentation of Expository Text. In Proceedings of ACL’94, Las Cruces, New Mexico, U.S.A., 1994.
D. Ponceleon and S. Srinivasen. Structure and Content-Based Segmentation of Speech Transcripts. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 404–405, New Orleans, 2001. ACM.
R. Jin and A. G. Hauptmann. Automatic title generation for spoken broadcast news. In Proceedings of Human Language Technology Conference (HLT 2001), San Diego, 2001.
L. J. Stifelman. Augmenting Real-World Objects: A Paper-Based Audio Netbook. In Proceedings of CHI’96, Vancouver, Canada, 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jones, G.J.F., Edens, R.J. (2002). Automated Alignment and Annotation of Audio-Visual Presentations. In: Agosti, M., Thanos, C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2002. Lecture Notes in Computer Science, vol 2458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45747-X_21
Download citation
DOI: https://doi.org/10.1007/3-540-45747-X_21
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44178-6
Online ISBN: 978-3-540-45747-3
eBook Packages: Springer Book Archive