Abstract
The current research efforts in the field of video parsing and analysis are mainly focused on the use of pictorial information, while neglecting an important supplementary source of content information such as the embedded audio or soundtrack. In contrast, in this paper we address the issue of exploiting audio information that can be jointly used with video information for scene changes detection. The proposed method directly works on MPEG encoded sequences so to avoid computationally intensive decoding procedures. It is based on a multi-expert classification system made up of a hierarchical ensemble of neural networks.
Finally, after presentation of a large audio database, suitably designed for assessing the performance of the approach, preliminary experimental results are discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hanjalic A, Lagendijk RL, Biemond J. Automated High-Level Movie Segmentation for Advanced Video-Retrieval Systems. IEEE Trans. on Circuits and Systems for Video Technology 1999, 9:580–588.
Yeung M, Yeo BL, Liu B. Extracting Story Units from Long Programs for Video Browsing and navigation. In: IEEE International Conference on Multimedia Computing and Systems, 1996, pp 296–305.
Kender JR, Yeo BL. Video Scene Segmentation Via Continuous Video Coherence. In: IEEE International Conference on Computer Vision and Pattern Recognition, 1998, pp 367–373.
Saraceno C, Leonardi R. Audio as a Support to Scene Change Detection and Characterization of Video Sequences. In: Proc. ICASSP’97, Munich, 1997.
Nam J, Cetin E, Tewfik H. Speaker Identification and Video Analysis for Hierarchical Video Shot Classification. In: Proc. ICIP’ 97, S. Barbara, 1997.
Boreczky JS, Wilcox LD. A Hidden Markov Model Framework for Video Segmentation Using Audio and Image Features. In: Proc. ICASSP’ 98, Seattle, 1998.
Jang PJ, Hauptmann AG. Improving acoustic models with captioned multimedia speech. In: Proc. of IEEE Intl. Conf. on Multimedia Computing and Systems, vol. 2, 1999, pp 767–771.
Scheirer E, Slaney M. Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, 1997, pp 1331–1334.
Saunders J. Real-Time Discrimination of Broadcast Speech/Music. In: IEEE Intern. Conf. on Acoustics, Speech, and Signal Processing, vol. 2, 1996, pp 993–996.
Liu Z, Wang Y, Chen T. Audio Feature Extraction and Analysis for Scene Segmentation and Classification. Journal of VLSI Signal Processing Systems for Signal, Image and Video Technology 1998, 20.
Patel NV, Sethi IK. Audio Characterization for Video Indexing. In: IS&T SPIE Proc. Storage and Retrieval for Image and Video Databases IV, 1996.
Nakajima Y, Lu Y, Sugano M, Yoneyama A, Yanagihara H, Kurematsu A. A Fast Audio Classification from MPEG Coded Data. In: IEEE Intern. Conf. on Acoustics, Speech, and Signal Processing, vol. 6, 1999, pp 3005–3008.
Ackermann B, Bunke H. Combination of Classifiers on the Decision Level for Face Recognition. Technical Report IAM-96-002, Institut für Informatik und angewandte Mathematik, Universität Bern, 1996.
Kittler J, Hatef M,. Duin RPW, Matas J. On Combining Classifiers. IEEE Trans. on Pattern Analysis and Machine Intelligence 1998; 20:226–239.
Rahman AFR, Fairhurst MC. An Evaluation of Multi-expert Configurations for the Recognition of Handwritten Numerals. Pattern Recognition 1998, 31:1255–1273.
Cordella LP, Sansone C, Tortorella F, Vento M, De Stefano C. Neural Network Classification Reliability: Problems and Application. In: Image Processing and Pattern Recognition, Academic Press, San Diego, CA, 1998, pp 161–200.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
De Santo, M., Percannella, G., Sansone, C., Vento, M. (2001). A Neural Multi -Expert Classification System for MPEG Audio Segmentation. In: Singh, S., Murshed, N., Kropatsch, W. (eds) Advances in Pattern Recognition — ICAPR 2001. ICAPR 2001. Lecture Notes in Computer Science, vol 2013. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44732-6_6
Download citation
DOI: https://doi.org/10.1007/3-540-44732-6_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41767-5
Online ISBN: 978-3-540-44732-0
eBook Packages: Springer Book Archive