Skip to main content

A Neural Multi -Expert Classification System for MPEG Audio Segmentation

  • Conference paper
  • First Online:
Advances in Pattern Recognition — ICAPR 2001 (ICAPR 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2013))

Included in the following conference series:

Abstract

The current research efforts in the field of video parsing and analysis are mainly focused on the use of pictorial information, while neglecting an important supplementary source of content information such as the embedded audio or soundtrack. In contrast, in this paper we address the issue of exploiting audio information that can be jointly used with video information for scene changes detection. The proposed method directly works on MPEG encoded sequences so to avoid computationally intensive decoding procedures. It is based on a multi-expert classification system made up of a hierarchical ensemble of neural networks.

Finally, after presentation of a large audio database, suitably designed for assessing the performance of the approach, preliminary experimental results are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hanjalic A, Lagendijk RL, Biemond J. Automated High-Level Movie Segmentation for Advanced Video-Retrieval Systems. IEEE Trans. on Circuits and Systems for Video Technology 1999, 9:580–588.

    Article  Google Scholar 

  2. Yeung M, Yeo BL, Liu B. Extracting Story Units from Long Programs for Video Browsing and navigation. In: IEEE International Conference on Multimedia Computing and Systems, 1996, pp 296–305.

    Google Scholar 

  3. Kender JR, Yeo BL. Video Scene Segmentation Via Continuous Video Coherence. In: IEEE International Conference on Computer Vision and Pattern Recognition, 1998, pp 367–373.

    Google Scholar 

  4. Saraceno C, Leonardi R. Audio as a Support to Scene Change Detection and Characterization of Video Sequences. In: Proc. ICASSP’97, Munich, 1997.

    Google Scholar 

  5. Nam J, Cetin E, Tewfik H. Speaker Identification and Video Analysis for Hierarchical Video Shot Classification. In: Proc. ICIP’ 97, S. Barbara, 1997.

    Google Scholar 

  6. Boreczky JS, Wilcox LD. A Hidden Markov Model Framework for Video Segmentation Using Audio and Image Features. In: Proc. ICASSP’ 98, Seattle, 1998.

    Google Scholar 

  7. Jang PJ, Hauptmann AG. Improving acoustic models with captioned multimedia speech. In: Proc. of IEEE Intl. Conf. on Multimedia Computing and Systems, vol. 2, 1999, pp 767–771.

    Google Scholar 

  8. Scheirer E, Slaney M. Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, 1997, pp 1331–1334.

    Google Scholar 

  9. Saunders J. Real-Time Discrimination of Broadcast Speech/Music. In: IEEE Intern. Conf. on Acoustics, Speech, and Signal Processing, vol. 2, 1996, pp 993–996.

    Google Scholar 

  10. Liu Z, Wang Y, Chen T. Audio Feature Extraction and Analysis for Scene Segmentation and Classification. Journal of VLSI Signal Processing Systems for Signal, Image and Video Technology 1998, 20.

    Google Scholar 

  11. Patel NV, Sethi IK. Audio Characterization for Video Indexing. In: IS&T SPIE Proc. Storage and Retrieval for Image and Video Databases IV, 1996.

    Google Scholar 

  12. Nakajima Y, Lu Y, Sugano M, Yoneyama A, Yanagihara H, Kurematsu A. A Fast Audio Classification from MPEG Coded Data. In: IEEE Intern. Conf. on Acoustics, Speech, and Signal Processing, vol. 6, 1999, pp 3005–3008.

    Google Scholar 

  13. Ackermann B, Bunke H. Combination of Classifiers on the Decision Level for Face Recognition. Technical Report IAM-96-002, Institut für Informatik und angewandte Mathematik, Universität Bern, 1996.

    Google Scholar 

  14. Kittler J, Hatef M,. Duin RPW, Matas J. On Combining Classifiers. IEEE Trans. on Pattern Analysis and Machine Intelligence 1998; 20:226–239.

    Article  Google Scholar 

  15. Rahman AFR, Fairhurst MC. An Evaluation of Multi-expert Configurations for the Recognition of Handwritten Numerals. Pattern Recognition 1998, 31:1255–1273.

    Article  Google Scholar 

  16. Cordella LP, Sansone C, Tortorella F, Vento M, De Stefano C. Neural Network Classification Reliability: Problems and Application. In: Image Processing and Pattern Recognition, Academic Press, San Diego, CA, 1998, pp 161–200.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

De Santo, M., Percannella, G., Sansone, C., Vento, M. (2001). A Neural Multi -Expert Classification System for MPEG Audio Segmentation. In: Singh, S., Murshed, N., Kropatsch, W. (eds) Advances in Pattern Recognition — ICAPR 2001. ICAPR 2001. Lecture Notes in Computer Science, vol 2013. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44732-6_6

Download citation

  • DOI: https://doi.org/10.1007/3-540-44732-6_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41767-5

  • Online ISBN: 978-3-540-44732-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics