Skip to main content

Content-Based Audio Classification with Generalized Ellipsoid Distance

  • Conference paper
  • First Online:
Advances in Multimedia Information Processing — PCM 2002 (PCM 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2532))

Included in the following conference series:

Abstract

While the size of multimedia database increases, the demand for efficient search for multimedia data becomes more and more urgent. Most recent works on audio classification and retrieval adopt Euclidean distance as their distance measures. However, Euclidean distance is not a perceptual distance measure for some audio features. The purpose of this work is to derive two new distance measures for content-based audio classification, which are based on reweighting and de-correlating each feature. Weighted Euclidean distance uses a diagonal matrix, which re-weighs the importance of each feature, and generalized ellipsoid distance takes further consideration on correlation between any two features. An audio database of 85 sound clips is used as our training set. The experimental results show that the generalized ellipsoid distance yields the best result and achieves an overall correction rate of classification.

This work was supported by MOE Program for Promoting Academic Excellence of Universities under the grant number MOE 89-E-FA04-1-4.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. T. Zhang and C.-C. J. Kuo, “Audio Content Analysis for On-line Audiovisual Data Segmantation and Classification,” IEEE Trans. Speech and Audio Processing, vol. 9, no. 4, May 2001.

    Google Scholar 

  2. Y. Wang, Z. Liu, and J.-C. Huang, “Multimedia Content Analysis,” IEEE Signal Processing Magazine, pp. 12–36, Nov. 2000.

    Google Scholar 

  3. A. M. Kondoz, Digital Speech, Wiley, 1994.

    Google Scholar 

  4. L. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition, Englewood Cliffs, NJ: Prentice-Hall, 1993.

    Google Scholar 

  5. S. Z. Li, “Content-Based Audio Classification and Retrieval Using the Nearest Feature Line Method,” IEEE Trans. Speech and Audio Processing, Vol.8, No.5, Sep. 2000.

    Google Scholar 

  6. Y. Rui and T. Huang, “Optimizing Learning in Image Retrieval,” Proc. CVPR, 2000.

    Google Scholar 

  7. Y. Ishikawa, R. Subramanya, and C. Faloutsos, “Mindreader: Query databases through multiple examples,” Proc. of the 24th VLDB Conference (New York), 1998.

    Google Scholar 

  8. Z. Liu, J. Huang, Y. Wang, and T. Chen, “Audio Feature Extraction and Analysis for Scene Segmentation and Classification,” Journal of VLSI Signal Processing 20, pp.61–79, 1998.

    Article  Google Scholar 

  9. E. Wold, T. Blum, D. Keislar, and J. Wheaton, “Content-based classification, search and retrieval of audio,” IEEE Multimedia Mag., vol. 3, no.3, pp. 27–36, 1996.

    Article  Google Scholar 

  10. J. Foote et al, “Content-based retrieval of music and audio,” Multimedia Storage Archiving Syst. II, vol. 3229, pp. 138–147, 1997.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cheng, CC., Hsu, CT. (2002). Content-Based Audio Classification with Generalized Ellipsoid Distance. In: Chen, YC., Chang, LW., Hsu, CT. (eds) Advances in Multimedia Information Processing — PCM 2002. PCM 2002. Lecture Notes in Computer Science, vol 2532. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36228-2_41

Download citation

  • DOI: https://doi.org/10.1007/3-540-36228-2_41

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00262-8

  • Online ISBN: 978-3-540-36228-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics