Abstract
While the size of multimedia database increases, the demand for efficient search for multimedia data becomes more and more urgent. Most recent works on audio classification and retrieval adopt Euclidean distance as their distance measures. However, Euclidean distance is not a perceptual distance measure for some audio features. The purpose of this work is to derive two new distance measures for content-based audio classification, which are based on reweighting and de-correlating each feature. Weighted Euclidean distance uses a diagonal matrix, which re-weighs the importance of each feature, and generalized ellipsoid distance takes further consideration on correlation between any two features. An audio database of 85 sound clips is used as our training set. The experimental results show that the generalized ellipsoid distance yields the best result and achieves an overall correction rate of classification.
This work was supported by MOE Program for Promoting Academic Excellence of Universities under the grant number MOE 89-E-FA04-1-4.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
T. Zhang and C.-C. J. Kuo, “Audio Content Analysis for On-line Audiovisual Data Segmantation and Classification,” IEEE Trans. Speech and Audio Processing, vol. 9, no. 4, May 2001.
Y. Wang, Z. Liu, and J.-C. Huang, “Multimedia Content Analysis,” IEEE Signal Processing Magazine, pp. 12–36, Nov. 2000.
A. M. Kondoz, Digital Speech, Wiley, 1994.
L. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition, Englewood Cliffs, NJ: Prentice-Hall, 1993.
S. Z. Li, “Content-Based Audio Classification and Retrieval Using the Nearest Feature Line Method,” IEEE Trans. Speech and Audio Processing, Vol.8, No.5, Sep. 2000.
Y. Rui and T. Huang, “Optimizing Learning in Image Retrieval,” Proc. CVPR, 2000.
Y. Ishikawa, R. Subramanya, and C. Faloutsos, “Mindreader: Query databases through multiple examples,” Proc. of the 24th VLDB Conference (New York), 1998.
Z. Liu, J. Huang, Y. Wang, and T. Chen, “Audio Feature Extraction and Analysis for Scene Segmentation and Classification,” Journal of VLSI Signal Processing 20, pp.61–79, 1998.
E. Wold, T. Blum, D. Keislar, and J. Wheaton, “Content-based classification, search and retrieval of audio,” IEEE Multimedia Mag., vol. 3, no.3, pp. 27–36, 1996.
J. Foote et al, “Content-based retrieval of music and audio,” Multimedia Storage Archiving Syst. II, vol. 3229, pp. 138–147, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cheng, CC., Hsu, CT. (2002). Content-Based Audio Classification with Generalized Ellipsoid Distance. In: Chen, YC., Chang, LW., Hsu, CT. (eds) Advances in Multimedia Information Processing — PCM 2002. PCM 2002. Lecture Notes in Computer Science, vol 2532. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36228-2_41
Download citation
DOI: https://doi.org/10.1007/3-540-36228-2_41
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00262-8
Online ISBN: 978-3-540-36228-9
eBook Packages: Springer Book Archive