Audio Content Description in Sound Databases

Wieczorkowska, Alicja A.; Raś, Zbigniew W.

doi:10.1007/3-540-45490-X_20

Alicja A. Wieczorkowska⁵ &
Zbigniew W. Raś^6,7

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2198))

Included in the following conference series:

Asia-Pacific Conference on Web Intelligence

671 Accesses
8 Citations

Abstract

Sound database indexing requires metadata to represent audio content of the data. If the metadata are not attached to the database by its creator, content information has to be extracted directly from sounds, using descriptors based on sound analysis. In this paper, authors present a number of sound descriptors based on various forms of signal analysis. Telescope Vector trees (TV-trees) and Frame Segment trees (FS-trees) are applied to represent audio content on the basis of the extracted sound descriptors and metadata provided by the database creator (if only available). Such a representation of audio content of the database is used to speed up the search of the audio material in multimedia databases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ando S., Yamaguchi K., Statistical Study of Spectral Parameters in Musical Instrument Tones, J. Acoust. Soc. of America, 94, 1, July 1993, 37–45
Article Google Scholar
D’Autilia R., Guerra F., Qualitative Aspects of Signal Processing Through Dynamic Neural Networks, in: Representations of Musical Signals (G. De Poli, A. Piccialli, C. Roads, Eds.), MIT Press, Cambridge, Massachusetts, 1991, 447–462
Google Scholar
Garnett G. E., Music, Signals, and Representations: A Survey, in: Representations of Musical Signals (G. De Poli, A. Piccialli, C. Roads, Eds.), MIT Press, Cambridge, Massachusetts, 1991, 325–369
Google Scholar
Herrera P., Amatriain X., Batlle E., Serra X., Towards instrument segmentation for music content description: a critical review of instrument classification techniques, International Symposium on Music Information Retrieval ISMIR 2000, Plymouth, MA, October 23-25, 2000
Google Scholar
ISO/IEC: MPEG-7 Overview (version 3.0), International Organisation For Standardisation, ISO/IEC JTC1/SC29/WG11, Coding of Moving Pictures and Audio, N3445, Geneva, May/June 2000
Google Scholar
Jansson E. V., Sundberg J., Long-Time-Average-Spectra Applied to Analysis of Music. Part I: Method and General Applications, Acustica, Vol. 34, 1975, 15–19
Google Scholar
Keele D. B. Jr., Time-Frequency Display of Electro-Acoustic Data Using Cycle-Octave Wavelet Transforms, 99th Audio Engineering Society Convention, New York 1995, preprint 4136
Google Scholar
Kostek B., Wieczorkowska A., Parametric Representation of Musical Sounds, Archive of Acoustics, 22, 1, Institute of Fundamental Technological Research, Warsaw, Poland, 1997, 3–26
Google Scholar
Krimphoff J., McAdams S., Winsberg S., Caractérisation du Timbre des Sons Complexes. II. Analyses acoustiques et quantification psychophysique, Journal de Physique IV, Colloque C5, supplement J. de Physique III, 4, 3ème Congrès Français d’Acoustique, I, 1994, 625–628
Google Scholar
Maher R. C., Evaluation of a Method for Separating Digitized Duet Signals, J. Audio Eng. Soc., Vol. 38, No. 12, 1990, 956–979
Google Scholar
Martin K. D., Kim Y. E., Musical instrument identification: A pattern-recognition approach, 136th meeting of the Acoustical Society of America, October 13, 1998. Internet: ftp://ftp.sound.media.mit.edu/pub/Papers/kdm-asa98.pdf
McGloughlin, Multimedia: concepts and practice, Prentice Hall, Upper Saddle River, NJ, 2001
Google Scholar
Papaodysseus C., Roussopoulos G., Fragoulis D., Panagopoulos Th., and Alexiou C., A New Approach to the Automatic Recognition of Musical Recordings, J. Audio Eng. Soc., Vol. 49, No. 1/2, 2001, 23–35
Google Scholar
Paraskevas M., Mourjopoulos J., A Statistical Study of the Variability and Features of Audio Signals: Some Preliminary Results, 100th AES Convention, preprint 4256, Copenhagen 1996
Google Scholar
Pollard H. F., Jansson E. V., A Tristimulus Method for the Specification of Musical Timbre, Acustica, Vol. 51, 1982, 162–171
Google Scholar
Reuter C., Karl Erich Schumann’s Principles of Timbre as a Helpful Tool in Stream Segregation Research, Joint International Conference 1996, College of Europe at Brugge, Belgium, 8-11 September 1996, II Int. Conf. on Cognitive Musicology, 212–219
Google Scholar
Sharda N. K., Multimedia information networking, Prentice Hall, Upper Saddle River, NJ, 1999
Google Scholar
Subrahmanian V.S., Multimedia Database Systems, Morgan Kaufmann Publishers, San Francisco, CA, 1998
Google Scholar
Toiviainen P., Optimizing Self-Organizing Timbre Maps: Two Approaches, Proc. Joint Int. Conf., II Int. Conf. on Cognitive Musicology, 1996, College of Europe at Brugge, Belgium, 8-11 September 1996, 264–271
Google Scholar
Uematsu H., Ozawa K., Suzuki Y., Sone T., A Consideration on the Timbre of Complex Tones Only Consisting of Higher Harmonics, Proc. 15th Intern. Congress on Acoustics, Trondheim, Norway 1995, 509–512
Google Scholar
Wieczorkowska A., The recognition efficiency of musical instrument sounds depending on parameterization and type of a classifier (in Polish), Ph.D. Dissertation, Technical University of Gdansk, 1999
Google Scholar
Wieczorkowska A., Towards Musical Data Classification via Wavelet Analysis, in: Foundations of Intelligent Systems, Proceedings of ISMIS’00, Charlotte, NC, (Z. W. Ras, S. Ohsuga, Eds.), LNCS/LNAI, No. 1932, Springer-Verlag, 2000, 292–300
Chapter Google Scholar
Zwicker E., Zwicker U. T., Audio Engineering and Psychoacoustics: Matching Signals to the Final Receiver, the Human Auditory System, J. Audio Eng. Soc., Vol. 39, No. 3, March 1991, 115–126
MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Polish-Japanese Institute of Information Technology, ul. Koszykowa 86, 02-008, Warsaw, Poland
Alicja A. Wieczorkowska
Computer Science Dept., University of North Carolina, Charlotte, NC, 28223, USA
Zbigniew W. Raś
Inst. of Comp. Science, Polish Academy of Sciences, ul. Ordona 2, 01-237, Warsaw, Poland
Zbigniew W. Raś

Authors

Alicja A. Wieczorkowska
View author publications
You can also search for this author in PubMed Google Scholar
Zbigniew W. Raś
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Systems and Information Engineering, Maebashi Institute of Technology, 460-1 Kamisadori-Cho, Maebashi-City, 371-0816, Japan
Ning Zhong
Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada, S4S 0A2
Yiju Yao
Department of Computer Science, Hong Kong Baptist University, 224 Waterloo Road, Kowloon, Hong Kong, China
Jiming Liu
Department of Information and Computer Science, Waseda University, 3-4-1 Okubo Shinjuku-Ku, Tokyo, 169, Japan
Setsuo Ohsuga

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wieczorkowska, A.A., Raś, Z.W. (2001). Audio Content Description in Sound Databases. In: Zhong, N., Yao, Y., Liu, J., Ohsuga, S. (eds) Web Intelligence: Research and Development. WI 2001. Lecture Notes in Computer Science(), vol 2198. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45490-X_20

Download citation

DOI: https://doi.org/10.1007/3-540-45490-X_20
Published: 19 October 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42730-8
Online ISBN: 978-3-540-45490-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics