Abstract
Search engines often employ techniques for determining syntactic similarity of Web pages. Such a tool allows them to avoid returning multiple copies of essentially the same page when a user makes a query. Here we describe our experience extending these techniques to MIDI music files. The music domain requires modification to cope with problems introduced in the musical setting, such as polyphony. Our experience suggests that when used properly these techniques prove useful for determining duplicates and clustering databases in the musical setting as well.
Supported in part by an Alfred P. Sloan Research Fellowship, NSF CAREER Grant CCR-9983832, and an equipment grant from Compaq Computer Corporation.
Supported in part by a grant from the Harvard Committee for Faculty Research Support.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
The Humdrum Toolkit. http://www.lib.virginia.edu/dmmc/Music/Humdrum/.
Themefinder. Available at http://www.themefinder.org
MIDI Manufacturers Association. The complete detailed MIDI 1.0 specification, 1996.
D. Bainbridge, C. G. Nevill-Manning, I. H. Witten, L. A. Smith, and R. J. McNab. Towards a digital library of popular music. In Proceedings od Digital Libraries’ 99, pages 161–169, 1999.
A. Z. Broder. Some applications of Rabin’s fingerprinting method. In R. Capocelli, A. De Santis, and U. Vaccaro, editors, Sequences II: Methods in Communications, Security, and Computer Science, pages 143–152. Springer-Verlag, 1993.
A. Z. Broder. On the resemblance and containment of documents. In Compression and Complexity of Sequences (SEQUENCES’ 97), pages 21–29. IEEE Computer Society, 1998.
A. Broder, S. Glassman, M. Manasse, and G. Zweig. Syntactic clustering of the Web. In Proceedings of the Sixth International World Wide Web Conference, pages 391–404, 1997.
C. Francu and C. G. Nevill-Manning. Distance metrics and indexing strategies for a digital library of popular music. In Proceedings of the IEEE International Conference on Multimedia and Expo, 2000.
U. Manber. Finding similar files in a large file system. In Proceeding of the Usenix 1994 Winter Technical Conference, pages 1–10, January 1994.
N. Shivakumar and H. Garcia-Molina. SCAM: A copy detection mechanism for digital documents. In Proceeding of the 2nd International Conference in the Theory and Practice of Digital Libraries (DL’ 95), 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mitzenmacher, M., Owen, S. (2001). Estimating Resemblance of MIDI Documents. In: Buchsbaum, A.L., Snoeyink, J. (eds) Algorithm Engineering and Experimentation. ALENEX 2001. Lecture Notes in Computer Science, vol 2153. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44808-X_6
Download citation
DOI: https://doi.org/10.1007/3-540-44808-X_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42560-1
Online ISBN: 978-3-540-44808-2
eBook Packages: Springer Book Archive