Abstract
Musical scores are traditionally retrieved by title, composer or subject classification. Just as multimedia computer systems increase the range of opportunities available for presenting musical information, so they also offer new ways of posing musically-oriented queries. This paper shows how scores can be retrieved from a database on the basis of a few notes sung or hummed into a microphone. The design of such a facility raises several interesting issues pertaining to music retrieval. We first describe an interface that transcribes acoustic input into standard music notation. We then analyze string matching requirements for ranked retrieval of music and present the results of an experiment which tests how accurately people sing well known melodies. The performance of several string matching criteria are analyzed using two folk song databases. Finally, we describe a prototype system which has been developed for retrieval of tunes from acoustic input and evaluate its performance.
Similar content being viewed by others
References
A. Askenfelt, “Automatic notation of played music: the Visa project,” IAML Conference, Lisbon 1978, pp. 109–121.
J. Backus, The Acoustical Foundations of Music, Norton and Co., New York, 1969.
D. Bainbridge and T.C. Bell, “An extensible optical music recognition system,” in Proc. 19th Australasian Computer Science Conf., Melbourne, January 1996, pp. 308-317.
B. Bauer, The New Real Book, Sher Music Co., Petaluma, CA, 1988.
M.J. Bishop and E.A. Thompson, “Maximum likelihood alignment of DNA sequences,” J. Molecular Biology, Vol. 190, pp. 159–165, 1986.
N.P. Carter, “Automatic recognition of printed music in the context of electronic publishing,” Ph.D. thesis, University of Surrey, UK, February 1989.
A. Cohen and N. Cohen, “Tune evolution as an indicator of traditional musical norms,” J. American Folklore, Vol. 86, No. 339, pp. 37–47, 1973.
D. Deutsch, “Octave generalization and tune recognition,” Perception and Psychophysics, Vol. 11, No. 6, pp. 411–412, 1972.
W.J. Dowling, “Scale and contour: Two components of a theory of memory for melodies,” Psychological Review, Vol. 85, No. 4, pp. 341–354, 1978.
Z. Galil and K. Park, “An improved algorithm for approximate string matching,” SIAM J. Comput., Vol. 19, No. 6, pp. 989–999, 1990.
A. Ghias, J. Logan, D. Chamberlin, and B.C. Smith, “Query by humming,” in Proc. ACM Multimedia 95, San Francisco, November 1995.
B. Gold and L. Rabiner, “Parallel processing techniques for estimating pitch periods of speech in the time domain,” J. Acoust. Soc. Am., Vol. 46, No. 2, pp. 442–448, 1969.
C.A. Goodrum and H.W. Dalrymple, Guide to the Library of Congress, Library of Congress, Washington, D.C., 1982.
M. Hawley, “The personal orchestra,” Computing Systems, Vol. 3, No. 2, pp. 289–329, 1990.
W. Hess, Pitch Determination of Speech Signals, Springer-Verlag, New York, 1983.
S. Loeb, “Architecting personalized delivery of multimedia information,” Commun. ACM, Vol. 35, No. 12, pp. 39–50, 1992.
R. Lowrance and R.A. Wagner, “An extension of the string-to-string correction problem,” J. ACM, Vol. 22, No. 2, pp. 177–183, 1975.
R.J. McNab, L.A. Smith, and I.H. Witten, “Signal processing for melody transcription,” in Proc. 19th Australasian Computer Science Conf., Melbourne, January 1996, pp. 301-307.
M. Mongeau and D. Sankoff, “Comparison of musical sequences,” Computers and the Humanities, Vol. 24, pp. 161–175, 1990.
D. Parsons, The Directory of Tunes and Musical Themes, Spencer Brown, Cambridge, 1975.
D. Sankoff and J.B. Kruskal (Eds.), TimeWarps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, Addison-Wesley, 1983.
R. Sedgewick, Algorithms, Addison-Wesley, Reading, Massachusetts, 1988.
E. Selfridge-Field, “Optical recognition of music notation: A survey of current work,” Computing in Musicology, Vol. 9, pp. 109–145, 1994.
J. Sloboda, “Music performance,” in The Psychology of Music, D. Deutsch (Ed.), Academic Press, 1982, pp. 479-496.
K. Steiglitz, T.W. Parks, and J.F. Kaiser, “METEOR: A constraint-based FIR filter design program,” IEEE Trans. Signal Proc., Vol. 40, No. 8, pp. 1901–1909, 1992.
J. Sundberg and B. Lindblom, “Generative theories in language and music descriptions,” Cognition, Vol. 4, pp. 99–122, 1976.
R.A.Wagner and M.J. Fischer, “The string-to-string correction problem,” J. ACM, Vol. 21, No. 1, pp. 168–173, 1974.
A.Waibel and B. Yegnanaryana, “Comparative study of nonlinear warping techniques in isolated word speech recognition systems,” IEEE Trans. Acoustics, Speech, and Signal Proc., Vol. 31, No. 6, pp. 1582–1586, 1983.
S. Wu and U. Manber, “Fast text searching allowing errors,” Commun. ACM, Vol. 35, No. 10, pp. 83–91 1992.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
McNab, R.J., Smith, L.A., Witten, I.H. et al. Tune Retrieval in the Multimedia Library. Multimedia Tools and Applications 10, 113–132 (2000). https://doi.org/10.1023/A:1009606600500
Issue Date:
DOI: https://doi.org/10.1023/A:1009606600500