Skip to main content

Automatic Score Extraction with Optical Music Recognition (OMR)

  • Chapter
Springer Handbook of Systematic Musicology

Part of the book series: Springer Handbooks ((SHB))

Abstract

Optical music recognition (GlossaryTerm

OMR

) describes the process of automatically transcribing music notation from a digital image. Although similar to optical character recognition (GlossaryTerm

OCR

), the process and procedures of OMR diverge due to the fundamental differences between text and music notation, such as the two-dimensional nature of the notation system and the overlay of music symbols on top of staff lines. The OMR process can be described as a sequence of steps, with techniques adapted from disciplines including image processing, machine learning, grammars, and notation encoding. The sequence and specific techniques used can differ depending on the condition of the image, the type of notation, and the desired output.

Several commercial and open-source OMR software systems have been available since the mid-1990s. Most of them are designed to be used by individuals and recognize common (post-18th-century) Western music notation, though there have been some efforts to recognize other types of music notation such as for the lute and for earlier Western music.

Even though traditional applications of OMR have focused on small-scale recognition tasks, typically as an automated method of musical entry for score editing, new applications of large-scale OMR are under development, where automated recognition is the central technology for building full-music search systems, similar to the large-scale full-text recognition efforts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 269.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 349.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

k-NN:

k-nearest-neighbor

ASCII:

American standard code for information interchange

BD:

book-dependent

BI:

book-independent

CWMN:

common Western music notation

DARMS:

digital alternative representation of music scores

EsAC:

Essen associative code

GPL:

general public license

HMM:

hidden Markov model

HTML:

hyper-text markup language

MAP:

maximum a posteriori

MIDI:

musical instrument digital interface

NIFF:

notation interchange file format

NN:

neural network

OCR:

optical character recognition

ODD:

one document does it all

OMR:

optical music recognition

References

  1. D.H. Shepard: Apparatus for reading, Patent Application 2664758 (1951)

    Google Scholar 

  2. D. Martin: David H. Shepard, 84, Dies; Optical Reader Inventor, New York Times, 11 December 2007

    Google Scholar 

  3. D. Pruslin: Automatic Recognition of Sheet Music, Sc. D. Diss. (Massachusetts Institute of Technology, Cambridge 1966)

    Google Scholar 

  4. D. Prerau: Computer Pattern Recognition of Standard Engraved Music Notation, PhD Diss. (Massachusetts Institute of Technology, Cambridge 1970)

    Google Scholar 

  5. A. Samuel: The banishment of paper-work, New Sci. 21(380), 529–530 (1964)

    Google Scholar 

  6. S. Mori, C. Suen, K. Yamamoto: Historical review of OCR research and development, Proc. IEEE 80(7), 1029–1058 (1992)

    Article  Google Scholar 

  7. D.S. Prerau: Computer pattern recognition of printed music. In: Fall Joint Computer Conference 1971, AFIP Conf. Proc., Vol. 39 (1971) pp. 153–162

    Google Scholar 

  8. D. Blostein, H.S. Baird: A critical survey of music image analysis. In: Structured Document Image Analysis, ed. by H.S. Baird, H. Bunke, K. Yamamoto (Springer, Berlin 1992) pp. 405–434

    Chapter  Google Scholar 

  9. C. Dalitz, T. Karsten: Using the Gamera framework for building a lute tablature recognition system. In: 6th Int. Soc. Music Inf. Retr. Conf. (ISMIR) (2005) pp. 478–481

    Google Scholar 

  10. L.L. Wei, Q.A. Salih, H.S. Hock: Optical tablature recognition (OTR) system: Using Fourier descriptors as a recognition tool. In: 2008 International Conference on Audio, Language and Image Processing, Shanghai (2008) pp. 1532–1539, https://doi.org/10.1109/ICALIP.2008.4590235

    Chapter  Google Scholar 

  11. C. Dalitz, C. Pranzas: German lute tablature recognition. In: Int. Conf. Document Anal. Recognit. (ICDAR) (2009) pp. 371–375

    Google Scholar 

  12. V.G. Gezerlis, S. Theodoridis: Optical character recognition of the orthodox hellenic byzantine music notation, Pattern Recognit. 35(4), 895–914 (2002)

    Article  Google Scholar 

  13. C. Dalitz, G.K. Michalakis, C. Pranzas: Optical recognition of psaltic Byzantine chant notation, Int. J. Doc. Anal. Recognit. (IJDAR) 11(3), 143–158 (2008)

    Article  Google Scholar 

  14. L. Pugin: Optical music recognition of early typographic prints using hidden Markov models. In: 7th Int. Conf. Music Inf. Retr. (ISMIR) (2006) pp. 53–56

    Google Scholar 

  15. L. Tardón, S. Sammartino, I. Barbancho, V. Gómez, A. Oliver: Optical music recognition for scores written in white mensural notation, EURASIP J. Image Video Process. 2009, 843401 (2009), https://doi.org/10.1155/2009/843401

    Article  Google Scholar 

  16. D. Bainbridge: Extensible Optical Music Recognition, PhD Diss. (University of Canterbury, Canterbury 1997)

    Google Scholar 

  17. K. MacMillan, M. Droettboom, I. Fujinaga: Gamera: Optical music recognition in a new shell. In: Proc. Int. Comput. Music Conf. (2002) pp. 482–485

    MATH  Google Scholar 

  18. D. Marr: Vision: A Computational Investigation into the Human Representation and Processing of Visual Information (Freeman, New York 1982)

    Google Scholar 

  19. T. Pun: C. De. Garrini: Cybernétique et vision par ordinateur. In: Le déficit visuel, de la neurophysiologie à la pratique de la réadaptation, ed. by A.B. Safran, A. Assimacopoulos (Masson, Paris 2014) pp. 213–224

    Google Scholar 

  20. R. Bruyer: Le Cerveau Qui Voit (Editions Odile Jacob, Paris 2000)

    Google Scholar 

  21. A. Rebelo, I. Fujinaga, F. Paszkiewicz, A.R.S. Marcal, C. Guedes, J.S. Cardoso: Optical music recognition: State-of-the-art and open issues, Int. J. Multimed. Inf. Retr. 1(3), 173–190 (2012)

    Article  Google Scholar 

  22. K.M. Sayre: Machine recognition of handwritten words: A project report, Pattern Recognit. 5, 213–228 (1973)

    Article  Google Scholar 

  23. T. Plötz, G. Fink: Markov models for offline handwriting recognition: A survey, Int. J. Document Anal. Recognit. 12, 269 (2009)

    Article  Google Scholar 

  24. K.C. Ng, R.D. Boyle: Recognition and reconstruction of primitives in music scores, Image Vis. Comput. 14(1), 39–46 (1996)

    Article  Google Scholar 

  25. I. Fujinaga, J. Riley: Recommended best practices for digital image capture of musical scores. In: 3rd Int. Conf. Music Inf. Retr. (ISMIR) (2002) pp. 261–263

    Google Scholar 

  26. W. Koseluk: Digitalization of musical sources: An overview. In: The Virtual Score: Representation, Retrieval, Restoration, Computing in Musicology, Vol. 12, ed. by W.B. Hewlett, E. Selfridge-Field (MIT Press, Cambridge 2001) pp. 219–226

    Google Scholar 

  27. D. Bainbridge, T. Bell: The challenge of optical music recognition, Comput. Humanit. 35, 95–121 (2001)

    Article  Google Scholar 

  28. E. Selfridge-Field: Optical recognition of musical notation: A survey of current work. In: Computational Musicology: An International Directory of Applications, Vol. 9, ed. by W.B. Hewlett, E. Selfridge-Field (1993) pp. 109–146

    Google Scholar 

  29. P. Martin, C. Bellissant: Low-level analysis of music drawings. In: 1st Int, Conf. Doc. Anal. Recognit., ICDAR pp, 417–425 (1991)

    Google Scholar 

  30. H. Fahmy, D. Blostein: A graph grammar programming style for recognition of music notation, Mach. Vis. Appl. 6, 83–99 (1993)

    Article  Google Scholar 

  31. D. Bainbridge, T. Bell: A music notation construction engine for optical music recognition, Softw. Pract. Exp. 33(2), 173–200 (2003)

    Article  Google Scholar 

  32. K. MacMillan, M. Droettboom, I. Fujinaga: Gamera: A structured document recognition application development environment. In: 2nd Int. Symp. Music Inf. Retr. ISMIR (2001) pp. 173–178

    Google Scholar 

  33. K.C. Ng: Music manuscript tracing. In: 4th Int. Workshop, Graphics Recognit.: Algorithms and Applications (GREC) (2001) pp. 322–334

    Google Scholar 

  34. J. Burgoyne, L. Pugin, G. Eustace, I. Fujinaga: A comparative survey of image binarisation algorithms for optical recognition on degraded musical sources. In: 8th Int. Conf. Music Inf. Retr. (ISMIR) (2007) pp. 509–512

    Google Scholar 

  35. C. Dalitz, M. Droettboom, B. Pranzas, I. Fujinaga: A comparative study of staff removal algorithms, IEEE Trans. Pattern Anal. Mach. Intell. 30(5), 753–766 (2008)

    Article  Google Scholar 

  36. H. Miyao: Stave extraction for printed music scores. In: 3rd Int. Conf. Intell. Data Eng. Automated Learning (IDEAL) (2002) pp. 562–568

    Google Scholar 

  37. F. Rossant: A global method for music symbol recognition in typeset music sheets, Pattern Recognit. Lett. 23(10), 1129–1141 (2002)

    Article  Google Scholar 

  38. I. Fujinaga: Exemplar-based learning in adaptive optical music recognition system. In: Int. Comput. Music Conf (1996) pp. 55–60

    Google Scholar 

  39. H. Kato, S. Inokuchi: A recognition system for printed piano music using musical knowledge and constraints. In: Int. Assoc. Pattern Recognit. Workshop on Syntactic and Struct. Pattern Recognit (1990) pp. 231–248

    Google Scholar 

  40. B. Coüasnon: Formalisation grammaticale de la connaissance a priori pour l’analyse de documents: Application aux partitions d’orchestre. In: Actes du dixième congrès Reconnaissance des Formes et Intelligence Artificielle, Rennes (1996) pp. 465–474

    Google Scholar 

  41. I. Knopke, D. Byrd: Towards musicdiff: A foundation for improved optical music recognition using multiple recognizers. In: 8th Int. Conf. Music Inf. Retr. (ISMIR) (2007) pp. 123–126

    Google Scholar 

  42. E.P. Bugge, K.L. Juncher, B.S. Mathiesen, J.G. Simonsen: Using sequence alignment and voting to improve optical music recognition from multiple recognizers. In: 12th Int. Soc. Music Inf. Retr. Conf. (ISMIR) (2011) pp. 405–410

    Google Scholar 

  43. M. Church, M.S. Cuthbert: Improving rhythmic transcriptions via probability models applied post-OMR. In: 15th Int. Soc. Music Inf. Retr. Conf. (ISMIR) (2014) pp. 643–648

    Google Scholar 

  44. H.E. Poole: Music printing. In: Music Printing and Publishing, ed. by D.W. Krummel, S. Sadie (Norton, New York 1990) pp. 3–78

    Google Scholar 

  45. R. Rasch (Ed.): Music Publishing in Europe 1600–1900 Concepts and Issues, Bibliography (Berliner Wissenschafts, Berlin 2005)

    Google Scholar 

  46. F. Rossant, I. Bloch: Robust and adaptive OMR system including Fuzzy modeling, Fusion of musical rules, and possible error detection, EURASIP J. Adv. Signal Process. 2007, 81541 (2007)

    Article  Google Scholar 

  47. L. Pugin, J.A. Burgoyne, I. Fujinaga: MAP adaptation to improve optical music recognition of early music documents using hidden Markov models. In: 8th Int. Conf. Music Inf. Retr. (ISMIR) (2007) pp. 513–516

    Google Scholar 

  48. E. Selfridge-Field: Beyond MIDI: The Handbook of Musical Codes (MIT Press, Cambridge 1997)

    Google Scholar 

  49. Makemusic Inc.: musicXML, http://www.musicxml.com (2017)

  50. WG_1599 – Working Group for XML Musical Application: 1599-2008 – IEEE Recommended Practice for Defining a Commonly Acceptable Musical Application Using XML, http://standards.ieee.org/findstds/standard/1599-2008.html (2017)

  51. Music Encoding Initiative: http://www.music-encoding.org

  52. A. Hankinson, P. Roland, I. Fujinaga: The music encoding initiative as a document-encoding framework. In: 12th Int. Soc. Music Inf. Retr. Conf. (ISMIR) (2011) pp. 293–298

    Google Scholar 

  53. A. Hankinson, L. Pugin, I. Fujinaga: An interchange format for optical music recognition applications. In: 11th Conf. Int. Soc. Music Inf. Retr. (ISMIR) (2010) pp. 51–56

    Google Scholar 

  54. T.M. Breuel, U. Kaiserslautern: The hOCR microformat for OCR workflow and results. In: Int. Conf. Document Anal. Recognit. (ICDAR) (2007) pp. 1063–1067

    Chapter  Google Scholar 

  55. S. George: Evaluation in the visual perception of music. In: Visual Perception of Music Notation: Online and Offline Recognition, ed. by S. George (IRM, Hershey 2004) p. 308

    Chapter  Google Scholar 

  56. M. Dawe: About Neuratron, http://www.neuratron.com (2015)

  57. capella-software AG: Products, http://www.capella.de/us/index.cfm/products (2017)

  58. Visiv Ltd: User comments, reviews, etc., http://www.visiv.co.uk/quote.htm (2006)

  59. Visiv Ltd: Version History, http://www.visiv.co.uk/vershv2.htm (2006)

  60. Graham Jones: http://www.indriid.com/grahamjones.html

  61. Wikipedia: Audiveris, https://en.wikipedia.org/wiki/Audiveris (2017)

  62. Laurent Pugin: Aruspix, http://www.aruspix.net

  63. Christoph Dalitz: GAMERA Project, http://gamera.informatik.hsnr.de

  64. G. Vigliensoni, J.A. Burgoyne, A. Hankinson, I. Fujinaga: Automatic pitch detection in printed square notation. In: Proc. Int. Soc. Music Inf. Retr. Conf., Miami (2011) pp. 423–428

    Google Scholar 

  65. L. Pugin, J. Hockman, J.A. Burgoyne, I. Fujinaga: Gamera versus Aruspix: Two optical music recognition approaches. In: 9th Int. Conf. Music Inf. Retr. (ISMIR) (2008) pp. 419–424

    Google Scholar 

  66. J. Cardoso, A. Capela, A. Rebelo, C. Guedes: A connected path approach for staff detection on a music score. In: Proc. 15th IEEE Int. Conf. Image Process (2008) pp. 1005–1008

    Google Scholar 

  67. A. Dutta, U. Pal, A. Fornés, J. Lladós: An Efficient Staff Removal Approach from Printed Musical Documents. In: Proc. 2010 20th Int. Conf. Pattern Recognit (2010) pp. 1965–1968

    Google Scholar 

  68. A. Fornés, V.C. Kieu, M. Visani, N. Journet, A. Dutta: The ICDAR/GREC 2013 Music Scores Competition: Staff removal, Lect. Notes Comput. Sci. 8746, 207–220 (2014)

    Article  Google Scholar 

  69. Laurent Pugin: Verovio, http://www.verovio.org

  70. L. Pugin, R. Zitellini, P. Roland: Verovio: A library for engraving MEI music notation into SVG. In: 15th Int. Conf. Music Inf. Retr. (ISMIR) (2014) pp. 107–112

    Google Scholar 

  71. McGill University: http://ddmal.github.io/diva.js (2016)

  72. A. Hankinson, W. Liu, L. Pugin, I. Fujinaga: Diva: A web-based document image viewer. In: Conf. Theory Prac. Digital Libraries (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ichiro Fujinaga .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Fujinaga, I., Hankinson, A., Pugin, L. (2018). Automatic Score Extraction with Optical Music Recognition (OMR). In: Bader, R. (eds) Springer Handbook of Systematic Musicology. Springer Handbooks. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-55004-5_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-55004-5_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-55002-1

  • Online ISBN: 978-3-662-55004-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics