Skip to main content

Music Learning: Automatic Music Composition and Singing Voice Assessment

  • Chapter
Springer Handbook of Systematic Musicology

Part of the book series: Springer Handbooks ((SHB))

Abstract

Traditionally, singing skills are learned and improved by means of the supervised rehearsal of a set of selected exercises. A music teacher evaluates the user's performance and recommends new exercises according to the user's evolution.

In this chapter, the goal is to describe a virtual environment that partially resembles the traditional music learning process and the music teacher's role, allowing for a complete interactive self-learning process.

An overview of the complete chain of an interactive singing-learning system including tools and concrete techniques will be presented. In brief, first, the system should provide a set of training exercises. Then, it should assess the user's performance. Finally, the system should be able to provide the user with new exercises selected or created according to the results of the evaluation.

Following this scheme, methods for the creation of user-adapted exercises and the automatic evaluation of singing skills will be presented. A technique for the dynamical generation of musically meaningful singing exercises, adapted to the user's level, will be shown. It will be based on the proper repetition of musical structures, while assuring the correctness of harmony and rhythm. Additionally, a module for singing assessment of the user's performance, in terms of intonation and rhythm, will be shown.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 269.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 349.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

DTW:

dynamic time warping

EMI:

experiments in musical intelligence

IOI:

interonset interval

MIDI:

musical instrument digital interface

RMS:

root mean square

RSSM:

rhythm self-similarity matrix

SMO:

sequential minimal optimization

TIE:

total intonation error

References

  1. M.P. Ryynänen, A.P. Klapuri: Automatic transcription of melody, bass line, and chords in polyphonic music, Comput. Music J. 32(3), 72–86 (2008)

    Article  Google Scholar 

  2. J. Serrá, E. Gómez, P. Herrera: Audio cover song identification and similiraty: Background, approaches, evaluation, and beyond. In: Advances in Music Information Retrieval, Vol. 274, ed. by Z.W. Ras, A.A. Wieczorkowska (Springer, Berlin, Heidelberg 2010) pp. 307–332

    Chapter  Google Scholar 

  3. S. Koelsch, W.A. Siebel: Towards a neural basis of music perception, Proc. TRENDS Cogn. Sci. 9(12), 578–584 (2005)

    Article  Google Scholar 

  4. R.F. Goldman: Ionisation; Density, 21.5; Integrales; Octandre; Hyperprism; Poeme Electronique, Musical Q. 47(1), 133–134 (1961)

    Article  Google Scholar 

  5. G. Nierhaus: Algorithmic Composition: Paradigms of Automated Music Generation, Vol. 34 (Springer, Wien 2010)

    MATH  Google Scholar 

  6. C. Uhle, J. Herre: Estimation of tempo, micro time and time signature from percussive music. In: Proc. Int. Conf. Digital Audio Effects (DAFx) (2003)

    Google Scholar 

  7. F. Gouyon, P. Herrera, P. Cano: Pulse-dependent analyses of percussive music, Proc. ICASSP 4, 396–401 (2002)

    Google Scholar 

  8. S. Tojo, K. Hirata: Structural similarity based on time-span tree. In: Proc. 9th Int. Symp. Comput. Music Model. Retriev. (CMMR) (2012) pp. 645–660

    Google Scholar 

  9. M. Müller, D.P.W. Ellis, A. Klapuri, G. Richard: Signal processing for music analysis, IEEE J. Sel. Top. Signal Process. 5(6), 1088–1110 (2011)

    Article  Google Scholar 

  10. A. Van Der Merwe, W. Schulze: Music generation with Markov models, IEEE Multimed. 18(3), 78–85 (2011)

    Article  Google Scholar 

  11. M. Pearce, G. Wiggins: Towards a framework for the evaluation of machine compositions. In: Proc. AISB’01 Symp. AI Creat. Arts Sci (2001) pp. 22–32

    Google Scholar 

  12. D. Conklin: Music generation from statistical models. In: Proc. Symp. Artif. Intell. Creat. Arts Sci. (AISB) (2003) pp. 30–35

    Google Scholar 

  13. E.R. Miranda, J.A. Biles: Evolutionary Computer Music (Springer, London 2007)

    Book  Google Scholar 

  14. D. Cope: Computer modeling of musical intelligence in EMI, Comput. Music J. 16(2), 69–83 (1992)

    Article  Google Scholar 

  15. D. Cope: Computer Models of Musical Creativity (MIT Press, Cambridge 2005)

    Google Scholar 

  16. M. Delgado, W. Fajardo, M. Molina-Solana: Inmamusys: Intelligent multiagent music system, Expert Syst. Appl. 36(3), 4574–4580 (2009)

    Article  Google Scholar 

  17. D.M. Howard, G. Welch, J. Brereton, E. Himonides, M. Decosta, J. Williams, A. Howard: WinSingad: A real-time display for the singing studio, Logop. Phoniatr. Vocology 29(3), 135–144 (2004)

    Article  Google Scholar 

  18. Barcelona Music and Audio Technologies: SKORE Performance Rating, http://skore.bmat.me (2008)

  19. O. Mayor, J. Bonada, A. Loscos: The singing tutor: Expression categorization and segmentation of the singing voice. In: Proc. AES 121st Convention (2006)

    Google Scholar 

  20. D. Rossiter, D.M. Howard: ALBERT: A real-time visual feedback computer tool for professional vocal development, J. Voice Off. J. Voice Found. 10(4), 321–336 (1996)

    Google Scholar 

  21. Sony Computer Entertainment Europe: Singstar (SCEE London Studios 2004)

    Google Scholar 

  22. T. Nakano, M. Goto, Y. Hiraga: An automatic singing skill evaluation method for unknown melodies using pitch interval accuracy and vibrato features. In: Proc. INTERSPEECH (ICSLP) (2006) pp. 1706–1709

    Google Scholar 

  23. J. Callaghan, P. Wilson: How to Sing and See: Singing Pedagogy in the Digital Era (Cantare Systems, Surry Hills 2004)

    Google Scholar 

  24. D. Hoppe, M. Sadakata, P. Desain: Development of real-time visual feedback assistance in singing training: A review, J. Comput. Assist. Learn. 22(4), 308–316 (2006)

    Article  Google Scholar 

  25. S. Grollmisch, E. Cano Cerón, C. Dittmar: Songs2see: Learn to play by playing. In: 41st Int. Audio Eng. Soc. Conf. (AES) (2011)

    Google Scholar 

  26. Z. Jin, J. Jia, Y. Liu, Y. Wang, L. Cai: An automatic grading method for singing evaluation, Rec. Adv. Comput. Sci. Inf. Eng. 5, 691–696 (2012)

    Google Scholar 

  27. C. Dittmar, E. Cano, J. Abeßer, S. Grollmisch: Music information retrieval meets music education, Multimod. Music Process. 3, 95–120 (2012)

    Google Scholar 

  28. E. Gómez, A. Klapuri, B. Meudic: Melody description and extraction in the context of music content processing, J. New Music Res. 32(1), 23–40 (2003)

    Article  Google Scholar 

  29. A. De Cheveigné, H. Kawahara: YIN, a fundamental frequency estimator for speech and music, J. Acoust. Soc. Am. 111(4), 1917 (2002)

    Article  Google Scholar 

  30. T. Viitaniemi, A. Klapuri, A. Eronen: A probabilistic model for the transcription of single-voice melodies. In: Proc. 2003 Finn. Signal Process. Symp. FINSIG'03 (2003) pp. 59–63

    Google Scholar 

  31. M. Ryynänen, A. Klapuri: Modelling of Note Events for Singing Transcription. In: Proc. ISCA Tutor. Res. Workshop Stat. Percept. Audio Process. (SAPA) (2004)

    Google Scholar 

  32. G.E. Poliner, D.P.W. Ellis, A.F. Ehmann, E. Gómez, S. Streich, B. Ong: Melody transcription from music audio: Approaches and evaluation, IEEE Trans. Audio Speech Lang. Process. 15(4), 1247–1256 (2007)

    Article  Google Scholar 

  33. E. Molina: Automatic Scoring of Signing Voice Based on Melodic Similarity Measures (Universitat Pompeu Fabra, Barcelona 2012)

    Google Scholar 

  34. R.J. McNab, L.A. Smith, I.H. Witten: Signal processing for melody transcription, Proc. 19th Australas. Comput. Sci. Conf. 18(4), 301–307 (1996)

    Google Scholar 

  35. M. Ryynänen: Singing transcription. In: Signal Processing Methods for Music Transcription, ed. by A. Klapuri, M. Davy (Springer Science/Business Media LLC, New York 2006) pp. 361–390

    Chapter  Google Scholar 

  36. J.J. Mestres, J.B. Sanjaume, M. De Boer, A.L. Mira: Audio Recording Analysis and Rating, US Patent 8158871 (2012)

    Google Scholar 

  37. G. Haus, E. Pollastri: An audio front end for query-by-humming systems. In: Proc. 2nd Int. Symp. Music Inf. Retriev. (ISMIR) (2001) pp. 65–72

    Google Scholar 

  38. W. Krige, T. Herbst, T. Niesler: Explicit transition modelling for automatic singing transcription, J. New Music Res. 37(4), 311–324 (2008)

    Article  Google Scholar 

  39. E. Molina: Hacer música… para aprender a componer, Eufonia, Didáct. Músic. 51, 53–64 (2011)

    Google Scholar 

  40. M.K. Shan, S.C. Chiu: Algorithmic compositions based on discovered musical patterns, Multimed. Tools Appl. 46(1), 1–23 (2010)

    Article  Google Scholar 

  41. P.J. Ponce de León: Statistical description models for melody analysis and characterization. In: Proc. Int. Comput. Music Conf., ed. by J.M. Iñesta (2004) pp. 149–156

    Google Scholar 

  42. Association MIDI Manufacturers: The Complete MIDI 1.0 Detailed Specification (The MIDI Manufacturers Association, Los Angeles 1996)

    Google Scholar 

  43. R.S. Brindle: Musical Composition (Oxford Univ. Press, Oxford 1986)

    Google Scholar 

  44. F. Lerdahl, R. Jackendoff: A Generative Theory of Tonal Music (MIT Press, Cambridge 1983)

    Google Scholar 

  45. W.T. Fitch, A.J. Rosenfeld: Perception and production of syncopated rhythms, Music Percept. 25, 43–58 (2007)

    Article  Google Scholar 

  46. W. Appel: Harvard Dictionary of Music, 2nd edn. (The Belknap Press of Harvard Univ., Cambridge, London 2000)

    Google Scholar 

  47. K. Seyerlehner, G. Widmer, D. Schnitzer: From rhythm patterns to perceived tempo. In: Int. Soc. Music Inf. Retriev. (ISMIR) (2007) pp. 519–524

    Google Scholar 

  48. M.F. McKinney, D. Moelants: Ambiguity in Tempo Perception: What Draws Listeners to Different Metrical Levels? (Univ. of California Press, Oakland 2006) pp. 155–166

    Google Scholar 

  49. M. Gainza, D. Barry, E. Coyle: Automatic bar line segmentation. In: 123rd Convent. Audio Eng. Soc. Convent. Paper (2007)

    Google Scholar 

  50. M. Gainza, E. Coyle: Time signature detection by using a multi resolution audio similarity matrix. In: 122nd Convent. Audio Eng. Soc. Convent. Paper (2007)

    Google Scholar 

  51. J. Foote, M. Cooper: Visualizing musical structure and rhythm via self-similarity. In: Proc. 2001 Int. Comput. Music Conf. (2001) pp. 419–422

    Google Scholar 

  52. J.R. Quinlan: C4.5: Programs for Machine Learning (Morgan Kaufmann, San Francisco 1993)

    Google Scholar 

  53. J. Platt: Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines (Microsoft Research, Redmond 1998)

    Google Scholar 

  54. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I.H. Witten: The WEKA data mining software: An update, SIGKDD Explor. 11(1), 10–18 (2009)

    Article  Google Scholar 

  55. W.J. Downling, D.S. Fujitani: Contour, interval and pitch recognition in memory for melodies, J. Acoust. Soc. Am. 49, 524–531 (1971)

    Article  Google Scholar 

  56. E. Schellenberg: Simplifying the implication-realization model of musical expectancy, Music Percept. 14(3), 295–318 (1997)

    Article  Google Scholar 

  57. E. Narmour: The Analysis and Cognition of Melodic Complexity: The Implication-Realization Model (Univ. of Chicago Press, Chicago, London 1992)

    Google Scholar 

  58. D. Roca, E. Molina (Eds.): Vademecum Musical (Enclave Creativa, Madrid 2006)

    Google Scholar 

  59. B. Benward: Music: In Theory and Practice, Vol. 1, 7th edn. (McGraw-Hill, New York 2003)

    Google Scholar 

  60. R.W. Ottman: Elementary Harmony: Theory and Practice, 5th edn. (Prentice Hall, Englewood Cliffs 1989)

    Google Scholar 

  61. A.E. Yilmaz, Z. Telatar: Note-against-note two-voice counterpoint by means of fuzzy logic, Knowl.-Based Syst. 23(3), 256–266 (2010)

    Article  Google Scholar 

  62. E. Molina, I. Barbancho, E. Gomez, A.M. Barbancho, L.J. Tardon: Fundamental frequency alignment vs. note-based melodic similarity for singing voice assessment. In: IEEE Int. Conf. on Acoust. Speech Signal Process. (ICASSP) (2013) pp. 744–748

    Google Scholar 

  63. J. Wapnick, E. Ekholm: Expert consensus in solo voice performance evaluation, J. Voice 11(4), 429–436 (1997)

    Article  Google Scholar 

  64. L.R. Rabiner, R.W. Schafer: Digital Processing of Speech Signals, Prentice-Hall Series in Signal Processing No. 7, Vol. 25 (Prentice Hall, Englewood Cliffs 1978) p. 290

    Google Scholar 

  65. A. De Cheveigné: Matlab Implementation of YIN Algorithm, http://audition.ens.fr/adc/sw/yin.zip (2012)

  66. H. Sakoe: Dynamic programming algorithm optimization for spoken word recognition, IEEE Trans. Acoust. Speech Signal Process. 26, 43–49 (1978)

    Article  Google Scholar 

  67. C.A. Ratanamahatana, E. Keogh: Everything you know about dynamic time warping is wrong. In: 3rd Workshop Min. Tempor. Seq. Data, 10th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min. (KDD-2004) (2004)

    Google Scholar 

  68. D. Ellis: Dynamic Time Warp (DTW) in Matlab, http://labrosa.ee.columbia.edu/matlab/dtw (2003)

Download references

Acknowledgements

This work has been funded by Ministerio de Economía y Competitividad of the Spanish Government under Project No. TIN2016-75866-C3-2-R. This work has been done at Universidad de Málaga, Campus de Excelencia Internacional Andalucía Tech.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lorenzo J. Tardón .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Tardón, L.J., Barbancho, I., Roig, C., Molina, E., Barbancho, A.M. (2018). Music Learning: Automatic Music Composition and Singing Voice Assessment. In: Bader, R. (eds) Springer Handbook of Systematic Musicology. Springer Handbooks. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-55004-5_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-55004-5_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-55002-1

  • Online ISBN: 978-3-662-55004-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics