Music Generation Using Deep Learning

  • Aishwarya BhaveEmail author
  • Mayank Sharma
  • Rekh Ram Janghel
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 898)


Deep learning has recently been used for many art-related activities such as automatic generation of music and pictures. This paper deals with music generation by using raw audio files in the frequency domain using Restricted Boltzmann Machine and Long Short- Term Memory architectures. The work does not use any information about musical structure to aid the learning, instead, it learns from a previous permutation of notes and generates an optimal and pleasant permutation. It also serves as a comparative study for music generation using Long Short-Term Memory and Restricted Boltzmann Machine.


Deep Learning Long short-term memory network Music Generation Restricted Boltzmann Machine 


  1. 1.
    Horner, A., Goldberg, D.E.: Genetic algorithms and computer-assisted music composition. Urbana 51(61801), 437–441 (1991)Google Scholar
  2. 2.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems (2014)Google Scholar
  3. 3.
    Nayebi, A., Vitelli, M.: Gruv: algorithmic music generation using recurrent neural networks. Course CS224D: Deep Learning for Natural Language Processing (Stanford) (2015)Google Scholar
  4. 4.
    Van Den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals. O., Graves, A., Kalchbrenner, N., Senior, A., Kavukcuoglu, K.: Wavenet: A Generative Model for Raw Audio. arXiv preprint (2016)Google Scholar
  5. 5.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  6. 6.
    Bickerman, G., et al.: Learning to Create Jazz Melodies Using Deep Belief Nets. In: ICCC (2010)Google Scholar
  7. 7.
    Earley, S., Obama, T., Note Identification Using FFT.: Note Identification Using Fast Fourier TransformGoogle Scholar
  8. 8.
    Fliege, N.J.: Multirate Digital Signal Processing, vol. 994. Wiley, New York (1994)zbMATHGoogle Scholar
  9. 9.
    Yu, G., Mallat, S., Bacry, Emmanuel: Audio denoising by time-frequency block thresholding. IEEE Trans. Signal Process. 56(5), 1830–1839 (2008)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Katoh, K., et al.: MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucl. Acids Res. 30(14), 3059–3066 (2002)CrossRefGoogle Scholar
  11. 11.
    Hinton, G.E.: A practical guide to training restricted Boltzmann machines. In: Neural Networks: Tricks of the Trade, pp. 599–619. Springer, Berlin, Heidelberg (2012)Google Scholar
  12. 12.
    Hinton, G.E., Osindero, S., Teh, Yee-Whye: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Hinton, G.E.: To recognize shapes, first learn to generate images. Prog. Brain Res. 165, 535-547 (2007)Google Scholar
  14. 14.
    Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the 24th International Conference on Machine Learning. ACM (2007)Google Scholar
  15. 15.
    Susskind, J.M., et al.: Generating facial expressions with deep belief nets. Affective Computing. InTech (008)Google Scholar
  16. 16.
    Smolensky, P.: Foundations of harmony theory: cognitive dynamical systems and the subsymbolic theory of information processing. Parallel Distrib. Process. Explor. Microstruct. Cogn. 1, 191–281 (1986)Google Scholar
  17. 17.
    Chollet, F.: “Keras (2015)” (2017)Google Scholar
  18. 18.
    Huang, A., Wu, R.: Deep learning for music. arXiv preprint arXiv:1606.04930 (2016)
  19. 19.
    Eck, D., Schmidhuber, J.: Learning the long-term structure of the blues. In: International Conference on Artificial Neural Networks. Springer, Berlin, Heidelberg (2002)CrossRefGoogle Scholar
  20. 20.
    Huang, Y.-S., Chou, S.-Y., Yang, Y.-H.: Generating Music Medleys via Playing Music Puzzle GamesGoogle Scholar
  21. 21.
    Hinton, Geoffrey E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Aishwarya Bhave
    • 1
    Email author
  • Mayank Sharma
    • 1
  • Rekh Ram Janghel
    • 1
  1. 1.National Institute of TechnologyRaipurIndia

Personalised recommendations