Convolutional Neural Networks and Transfer Learning Applied to Automatic Composition of Descriptive Music

Martín-Gómez, Lucía; Pérez-Marcos, Javier; Navarro-Cáceres, María; Rodríguez-González, Sara

doi:10.1007/978-3-319-99608-0_31

Convolutional Neural Networks and Transfer Learning Applied to Automatic Composition of Descriptive Music

Lucía Martín-Gómez²³,
Javier Pérez-Marcos²³,
María Navarro-Cáceres²³ &
…
Sara Rodríguez-González²³

Conference paper
First Online: 09 January 2019

676 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 801))

Abstract

Visual and musical arts has been strongly interconnected throughout history. The aim of this work is to compose music on the basis of the visual characteristics of a video. For this purpose, descriptive music is used as a link between image and sound and a video fragment of film Fantasia is deeply analyzed. Specially, convolutional neural networks in combination with transfer learning are applied in the process of extracting image descriptors. In order to establish a relationship between the visual and musical information, Naive Bayes, Support Vector Machine and Random Forest classifiers are applied. The obtained model is subsequently employed to compose descriptive music from a new video. The results of this proposal are compared with those of an antecedent work in order to evaluate the performance of the classifiers and the quality of the descriptive musical composition.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Clague, M.: Playing in ’Toon: Walt Disney’s "Fantasia" (1940) and the imagineering of classical music. Am. Music 22(1), 91–109 (2004)
Article Google Scholar
Culhane, J.: Fantasia 2000: Visions of Hope. Disney Editions, Glendale (1999)
Google Scholar
Haykin, S., Network, N.: A comprehensive foundation. Neural Netw. 2(2004), 41 (2004)
Google Scholar
Hsu, C.W., Chang, C.C., Lin, C.J., et al.: A practical guide to support vector classification (2003)
Google Scholar
John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann Publishers Inc. (1995)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article MathSciNet Google Scholar
Lu, G., Phillips, J.: Using perceptually weighted histograms for colour-based image retrieval. In: 1998 Fourth International Conference on Signal Processing Proceedings, 1998. ICSP 1998, vol. 2, pp. 1150–1153. IEEE (1998)
Google Scholar
Marks, L.E.: On colored-hearing synesthesia: cross-modal translations of sensory dimensions. Psychol. Bull. 82(3), 303 (1975)
Article Google Scholar
Martín-Gómez, L., Pérez-Marcos, J.: Image and sound data from film Fantasia produced by Walt Disney (2018). https://figshare.com/articles/FantasiaDisney_ImageSound/5999207
Martín-Gómez, L., Pérez-Marcos, J., Navarro-Cáceres, M.: Automatic composition of descriptive music: a case study of the relationship between image and sound. In: Proceedings of the Workshop Computational Creativity, Concept Invention, and General Intelligence (C3GI) 2017 (2017)
Google Scholar
Martín-Gmez, L., Pérez-Marcos, J.: Data repository of fantasia case study (2017). https://github.com/lumg/FantasiaDisney_data
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)
Article Google Scholar
Seeger, C.: Prescriptive and descriptive music-writing. Music. Q. 44(2), 184–195 (1958)
Article Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Google Scholar
Yang, J., Jiang, Y.G., Hauptmann, A.G., Ngo, C.W.: Evaluating bag-of-visual-words representations in scene classification. In: Proceedings of the International Workshop on Workshop on Multimedia Information Retrieval, pp. 197–206. ACM (2007)
Google Scholar

Download references

Acknowledgments

This work was supported by the Spanish Ministry, Ministerio de Economía y Competitividad and FEDER funds. Project. SURF: Intelligent System for integrated and sustainable management of urban fleets TIN2015-65515-C4-3-R.

Author information

Authors and Affiliations

BISITE Digital Innovation Hub, University of Salamanca. Edificio Multiusos I+D+i, 37007, Salamanca, Spain
Lucía Martín-Gómez, Javier Pérez-Marcos, María Navarro-Cáceres & Sara Rodríguez-González

Authors

Lucía Martín-Gómez
View author publications
You can also search for this author in PubMed Google Scholar
Javier Pérez-Marcos
View author publications
You can also search for this author in PubMed Google Scholar
María Navarro-Cáceres
View author publications
You can also search for this author in PubMed Google Scholar
Sara Rodríguez-González
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lucía Martín-Gómez .

Editor information

Editors and Affiliations

BISITE Digital Innovation Hub, University of Salamanca, Salamanca, Spain
Sara Rodríguez
BISITE Digital Innovation Hub, University of Salamanca, Salamanca, Spain
Javier Prieto
GECAD - Instituto Superior de Engenharia, Porto, Portugal
Pedro Faria
Department of Computer Science and Production Management, University of Zielona Góra, Zielona Góra, Poland
Sławomir Kłos
Computing Science and Artificial Intelligence, Rey Juan Carlos University, Móstoles, Madrid, Spain
Alberto Fernández
Basque Center for Applied Mathematics, Bilbao, Spain
Santiago Mazuelas
Basque Center for Applied Mathematics, Universidad de Alcalá, Alcalá de Henares, Spain
M. Dolores Jiménez-López
Departamento de Informática y Automática, University of Salamanca, Salamanca, Spain
María N. Moreno
Departamento de Sistemas Informáticos, University of Castilla-La Mancha, Albacete, Spain
Elena M. Navarro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martín-Gómez, L., Pérez-Marcos, J., Navarro-Cáceres, M., Rodríguez-González, S. (2019). Convolutional Neural Networks and Transfer Learning Applied to Automatic Composition of Descriptive Music. In: Rodríguez, S., et al. Distributed Computing and Artificial Intelligence, Special Sessions, 15th International Conference. DCAI 2018. Advances in Intelligent Systems and Computing, vol 801. Springer, Cham. https://doi.org/10.1007/978-3-319-99608-0_31

Download citation

DOI: https://doi.org/10.1007/978-3-319-99608-0_31
Published: 09 January 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99607-3
Online ISBN: 978-3-319-99608-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics