Abstract
Visual and musical arts has been strongly interconnected throughout history. The aim of this work is to compose music on the basis of the visual characteristics of a video. For this purpose, descriptive music is used as a link between image and sound and a video fragment of film Fantasia is deeply analyzed. Specially, convolutional neural networks in combination with transfer learning are applied in the process of extracting image descriptors. In order to establish a relationship between the visual and musical information, Naive Bayes, Support Vector Machine and Random Forest classifiers are applied. The obtained model is subsequently employed to compose descriptive music from a new video. The results of this proposal are compared with those of an antecedent work in order to evaluate the performance of the classifiers and the quality of the descriptive musical composition.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Clague, M.: Playing in ’Toon: Walt Disney’s "Fantasia" (1940) and the imagineering of classical music. Am. Music 22(1), 91–109 (2004)
Culhane, J.: Fantasia 2000: Visions of Hope. Disney Editions, Glendale (1999)
Haykin, S., Network, N.: A comprehensive foundation. Neural Netw. 2(2004), 41 (2004)
Hsu, C.W., Chang, C.C., Lin, C.J., et al.: A practical guide to support vector classification (2003)
John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann Publishers Inc. (1995)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Lu, G., Phillips, J.: Using perceptually weighted histograms for colour-based image retrieval. In: 1998 Fourth International Conference on Signal Processing Proceedings, 1998. ICSP 1998, vol. 2, pp. 1150–1153. IEEE (1998)
Marks, L.E.: On colored-hearing synesthesia: cross-modal translations of sensory dimensions. Psychol. Bull. 82(3), 303 (1975)
Martín-Gómez, L., Pérez-Marcos, J.: Image and sound data from film Fantasia produced by Walt Disney (2018). https://figshare.com/articles/FantasiaDisney_ImageSound/5999207
Martín-Gómez, L., Pérez-Marcos, J., Navarro-Cáceres, M.: Automatic composition of descriptive music: a case study of the relationship between image and sound. In: Proceedings of the Workshop Computational Creativity, Concept Invention, and General Intelligence (C3GI) 2017 (2017)
Martín-Gmez, L., Pérez-Marcos, J.: Data repository of fantasia case study (2017). https://github.com/lumg/FantasiaDisney_data
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)
Seeger, C.: Prescriptive and descriptive music-writing. Music. Q. 44(2), 184–195 (1958)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Yang, J., Jiang, Y.G., Hauptmann, A.G., Ngo, C.W.: Evaluating bag-of-visual-words representations in scene classification. In: Proceedings of the International Workshop on Workshop on Multimedia Information Retrieval, pp. 197–206. ACM (2007)
Acknowledgments
This work was supported by the Spanish Ministry, Ministerio de Economía y Competitividad and FEDER funds. Project. SURF: Intelligent System for integrated and sustainable management of urban fleets TIN2015-65515-C4-3-R.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Martín-Gómez, L., Pérez-Marcos, J., Navarro-Cáceres, M., Rodríguez-González, S. (2019). Convolutional Neural Networks and Transfer Learning Applied to Automatic Composition of Descriptive Music. In: Rodríguez, S., et al. Distributed Computing and Artificial Intelligence, Special Sessions, 15th International Conference. DCAI 2018. Advances in Intelligent Systems and Computing, vol 801. Springer, Cham. https://doi.org/10.1007/978-3-319-99608-0_31
Download citation
DOI: https://doi.org/10.1007/978-3-319-99608-0_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99607-3
Online ISBN: 978-3-319-99608-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)