Advertisement

Transfer learning features for predicting aesthetics through a novel hybrid machine learning method

  • Adrian CarballalEmail author
  • Carlos Fernandez-Lozano
  • Jonathan Heras
  • Juan Romero
Original Article

Abstract

The automatic assessment of the aesthetic value of an image is a task with many applications but really complex and challenging, due to the subjective component of the aesthetics for humans. The computational systems that carry out this task are usually composed of a set of ad hoc metrics proposed by the researchers and a machine learning system. We propose a new approach that fully automates the metrics creation process, its filtering and adjustment without human subjectivity. Thus, it does not depend on the authors’ human aesthetic intuitions. Our proposal is therefore based on the integration of two machine learning algorithms: CNN, which works as a feature extractor, and Correlation by Genetic Search (CGS)—a novel regression method, working as a supervised learning method. CGS is based on the creation of an adjusted linear regression model using Pearson’s correlation as a measure of performance in an evolutionary process. Experiments were conducted on a very well-known aesthetics database called “Photo.net” with more than a million images from over 400,000 users. The comparison of results with other approaches using the same dataset demonstrates that the fusion of CNN transfer learning features with this specific machine learning method has achieved robust and significantly better results than other state-of-the-art methods and hybrid approaches in terms of AUROC (0.93), accuracy (0.93) and Pearson’s correlation value (0.94).

Keywords

Convolutional neural networks Feature extraction Machine learning Prediction Classification Aesthetics assessment Hybrid model Transfer learning 

Notes

Acknowledgements

This work is supported by the General Directorate of Culture, Education and University Management of Xunta de Galicia (Ref. GRC2014/049) and the European Fund for Regional Development (FEDER) allocated by the European Union, the Portuguese Foundation for Science and Technology for the development of project SBIRC (Ref. PTDC/EIA-EIA/115667/2009), Xunta de Galicia (Ref. XUGA-PGIDIT-10TIC105008-PR) and the Spanish Ministry for Science and Technology (TIN2008-06562/TIN and MTM2017-88804-P) and the Juan de la Cierva fellowship program by the Spanish Ministry of Economy and Competitiveness (Carlos Fernandez-Lozano, Ref. FJCI-2015-26071) and Grant from the Ministry of Education, Culture and Sport for mobility stays of professors and researchers in foreign higher education centers and investigation (PRX18/00117).

We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research.

References

  1. 1.
    Wang W, Cai D, Wang L, Huang Q, Xu X, Li X (2016) Synthesized computational aesthetic evaluation of photos. Neurocomputing 172(C):244–252.  https://doi.org/10.1016/j.neucom.2014.12.106 CrossRefGoogle Scholar
  2. 2.
    Mullin C, Hayn-Leichsenring G, Redies C, Wagemans J (2017) The gist of beauty: an investigation of aesthetic perception in rapidly presented images. Electron Imaging 2017(14):248–256.  https://doi.org/10.2352/ISSN.2470-1173.2017.14.HVEI-152 CrossRefGoogle Scholar
  3. 3.
    Bianco S, Celona L, Napoletano P, Schettini R (2016) Predicting image aesthetics with deep learning. In: Blanc-Talon J, Distante C, Philips W, Popescu D, Scheunders P (eds) Advanced concepts for intelligent vision systems. Springer, Cham, pp 117–125CrossRefGoogle Scholar
  4. 4.
    Deng Y, Loy CC, Tang X (2017) Image aesthetic assessment: an experimental survey. IEEE Signal Process Mag 34(4):80–106.  https://doi.org/10.1109/MSP.2017.2696576 CrossRefGoogle Scholar
  5. 5.
    Li Y-X, Pu Y-Y, Xu D, Qian W-H, Wang L-P (2017) Image aesthetic quality evaluation using convolution neural network embedded learning. Optoelectron Lett 13(6):471–475.  https://doi.org/10.1007/s11801-017-7203-6 CrossRefGoogle Scholar
  6. 6.
    Datta R, Joshi D, Li J, Wang JZ (2006) Studying aesthetics in photographic images using a computational approach. In: Leonardis A, Bischof H, Pinz A (eds) Computer Vision—ECCV 2006. Springer, Berlin, pp 288–301CrossRefGoogle Scholar
  7. 7.
    Ke Y, Tang X, Jing F (2006) The design of high-level features for photo quality assessment. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol 1, pp 419–426.  https://doi.org/10.1109/CVPR.2006.303
  8. 8.
    Carballal A, Castro L, Perez R, Correia J (2014) Detecting bias on aesthetic image datasets. IJCICG 5(2):62–74.  https://doi.org/10.4018/ijcicg.2014070104 Google Scholar
  9. 9.
    Carballal A, Fernandez-Lozano C, Rodriguez-Fernandez N, Castro L, Santos A (2019) Avoiding the inherent limitations in datasets used for measuring aesthetics when using a machine learning approach. Complexity 2019:1–12.  https://doi.org/10.1155/2019/4659809 CrossRefGoogle Scholar
  10. 10.
    Murray N, Marchesotti L, Perronnin F (2012) Ava: a large-scale database for aesthetic visual analysis. In. IEEE Conference on Computer Vision and Pattern Recognition, pp 2408–2415.  https://doi.org/10.1109/CVPR.2012.6247954
  11. 11.
    Dong Z, Tian X (2015) Multi-level photo quality assessment with multi-view features. Neurocomputing 168:308–319.  https://doi.org/10.1016/j.neucom.2015.05.095 CrossRefGoogle Scholar
  12. 12.
    Xia Y, Liu Z, Yan Y, Chen Y, Zhang L, Zimmermann R (2017) Media quality assessment by perceptual gaze-shift patterns discovery. IEEE Trans Multimed 19(8):1811–1820.  https://doi.org/10.1109/TMM.2017.2679900 CrossRefGoogle Scholar
  13. 13.
    Marchesotti L, Perronnin F, Larlus D, Csurka G (2011) Assessing the aesthetic quality of photographs using generic image descriptors. In: International Conference on Computer Vision, pp 1784–1791.  https://doi.org/10.1109/ICCV.2011.6126444
  14. 14.
    Kao Y, He R, Huang K Visual aesthetic quality assessment with multi-task deep learning. CoRR arXiv:1604.04970
  15. 15.
    Dhar S, Ordonez V, Berg TL (2011) High level describable attributes for predicting aesthetics and interestingness. In: CVPR 2011, pp 1657–1664.  https://doi.org/10.1109/CVPR.2011.5995467
  16. 16.
    Luo Y, Tang X (2008) Photo and video quality evaluation: focusing on the subject. In: Forsyth D, Torr P, Zisserman A (eds) Computer Vision—ECCV 2008. Springer, Berlin, pp 386–399CrossRefGoogle Scholar
  17. 17.
    Tang X, Luo W, Wang X (2013) Content-based photo quality assessment. IEEE Trans Multimed 15(8):1930–1943.  https://doi.org/10.1109/TMM.2013.2269899 CrossRefGoogle Scholar
  18. 18.
    Wong L-K, Low K-L, (2009) Saliency-enhanced image aesthetics class prediction. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp 997–1000.  https://doi.org/10.1109/ICIP.2009.5413825
  19. 19.
    Tan Y, Zhou Y, Li G, Huang A (2016) Computational aesthetics of photos quality assessment based on improved artificial neural network combined with an autoencoder technique. Neurocomputing 188:50–62.  https://doi.org/10.1016/j.neucom.2015.04.124 (Advanced Intelligent Computing Methodologies and Applications)
  20. 20.
    Machado P, Cardoso A (1998) Computing aesthetics. In: de Oliveira FM (ed) Advances in Artificial Intelligence, 14th Brazilian Symposium on Artificial Intelligence, SBIA ’98, Porto Alegre, Brazil, November 4–6, 1998, Proceedings, volume 1515 of Lecture Notes in Computer Science. Springer, pp 219–228Google Scholar
  21. 21.
    Zipf GK (1949) Human behaviour and the principle of least effort: an introduction to human ecology. Addison-Wesley, OxfordGoogle Scholar
  22. 22.
    Machado P, Romero J, Nadal M, Santos A, Correia J, Carballal A (2015) Computerized measures of visual complexity. Acta Psychol 160:43–57.  https://doi.org/10.1016/j.actpsy.2015.06.005 CrossRefGoogle Scholar
  23. 23.
    Carballal A, Santos A, Romero J, Machado P, Correia J, Castro L (2018) Distinguishing paintings from photographs by complexity estimates. Neural Comput Appl 30(6):1957–1969.  https://doi.org/10.1007/s00521-016-2787-5 CrossRefGoogle Scholar
  24. 24.
    Perez RIP, Carballal A, Rabuñal JR, Mures OA, García-Vidaurrázaga MD (2018) Predicting vertical urban growth using genetic evolutionary algorithms in Tokyo’s Minato Ward. J Urban Plan Dev 144(1):04017024.  https://doi.org/10.1061/(ASCE)UP.1943-5444.0000413 CrossRefGoogle Scholar
  25. 25.
    Romero J, Machado P, Carballal A, Santos A (2012) Using complexity estimates in aesthetic image classification. J Math Arts 6(2–3):125–136.  https://doi.org/10.1080/17513472.2012.679514 MathSciNetCrossRefGoogle Scholar
  26. 26.
    Jiang W, Loui AC, Cerosaletti CD (2010) Automatic aesthetic value assessment in photographic images. In: IEEE International Conference on Multimedia and Expo, pp 920–925.  https://doi.org/10.1109/ICME.2010.5582588
  27. 27.
    Datta R, Wang JZ (2010) Acquine: aesthetic quality inference engine—real-time automatic rating of photo aesthetics. In: Multimedia Information Retrieval, pp 1–4Google Scholar
  28. 28.
    Lu X, Lin Z, Shen X, Mech R, Wang JZ (2015) Deep multi-patch aggregation network for image style, aesthetics, and quality estimation. In: IEEE International Conference on Computer Vision (ICCV), pp 990–998.  https://doi.org/10.1109/ICCV.2015.119
  29. 29.
    Mai L, Jin H, Liu F (2016) Composition-preserving deep photo aesthetics assessment. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 497–506.  https://doi.org/10.1109/CVPR.2016.60
  30. 30.
    Taylor JR (1997) An introduction to error analysis: the study of uncertainties in physical measurements. University Science Books, SausalitoGoogle Scholar
  31. 31.
    Pearson K (1920) Notes on the history of correlation. Biometrika 13(1):25–45CrossRefGoogle Scholar
  32. 32.
    Wang S-H, Sun J, Phillips P, Zhao G, Zhang Y-D (2018) Polarimetric synthetic aperture radar image segmentation by convolutional neural network using graphical processing units. J Real-Time Image Process 15(3):631–642.  https://doi.org/10.1007/s11554-017-0717-0 CrossRefGoogle Scholar
  33. 33.
    Wang S-H, Tang C, Sun J, Yang J, Huang C, Phillips P, Zhang Y-D (2018) Multiple sclerosis identification by 14-layer convolutional neural network with batch normalization, dropout, and stochastic pooling. Front Neurosci 12:818.  https://doi.org/10.3389/fnins.2018.00818 CrossRefGoogle Scholar
  34. 34.
    Kazemi SMR, Bidgoli BM, Shamshirband S, Karimi SM, Ghorbani MA, Wing Chau K, Pour RK (2018) Novel genetic-based negative correlation learning for estimating soil temperature. Eng Appl Comput Fluid Mech 12(1):506–516.  https://doi.org/10.1080/19942060.2018.1463871 Google Scholar
  35. 35.
    Taormina R, Chau K-W, Sivakumar B (2015) Neural network river forecasting through baseflow separation and binary-coded swarm optimization. J Hydrol 529:1788–1797.  https://doi.org/10.1016/j.jhydrol.2015.08.008 CrossRefGoogle Scholar
  36. 36.
    Krizhevsky A et al (2012) ImageNet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems, vol 25. Curran Associates Inc, pp 1097–1105Google Scholar
  37. 37.
    Razavian A S et al (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’14), IEEE Computer Society, IEEE, pp 512–519Google Scholar
  38. 38.
    Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRefGoogle Scholar
  39. 39.
    Christodoulidis S et al (2017) Multisource transfer learning with convolutional neural networks for lung pattern analysis. IEEE J Biomed Health Inform 21(1):76–84CrossRefGoogle Scholar
  40. 40.
    Ghafoorian M et al (2017) Transfer learning for domain adaptation in MRI: application in brain lesion segmentation. CoRR arXiv:1702.07841
  41. 41.
    Menegola A et al (2017) Knowledge transfer for melanoma screening with deep learning. CoRR arXiv:1703.07479
  42. 42.
    Szegedy C et al (2015) Going deeper with convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15), IEEE Computer Society, IEEE, pp 1–9Google Scholar
  43. 43.
    He K et al (2016) Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16), IEEE Computer Society, IEEE, pp 770–778Google Scholar
  44. 44.
    Russakovsky O et al (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252MathSciNetCrossRefGoogle Scholar
  45. 45.
    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. CoRR arXiv:1409.1556
  46. 46.
    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems—Volume 1, NIPS’12, Curran Associates Inc., USA, pp 1097–1105Google Scholar
  47. 47.
    Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. CoRR arXiv:1704.04861
  48. 48.
    Radenović F, Tolias G, Chum O (2018) Fine-tuning cnn image retrieval with no human annotation. IEEE Trans Pattern Anal Mach Intell.  https://doi.org/10.1109/TPAMI.2018.2846566
  49. 49.
    Kaehler A, Bradski G (2015) Learning OpenCV 3, O’Reilly MediaGoogle Scholar
  50. 50.
    Chollet F et al (2015) KerasGoogle Scholar
  51. 51.
    Abadi M et al (2015) TensorFlow: large-scale machine learning on heterogeneous systems, software available from http://tensorflow.org/
  52. 52.
    Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958MathSciNetzbMATHGoogle Scholar
  53. 53.
    Simard P, Steinkraus D, Platt JC (2003) Best practices for convolutional neural networks applied to visual document analysis. In: I. C. Society (ed) Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR’03), vol 2, pp 958–964Google Scholar
  54. 54.
    Frank E, Wang Y, Inglis S, Holmes G, Witten IH (1998) Using model trees for classification. Mach Learn 32(1):63–76.  https://doi.org/10.1023/A:1007421302149 CrossRefzbMATHGoogle Scholar
  55. 55.
    Sammut C, Webb GI (eds) (2010) Encyclopedia of machine learning, Springer US, Ch. Leave-One-Out Cross-Validation, pp 600–601Google Scholar
  56. 56.
    McIntosh A (2016) The Jackknife estimation method. ArXiv e-prints arXiv:1606.00497
  57. 57.
    Breiman L (2001) Random forests. Mach Learn 45(1):5–32.  https://doi.org/10.1023/A:1010933404324 CrossRefzbMATHGoogle Scholar
  58. 58.
    Hocking RR (1976) A biometrics invited paper. The analysis and selection of variables in linear regression. Biometrics 32(1):1–49MathSciNetCrossRefGoogle Scholar
  59. 59.
    Gron A (2017) Hands-on machine learning with Scikit-Learn and TensorFlow: concepts, tools, and techniques to build intelligent systems, 1st edn. O’Reilly Media, Inc., NewtonGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Computer Science, Faculty of Computer ScienceUniversity of A CoruñaA CoruñaSpain
  2. 2.Instituto de Investigacion Biomedica de A Coruña (INIBIC)Complexo Hospitalario Universitario de A Coruña (CHUAC)A CoruñaSpain
  3. 3.Department of Mathematics and Computer ScienceUniversity of La RiojaLogroñoSpain

Personalised recommendations