Photo Aesthetic Scoring Through Spatial Aggregation Perception DCNN on a New IDEA Dataset

  • Xin Jin
  • Le Wu
  • Geng Zhao
  • Xinghui Zhou
  • Xiaokun Zhang
  • Xiaodong LiEmail author
Part of the Studies in Computational Intelligence book series (SCI, volume 810)


The aesthetic quality assessment of image is a challenging work in computer vision field. The recent research work used the deep convolutional neural network to evaluate the aesthetic quality of images. However, the score of image data sets has a strongly normal distribution, which makes the training of neural network easy to be over-fitting. In addition, traditional deep learning methods usually pre-process images, which destroy the original aesthetic features of the picture, so that the network can only learn some superficial aesthetic features. This paper presents a new data set what images distributed evenly for aesthetics (IDEA). This data set has less statistical characteristics, which is helpful for the neural network to learn the deeper features. We propose a new spatial aggregation perception neural network architecture which can control channel weights automatically. The advantages and effectiveness of our method are proved by experiments in different data sets.


Aesthetic assessment Neural network Computer vision 



We thank all the reviewers and ACs. This work is partially supported by the National Natural Science Foundation of China (Grant Nos. 61772047, 61772513), the Science and Technology Project of the State Archives Administrator (Grant No. 2015-B-10), the open funding project of State Key Laboratory of Virtual Reality Technology and Systems, Beihang University (Grant No. BUAA-VR-16KF-09), the Fundamental Research Funds for the Central Universities (Grant No. 3122014C017), the China Postdoctoral Science Foundation (Grant No. 2015M581841), and the Postdoctoral Science Foundation of Jiangsu Province (Grant No. 1501019A).


  1. 1.
    Mai, L., Jin, H., Liu, F.: Composition-preserving deep photo aesthetics assessment. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 497–506 (2016)Google Scholar
  2. 2.
    Deng, J., Dong, W., Socher, R., et al.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 248–255. IEEE (2009)Google Scholar
  3. 3.
    Karayev, S., Trentacoste, M., Han, H., et al.: Recognizing image style (2013). arXiv:1311.3715
  4. 4.
    Lu, X., Lin, Z., Jin, H., et al.: Rapid: rating pictorial aesthetics using deep learning. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 457–466. ACM (2014)Google Scholar
  5. 5.
    Kao, Y., Wang, C., Huang, K.: Visual aesthetic quality assessment with a regression model. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 1583–1587. IEEE (2015)Google Scholar
  6. 6.
    Lu, X., Lin, Z., Shen, X., et al.: Deep multi-patch aggregation network for image style, aesthetics, and quality estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 990–998 (2015)Google Scholar
  7. 7.
    Lu, X., Lin, Z., Jin, H., et al.: Rating image aesthetics using deep learning. IEEE Trans. Multimed. 17(11), 2021–2034 (2015)CrossRefGoogle Scholar
  8. 8.
    Dong, Z., Tian, X.: Multi-level photo quality assessment with multi-view features. Neurocomputing 168, 308–319 (2015)CrossRefGoogle Scholar
  9. 9.
    Kao, Y., Huang, K., Maybank, S.: Hierarchical aesthetic quality assessment using deep convolutional neural networks. Signal Process. Image Commun. 47, 500–510 (2016)CrossRefGoogle Scholar
  10. 10.
    Wang, W., Zhao, M., Wang, L., et al.: A multi-scene deep learning model for image aesthetic evaluation. Signal Process. Image Commun. 47, 511–518 (2016)CrossRefGoogle Scholar
  11. 11.
    Ma, S., Liu, J., Chen, C.W.: A-lamp: Adaptive layout-aware multi-patch deep convolutional neural network for photo aesthetic assessment (2017). CoRR abs/1704.00248.
  12. 12.
    Kong, S., Shen, X., Lin, Z., et al.: Photo aesthetics ranking network with attributes and content adaptation. In: European Conference on Computer Vision, pp. 662–679. Springer, Cham (2016)Google Scholar
  13. 13.
    Jin, X., Chi, J., Peng, S., et al.: Deep image aesthetics classification using inception modules and fine-tuning connected layer. In: 2016 8th International Conference on Wireless Communications & Signal Processing (WCSP), pp. 1–6. IEEE (2016)Google Scholar
  14. 14.
    Jin, X., Wu, L., Song, C., et al.: Predicting aesthetic score distribution through cumulative Jensen-Shannon divergence. In: Proceedings of the 32th international conference of the America Association for Artificial Intelligence (AAAI18), New Orleans, Louisiana, 2–7 Feb 2018 (2017)Google Scholar
  15. 15.
    Kao, Y., He, R., Huang, K.: Deep aesthetic quality assessment with semantic information. IEEE Trans. Image Process. 26(3), 1482–1495 (2017)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Ke, Y., Tang, X., Jing, F.: The design of high-level features for photo quality assessment. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 419–426. IEEE (2006)Google Scholar
  17. 17.
    He, K., Zhang, X., Ren, S., et al.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European Conference on Computer Vision, pp. 346–361. Springer, Cham (2014)Google Scholar
  18. 18.
    Wang, Z., Liu, D., Chang, S., et al.: Image aesthetics assessment using Deep Chatterjee’s machine. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 941–948. IEEE (2017)Google Scholar
  19. 19.
    Jin, B., Segovia, M.V.O., Süsstrunk, S.: Image aesthetic predictors based on weighted CNNs. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 2291–2295. IEEE (2016)Google Scholar
  20. 20.
    Hou, L., Yu, C.P., Samaras, D.: Squared Earth Mover’s Distance-based Loss for Training Deep Neural Networks (2016). arXiv:1611.05916
  21. 21.
    Wu, O., Hu, W., Gao, J.: Learning to predict the perceived visual quality of photos. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 225–232. IEEE (2011)Google Scholar
  22. 22.
    Murray, N., Marchesotti, L., Perronnin, F.: AVA: a large-scale database for aesthetic visual analysis. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2408–2415. IEEE (2012)Google Scholar
  23. 23.
    Serikawa, S., Lu, H.: Underwater image dehazing using joint trilateral filter. Comput. Electr. Eng. 40(1), 41–50 (2014)CrossRefGoogle Scholar
  24. 24.
    Lu, H., Li, Y., Mu, S., et al.: Motor anomaly detection for unmanned aerial vehicles using reinforcement learning. IEEE Internet Things J. (2017)Google Scholar
  25. 25.
    Lu, H., Li, Y., Chen, M., et al.: Brain intelligence: go beyond artificial intelligence. In: Mobile Networks and Applications, pp. 1–8 (2017)Google Scholar
  26. 26.
    Lu, H., Li, B., Zhu, J., et al.: Wound intensity correction and segmentation with convolutional neural networks. Concurr. Comput. Pract. Exp. 29(6) (2017)Google Scholar
  27. 27.
    Lu, H., Li, Y., Uemura, T., et al.: Low illumination underwater light field images reconstruction using deep convolutional neural networks. Future Gener. Comput. Syst. (2018)Google Scholar
  28. 28.
    Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks (2017). arXiv:1709.01507
  29. 29.
    He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  30. 30.
    Jia, Y., Shelhamer, E., Donahue, J., et al.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on Multimedia, pp. 675–678. ACM (2014)Google Scholar
  31. 31.
    Marchesotti, L., Perronnin, F., Larlus, D., et al.: Assessing the aesthetic quality of photographs using generic image descriptors. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1784–1791. IEEE (2011)Google Scholar
  32. 32.
    Bianco, S., Celona, L., Napoletano, P., et al.: Predicting image aesthetics with deep learning. In: International Conference on Advanced Concepts for Intelligent Vision Systems, pp. 117–125. Springer, Cham (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Xin Jin
    • 1
    • 2
  • Le Wu
    • 1
  • Geng Zhao
    • 1
  • Xinghui Zhou
    • 1
  • Xiaokun Zhang
    • 1
  • Xiaodong Li
    • 1
    Email author
  1. 1.Department of Cyber SecurityBeijing Electronic Science and Technology InstituteBeijingChina
  2. 2.CETC Big Data Research Institute Co., Ltd.GuiyangChina

Personalised recommendations