Skip to main content

Unified Image Aesthetic Prediction via Scanpath-Guided Feature Aggregation Network

  • Conference paper
  • First Online:
Book cover Human Brain and Artificial Intelligence (HBAI 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1072))

Included in the following conference series:

  • 668 Accesses

Abstract

The performance of automatic aesthetic prediction has achieved significant improvement by utilizing deep convolutional neural networks (CNNs). However, existing CNN methods can only achieve limited success because (1) most of the methods take one fixed-size patch as the training example, which loses the fine-grained details and the holistic layout information, and (2) most of the methods ignore the biologically cues such as the gaze shifting sequence in image aesthetic assessment. To address these challenges, we propose a scanpath-guided feature aggregation model for aesthetic prediction. In our model, human fixation map and the view scanpath are predicted by a multi-scale network. Then a sequence of regions are adaptively selected according to the scanpath. These attended regions are then progressively fed into the CNN and LSTM network to accumulate the information, yielding a compact image level representation. Extensive experiments on the large scale aesthetics assessment benchmark AVA and Photo.net data set thoroughly demonstrate the efficacy of our approach for unified aesthetic prediction tasks: (i) aesthetic quality classification; (ii) aesthetic score regression; and (iii) aesthetic score distribution prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bhattacharya, S., Sukthankar, R., Shah, M.: A framework for photo-quality assessment and enhancement based on visual aesthetics. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 271–280 (2010)

    Google Scholar 

  2. Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying aesthetics in photographic images using a computational approach. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 288–301. Springer, Heidelberg (2006). https://doi.org/10.1007/11744078_23

    Chapter  Google Scholar 

  3. Guo, L., Xiong, Y., Huang, Q., Li, X.: Image esthetic assessment using both hand-crafting and semantic features. Neurocomputing 143, 14–26 (2014)

    Article  Google Scholar 

  4. Itti, L., Koch, C.: Computational modelling of visual attention. Nat. Rev. Neurosci. 2(3), 194 (2001)

    Article  Google Scholar 

  5. Jin, X., et al.: Predicting aesthetic score distribution through cumulative Jensen-Shannon divergence. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 77–84 (2018)

    Google Scholar 

  6. Kao, Y., He, R., Huang, K.: Deep aesthetic quality assessment with semantic information. IEEE Trans. Image Process. 26(3), 1482–1495 (2017)

    Article  MathSciNet  Google Scholar 

  7. Kong, S., Shen, X., Lin, Z., Mech, R., Fowlkes, C.: Photo aesthetics ranking network with attributes and content adaptation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 662–679. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_40

    Chapter  Google Scholar 

  8. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)

    Google Scholar 

  9. Locher, P.: The usefulness of eye movement recordings to subject an aesthetic episode with visual art to empirical scrutiny. Psychol. Sci. 48(2), 106 (2006)

    Google Scholar 

  10. Lu, X., Lin, Z., Shen, X., Mech, R., Wang, J.Z.: Deep multi-patch aggregation network for image style, aesthetics, and quality estimation. In: ICCV, pp. 990–998 (2015)

    Google Scholar 

  11. Lu, X., Lin, Z., Jin, H., Yang, J., Wang, J.Z.: Rating image aesthetics using deep learning. IEEE Trans. Multimed. 17(11), 2021–2034 (2015)

    Article  Google Scholar 

  12. Luo, Y., Tang, X.: Photo and video quality evaluation: focusing on the subject. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5304, pp. 386–399. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88690-7_29

    Chapter  Google Scholar 

  13. Ma, S., Liu, J., Chen, C.W.: A-lamp: adaptive layout-aware multi-patch deep convolutional neural network for photo aesthetic assessment. In: CVPR, pp. 722–731 (2017)

    Google Scholar 

  14. Mai, L., Jin, H., Liu, F.: Composition-preserving deep photo aesthetics assessment. In: CVPR, pp. 497–506 (2016)

    Google Scholar 

  15. Marchesotti, L., Perronnin, F., Larlus, D., Csurka, G.: Assessing the aesthetic quality of photographs using generic image descriptors. In: ICCV, pp. 1784–1791 (2011)

    Google Scholar 

  16. Mnih, V., Heess, N., Graves, A., Kavukcuoglu, K.: Recurrent models of visual attention. In: NIPS, pp. 2204–2212 (2014)

    Google Scholar 

  17. Murray, N., Marchesotti, L., Perronnin, F.: AVA: a large-scale database for aesthetic visual analysis. In: CVPR, pp. 2408–2415 (2012)

    Google Scholar 

  18. Ren, J., Shen, X., Lin, Z.L., Mech, R., Foran, D.J.: Personalized image aesthetics. In: ICCV, pp. 638–647 (2017)

    Google Scholar 

  19. Samii, A., Měch, R., Lin, Z.: Data-driven automatic cropping using semantic composition search. Comput. Graph. Forum 34(1), 141–151 (2015)

    Article  Google Scholar 

  20. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: CVPR, pp. 2818–2826 (2016)

    Google Scholar 

  21. Talebi, H., Milanfar, P.: NIMA: neural image assessment. IEEE Trans. Image Process. 27(8), 3998–4011 (2018)

    Article  MathSciNet  Google Scholar 

  22. Tang, X., Luo, W., Wang, X.: Content-based photo quality assessment. IEEE Trans. Multimed. 15(8), 1930–1943 (2013)

    Article  Google Scholar 

  23. Tian, X., Dong, Z., Yang, K., Mei, T.: Query-dependent aesthetic model with deep learning for photo quality assessment. IEEE Trans. Multimed. 17(11), 2035–2048 (2015)

    Article  Google Scholar 

  24. Wang, G., Yan, J., Qin, Z.: Collaborative and attentive learning for personalized image aesthetic assessment. In: IJCAI, pp. 957–963 (2018)

    Google Scholar 

  25. Wang, W., Zhao, M., Wang, L., Huang, J., Cai, C., Xu, X.: A multi-scene deep learning model for image aesthetic evaluation. Sig. Process. Image Comm. 47, 511–518 (2016)

    Article  Google Scholar 

  26. Wang, W., Shen, J.: Deep visual attention prediction. IEEE Trans. Image Process. 27(5), 2368–2378 (2018)

    Article  MathSciNet  Google Scholar 

  27. Zhang, F.L., Wang, M., Hu, S.M.: Aesthetic image enhancement by dependence-aware object recomposition. IEEE Trans. Multimed. 15(7), 1480–1490 (2013)

    Article  Google Scholar 

  28. Zics, B.: Eye gaze as a vehicle for aesthetic interaction: affective visualisation for immersive user experience. In: Proceedings of the 17th International Symposium on Electronic Arts, Istanbul, Turkey, pp. 2204–2212, September 2014

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaodan Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, X., Gao, X., Lu, W., Yu, Y., He, L. (2019). Unified Image Aesthetic Prediction via Scanpath-Guided Feature Aggregation Network. In: Zeng, A., Pan, D., Hao, T., Zhang, D., Shi, Y., Song, X. (eds) Human Brain and Artificial Intelligence. HBAI 2019. Communications in Computer and Information Science, vol 1072. Springer, Singapore. https://doi.org/10.1007/978-981-15-1398-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-1398-5_19

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-1397-8

  • Online ISBN: 978-981-15-1398-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics