Abstract
The performance of automatic aesthetic prediction has achieved significant improvement by utilizing deep convolutional neural networks (CNNs). However, existing CNN methods can only achieve limited success because (1) most of the methods take one fixed-size patch as the training example, which loses the fine-grained details and the holistic layout information, and (2) most of the methods ignore the biologically cues such as the gaze shifting sequence in image aesthetic assessment. To address these challenges, we propose a scanpath-guided feature aggregation model for aesthetic prediction. In our model, human fixation map and the view scanpath are predicted by a multi-scale network. Then a sequence of regions are adaptively selected according to the scanpath. These attended regions are then progressively fed into the CNN and LSTM network to accumulate the information, yielding a compact image level representation. Extensive experiments on the large scale aesthetics assessment benchmark AVA and Photo.net data set thoroughly demonstrate the efficacy of our approach for unified aesthetic prediction tasks: (i) aesthetic quality classification; (ii) aesthetic score regression; and (iii) aesthetic score distribution prediction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bhattacharya, S., Sukthankar, R., Shah, M.: A framework for photo-quality assessment and enhancement based on visual aesthetics. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 271–280 (2010)
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying aesthetics in photographic images using a computational approach. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 288–301. Springer, Heidelberg (2006). https://doi.org/10.1007/11744078_23
Guo, L., Xiong, Y., Huang, Q., Li, X.: Image esthetic assessment using both hand-crafting and semantic features. Neurocomputing 143, 14–26 (2014)
Itti, L., Koch, C.: Computational modelling of visual attention. Nat. Rev. Neurosci. 2(3), 194 (2001)
Jin, X., et al.: Predicting aesthetic score distribution through cumulative Jensen-Shannon divergence. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 77–84 (2018)
Kao, Y., He, R., Huang, K.: Deep aesthetic quality assessment with semantic information. IEEE Trans. Image Process. 26(3), 1482–1495 (2017)
Kong, S., Shen, X., Lin, Z., Mech, R., Fowlkes, C.: Photo aesthetics ranking network with attributes and content adaptation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 662–679. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_40
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)
Locher, P.: The usefulness of eye movement recordings to subject an aesthetic episode with visual art to empirical scrutiny. Psychol. Sci. 48(2), 106 (2006)
Lu, X., Lin, Z., Shen, X., Mech, R., Wang, J.Z.: Deep multi-patch aggregation network for image style, aesthetics, and quality estimation. In: ICCV, pp. 990–998 (2015)
Lu, X., Lin, Z., Jin, H., Yang, J., Wang, J.Z.: Rating image aesthetics using deep learning. IEEE Trans. Multimed. 17(11), 2021–2034 (2015)
Luo, Y., Tang, X.: Photo and video quality evaluation: focusing on the subject. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5304, pp. 386–399. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88690-7_29
Ma, S., Liu, J., Chen, C.W.: A-lamp: adaptive layout-aware multi-patch deep convolutional neural network for photo aesthetic assessment. In: CVPR, pp. 722–731 (2017)
Mai, L., Jin, H., Liu, F.: Composition-preserving deep photo aesthetics assessment. In: CVPR, pp. 497–506 (2016)
Marchesotti, L., Perronnin, F., Larlus, D., Csurka, G.: Assessing the aesthetic quality of photographs using generic image descriptors. In: ICCV, pp. 1784–1791 (2011)
Mnih, V., Heess, N., Graves, A., Kavukcuoglu, K.: Recurrent models of visual attention. In: NIPS, pp. 2204–2212 (2014)
Murray, N., Marchesotti, L., Perronnin, F.: AVA: a large-scale database for aesthetic visual analysis. In: CVPR, pp. 2408–2415 (2012)
Ren, J., Shen, X., Lin, Z.L., Mech, R., Foran, D.J.: Personalized image aesthetics. In: ICCV, pp. 638–647 (2017)
Samii, A., Měch, R., Lin, Z.: Data-driven automatic cropping using semantic composition search. Comput. Graph. Forum 34(1), 141–151 (2015)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: CVPR, pp. 2818–2826 (2016)
Talebi, H., Milanfar, P.: NIMA: neural image assessment. IEEE Trans. Image Process. 27(8), 3998–4011 (2018)
Tang, X., Luo, W., Wang, X.: Content-based photo quality assessment. IEEE Trans. Multimed. 15(8), 1930–1943 (2013)
Tian, X., Dong, Z., Yang, K., Mei, T.: Query-dependent aesthetic model with deep learning for photo quality assessment. IEEE Trans. Multimed. 17(11), 2035–2048 (2015)
Wang, G., Yan, J., Qin, Z.: Collaborative and attentive learning for personalized image aesthetic assessment. In: IJCAI, pp. 957–963 (2018)
Wang, W., Zhao, M., Wang, L., Huang, J., Cai, C., Xu, X.: A multi-scene deep learning model for image aesthetic evaluation. Sig. Process. Image Comm. 47, 511–518 (2016)
Wang, W., Shen, J.: Deep visual attention prediction. IEEE Trans. Image Process. 27(5), 2368–2378 (2018)
Zhang, F.L., Wang, M., Hu, S.M.: Aesthetic image enhancement by dependence-aware object recomposition. IEEE Trans. Multimed. 15(7), 1480–1490 (2013)
Zics, B.: Eye gaze as a vehicle for aesthetic interaction: affective visualisation for immersive user experience. In: Proceedings of the 17th International Symposium on Electronic Arts, Istanbul, Turkey, pp. 2204–2212, September 2014
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhang, X., Gao, X., Lu, W., Yu, Y., He, L. (2019). Unified Image Aesthetic Prediction via Scanpath-Guided Feature Aggregation Network. In: Zeng, A., Pan, D., Hao, T., Zhang, D., Shi, Y., Song, X. (eds) Human Brain and Artificial Intelligence. HBAI 2019. Communications in Computer and Information Science, vol 1072. Springer, Singapore. https://doi.org/10.1007/978-981-15-1398-5_19
Download citation
DOI: https://doi.org/10.1007/978-981-15-1398-5_19
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1397-8
Online ISBN: 978-981-15-1398-5
eBook Packages: Computer ScienceComputer Science (R0)