Unified Image Aesthetic Prediction via Scanpath-Guided Feature Aggregation Network

Zhang, Xiaodan; Gao, Xinbo; Lu, Wen; Yu, Ying; He, Lihuo

doi:10.1007/978-981-15-1398-5_19

Xiaodan Zhang¹²,
Xinbo Gao¹²,
Wen Lu¹²,
Ying Yu¹² &
…
Lihuo He¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1072))

Included in the following conference series:

International Workshop on Human Brain and Artificial Intelligence

668 Accesses

Abstract

The performance of automatic aesthetic prediction has achieved significant improvement by utilizing deep convolutional neural networks (CNNs). However, existing CNN methods can only achieve limited success because (1) most of the methods take one fixed-size patch as the training example, which loses the fine-grained details and the holistic layout information, and (2) most of the methods ignore the biologically cues such as the gaze shifting sequence in image aesthetic assessment. To address these challenges, we propose a scanpath-guided feature aggregation model for aesthetic prediction. In our model, human fixation map and the view scanpath are predicted by a multi-scale network. Then a sequence of regions are adaptively selected according to the scanpath. These attended regions are then progressively fed into the CNN and LSTM network to accumulate the information, yielding a compact image level representation. Extensive experiments on the large scale aesthetics assessment benchmark AVA and Photo.net data set thoroughly demonstrate the efficacy of our approach for unified aesthetic prediction tasks: (i) aesthetic quality classification; (ii) aesthetic score regression; and (iii) aesthetic score distribution prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bhattacharya, S., Sukthankar, R., Shah, M.: A framework for photo-quality assessment and enhancement based on visual aesthetics. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 271–280 (2010)
Google Scholar
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying aesthetics in photographic images using a computational approach. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 288–301. Springer, Heidelberg (2006). https://doi.org/10.1007/11744078_23
Chapter Google Scholar
Guo, L., Xiong, Y., Huang, Q., Li, X.: Image esthetic assessment using both hand-crafting and semantic features. Neurocomputing 143, 14–26 (2014)
Article Google Scholar
Itti, L., Koch, C.: Computational modelling of visual attention. Nat. Rev. Neurosci. 2(3), 194 (2001)
Article Google Scholar
Jin, X., et al.: Predicting aesthetic score distribution through cumulative Jensen-Shannon divergence. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 77–84 (2018)
Google Scholar
Kao, Y., He, R., Huang, K.: Deep aesthetic quality assessment with semantic information. IEEE Trans. Image Process. 26(3), 1482–1495 (2017)
Article MathSciNet Google Scholar
Kong, S., Shen, X., Lin, Z., Mech, R., Fowlkes, C.: Photo aesthetics ranking network with attributes and content adaptation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 662–679. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_40
Chapter Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)
Google Scholar
Locher, P.: The usefulness of eye movement recordings to subject an aesthetic episode with visual art to empirical scrutiny. Psychol. Sci. 48(2), 106 (2006)
Google Scholar
Lu, X., Lin, Z., Shen, X., Mech, R., Wang, J.Z.: Deep multi-patch aggregation network for image style, aesthetics, and quality estimation. In: ICCV, pp. 990–998 (2015)
Google Scholar
Lu, X., Lin, Z., Jin, H., Yang, J., Wang, J.Z.: Rating image aesthetics using deep learning. IEEE Trans. Multimed. 17(11), 2021–2034 (2015)
Article Google Scholar
Luo, Y., Tang, X.: Photo and video quality evaluation: focusing on the subject. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5304, pp. 386–399. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88690-7_29
Chapter Google Scholar
Ma, S., Liu, J., Chen, C.W.: A-lamp: adaptive layout-aware multi-patch deep convolutional neural network for photo aesthetic assessment. In: CVPR, pp. 722–731 (2017)
Google Scholar
Mai, L., Jin, H., Liu, F.: Composition-preserving deep photo aesthetics assessment. In: CVPR, pp. 497–506 (2016)
Google Scholar
Marchesotti, L., Perronnin, F., Larlus, D., Csurka, G.: Assessing the aesthetic quality of photographs using generic image descriptors. In: ICCV, pp. 1784–1791 (2011)
Google Scholar
Mnih, V., Heess, N., Graves, A., Kavukcuoglu, K.: Recurrent models of visual attention. In: NIPS, pp. 2204–2212 (2014)
Google Scholar
Murray, N., Marchesotti, L., Perronnin, F.: AVA: a large-scale database for aesthetic visual analysis. In: CVPR, pp. 2408–2415 (2012)
Google Scholar
Ren, J., Shen, X., Lin, Z.L., Mech, R., Foran, D.J.: Personalized image aesthetics. In: ICCV, pp. 638–647 (2017)
Google Scholar
Samii, A., Měch, R., Lin, Z.: Data-driven automatic cropping using semantic composition search. Comput. Graph. Forum 34(1), 141–151 (2015)
Article Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: CVPR, pp. 2818–2826 (2016)
Google Scholar
Talebi, H., Milanfar, P.: NIMA: neural image assessment. IEEE Trans. Image Process. 27(8), 3998–4011 (2018)
Article MathSciNet Google Scholar
Tang, X., Luo, W., Wang, X.: Content-based photo quality assessment. IEEE Trans. Multimed. 15(8), 1930–1943 (2013)
Article Google Scholar
Tian, X., Dong, Z., Yang, K., Mei, T.: Query-dependent aesthetic model with deep learning for photo quality assessment. IEEE Trans. Multimed. 17(11), 2035–2048 (2015)
Article Google Scholar
Wang, G., Yan, J., Qin, Z.: Collaborative and attentive learning for personalized image aesthetic assessment. In: IJCAI, pp. 957–963 (2018)
Google Scholar
Wang, W., Zhao, M., Wang, L., Huang, J., Cai, C., Xu, X.: A multi-scene deep learning model for image aesthetic evaluation. Sig. Process. Image Comm. 47, 511–518 (2016)
Article Google Scholar
Wang, W., Shen, J.: Deep visual attention prediction. IEEE Trans. Image Process. 27(5), 2368–2378 (2018)
Article MathSciNet Google Scholar
Zhang, F.L., Wang, M., Hu, S.M.: Aesthetic image enhancement by dependence-aware object recomposition. IEEE Trans. Multimed. 15(7), 1480–1490 (2013)
Article Google Scholar
Zics, B.: Eye gaze as a vehicle for aesthetic interaction: affective visualisation for immersive user experience. In: Proceedings of the 17th International Symposium on Electronic Arts, Istanbul, Turkey, pp. 2204–2212, September 2014
Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Integrated Services Networks, School of Electronic Engineering, Xidian University, Xi’an, 710071, China
Xiaodan Zhang, Xinbo Gao, Wen Lu, Ying Yu & Lihuo He

Authors

Xiaodan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xinbo Gao
View author publications
You can also search for this author in PubMed Google Scholar
Wen Lu
View author publications
You can also search for this author in PubMed Google Scholar
Ying Yu
View author publications
You can also search for this author in PubMed Google Scholar
Lihuo He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaodan Zhang .

Editor information

Editors and Affiliations

Guangdong University of Technology, Guangzhou, China
An Zeng
Guangdong Construction Polytechnic, Guangzhou, China
Dan Pan
South China Normal University, Guangzhou, China
Tianyong Hao
Nanjing University of Aeronautics and Astronautics, Nanjing, China
Daoqiang Zhang
University of Notre Dame, South Bend, IN, USA
Yiyu Shi
Simon Fraser University, Vancouver, BC, Canada
Xiaowei Song

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, X., Gao, X., Lu, W., Yu, Y., He, L. (2019). Unified Image Aesthetic Prediction via Scanpath-Guided Feature Aggregation Network. In: Zeng, A., Pan, D., Hao, T., Zhang, D., Shi, Y., Song, X. (eds) Human Brain and Artificial Intelligence. HBAI 2019. Communications in Computer and Information Science, vol 1072. Springer, Singapore. https://doi.org/10.1007/978-981-15-1398-5_19

Download citation

DOI: https://doi.org/10.1007/978-981-15-1398-5_19
Published: 10 November 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1397-8
Online ISBN: 978-981-15-1398-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics