Can Deep Neural Networks Learn Broad Semantic Concepts of Images?

Cai, Longzheng; Lim, Shuyun; Wang, Xuan; Tang, Longmei

doi:10.1007/978-981-15-3308-2_26

Can Deep Neural Networks Learn Broad Semantic Concepts of Images?

Longzheng Cai¹⁸,
Shuyun Lim¹⁹,
Xuan Wang¹⁸ &
…
Longmei Tang¹⁸

Conference paper
First Online: 13 March 2020

794 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1107))

Abstract

A lot of researches use DNNs to learn image high-level semantic concepts, like categories, from low-level visual properties. Images have more semantic concepts than categories, like whether two images are complement with each other, serve the same purpose, or occur in the same place or situation, etc. In this work, we do an experimental research to evaluate whether DNNs can learn these broad semantic concepts of images. We perform experiments with POPORO image dataset. Our results show that in overall, DNNs have limited capability in learning above-mentioned broad semantic concepts from image visual features. Within DNN models we tested, Inception models and its variants can learn broad semantic concepts of images better than VGG, ResNet, and DenseNet models. We think one of the main reasons for the pale performance in our experiments is the POPORO dataset used in this work is too small for DNN models. Big image datasets with rich and broad semantic labels and measures is the key for successful research in this area.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Kovalenko, L.Y., Chaumon, M., Busch, N.A.: A pool of pairs of related objects (POPORO) for investigating visual semantic integration: behavioral and electrophysiological validation. Brain Topogr. 25(3), 272–284 (2012)
Article Google Scholar
Wang, Z., Alan, C., Sheikh, H.R., Simoncelli, E.P.: The SSIM index for image quality assessment (2019). https://ece.uwaterloo.ca/~z70wang/research/ssim/. Accessed 25 July 2019
Peak signal-to-noise ratio as an image quality metric. http://www.ni.com/zh-cn/innovations/white-papers/11/peak-signal-to-noise-ratio-as-an-image-quality-metric.html. Accessed 25 July 2019
Lee, H.S., Jung, H., Agarwal, A.A., Kim, J.: Can peep neural networks match the related objects?: a survey on ImageNet-trained classification models (2017). https://arxiv.org/abs/1709.03806v1. Accessed 25 July 2019
Gupta, V.: Keras tutorial: using pre-trained Imagenet models (2019). https://www.learnopencv.com/keras-tutorial-using-pre-trained-imagenet-models/. Accessed 25 July 2019
Deng, J., Dong, W., Socher, R., Li, K.J., Li, K., Li, F.F.: ImageNet: a large-scale hierarchical image database. In: CVPR 2009, pp. 248–255 (2009)
Google Scholar
Brownlee, J.: How to grid search hyperparameters for deep learning models in Python with Keras (2016). https://machinelearningmastery.com/grid-search-hyperparameters-deep-learning-models-python-keras/. Accessed 25 July 2019
Deselaers, T., Ferrari, V.: Visual and semantic similarity in ImageNet. In: CVPR 2011, pp. 1777–1784 (2011)
Google Scholar
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Book Google Scholar
Deng, J., Berg, A.C., Li, F.F.: Hierarchical semantic indexing for large scale image retrieval. In: CVPR 2011 (2011)
Google Scholar
McAuley, J., Leskovec, J.: Image labeling on a network: using social-network metadata for image classification. In: European Conference on Computer Vision, pp. 828–841 (2012)
Google Scholar
Wang, Q., Zhou, X.W., Daniilidis, K.: Multi-image semantic matching by mining consistent features. In: CVPR 2017, pp. 685–694 (2017)
Google Scholar
Huang, Y., Wu, Q., Song, C.F., Wang, L.: Learning semantic concepts and order for image and sentence matching. In: CVPR 2018, pp. 6163–6171 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Information Science and Engineering, Fujian University of Technology, Fuzhou, 350118, China
Longzheng Cai, Xuan Wang & Longmei Tang
Faculty of Business and Technology, Unitar International University, 47301, Petaling Jaya, Selangor, Malaysia
Shuyun Lim

Authors

Longzheng Cai
View author publications
You can also search for this author in PubMed Google Scholar
Shuyun Lim
View author publications
You can also search for this author in PubMed Google Scholar
Xuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Longmei Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Longzheng Cai .

Editor information

Editors and Affiliations

College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, China
Jeng-Shyang Pan
Department of Computer Science, Electrical Engineering and Mathematical Sciences, Western Norway University of Applied Sciences, Bergen, Norway
Jerry Chun-Wei Lin
College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, China
Yongquan Liang
School of Computer Science, Engineering and Mathematics, Flinders University, Bedford Park, Australia
Shu-Chuan Chu

Ethics declarations

This research does not involve human participants and/or animals.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cai, L., Lim, S., Wang, X., Tang, L. (2020). Can Deep Neural Networks Learn Broad Semantic Concepts of Images?. In: Pan, JS., Lin, JW., Liang, Y., Chu, SC. (eds) Genetic and Evolutionary Computing. ICGEC 2019. Advances in Intelligent Systems and Computing, vol 1107. Springer, Singapore. https://doi.org/10.1007/978-981-15-3308-2_26

Download citation

DOI: https://doi.org/10.1007/978-981-15-3308-2_26
Published: 13 March 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3307-5
Online ISBN: 978-981-15-3308-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics