Skip to main content

Can Deep Neural Networks Learn Broad Semantic Concepts of Images?

  • Conference paper
  • First Online:
  • 794 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1107))

Abstract

A lot of researches use DNNs to learn image high-level semantic concepts, like categories, from low-level visual properties. Images have more semantic concepts than categories, like whether two images are complement with each other, serve the same purpose, or occur in the same place or situation, etc. In this work, we do an experimental research to evaluate whether DNNs can learn these broad semantic concepts of images. We perform experiments with POPORO image dataset. Our results show that in overall, DNNs have limited capability in learning above-mentioned broad semantic concepts from image visual features. Within DNN models we tested, Inception models and its variants can learn broad semantic concepts of images better than VGG, ResNet, and DenseNet models. We think one of the main reasons for the pale performance in our experiments is the POPORO dataset used in this work is too small for DNN models. Big image datasets with rich and broad semantic labels and measures is the key for successful research in this area.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Kovalenko, L.Y., Chaumon, M., Busch, N.A.: A pool of pairs of related objects (POPORO) for investigating visual semantic integration: behavioral and electrophysiological validation. Brain Topogr. 25(3), 272–284 (2012)

    Article  Google Scholar 

  2. Wang, Z., Alan, C., Sheikh, H.R., Simoncelli, E.P.: The SSIM index for image quality assessment (2019). https://ece.uwaterloo.ca/~z70wang/research/ssim/. Accessed 25 July 2019

  3. Peak signal-to-noise ratio as an image quality metric. http://www.ni.com/zh-cn/innovations/white-papers/11/peak-signal-to-noise-ratio-as-an-image-quality-metric.html. Accessed 25 July 2019

  4. Lee, H.S., Jung, H., Agarwal, A.A., Kim, J.: Can peep neural networks match the related objects?: a survey on ImageNet-trained classification models (2017). https://arxiv.org/abs/1709.03806v1. Accessed 25 July 2019

  5. Gupta, V.: Keras tutorial: using pre-trained Imagenet models (2019). https://www.learnopencv.com/keras-tutorial-using-pre-trained-imagenet-models/. Accessed 25 July 2019

  6. Deng, J., Dong, W., Socher, R., Li, K.J., Li, K., Li, F.F.: ImageNet: a large-scale hierarchical image database. In: CVPR 2009, pp. 248–255 (2009)

    Google Scholar 

  7. Brownlee, J.: How to grid search hyperparameters for deep learning models in Python with Keras (2016). https://machinelearningmastery.com/grid-search-hyperparameters-deep-learning-models-python-keras/. Accessed 25 July 2019

  8. Deselaers, T., Ferrari, V.: Visual and semantic similarity in ImageNet. In: CVPR 2011, pp. 1777–1784 (2011)

    Google Scholar 

  9. Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)

    Book  Google Scholar 

  10. Deng, J., Berg, A.C., Li, F.F.: Hierarchical semantic indexing for large scale image retrieval. In: CVPR 2011 (2011)

    Google Scholar 

  11. McAuley, J., Leskovec, J.: Image labeling on a network: using social-network metadata for image classification. In: European Conference on Computer Vision, pp. 828–841 (2012)

    Google Scholar 

  12. Wang, Q., Zhou, X.W., Daniilidis, K.: Multi-image semantic matching by mining consistent features. In: CVPR 2017, pp. 685–694 (2017)

    Google Scholar 

  13. Huang, Y., Wu, Q., Song, C.F., Wang, L.: Learning semantic concepts and order for image and sentence matching. In: CVPR 2018, pp. 6163–6171 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Longzheng Cai .

Editor information

Editors and Affiliations

Ethics declarations

This research does not involve human participants and/or animals.

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cai, L., Lim, S., Wang, X., Tang, L. (2020). Can Deep Neural Networks Learn Broad Semantic Concepts of Images?. In: Pan, JS., Lin, JW., Liang, Y., Chu, SC. (eds) Genetic and Evolutionary Computing. ICGEC 2019. Advances in Intelligent Systems and Computing, vol 1107. Springer, Singapore. https://doi.org/10.1007/978-981-15-3308-2_26

Download citation

Publish with us

Policies and ethics