Skip to main content

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 172))

Abstract

Convolutional neural network (or CNN) is a special type of multilayer neural network or deep learning architecture inspired by the visual system of living beings. The CNN is very much suitable for different fields of computer vision and natural language processing. The main focus of this chapter is an elaborate discussion of all the basic components of CNN. It also gives a general view of foundation of CNN, recent advancements of CNN and some major application areas.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Notable thing: CNN uses a set of multiple filters in each convolutional layers so that each filter can extract the different types of features.

  2. 2.

    The pooling operation used in convolutional neural networks is a big mistake and the fact that it works so well is a disaster.” –Geoffrey Hinton.

References

  1. Anwar, S.M., Majid, M., Qayyum, A., Awais, M., Alnowami, M., Khan, M.K.: Medical image analysis using convolutional neural networks: a review. J. Med. Syst. 42(11), 1–13 (Nov. 2018)

    Google Scholar 

  2. Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: a deep convolutional encoder-decoder architecture for image segmentation. CoRR, abs/1511.00561 (2015)

    Google Scholar 

  3. Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta, G. (eds.) Proceedings of COMPSTAT’2010, pp. 177–186. Heidelberg, Physica-Verlag HD (2010)

    Google Scholar 

  4. Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected crfs. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (2015)

    Google Scholar 

  5. Chen, X., Girshick, R.B., He, K., Dollár, P.: Tensormask: a foundation for dense object segmentation. CoRR, abs/1903.12174 (2019)

    Google Scholar 

  6. Everingham, M., Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)

    Article  Google Scholar 

  7. Fukushima, K.: Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4), 193–202 (1980)

    Article  Google Scholar 

  8. Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  9. Girshick, R.: Fast r-cnn. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), ICCV ’15, pages 1440–1448, Washington, DC, USA, (2015). IEEE Computer Society

    Google Scholar 

  10. Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR, abs/1311.2524 (2013)

    Google Scholar 

  11. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016). http://www.deeplearningbook.org

  12. He, K., Gkioxari, G. Dollár P., Girshick, R.: Mask r-cnn. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)

    Google Scholar 

  13. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. CoRR, abs/1406.4729 (2014)

    Google Scholar 

  14. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  15. Huang, G., Liu, Z., Weinberger, K.Q.: Densely connected convolutional networks. CoRR, abs/1608.06993 (2016)

    Google Scholar 

  16. Hubel, D.H., Wiesel, T.N.: Receptive fields and functional architecture of monkey striate cortex. J. Physiol. (Lond.) 195, 215–243 (1968)

    Article  Google Scholar 

  17. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. CoRR, abs/1502.03167 (2015)

    Google Scholar 

  18. Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  19. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  20. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 1097–1105. Curran Associates, Inc. (2012)

    Google Scholar 

  21. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  22. LeCun, Y., Cortes, C.: MNIST handwritten digit database (2010)

    Google Scholar 

  23. Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. CoRR, abs/1803.01534, (2018)

    Google Scholar 

  24. Maggiori, E., Tarabalka, Y., Charpiat, G., Alliez, P.: Convolutional neural networks for large-scale remote-sensing image classification. IEEE Trans. Geosci. Remote. Sens. 55(2), 645–657 (2017)

    Article  Google Scholar 

  25. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML’10, pp. 807–814. USA (2010). Omnipress

    Google Scholar 

  26. Ng, A.Y.: Feature selection, l1 versus l2 regularization, and rotational invariance. In: Proceedings of the Twenty-first International Conference on Machine Learning, ICML ’04, pages 78–, New York, NY, USA (2004). ACM

    Google Scholar 

  27. Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. CoRR, abs/1505.04366 (2015)

    Google Scholar 

  28. Pinheiro, P.H.O., Collobert, R., Dollór, P.: Learning to segment object candidates. CoRR, abs/1506.06204 (2015)

    Google Scholar 

  29. Pinheiro, P.H.O., Lin, T., Collobert, R., Dollór, P.: Learning to refine object segments. CoRR, abs/1603.08695 (2016)

    Google Scholar 

  30. Rasti, P., Uiboupin, T., Escalera, S., Anbarjafari, G.: Convolutional neural network super resolution for face recognition in surveillance monitoring, vol. 9756, pp. 175–184 (2016)

    Google Scholar 

  31. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)

    Google Scholar 

  32. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28, pp. 91–99. Curran Associates, Inc. (2015)

    Google Scholar 

  33. Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention (MICCAI), volume 9351 of LNCS, pp. 234–241. Springer, 2015. Available on arXiv:1505.04597 [cs.CV]

  34. Ruder, S.: An overview of gradient descent optimization algorithms. CoRR, abs/1609.04747 (2016)

    Google Scholar 

  35. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. In: Rumelhart, D.E., Mcclelland, J.L. (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations, pp. 318–362. MIT Press, Cambridge, MA (1986)

    Google Scholar 

  36. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  37. Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017)

    Article  Google Scholar 

  38. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556 (2014)

    Google Scholar 

  39. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  40. Sufian, A., Ghosh, A., Naskar, A., Sultana, F.: Bdnet: bengali handwritten numeral digit recognition based on densely connected convolutional neural networks. CoRR, abs/1906.03786 (2019)

    Google Scholar 

  41. Sultana, F., Sufian, A., Dutta, P.: Advancements in image classification using convolutional neural network. In: 2018 Fourth International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN), pp. 122–129 (2018)

    Google Scholar 

  42. Sultana, F., Sufian, A., Dutta, P.: A review of object detection models based on convolutional neural network. CoRR, abs/1905.01614 (2019)

    Google Scholar 

  43. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  44. Wang, T., Wu, D.J., Coates, A., Ng, A.Y.: End-to-end text recognition with convolutional neural networks. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), pp. 3304–3308 (2012)

    Google Scholar 

  45. Zaitoun, N.M., Aqel, M.J.: Survey on image segmentation techniques. Procedia Comput. Sci. 65, 797- 806 (2015). International Conference on Communications, management, and Information technology (ICCMIT’2015)

    Google Scholar 

  46. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision - ECCV 2014. pp, pp. 818–833. Springer International Publishing, Cham (2014)

    Google Scholar 

  47. Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. CoRR, abs/1603.08511 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abu Sufian .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Ghosh, A., Sufian, A., Sultana, F., Chakrabarti, A., De, D. (2020). Fundamental Concepts of Convolutional Neural Network. In: Balas, V., Kumar, R., Srivastava, R. (eds) Recent Trends and Advances in Artificial Intelligence and Internet of Things. Intelligent Systems Reference Library, vol 172. Springer, Cham. https://doi.org/10.1007/978-3-030-32644-9_36

Download citation

Publish with us

Policies and ethics