Eigen-CAM: Visual Explanations for Deep Convolutional Neural Networks

Bany Muhammad, Mohammed; Yeasin, Mohammed

doi:10.1007/s42979-021-00449-3

Eigen-CAM: Visual Explanations for Deep Convolutional Neural Networks

Original Research
Published: 20 January 2021

Volume 2, article number 47, (2021)
Cite this article

SN Computer Science Aims and scope Submit manuscript

2092 Accesses
20 Citations
Explore all metrics

Abstract

The adoption of deep convolutional neural networks (CNN) is growing exponentially in wide varieties of applications due to exceptional performance that equals to or is better than classical machine learning as well as a human. However, such models are difficult to interpret, susceptible to overfit, and hard to decode failure. An increasing body of literature, such as class activation map (CAM), focused on understanding what representations or features a model learned from the data. This paper presents novel Eigen-CAM to enhance explanations of CNN predictions by visualizing principal components of learned representations from convolutional layers. The Eigen-CAM is intuitive, easy to use, computationally efficient, and does not require correct classification by the model. Eigen-CAM can work with all CNN models without the need to modify layers or retrain models. For the task of generating a visual explanation of CNN predictions, compared to state-of-the-art methods, Eigen-CAM is more consistent, class discriminative, and robust against classification errors made by dense layers. Empirical analyses and comparison with the best state-of-the-art methods show up to 12% improvement in weakly-supervised object localization, an average of 13% improvement in weakly-supervised segmentation, and at least 15% improvement in generic object proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cross-CAM: Focused Visual Explanations for Deep Convolutional Networks via Training-Set Tracing

Explaining classifiers by constructing familiar concepts

Article Open access 25 March 2022

Fine-grained visual explanations for the convolutional neural network via class discriminative deconvolution

Article 03 November 2021

References

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition. 2016. p. 770–8.
Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM. 2017. https://doi.org/10.1145/3065386.
Article Google Scholar
Wang Q, Li Q, Li X. Hyperspectral image super-resolution using spectrum and feature context. IEEE Trans Industr Electron. 2020. https://doi.org/10.1109/TIE.2020.3038096.
Article Google Scholar
Girshick R. Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision. 2015. p. 1440–8.
Ren S, He K, Girshick R, Sun J. Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems. 2015. p. 91–9.
Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector. arXiv. 2016. https://doi.org/10.1007/978-3-319-46448-0_2.
Article Google Scholar
Wang Q, Gao J, Lin W, Li X. NWPU-crowd: a large-scale benchmark for crowd counting and localization. IEEE Trans Pattern Anal Mach Intell. 2020. https://doi.org/10.1109/TPAMI.2020.3013269.
Article Google Scholar
Aneja J, Deshpande A, Schwing AG. Convolutional image captioning. In: IEEE/CVF conference on computer vision and pattern recognition. 2018, pp 5561–5570
Fang H, Gupta S, Iandola F, et al. From captions to visual concepts and back. IEEE conference on computer vision and pattern recognition (CVPR). 2015, pp 1473–1482
Johnson J, Karpathy A, Fei-Fei L. DenseCap: fully convolutional localization networks for dense captioning. In: IEEE conference on computer vision and pattern recognition. 2016, pp 4565–4574
Vinyals O, Toshev A, Bengio S, Erhan D. Show and tell: lessons learned from the 2015 MSCOCO image captioning challenge. IEEE Trans Pattern Anal Mach Intell. 2017. https://doi.org/10.1109/TPAMI.2016.2587640.
Article Google Scholar
Ronneberger O, Fischer P, Brox T. U-Net: Convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF, editors. Medical image computing and computer-assisted intervention – MICCAI 2015. Springer: Cham; 2015. p. 234–41.
Chapter Google Scholar
Han C, Duan Y, Tao X, Lu J. Dense convolutional networks for semantic segmentation. IEEE Access. 2019. https://doi.org/10.1109/ACCESS.2019.2908685.
Article Google Scholar
Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2015, pp 3431–3440
Abdulla W. Title of subordinate document. In: Mask_RCNN: mask R-CNN for object detection and instance segmentation on Keras and TensorFlow. Matterport. 2017. https://github.com/matterport/Mask_RCNN. Accessed 18 Dec 2020
Wang J, Liu Z, Chorowski J, et al. Robust 3D action recognition with random occupancy patterns. In: Fitzgibbon A, Lazebnik S, Perona P, et al., editors. Computer vision—ECCV 2012. Berlin: Springer; 2012. p. 872–85.
Chapter Google Scholar
Xia L, Chen C-C, Aggarwal JK. View invariant human action recognition using histograms of 3D joints. In: IEEE computer society conference on computer vision and pattern recognition workshops. 2012, pp 20–27
Antol S, Agrawal A, Lu J, et al. VQA: visual question answering. In: IEEE international conference on computer vision (ICCV). 2015, pp 2425–2433
Anderson P, He X, Buehler C, et al. Bottom-up and top-down attention for image captioning and visual question answering. In: IEEE/CVF conference on computer vision and pattern recognition. 2018, pp 6077–6086
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv:2015.14091556 [cs].
Muhammad MB, Yeasin M. Eigen-CAM: class activation map using principal components. In: International joint conference on neural networks (IJCNN). 2020, pp 1–7
Simonyan K, Vedaldi A, Zisserman A. Deep inside convolutional networks: visualising image classification models and saliency maps. In: International conference on learning representations. 2014. p. 1–8.
Zeiler MD, Taylor GW, Fergus R. Adaptive deconvolutional networks for mid and high level feature learning. In: IEEE international conference on computer vision. 2011, pp 2018–2025
Springenberg JT, Dosovitskiy A, Brox T, Riedmiller M. Striving for simplicity: the all convolutional net. 2014. arXiv preprint arXiv:1412.6806.
Zhou B, Khosla A, Lapedriza A, et al. Learning deep features for discriminative localization. In: IEEE conference on computer vision and pattern recognition (CVPR). 2016, pp 2921–2929
Selvaraju RR, Cogswell M, Das A, et al. Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision. 2017. p. 618–626.
Chattopadhyay A, Sarkar A, Howlader P, Balasubramanian VN. Grad-CAM++: improved visual explanations for deep convolutional networks. In: IEEE winter conference on applications of computer vision (WACV). 2018. https://doi.org/10.1109/WACV.2018.00097
Shrikumar A, Greenside P, Kundaje A. Learning important features through propagating activation differences. In: Proceedings of the 34th international conference on machine learning. 2017. p. 3145–53.
Bach S, Binder A, Montavon G, et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE. 2015;10:e0130140. https://doi.org/10.1371/journal.pone.0130140.
Article Google Scholar
Mopuri KR, Garg U, Venkatesh BR. CNN fixations: an unraveling approach to visualize the discriminative image regions. IEEE Trans Image Process. 2019. https://doi.org/10.1109/TIP.2018.2881920.
Article Google Scholar
Voulodimos A, Doulamis N, Doulamis A, Protopapadakis E. Deep learning for computer vision: a brief review. Comput Intell Neurosci. 2018. https://doi.org/10.1155/2018/7068349.
Article Google Scholar
Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge. Int J Comput Vis. 2015. https://doi.org/10.1007/s11263-015-0816-y.
Article MathSciNet Google Scholar
Szegedy C, Wei Liu, Yangqing Jia, et al. Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). 2015. p. 1–9.
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). 2017. p. 2261–9.
Cholakkal H, Johnson J, Rajan D. Backtracking ScSPM image classifier for weakly supervised top-down saliency. In: IEEE conference on computer vision and pattern recognition (CVPR). 2016, pp 5278–5287
Marszałek M, Schmid C. Accurate object recognition with shape masks. Int J Comput Vis. 2012. https://doi.org/10.1007/s11263-011-0479-2.
Article MathSciNet Google Scholar
Bazzani L, Bergamo A, Anguelov D, Torresani L. Self-taught object localization with deep networks. In: 2016 IEEE winter conference on applications of computer vision (WACV). 2016. p. 1–9.
Pinheiro PO, Collobert R, Dollár P. Learning to segment object candidates. In: Advances in neural information processing systems. 2015. p. 1990–8.
Zhang J, Bargal SA, Lin Z, et al. Top-down neural attention by excitation backprop. Int J Comput Vis. 2018;126:1084–102.
Article Google Scholar
Everingham M, Gool L, Williams CK, et al. The Pascal visual object classes (VOC) challenge. Int J Comput Vis. 2010. https://doi.org/10.1007/s11263-009-0275-4.
Article Google Scholar
Moosavi-Dezfooli S-M, Fawzi A, Frossard P. Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. p. 2574–82.

Download references

Acknowledgements

The authors thank Felix Havugimana for having helpful discussions while conducting the performance evaluation for this research.

Funding

The authors acknowledge the funding and research support provided by the Dept. of EECE at the Herff College of Engineering, University of Memphis.

Author information

Authors and Affiliations

The University of Memphis, Memphis, USA
Mohammed Bany Muhammad & Mohammed Yeasin

Authors

Mohammed Bany Muhammad
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Yeasin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammed Bany Muhammad.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bany Muhammad, M., Yeasin, M. Eigen-CAM: Visual Explanations for Deep Convolutional Neural Networks. SN COMPUT. SCI. 2, 47 (2021). https://doi.org/10.1007/s42979-021-00449-3

Download citation

Received: 19 August 2020
Accepted: 02 January 2021
Published: 20 January 2021
DOI: https://doi.org/10.1007/s42979-021-00449-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Eigen-CAM: Visual Explanations for Deep Convolutional Neural Networks

Abstract

Access this article

Similar content being viewed by others

Cross-CAM: Focused Visual Explanations for Deep Convolutional Networks via Training-Set Tracing

Explaining classifiers by constructing familiar concepts

Fine-grained visual explanations for the convolutional neural network via class discriminative deconvolution

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Eigen-CAM: Visual Explanations for Deep Convolutional Neural Networks

Abstract

Access this article

Similar content being viewed by others

Cross-CAM: Focused Visual Explanations for Deep Convolutional Networks via Training-Set Tracing

Explaining classifiers by constructing familiar concepts

Fine-grained visual explanations for the convolutional neural network via class discriminative deconvolution

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation