Skip to main content

BatchNorm Decomposition for Deep Neural Network Interpretation

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11507))

Abstract

Layer-wise relevance propagation (LRP) has shown potential for explaining neural network classifier decisions. In this paper, we investigate how LRP is to be applied to deep neural network which makes use of batch normalization (BatchNorm), and show that despite the functional simplicity of BatchNorm, several intuitive choices of published LRP rules perform poorly for a number of frequently used state of the art networks. Also, we show that by using the \(\varepsilon \)-rule for BatchNorm layers we are able to detect training artifacts for MobileNet and layer design artifacts for ResNet. The causes for such failures are analyzed deeply and thoroughly. We observe that some assumptions on the LRP decomposition rules are broken given specific networks, and propose a novel LRP rule tailored for BatchNorm layers. Our quantitatively evaluated results show advantage of our novel LRP rule for BatchNorm layers and its wide applicability to common deep neural network architectures. As an aside, we demonstrate that one observation made by LRP analysis serves to modify a ResNet for faster initial training convergence.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/tonylins/pytorch-mobilenet-v2.

  2. 2.

    https://pytorch.org/docs/stable/torchvision/models.html#id3.

  3. 3.

    https://pytorch.org/docs/stable/torchvision/models.html#id5.

  4. 4.

    https://github.com/Cadene/pretrained-models.pytorch/blob/master/pretrainedmodels/models/inceptionresnetv2.py.

References

  1. Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7), 1–46 (2015). https://doi.org/10.1371/journal.pone.0130140

    Article  Google Scholar 

  2. Binder, A., Montavon, G., Lapuschkin, S., Müller, K.-R., Samek, W.: Layer-wise relevance propagation for neural networks with local renormalization layers. In: Villa, A.E.P., Masulli, P., Pons Rivero, A.J. (eds.) ICANN 2016. LNCS, vol. 9887, pp. 63–71. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44781-0_8

    Chapter  MATH  Google Scholar 

  3. Bjorck, J., Gomes, C., Selman, B., Weinberger, K.Q.: Understanding batch normalization (2018)

    Google Scholar 

  4. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  5. Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications (2017)

    Google Scholar 

  6. Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L.: Densely connected convolutional networks. arXiv preprint arXiv:1608.06993 (2016)

  7. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift (2015)

    Google Scholar 

  8. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1106–1114 (2012)

    Google Scholar 

  9. Montavon, G., Bach, S., Binder, A., Samek, W., Müller, K.R.: Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recogn. 65, 211–222 (2017). https://doi.org/10.1016/j.patcog.2016.11.008

    Article  Google Scholar 

  10. Montavon, G., Lapuschkin, S., Binder, A., Samek, W., Mller, K.R.: Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recogn. 65, 211–222 (2017). https://doi.org/10.1016/j.patcog.2016.11.008

    Article  Google Scholar 

  11. Samek, W., Binder, A., Montavon, G., Lapuschkin, S., Müller, K.R.: Evaluating the visualization of what a deep neural network has learned. IEEE Trans. Neural Netw. Learn. Syst. 28(11), 2660–2673 (2017)

    Article  MathSciNet  Google Scholar 

  12. Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. CoRR abs/1704.02685 (2017). http://arxiv.org/abs/1704.02685

  13. Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013)

  14. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014)

    Google Scholar 

  15. Smilkov, D., Thorat, N., Kim, B., Vigas, F., Wattenberg, M.: Smoothgrad: removing noise by adding noise (2017)

    Google Scholar 

  16. Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.A.: Striving for simplicity: the all convolutional net. CoRR abs/1412.6806 (2014), http://arxiv.org/abs/1412.6806

  17. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning (2016)

    Google Scholar 

  18. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lucas Y. W. Hui .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hui, L.Y.W., Binder, A. (2019). BatchNorm Decomposition for Deep Neural Network Interpretation. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2019. Lecture Notes in Computer Science(), vol 11507. Springer, Cham. https://doi.org/10.1007/978-3-030-20518-8_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20518-8_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20517-1

  • Online ISBN: 978-3-030-20518-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics