Interpreting Layered Neural Networks via Hierarchical Modular Representation

Watanabe, Chihiro

doi:10.1007/978-3-030-36802-9_40

Chihiro Watanabe⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1143))

Included in the following conference series:

International Conference on Neural Information Processing

2258 Accesses
2 Citations

Abstract

Interpreting the prediction mechanism of complex models is currently one of the most important tasks in the machine learning field, especially with layered neural networks, which have achieved high predictive performance with various practical data sets. To reveal the global structure of a trained neural network in an interpretable way, a series of clustering methods have been proposed, which decompose the units into clusters according to the similarity of their inference roles. The main problems in these studies were that (1) we have no prior knowledge about the optimal resolution for the decomposition, or the appropriate number of clusters, and (2) there was no method for acquiring knowledge about whether the outputs of each cluster have a positive or negative correlation with the input and output unit values. In this paper, to solve these problems, we propose a method for obtaining a hierarchical modular representation of a layered neural network. The application of a hierarchical clustering method to a trained network reveals a tree-structured relationship among hidden layer units, based on their feature vectors defined by their correlation with the input and output unit values.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alain, G., Bengio, Y.: Understanding intermediate layers using linear classifier probes. In: ICLR 2017 Workshop (2017)
Google Scholar
Ancona, M., Ceolini, E., Öztireli, A.C., Gross, M.: Towards better understanding of gradient-based attribution methods for deep neural networks. In: International Conference on Learning Representations (2018)
Google Scholar
Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. In: Computer Vision and Pattern Recognition (2017)
Google Scholar
Craven, M., Shavlik, J.W.: Extracting tree-structured representations of trained networks. In: Advances in Neural Information Processing Systems, vol. 8, pp. 24–30 (1996)
Google Scholar
Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning. arXiv:1702.08608 (2017)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2015)
Google Scholar
Krishnan, R., Sivakumar, G., Bhattacharya, P.: Extracting decision trees from trained neural networks. Pattern Recogn. 32(12), 1999–2009 (1999)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE. 86, 2278–2324 (1998)
Article Google Scholar
Lipton, Z.C.: The mythos of model interpretability. In: Proceedings of the 2016 ICML Workshop on Human Interpretability in Machine Learning (2016)
Google Scholar
Lundberg, S.M., Lee, S.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30, pp. 4765–4774 (2017)
Google Scholar
Luo, W., Li, Y., Urtasun, R., Zemel, R.: Understanding the effective receptive field in deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 29, pp. 4898–4906 (2016)
Google Scholar
Nagamine, T., Mesgarani, N.: Understanding the representation and computation of multilayer perceptrons: a case study in speech recognition. In: Proceedings of the 34th International Conference on Machine Learning, pp. 2564–2573 (2017)
Google Scholar
Raghu, M., Gilmer, J., Yosinski, J., Sohl-Dickstein, J.: SVCCA: singular vector canonical correlation analysis for deep learning dynamics and interpretability. In: Advances in Neural Information Processing Systems, vol. 30, pp. 6076–6085 (2017)
Google Scholar
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)
Google Scholar
Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: Proceedings of the 34th International Conference on Machine Learning, pp. 3145–3153 (2017)
Google Scholar
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. In: ICLR 2014 Workshop (2014)
Google Scholar
Singh, C., Murdoch, W.J., Yu, B.: Hierarchical interpretations for neural network predictions. In: International Conference on Learning Representations (2019)
Google Scholar
Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.: Striving for simplicity: the all convolutional net. In: ICLR 2015 Workshop (2015)
Google Scholar
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: Proceedings of the 34th International Conference on Machine Learning, pp. 3319–3328 (2017)
Google Scholar
Thiagarajan, J.J., Kailkhura, B., Sattigeri, P., Ramamurthy, K.N.: Treeview: peeking into deep neural networks via feature-space partitioning. In: NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems (2016)
Google Scholar
Wagner, J., Köhler, J.M., Gindele, T., Hetzel, L., Wiedemer, J.T., Behnke, S.: Interpretable and fine-grained visual explanations for convolutional neural networks. In: Computer Vision and Pattern Recognition (2019)
Google Scholar
Ward, J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58(301), 236–244 (1963)
Article MathSciNet Google Scholar
Watanabe, C., Hiramatsu, K., Kashino, K.: Modular representation of autoencoder networks. In: Proceedings of 2017 IEEE Symposium on Deep Learning, 2017 IEEE Symposium Series on Computational Intelligence (2017)
Google Scholar
Watanabe, C., Hiramatsu, K., Kashino, K.: Recursive extraction of modular structure from layered neural networks using variational Bayes method. In: Yamamoto, A., Kida, T., Uno, T., Kuboyama, T. (eds.) DS 2017. LNCS (LNAI), vol. 10558, pp. 207–222. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67786-6_15
Chapter Google Scholar
Watanabe, C., Hiramatsu, K., Kashino, K.: Knowledge discovery from layered neural networks based on non-negative task decomposition. arXiv:1805.07137v2 (2018)
Watanabe, C., Hiramatsu, K., Kashino, K.: Modular representation of layered neural networks. Neural Netw. 97, 62–73 (2018)
Article Google Scholar
Watanabe, C., Hiramatsu, K., Kashino, K.: Understanding community structure in layered neural networks. arXiv:1804.04778 (2018)
Zahavy, T., Ben-Zrihem, N., Mannor, S.: Graying the black box: understanding DQNs. In: Proceedings of the 33rd International Conference on Machine Learning, pp. 1899–1908 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

NTT Communication Science Laboratories, 3-1, Morinosato Wakamiya, Atsugi-shi, Kanagawa Prefecture, Japan
Chihiro Watanabe

Authors

Chihiro Watanabe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chihiro Watanabe .

Editor information

Editors and Affiliations

Australian National University, Canberra, ACT, Australia
Tom Gedeon
Murdoch University, Murdoch, WA, Australia
Kok Wai Wong
Kyungpook National University, Daegu, Korea (Republic of)
Minho Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Watanabe, C. (2019). Interpreting Layered Neural Networks via Hierarchical Modular Representation. In: Gedeon, T., Wong, K., Lee, M. (eds) Neural Information Processing. ICONIP 2019. Communications in Computer and Information Science, vol 1143. Springer, Cham. https://doi.org/10.1007/978-3-030-36802-9_40

Download citation

DOI: https://doi.org/10.1007/978-3-030-36802-9_40
Published: 05 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36801-2
Online ISBN: 978-3-030-36802-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics