Human-Understandable Decision Making for Visual Recognition

Zhou, Xiaowei; Yin, Jie; Tsang, Ivor; Wang, Chen

doi:10.1007/978-3-030-75768-7_14

Xiaowei Zhou^15,17,
Jie Yin¹⁶,
Ivor Tsang¹⁵ &
…
Chen Wang¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12714))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1579 Accesses
1 Citations

Abstract

The widespread use of deep neural networks has achieved substantial success in many tasks. However, there still exists a huge gap between the operating mechanism of deep learning models and human-understandable decision making, so that humans cannot fully trust the predictions made by these models. To date, little work has been done on how to align the behaviors of deep learning models with human perception in order to train a human-understandable model. To fill this gap, we propose a new framework to train a deep neural network by incorporating the prior of human perception into the model learning process. Our proposed model mimics the process of perceiving conceptual parts from images and assessing their relative contributions towards the final recognition. The effectiveness of our proposed model is evaluated on two classical visual recognition tasks. The experimental results and analysis confirm our model is able to provide interpretable explanations for its predictions, but also maintain competitive recognition accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Explainable Methods for Image-Based Deep Learning: A Review

Article 13 January 2023

Post hoc visual interpretation using a deep learning-based smooth feature network

Article 18 December 2023

Interpretable Basis Decomposition for Visual Explanation

References

Adebayo, J., Gilmer, J., Muelly, M., Goodfellow, I., Hardt, M., Kim, B.: Sanity checks for saliency maps. In: NeurIPS, pp. 9505–9515 (2018)
Google Scholar
Biederman, I.: Recognition-by-components: a theory of human image understanding. Psychol. Rev. 94(2), 115 (1987)
Article Google Scholar
Chang, D., et al.: The devil is in the channels: mutual-channel loss for fine-grained image classification. IEEE Trans. Image Process. 29, 4683–4695 (2020)
Google Scholar
Chattopadhay, A., Sarkar, A., Howlader, P., Balasubramanian, V.N.: Grad-cam++: generalized gradient-based visual explanations for deep convolutional networks. In: WACV, pp. 839–847. IEEE (2018)
Google Scholar
Dong, Y., Su, H., Zhu, J., Zhang, B.: Improving interpretability of deep neural networks with semantic information. In: CVPR, pp. 4306–4314 (2017)
Google Scholar
Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017)
Ghorbani, A., Abid, A., Zou, J.: Interpretation of neural networks is fragile. In: AAAI, Vol. 33, No. 01, pp. 3681–3688 (2019)
Google Scholar
Ghorbani, A., Wexler, J., Zou, J.Y., Kim, B.: Towards automatic concept-based explanations. In: NeurIPS, pp. 9273–9282 (2019)
Google Scholar
Goodfellow, I.J., et al.: Challenges in representation learning: a report on three machine learning contests. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013. LNCS, vol. 8228, pp. 117–124. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-42051-1_16
Chapter Google Scholar
Hendricks, L.A., Akata, Z., Rohrbach, M., Donahue, J., Schiele, B., Darrell, T.: Generating visual explanations. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 3–19. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_1
Chapter Google Scholar
Huang, Z., Li, Y.: Interpretable and accurate fine-grained recognition via region grouping. In: CVPR, pp. 8662–8672 (2020)
Google Scholar
Jacobs, R.A., et al.: Adaptive mixtures of local experts. Neural Comput. 3(1), 79–87 (1991)
Article Google Scholar
Lake, B.M., Ullman, T.D., Tenenbaum, J.B., Gershman, S.J.: Building machines that learn and think like people. Behav. Brain Sci. 40, (2017)
Google Scholar
Laugel, T., Lesot, M.J., Marsala, C., Renard, X., Detyniecki, M.: The dangers of post-hoc interpretability: unjustified counterfactual explanations. In: IJCAI, pp. 2801–2807. AAAI Press (2019)
Google Scholar
Mordan, T., Thome, N., Henaff, G., Cord, M.: End-to-end learning of latent deformable part-based representations for object detection. Int. J. Compute. Vision 127(11–12), 1659–1679 (2019)
Article Google Scholar
Paçacı, G., Johnson, D., McKeever, S., Hamfelt, A.: “Why did you do that?”: explaining black box models with inductive synthesis. In: ICCS, Faro, Algarve, Portugal (2019)
Google Scholar
Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you?: explaining the predictions of any classifier. In: SIGKDD, pp. 1135–1144. ACM (2016)
Google Scholar
Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: ICML, pp. 3145–3153. JMLR. org (2017)
Google Scholar
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: ICML, pp. 3319–3328. JMLR. org (2017)
Google Scholar
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The caltech-ucsd birds-200-2011 dataset. Tech. Rep. CNS-TR-2011-001, Caltech (2011)
Google Scholar
Wu, H., Wang, C., Yin, J., Lu, K., Zhu, L.: Sharing deep neural network models with interpretation. In: WWW, pp. 177–186. WWW Steering Committee (2018)
Google Scholar
Zhang, H., et al.: SPDA-CNN: unifying semantic part detection and abstraction for fine-grained recognition. In: CVPR, pp. 1143–1152 (2016)
Google Scholar
Zhang, Q., Nian Wu, Y., Zhu, S.C.: Interpretable convolutional neural networks. In: CVPR, pp. 8827–8836 (2018)
Google Scholar

Download references

Acknowledgments

This work is partially supported by ARC under Grant DP180100106 and DP200101328. Xiaowei Zhou is supported by a Data61 Student Scholarship from CSIRO.

Author information

Authors and Affiliations

Australian Artificial Intelligence Institute, FEIT, University of Technology Sydney, Sydney, Australia
Xiaowei Zhou & Ivor Tsang
Discipline of Business Analytics, The University of Sydney, Sydney, Australia
Jie Yin
Data61, CSIRO, Sydney, Australia
Xiaowei Zhou & Chen Wang

Authors

Xiaowei Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Jie Yin
View author publications
You can also search for this author in PubMed Google Scholar
Ivor Tsang
View author publications
You can also search for this author in PubMed Google Scholar
Chen Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaowei Zhou .

Editor information

Editors and Affiliations

IIIT, Hyderabad, Hyderabad, India
Kamal Karlapalem
Chinese University of Hong Kong, Shatin, Hong Kong
Hong Cheng
Virginia Tech, Arlington, VA, USA
Naren Ramakrishnan
Jawaharlal Nehru University, New Delhi, India
R. K. Agrawal
IIIT Hyderabad, Hyderabad, India
P. Krishna Reddy
University of Minnesota, Minneapolis, MN, USA
Jaideep Srivastava
IIIT Delhi, New Delhi, India
Tanmoy Chakraborty

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, X., Yin, J., Tsang, I., Wang, C. (2021). Human-Understandable Decision Making for Visual Recognition. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12714. Springer, Cham. https://doi.org/10.1007/978-3-030-75768-7_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-75768-7_14
Published: 08 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75767-0
Online ISBN: 978-3-030-75768-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Human-Understandable Decision Making for Visual Recognition

Abstract

Access this chapter

Similar content being viewed by others

Explainable Methods for Image-Based Deep Learning: A Review

Post hoc visual interpretation using a deep learning-based smooth feature network

Interpretable Basis Decomposition for Visual Explanation

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Human-Understandable Decision Making for Visual Recognition

Abstract

Access this chapter

Similar content being viewed by others

Explainable Methods for Image-Based Deep Learning: A Review

Post hoc visual interpretation using a deep learning-based smooth feature network

Interpretable Basis Decomposition for Visual Explanation

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation