Advertisement

Deep Feature Factorization for Concept Discovery

  • Edo Collins
  • Radhakrishna Achanta
  • Sabine Süsstrunk
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11218)

Abstract

We propose Deep Feature Factorization (DFF), a method capable of localizing similar semantic concepts within an image or a set of images. We use DFF to gain insight into a deep convolutional neural network’s learned features, where we detect hierarchical cluster structures in feature space. This is visualized as heat maps, which highlight semantically matching regions across a set of images, revealing what the network ‘perceives’ as similar. DFF can also be used to perform co-segmentation and co-localization, and we report state-of-the-art results on these tasks.

Keywords

Neural network interpretability Part co-segmentation Co-segmentation Co-localization Non-negative matrix factorization 

References

  1. 1.
    Batra, D., Kowdle, A., Parikh, D., Luo, J., Chen, T.: Icoseg: interactive co-segmentation with intelligent scribble guidance. In: Computer Vision and Pattern Recognition (CVPR), pp. 3169–3176. IEEE (2010)Google Scholar
  2. 2.
    Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. In: Computer Vision and Pattern Recognition (CVPR), pp. 3319–3327. IEEE (2017)Google Scholar
  3. 3.
    Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 35(8), 1798–1828 (2013)CrossRefGoogle Scholar
  4. 4.
    Chen, X., Mottaghi, R., Liu, X., Fidler, S., Urtasun, R., Yuille, A.: Detect what you can: detecting and representing objects using holistic models and body parts. In: Computer Vision and Pattern Recognition (CVPR), pp. 1971–1978 (2014)Google Scholar
  5. 5.
    Cho, M., Kwak, S., Schmid, C., Ponce, J.: Unsupervised object discovery and localization in the wild: part-based matching with bottom-up region proposals. In: Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  6. 6.
    Cichocki, A., Zdunek, R.: Multilayer nonnegative matrix factorisation. Electron. Lett. 42(16), 1 (2006)CrossRefGoogle Scholar
  7. 7.
    Deselaers, T., Alexe, B., Ferrari, V.: Weakly supervised localization and learning with generic knowledge. Int. J. Comput. Vis. (IJCV) 100(3), 275–293 (2012)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Ding, C., He, X., Simon, H.D.: On the equivalence of nonnegative matrix factorization and spectral clustering. In: Proceedings of the 2005 SIAM International Conference on Data Mining, pp. 606–610. SIAM (2005)Google Scholar
  9. 9.
    Dziugaite, G.K., Roy, D.M.: Neural network matrix factorization. arXiv preprint arXiv:1511.06443 (2015)
  10. 10.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2010 (VOC2010) Results. http://www.pascal-network.org/challenges/VOC/voc2010/workshop/index.html
  11. 11.
    Gonzalez-Garcia, A., Modolo, D., Ferrari, V.: Do semantic parts emerge in convolutional neural networks? Int. J. Comput. Vis. (IJCV) 126(5), 1–19 (2017). https://link.springer.com/article/10.1007/s11263-017-1048-0MathSciNetCrossRefGoogle Scholar
  12. 12.
    Grais, E.M., Erdogan, H.: Single channel speech music separation using nonnegative matrix factorization and spectral masks. In: Digital Signal Processing (DSP), pp. 1–6. IEEE (2011)Google Scholar
  13. 13.
    Guillamet, D., Vitrià, J.: Non-negative matrix factorization for face recognition. In: Escrig, M.T., Toledo, F., Golobardes, E. (eds.) CCIA 2002. LNCS (LNAI), vol. 2504, pp. 336–344. Springer, Heidelberg (2002).  https://doi.org/10.1007/3-540-36079-4_29CrossRefGoogle Scholar
  14. 14.
    Hariharan, B., Arbeláez, P., Girshick, R., Malik, J.: Simultaneous detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 297–312. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10584-0_20CrossRefGoogle Scholar
  15. 15.
    He, K., Sun, J., Tang, X.: Guided image filtering. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 35(6), 1397–1409 (2013)CrossRefGoogle Scholar
  16. 16.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)Google Scholar
  17. 17.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning (ICML), pp. 448–456 (2015)Google Scholar
  18. 18.
    Jolliffe, I.T.: Principal component analysis and factor analysis. In: Principal Component Analysis, pp. 115–128. Springer, NewYork (1986).  https://doi.org/10.1007/0-387-22440-8_7
  19. 19.
    Joulin, A., Tang, K., Fei-Fei, L.: Efficient image and video co-localization with frank-wolfe algorithm. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 253–268. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10599-4_17CrossRefGoogle Scholar
  20. 20.
    Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFS with gaussian edge potentials. In: Advances in Neural Information Processing Systems (NIPS), pp. 109–117 (2011)Google Scholar
  21. 21.
    Le, H., Yu, C.P., Zelinsky, G., Samaras, D.: Co-localization with category-consistent features and geodesic distance propagation. In: Computer Vision and Pattern Recognition (CVPR), pp. 1103–1112 (2017)Google Scholar
  22. 22.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788 (1999)CrossRefGoogle Scholar
  23. 23.
    Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems, pp. 556–562 (2001)Google Scholar
  24. 24.
    Li, Y., Liu, L., Shen, C., van den Hengel, A.: Image co-localization by mimicking a good detector’s confidence score distribution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 19–34. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_2CrossRefGoogle Scholar
  25. 25.
    Montavon, G., Samek, W., Müller, K.: Methods for interpreting and understanding deep neural networks. Digit. Signal Process. 73, 1–15 (2018)  https://doi.org/10.1016/j.dsp.2017.10.011MathSciNetCrossRefGoogle Scholar
  26. 26.
    Paszke, A., et al.: Automatic differentiation in pytorch (2017)Google Scholar
  27. 27.
    Ribeiro, M.T., Singh, S., Guestrin, C.: Why should I trust you?: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)Google Scholar
  28. 28.
    Rubinstein, M., Joulin, A., Kopf, J., Liu, C.: Unsupervised joint object discovery and segmentation in internet images. In: Computer Vision and Pattern Recognition (CVPR), June 2013Google Scholar
  29. 29.
    Rubio, J.C., Serrat, J., López, A., Paragios, N.: Unsupervised co-segmentation through region matching. In: Computer Vision and Pattern Recognition (CVPR), pp. 749–756. IEEE (2012)Google Scholar
  30. 30.
    Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115(3), 211–252 (2015).  https://doi.org/10.1007/s11263-015-0816-yMathSciNetCrossRefGoogle Scholar
  31. 31.
    Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: Visual explanations from deep networks via gradient-based localization, vol. 37(8) (2016). See arxiv:1610.02391
  32. 32.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  33. 33.
    Vicente, S., Rother, C., Kolmogorov, V.: Object cosegmentation. In: Computer Vision and Pattern Recognition (CVPR), pp. 2217–2224. IEEE (2011)Google Scholar
  34. 34.
    Vu, T.T., Bigot, B., Chng, E.S.: Combining non-negative matrix factorization and deep neural networks for speech enhancement and automatic speech recognition. In: Acoustics, Speech and Signal Processing (ICASSP), pp. 499–503. IEEE (2016)Google Scholar
  35. 35.
    Wang, J., Yuille, A.L.: Semantic part segmentation using compositional model combining shape and appearance. In: CVPR (2015)Google Scholar
  36. 36.
    Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 267–273. ACM (2003)Google Scholar
  37. 37.
    Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Computer Vision and Pattern Recognition (CVPR), pp. 2921–2929. IEEE (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Edo Collins
    • 1
  • Radhakrishna Achanta
    • 2
  • Sabine Süsstrunk
    • 1
  1. 1.School of Computer and Communication SciencesEPFLLausanneSwitzerland
  2. 2.Swiss Data Science CenterEPFL and ETHZZurichSwitzerland

Personalised recommendations