Inference, Learning and Attention Mechanisms that Exploit and Preserve Sparsity in CNNs

  • Timo Hackel
  • Mikhail Usvyatsov
  • Silvano Galliani
  • Jan Dirk Wegner
  • Konrad SchindlerEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11269)


While CNNs naturally lend themselves to densely sampled data, and sophisticated implementations are available, they lack the ability to efficiently process sparse data. In this work we introduce a suite of tools that exploit sparsity in both the feature maps and the filter weights, and thereby allow for significantly lower memory footprints and computation times than the conventional dense framework, when processing data with a high degree of sparsity. Our scheme provides (i) an efficient GPU implementation of a convolution layer based on direct, sparse convolution; (ii) a filter step within the convolution layer, which we call attention, that prevents fill-in, i.e., the tendency of convolution to rapidly decrease sparsity, and guarantees an upper bound on the computational resources; and (iii) an adaptation of back-propagation that makes it possible to combine our approach with standard learning frameworks, while still exploiting sparsity in the data and the model.


  1. 1.
    Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: USENIX OSDI (2016)Google Scholar
  2. 2.
    Alabi, T., Blanchard, J.D., Gordon, B., Steinbach, R.: Fast k-selection algorithms for graphics processing units. J. Exp. Algorithmics 17 (2012)Google Scholar
  3. 3.
    Brock, A., Lim, T., Ritchie, J., Weston, N.: Generative and discriminative voxel modeling with convolutional neural networks. arXiv preprint arXiv:1608.04236 (2017)
  4. 4.
    Chetlur, S., et al.: CUDNN: efficient primitives for deep learning. arXiv preprint arXiv:1410.0759 (2014)
  5. 5.
    Denil, M., Shakibi, B., Dinh, L., de Freitas, N., et al.: Predicting parameters in deep learning. In: NIPS (2013)Google Scholar
  6. 6.
    Denton, E.L., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. In: NIPS (2014)Google Scholar
  7. 7.
    Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Engelcke, M., Rao, D., Wang, D.Z., Tong, C.H., Posner, I.: Vote3Deep: fast object detection in 3D point clouds using efficient convolutional neural networks. arXiv preprint arXiv:1609.06666 (2017)
  9. 9.
    Graham, B.: Spatially-sparse convolutional neural networks. arXiv preprint arXiv:1409.6070 (2014)
  10. 10.
    Graham, B., Engelcke, M., van der Maaten, L.: 3D semantic segmentation with submanifold sparse convolutional networks. arXiv preprint arXiv:1711.10275 (2017)
  11. 11.
    Graham, B., van der Maaten, L.: Submanifold sparse convolutional networks. arXiv preprint arXiv:1706.01307 (2017)
  12. 12.
    Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. In: NIPS (2015)Google Scholar
  13. 13.
    Häne, C., Tulsiani, S., Malik, J.: Hierarchical surface prediction for 3D object reconstruction. arXiv preprint arXiv:1704.00710 (2017)
  14. 14.
    Huang, J., You, S.: Point cloud labeling using 3D convolutional neural network. In: ICPR (2016)Google Scholar
  15. 15.
    Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866 (2014)
  16. 16.
    Jampani, V., Kiefel, M., Gehler, P.V.: Learning sparse high dimensional filters: image filtering, dense CRFs and bilateral neural networks. In: CVPR (2016)Google Scholar
  17. 17.
    Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: CVPR (2014)Google Scholar
  18. 18.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)Google Scholar
  19. 19.
    Lai, K., Bo, L., Fox, D.: Unsupervised feature learning for 3D scene labeling. In: ICRA (2014)Google Scholar
  20. 20.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  21. 21.
    Li, Y., Pirk, S., Su, H., Qi, C.R., Guibas, L.J.: FPNN: field probing neural networks for 3D data. In: NIPS (2016)Google Scholar
  22. 22.
    Liu, B., Wang, M., Foroosh, H., Tappen, M., Pensky, M.: Sparse convolutional neural networks. In: CVPR (2015)Google Scholar
  23. 23.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)Google Scholar
  24. 24.
    Maturana, D., Scherer, S.: VoxNet: a 3D convolutional neural network for real-time object recognition. In: IROS (2015)Google Scholar
  25. 25.
    Nissen, M.J., Bullemer, P.: Attentional requirements of learning: evidence from performance measures. Cogn. Psychol. 19(1), 1–32 (1987)CrossRefGoogle Scholar
  26. 26.
    Parashar, A., et al.: SCNN: an accelerator for compressed-sparse convolutional neural networks. In: International Symposium on Computer Architecture (2017)Google Scholar
  27. 27.
    Park, J., et al.: Faster CNNs with direct sparse convolutions and guided pruning. In: ICLR (2017)Google Scholar
  28. 28.
    Prokhorov, D.: A convolutional learning system for object classification in 3-D lidar data. IEEE Trans. Neural Netw. 21(5), 858–863 (2010)CrossRefGoogle Scholar
  29. 29.
    Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: CVPR (2017)Google Scholar
  30. 30.
    Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413 (2017)
  31. 31.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)Google Scholar
  32. 32.
    Riegler, G., Ulusoy, A.O., Geiger, A.: OctNet: learning deep 3D representations at high resolutions. In: CVPR (2017)Google Scholar
  33. 33.
    Robertson, E.M.: The serial reaction time task: implicit motor skill learning? J. Neurosci. 27(38), 10073–10075 (2007)CrossRefGoogle Scholar
  34. 34.
    Song, S., Xiao, J.: Deep sliding shapes for amodal 3D object detection in RGB-D images. In: CVPR (2016)Google Scholar
  35. 35.
    Tatarchenko, M., Dosovitskiy, A., Brox, T.: Octree generating networks: efficient convolutional architectures for high-resolution 3D outputs. arXiv preprint arXiv:1703.09438 (2017)
  36. 36.
    Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity invariant CNNs. arXiv preprint arXiv:1708.06500 (2017)
  37. 37.
    Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning structured sparsity in deep neural networks. In: NIPS (2016)Google Scholar
  38. 38.
    Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: CVPR (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Timo Hackel
    • 1
  • Mikhail Usvyatsov
    • 1
  • Silvano Galliani
    • 1
  • Jan Dirk Wegner
    • 1
  • Konrad Schindler
    • 1
    Email author
  1. 1.Photogrammetry and Remote SensingETH ZürichZürichSwitzerland

Personalised recommendations