Advertisement

Channel Max Pooling for Image Classification

  • Lu Cheng
  • Dongliang ChangEmail author
  • Jiyang Xie
  • Rongliang Ma
  • Chunsheng Wu
  • Zhanyu Ma
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11935)

Abstract

A problem of deep convolutional neural networks is that the channel numbers of the feature maps often increases with the depth of the network. This problem can result in a dramatic increase in the number of parameters and serious over-fitting. The \(1\times 1\) convolutional layer whose kernel size is \(1\times 1\) is popular for decreasing the channel numbers of the feature maps by offer a channel-wise parametric pooling, often called a feature map pooling or a projection layer. However, the \(1\times 1\) convolutional layer has numerous parameters that need to be learned. Inspired by the \(1\times 1\) convolutional layer, we proposed a channel max pooling, which reduces the feature space by compressing multiple feature maps into one feature map via selecting the maximum values of the same locations from different feature maps. The advantages of the proposed method are twofold as follows: the first is that it decreases the channel numbers of the feature maps whilst retaining their salient features, and the second is that it non-parametric which has no increase in parameters. The experimental results on three image classification datasets show that the proposed method achieves good performance and significantly reduced the parameters of the neural network.

Keywords

Machine learning Image classification Deep neural networks Channel-wise pooling 

Notes

Acknowledgements

This work was supported in part by the National Key Research and Development Program of China under Grant 2018YFC0807205, in part by the National Natural Science Foundation of China (NSFC) No. 61773071, 61922015, in part by the Beijing Nova Program No. Z171100001117049, in part by the Beijing Nova Program Interdisciplinary Cooperation Project No. Z181100006218137, in part by the Fundamental Research Funds for the Central University No. 2018XKJC02, in part by the scholarship from China Scholarship Council (CSC) under Grant CSC No. 201906470049, and in part by the BUPT Excellent Ph.D. Students Foundation No. CX2019109, XTCX201804.

References

  1. 1.
    Boureau, Y.l., Cun, Y.L., et al.: Sparse feature learning for deep belief networks. In: Advances in Neural Information Processing Systems, pp. 1185–1192 (2008)Google Scholar
  2. 2.
    Boureau, Y.L., Ponce, J., LeCun, Y.: A theoretical analysis of feature pooling in visual recognition. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 111–118 (2010)Google Scholar
  3. 3.
    Chen, Y.C., Zhu, X., Zheng, W.S., Lai, J.H.: Person re-identification by camera correlation aware feature augmentation. IEEE Trans. Pattern Anal. Mach. Intell. 40(2), 392–408 (2018)CrossRefGoogle Scholar
  4. 4.
    Cogswell, M., Ahmed, F., Girshick, R., Zitnick, L., Batra, D.: Reducing overfitting in deep networks by decorrelating representations. arXiv preprint arXiv:1511.06068 (2015)
  5. 5.
    Ding, X., Li, B., Xiong, W., Guo, W., Hu, W., Wang, B.: Multi-instance multi-label learning combining hierarchical context and its application to image annotation. IEEE Trans. Multimed. 18(8), 1616–1627 (2016)CrossRefGoogle Scholar
  6. 6.
    Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 317–326 (2016)Google Scholar
  7. 7.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)Google Scholar
  8. 8.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  9. 9.
    Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
  10. 10.
    Hu, Q., Wang, H., Li, T., Shen, C.: Deep CNNs with spatially weighted pooling for fine-grained car recognition. IEEE Trans. Intell. Transp. Syst. 18(11), 3147–3156 (2017)CrossRefGoogle Scholar
  11. 11.
    Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)Google Scholar
  12. 12.
    Jarrett, K., Kavukcuoglu, K., LeCun, Y., et al.: What is the best multi-stage architecture for object recognition? In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2146–2153. IEEE (2009)Google Scholar
  13. 13.
    Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., Bengio, Y.: The one hundred layers tiramisu: fully convolutional densenets for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 11–19 (2017)Google Scholar
  14. 14.
    Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3d object representations for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 554–561 (2013)Google Scholar
  15. 15.
    Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, Citeseer (2009)Google Scholar
  16. 16.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  17. 17.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRefGoogle Scholar
  18. 18.
    LeCun, Y., et al.: Handwritten digit recognition with a back-propagation network. In: Advances in Neural Information Processing Systems, pp. 396–404 (1990)Google Scholar
  19. 19.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  20. 20.
    Li, B., Xiong, W., Hu, W., Funt, B.: Evaluating combinational illumination estimation methods on real-world images. IEEE Trans. Image Process. 23(3), 1194–1209 (2013)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Lin, M., Chen, Q., Yan, S.: Network in network. arXiv preprint arXiv:1312.4400 (2013)
  22. 22.
    Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1449–1457 (2015)Google Scholar
  23. 23.
    Liu, R., Gillies, D.F.: Overfitting in linear feature extraction for classification of high-dimensional image data. Pattern Recognit. 53, 73–86 (2016)CrossRefGoogle Scholar
  24. 24.
    Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016)
  25. 25.
    Ma, Z., Yu, H., Chen, W., Guo, J.: Short utterance based speech language identification in intelligent vehicles with time-scale modifications and deep bottleneck features. IEEE Trans. Veh. Technol. 68(1), 1–13 (2019)CrossRefGoogle Scholar
  26. 26.
    Salamon, J., Bello, J.P.: Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Process. Lett. 24(3), 279–283 (2017)CrossRefGoogle Scholar
  27. 27.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  28. 28.
    Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)Google Scholar
  29. 29.
    Wang, Y., Morariu, V.I., Davis, L.S.: Learning a discriminative filter bank within a CNN for fine-grained recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4148–4157 (2018)Google Scholar
  30. 30.
    Zhang, T., et al.: Predicting functional cortical ROIs via DTI-derived fiber shape models. Cerebral cortex 22(4), 854–864 (2011)CrossRefGoogle Scholar
  31. 31.
    Zhou, Y., Fadlullah, Z.M., Mao, B., Kato, N.: A deep-learning-based radio resource assignment technique for 5G ultra dense networks. IEEE Netw. 32(6), 28–34 (2018)CrossRefGoogle Scholar
  32. 32.
    Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Lu Cheng
    • 1
  • Dongliang Chang
    • 1
    Email author
  • Jiyang Xie
    • 1
  • Rongliang Ma
    • 2
  • Chunsheng Wu
    • 3
  • Zhanyu Ma
    • 1
  1. 1.Beijing University of Posts and TelecommunicationsBeijingChina
  2. 2.Institute of Forensic Science, Ministry of Public SecurityBeijingChina
  3. 3.Forensic Science Institution of Beijing Public Security BureauBeijingChina

Personalised recommendations