Channel Max Pooling for Image Classification

Cheng, Lu; Chang, Dongliang; Xie, Jiyang; Ma, Rongliang; Wu, Chunsheng; Ma, Zhanyu

doi:10.1007/978-3-030-36189-1_23

Lu Cheng¹³,
Dongliang Chang¹³,
Jiyang Xie¹³,
Rongliang Ma¹⁴,
Chunsheng Wu¹⁵ &
…
Zhanyu Ma¹³

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11935))

Included in the following conference series:

International Conference on Intelligent Science and Big Data Engineering

1432 Accesses
3 Citations

Abstract

A problem of deep convolutional neural networks is that the channel numbers of the feature maps often increases with the depth of the network. This problem can result in a dramatic increase in the number of parameters and serious over-fitting. The \(1\times 1\) convolutional layer whose kernel size is \(1\times 1\) is popular for decreasing the channel numbers of the feature maps by offer a channel-wise parametric pooling, often called a feature map pooling or a projection layer. However, the \(1\times 1\) convolutional layer has numerous parameters that need to be learned. Inspired by the \(1\times 1\) convolutional layer, we proposed a channel max pooling, which reduces the feature space by compressing multiple feature maps into one feature map via selecting the maximum values of the same locations from different feature maps. The advantages of the proposed method are twofold as follows: the first is that it decreases the channel numbers of the feature maps whilst retaining their salient features, and the second is that it non-parametric which has no increase in parameters. The experimental results on three image classification datasets show that the proposed method achieves good performance and significantly reduced the parameters of the neural network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Boureau, Y.l., Cun, Y.L., et al.: Sparse feature learning for deep belief networks. In: Advances in Neural Information Processing Systems, pp. 1185–1192 (2008)
Google Scholar
Boureau, Y.L., Ponce, J., LeCun, Y.: A theoretical analysis of feature pooling in visual recognition. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 111–118 (2010)
Google Scholar
Chen, Y.C., Zhu, X., Zheng, W.S., Lai, J.H.: Person re-identification by camera correlation aware feature augmentation. IEEE Trans. Pattern Anal. Mach. Intell. 40(2), 392–408 (2018)
Article Google Scholar
Cogswell, M., Ahmed, F., Girshick, R., Zitnick, L., Batra, D.: Reducing overfitting in deep networks by decorrelating representations. arXiv preprint arXiv:1511.06068 (2015)
Ding, X., Li, B., Xiong, W., Guo, W., Hu, W., Wang, B.: Multi-instance multi-label learning combining hierarchical context and its application to image annotation. IEEE Trans. Multimed. 18(8), 1616–1627 (2016)
Article Google Scholar
Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 317–326 (2016)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Hu, Q., Wang, H., Li, T., Shen, C.: Deep CNNs with spatially weighted pooling for fine-grained car recognition. IEEE Trans. Intell. Transp. Syst. 18(11), 3147–3156 (2017)
Article Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Google Scholar
Jarrett, K., Kavukcuoglu, K., LeCun, Y., et al.: What is the best multi-stage architecture for object recognition? In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2146–2153. IEEE (2009)
Google Scholar
Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., Bengio, Y.: The one hundred layers tiramisu: fully convolutional densenets for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 11–19 (2017)
Google Scholar
Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3d object representations for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 554–561 (2013)
Google Scholar
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, Citeseer (2009)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
LeCun, Y., et al.: Handwritten digit recognition with a back-propagation network. In: Advances in Neural Information Processing Systems, pp. 396–404 (1990)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Li, B., Xiong, W., Hu, W., Funt, B.: Evaluating combinational illumination estimation methods on real-world images. IEEE Trans. Image Process. 23(3), 1194–1209 (2013)
Article MathSciNet Google Scholar
Lin, M., Chen, Q., Yan, S.: Network in network. arXiv preprint arXiv:1312.4400 (2013)
Lin, T.Y., RoyChowdhury, A., Maji, S.: Bilinear CNN models for fine-grained visual recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1449–1457 (2015)
Google Scholar
Liu, R., Gillies, D.F.: Overfitting in linear feature extraction for classification of high-dimensional image data. Pattern Recognit. 53, 73–86 (2016)
Article Google Scholar
Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016)
Ma, Z., Yu, H., Chen, W., Guo, J.: Short utterance based speech language identification in intelligent vehicles with time-scale modifications and deep bottleneck features. IEEE Trans. Veh. Technol. 68(1), 1–13 (2019)
Article Google Scholar
Salamon, J., Bello, J.P.: Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Process. Lett. 24(3), 279–283 (2017)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Wang, Y., Morariu, V.I., Davis, L.S.: Learning a discriminative filter bank within a CNN for fine-grained recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4148–4157 (2018)
Google Scholar
Zhang, T., et al.: Predicting functional cortical ROIs via DTI-derived fiber shape models. Cerebral cortex 22(4), 854–864 (2011)
Article Google Scholar
Zhou, Y., Fadlullah, Z.M., Mao, B., Kato, N.: A deep-learning-based radio resource assignment technique for 5G ultra dense networks. IEEE Netw. 32(6), 28–34 (2018)
Article Google Scholar
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Key Research and Development Program of China under Grant 2018YFC0807205, in part by the National Natural Science Foundation of China (NSFC) No. 61773071, 61922015, in part by the Beijing Nova Program No. Z171100001117049, in part by the Beijing Nova Program Interdisciplinary Cooperation Project No. Z181100006218137, in part by the Fundamental Research Funds for the Central University No. 2018XKJC02, in part by the scholarship from China Scholarship Council (CSC) under Grant CSC No. 201906470049, and in part by the BUPT Excellent Ph.D. Students Foundation No. CX2019109, XTCX201804.

Author information

Authors and Affiliations

Beijing University of Posts and Telecommunications, Beijing, 100876, China
Lu Cheng, Dongliang Chang, Jiyang Xie & Zhanyu Ma
Institute of Forensic Science, Ministry of Public Security, Beijing, 100876, China
Rongliang Ma
Forensic Science Institution of Beijing Public Security Bureau, Beijing, 100876, China
Chunsheng Wu

Authors

Lu Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Dongliang Chang
View author publications
You can also search for this author in PubMed Google Scholar
Jiyang Xie
View author publications
You can also search for this author in PubMed Google Scholar
Rongliang Ma
View author publications
You can also search for this author in PubMed Google Scholar
Chunsheng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Zhanyu Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dongliang Chang .

Editor information

Editors and Affiliations

Nanjing University of Science and Technology, Nanjing, China
Zhen Cui
Nanjing University of Science and Technology, Nanjing, China
Jinshan Pan
Nanjing University of Science and Technology, Nanjing, China
Shanshan Zhang
Nanjing University of Science and Technology, Nanjing, China
Liang Xiao
Nanjing University of Science and Technology, Nanjing, China
Jian Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cheng, L., Chang, D., Xie, J., Ma, R., Wu, C., Ma, Z. (2019). Channel Max Pooling for Image Classification. In: Cui, Z., Pan, J., Zhang, S., Xiao, L., Yang, J. (eds) Intelligence Science and Big Data Engineering. Visual Data Engineering. IScIDE 2019. Lecture Notes in Computer Science(), vol 11935. Springer, Cham. https://doi.org/10.1007/978-3-030-36189-1_23

Download citation

DOI: https://doi.org/10.1007/978-3-030-36189-1_23
Published: 29 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36188-4
Online ISBN: 978-3-030-36189-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics