Abstract
With the rise of deep neural network, convolutional neural networks show superior performances on many different computer vision recognition tasks. The convolution is used as one of the most efficient ways for extracting the details features of an image, while the deconvolution is mostly used for semantic segmentation and significance detection to obtain the contour information of the image and rarely used for image classification. In this paper, we propose a novel network named bi-branch deconvolution-based convolutional neural network (BB-deconvNet), which is constructed by mainly stacking a proposed simple module named Zoom. The Zoom module has two branches to extract multi-scale features from the same feature map. Especially, the deconvolution is borrowed to one of the branches, which can provide distinct features differently from regular convolution through the zoom of learned feature maps. To verify the effectiveness of the proposed network, we conduct several experiments on three object classification benchmarks (CIFAR-10, CIFAR-100, SVHN). The BB-deconvNet shows encouraging performances compared with other state-of-the-art deep CNNs.
This is a preview of subscription content, access via your institution.
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.




References
- 1.
Clevert D-A, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus). arXiv:1511.07289
- 2.
Glorot X, Bordes A, Bengio Y (2012) Deep sparse rectifier neural networks. In: International conference on artificial intelligence and statistics
- 3.
Goodfellow IJ, Warde-Farley D, Mirza M, Courville A, Bengio Y (2013) Maxout networks. arXiv:1302.4389
- 4.
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
- 5.
He K, Zhang X, Ren S, Sun J (2016) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: IEEE International conference on computer vision, pp 1026– 1034
- 6.
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European Conference on computer vision. Springer, pp 630–645
- 7.
Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: alexnet-level accuracy with 50x fewer parameters and 0.5mb model size. arXiv:1602.07360
- 8.
Ioffe S (2017) Batch renormalization: towards reducing minibatch dependence in batch-normalized models. arXiv:1702.03275
- 9.
Ioffe, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456
- 10.
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. arXiv:1408.5093
- 11.
Krizhevsky A (2009) Learning multiple layers of features from tiny images. Technical report, University of Toronto
- 12.
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097–1105
- 13.
Lécun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
- 14.
Lee CY, Xie S, Gallagher P, Zhang Z, Zhuowen T (2015) Deeply-supervised nets. Artif Intell Statist, 562–570
- 15.
Li J, Liang X, Shen SM, Xu T, Feng J, Yan S (2015) Scale-aware fast r-cnn for pedestrian detection. arXiv:1510.08160
- 16.
Lin M, Chen Q, Yan S (2013) Network in network. arXiv:1312.4400
- 17.
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision, pp 21–37
- 18.
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. Comput Vis Pattern Recogn, 3431–3440
- 19.
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: International Conference on international conference on machine learning, pp 807–814
- 20.
Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. Nips Workshop on Deep Learning & Unsupervised Feature Learning
- 21.
Noh H, Hong S, Han B (2016) Learning deconvolution network for semantic segmentation. In: IEEE International conference on computer vision, pp 1520–1528
- 22.
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: International Conference on neural information processing systems, pp 91– 99
- 23.
Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2014) Fitnets: Hints for thin deep nets. arXiv:1412.6550
- 24.
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
- 25.
Sermanet P, Kavukcuoglu K, Chintala S, Lecun Y (2013) Pedestrian detection with unsupervised multi-stage feature learning. In: IEEE Conference on computer vision and pattern recognition, pp 3626– 3633
- 26.
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
- 27.
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
- 28.
Srivastava R K, Greff K, Schmidhuber J (2015) Training very deep networks. In: Advances in neural information processing systems, pp 2377–2385
- 29.
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Computer vision and pattern recognition, pp 1–9
- 30.
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Computer Vision and pattern recognition, pp 2818–2826
- 31.
Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. AAAI, 4278–4284
- 32.
Wu C, Wen W, Afzal T, Zhang Y, Chen Y, Li H (2017) A compact dnn: approaching googlenet-level accuracy of classification and domain adaptation. arXiv:1703.04071
- 33.
Xie S, Girshick R, Dollár P, Tu Z, He K (2016) Aggregated residual transformations for deep neural networks. arXiv:1611.05431
- 34.
Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv:1605.07146
- 35.
Zagoruyko S, Komodakis N (2017) Diracnets: training very deep neural networks without skip-connections. arXiv:1706.00388
- 36.
Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. arXiv:1301.3557
- 37.
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, pp 818–833
- 38.
Zeiler M D, Krishnan D, Taylor GW, Fergus R (2010) Deconvolutional networks. In: Computer IEEE Conference on computer vision and pattern recognition (CVPR). IEEE, pp 2528–2535
- 39.
Zeiler MD, Taylor GW, Fergus R (2011) Adaptive deconvolutional networks for mid and high level feature learning. In: International conference on computer vision, pp 2018–2025
Acknowledgments
This work is supported by the Natural Science Foundation of China (Grant 61572214 and U1536203), Independent Innovation Research Fund Sponsored by Huazhong university of science and technology (Project No. 2016YXMS089).
Author information
Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Guo, J., Yuan, C., Zhao, Z. et al. Bi-branch deconvolution-based convolutional neural network for image classification. Multimed Tools Appl 77, 30233–30250 (2018). https://doi.org/10.1007/s11042-018-6130-2
Received:
Revised:
Accepted:
Published:
Issue Date:
Keywords
- Image classification
- Bi-branch convolutional neural network
- Deconvolution
- Multi-scale