Advertisement

Improved inception-residual convolutional neural network for object recognition

  • Md Zahangir Alom
  • Mahmudul Hasan
  • Chris Yakopcic
  • Tarek M. Taha
  • Vijayan K. Asari
Original Article
  • 65 Downloads

Abstract

Machine learning and computer vision have driven many of the greatest advances in the modeling of Deep Convolutional Neural Networks (DCNNs). Nowadays, most of the research has been focused on improving recognition accuracy with better DCNN models and learning approaches. The recurrent convolutional approach is not applied very much, other than in a few DCNN architectures. On the other hand, Inception-v4 and Residual networks have promptly become popular among computer the vision community. In this paper, we introduce a new DCNN model called the Inception Recurrent Residual Convolutional Neural Network (IRRCNN), which utilizes the power of the Recurrent Convolutional Neural Network (RCNN), the Inception network, and the Residual network. This approach improves the recognition accuracy of the Inception-residual network with same number of network parameters. In addition, this proposed architecture generalizes the Inception network, the RCNN, and the Residual network with significantly improved training accuracy. We have empirically evaluated the performance of the IRRCNN model on different benchmarks including CIFAR-10, CIFAR-100, TinyImageNet-200, and CU3D-100. The experimental results show higher recognition accuracy against most of the popular DCNN models including the RCNN. We have also investigated the performance of the IRRCNN approach against the Equivalent Inception Network (EIN) and the Equivalent Inception Residual Network (EIRN) counterpart on the CIFAR-100 dataset. We report around 4.53, 4.49 and 3.56% improvement in classification accuracy compared with the RCNN, EIN, and EIRN on the CIFAR-100 dataset respectively. Furthermore, the experiment has been conducted on the TinyImageNet-200 and CU3D-100 datasets where the IRRCNN provides better testing accuracy compared to the Inception Recurrent CNN, the EIN, the EIRN, Inception-v3, and Wide Residual Networks.

Keywords

DCNN RCNN Inception network Residual network Deep learning 

References

  1. 1.
    Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systemsGoogle Scholar
  2. 2.
    Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440Google Scholar
  3. 3.
    Toshev A, Szegedy C (2014) Deeppose: human pose estimation via deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1653–1660Google Scholar
  4. 4.
    Dong C, Loy CC, He K, Tang X (2014) Learning a deep convolutional network for image super-resolution. In: European conference on computer vision. Springer, Berlin, pp 184–199Google Scholar
  5. 5.
    Zhou B, Lapedriza A, Xiao J, Torralba A, Oliva A (2014) Learning deep features for scene recognition using places database. In: Advances in neural information processing systems, pp 487–495Google Scholar
  6. 6.
    Wang N et al (2015) Transferring rich feature hierarchies for robust visual tracking. To further investigate the performance of the proposed IRRCNN model. arXiv preprint arXiv:1501.04587
  7. 7.
    Mao J et al (2014) Deep captioning with multimodal recurrent neural networks (m-rnn). arXiv preprint arXiv:1412.6632
  8. 8.
    Shankar S, Garg VK, Cipolla R (2015) Deep-carving: discovering visual attributes by carving deep neural nets. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3403–3412Google Scholar
  9. 9.
    Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L (2014) Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1725–1732Google Scholar
  10. 10.
    Ballas N et al (2015) Delving deeper into convolutional networks for learning video representations. arXiv preprint arXiv:1511.06432
  11. 11.
    RojasBarahona Lina Maria (2016) Deep learning for sentiment analysis. Lang Linguist Compass 10(12):701–719CrossRefGoogle Scholar
  12. 12.
    Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th international conference on Machine learning. ACMGoogle Scholar
  13. 13.
    Manning CD et al (2014) The Stanford CoreNLP natural language processing toolkit. ACL (System Demonstrations)Google Scholar
  14. 14.
    Geoffrey Hinton et al (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag 29(6):82–97CrossRefGoogle Scholar
  15. 15.
    Mnih V et al (2013) Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602
  16. 16.
    Lillicrap TP et al (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971
  17. 17.
    DiCarlo JJ, Zoccolan D, Rust NC (2012) How does the brain solve visual object recognition? Neuron 73(3):415–434CrossRefGoogle Scholar
  18. 18.
    McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5(4):115–133MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Liang M, Hu X (2015) Recurrent convolutional neural network for object recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognitionGoogle Scholar
  20. 20.
    Donahue J, Hendricks LA, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: CVPRGoogle Scholar
  21. 21.
    Fernandez Benito, Parlos AG, Tsai WK (1990) Nonlinear dynamic system identification using artificial neural networks (ANNs). In: 1990 IJCNN international joint conference on neural networks. IEEEGoogle Scholar
  22. 22.
    Alom MZ, Hasan M, Yakopcic C, Taha TM (2017) Inception recurrent convolutional neural network for object recognition. arXiv:1704.07709
  23. 23.
    Szegedy C et al (2016) Inception-v4, Inception-Resnet and the impact of residual connections on learning. arXiv preprint arXiv:1602.07261
  24. 24.
    He K et al (2016) Identity mappings in deep residual networks. In: European conference on computer vision. Springer, BerlinGoogle Scholar
  25. 25.
    He K et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognitionGoogle Scholar
  26. 26.
    Szegedy C et al (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognitionGoogle Scholar
  27. 27.
    LeCun Y et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRefGoogle Scholar
  28. 28.
    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
  29. 29.
    Lin M, Chen Q, Yan S (2013) Network in network. arXiv preprint arXiv:1312.4400
  30. 30.
    Springenberg JT et al (2014) Striving for simplicity: the all convolutional net. arXiv preprint arXiv:1412.6806
  31. 31.
    Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826Google Scholar
  32. 32.
    Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv preprint arXiv:1605.07146
  33. 33.
    Xie S, Girshick R, Dollr P, Tu Z, He K (2016) Aggregated residual transformations for deep neural networks. arXiv preprint arXiv:1611.05431
  34. 34.
    Iandola FN et al (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\) 0.5 MB model size. arXiv preprint arXiv:1602.07360
  35. 35.
    Liao Q, Poggio T (2016) Bridging the gaps between residual learning, recurrent neural networks and visual cortex. arXiv preprint arXiv:1604.03640
  36. 36.
    O’Reilly RC, Wyatte D, Herd S, Mingus B, Jilk D (2013) Recurrent processing during object recognition. Front Psychol 4(124):1–14Google Scholar
  37. 37.
    Krizhevsky A (2009) Learning multiple layers of features from tiny images. Technical reportGoogle Scholar
  38. 38.
    Tiny ImageNet (2017) https://tiny-imagenet.herokuapp.com/. Accessed Dec 2017
  39. 39.
    Ilya Sutskever et al (2013) On the importance of initialization and momentum in deep learning. ICML 3(28):1139–1147Google Scholar
  40. 40.
    Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167
  41. 41.
    Wan L et al (2013) Regularization of neural networks using drop-connect. In: Proceedings of the 30th international conference on machine learning (ICML-13)Google Scholar
  42. 42.
    Keras CF (2016) https://github.com/fchollet/keras. Accessed Jan 2017
  43. 43.
    Bastien F, Lamblin P, Pascanu R, Bergstra J, Goodfellow IJ, Bergeron A, Bouchard N, Bengio Y (2012) Theano: new features and speed improvements. NIPS workshop on deep learning and unsupervised feature learningGoogle Scholar
  44. 44.
    Mishkin D, Matas J (2015) All you need is a good init. arXiv preprint arXiv:1511.06422
  45. 45.
    Koushik J, Hayashi H (2016) Improving stochastic gradient descent with feedback. arXiv preprint arXiv:1611.01505
  46. 46.
    Goodfellow IJ et al (2013) Maxout networks. ICML 3(28):1319–1327MathSciNetGoogle Scholar
  47. 47.
    Lee C-Y et al (2015) Deeply-supervised nets. AISTATS 2(3):562–570Google Scholar
  48. 48.
    Springenberg JT, Riedmiller M (2014) Improving deep neural networks with probabilistic maxout units. In: International conference on learning representations (ICLR)Google Scholar
  49. 49.
    Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. In: Advances in neural information processing systems (Highway Network)Google Scholar
  50. 50.
    Stollenga MF et al (2014) Deep networks with internal selective attention through feedback connections. In: Advances in neural information processing systemsGoogle Scholar
  51. 51.
    Romero A et al (2014) Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550
  52. 52.
    Snoek J, Rippel O, Swersky K, Kiros R, Satish N, Sundaram N, Adams RP (2015) Scalable Bayesian optimization using deep neural networks. In: ICML, pp 2171–2180Google Scholar
  53. 53.
  54. 54.

Copyright information

© The Natural Computing Applications Forum 2018

Authors and Affiliations

  1. 1.Department of Electrical and Computer EngineeringUniversity of DaytonDaytonUSA
  2. 2.Comcast LabsWashingtonUSA

Personalised recommendations